notes for building a chatbot
When designing a chatbot, companies have two options: a machine learning based approach or a linguistic rules approach PHOTO: flexbox on Flickr

“It gets better over time” may be the leading slogan for artificial intelligence (AI) these days. It’s also the technology’s biggest excuse. Is “better later” acceptable in today’s marketplace? How long should we wait for AI to become great? 

For a company that is trying to decide whether to use chatbots to serve customers, those questions matter.

Two Roads to Rome

In customer service, pretty much everything starts with a question:

  • “What is my balance?”
  • “Where is my order?”
  • “Why did I get charged for that?”

Because companies know that interactions are probably going to begin with a question, they need to program customer service chatbots to determine the intent of the message — i.e., what it is the customer wants. Once the chatbot knows the intent, it can engage in a more or less scripted dialogue to achieve the goal — just like a customer service agent, a bank teller or the person taking a food order in a restaurant.

To build such an “intent classification” algorithm, you can take one of two paths: the machine learning approach or the linguistic rules-based approach.

The Machine Learning Chatbot Approach

A machine learning (ML) engine, based on neural networks, looks at a pattern (say, a text message) and maps it to a concept such as the semantics, or the intent of the customer sending the message. 

For this approach, developers train the chatbot by showing it a number of different sample sentences (“What’s my balance?” “What is my balance?” “Balance.” “My balance, please.”) and telling it they all mean the same thing: BALANCEINQUIRY. The engine will observe all of the different sentences share a certain patterns of characters: “my balance.” The engine has no knowledge whatsoever of words, their relationships or anything really β€Š— β€Šit only sees character sequences. And it needs to see a lot of character sequences — hundreds, thousands, tens of thousands — until it has learned and observed enough so it can abstract from the original training sentences.

Designers then repeat the above for every intent they want the bot to distinguish, i.e. for every distinct question they want answered. When that training phase is done, the engine is deployed to the world, and when it sees a sentence that is either exactly like the one it was trained to handle, or sentences that are similar in surface structure, it will conclude that “the customer means BALANCEINQUIRY.”

The Linguistic Rules Chatbot Approach

For a linguistic rules approach, a chatbot developer takes a linguistic engine that has knowledge of a given language’s syntax, semantics and morphology (how words are built) and then adds program rules that look for the key semantic concepts that determine that a sentence has a certain meaning (e.g., if it is told to look for the word balance, it automatically knows to also look for money/cash/savings, etc.).

As in the ML-based approach, designers then repeat the above for every intent they want the bot to distinguish — i.e., for every distinct question they want answered. When the engine is deployed after that programming phase is done, it will conclude “the customer means BALANCEINQUIRY” when it sees a sentence that is either exactly like the one it was trained for, or ones that are similar in terms of the meaning.

In both cases, once the engine has determined that the user intent was BALANCEINQUIRY, the chatbot script will then have logic that asserts, “If intent was BALANCEINQUIRY, then look up balance from backend and present it.” 

How to Choose Between the Two Approaches

Both approaches essentially yield the same result. These two questions will help you decide which approach to use:

  1. What is the difference in cost between the two approaches?
  2. What level of understanding accuracy do we accept at different points in the early life of the application?

Those are truly the only two questions that should matter when deciding which approach is right for your business, machine learning or linguistic rules. At the end of the day, it shouldn’t matter how software solves your problem, as long as it solves the problem and makes business sense.

The Pros and Cons of Each Approach

Marketers promoting the ML approach to chatbot building make it sound magical: You give the engine a few sample questions, and it magically “learns” from those. And then it learns from every new sentence it sees in the field and gets better over time. 

Unfortunately, that’s more marketing than reality and it’s a result of a lack of common understanding of how neural networks work. Everyone understands IF-THEN-ELSE style programming, but few understand how ML works. Would you let a piece of software create your website automatically from scratch by letting it read all of your product documentation? That’s roughly the equivalent of thinking ML by itself could build a chatbot for you.

The initial training step in an ML-based approach is labor intensive. You need to gather many sample sentences, tag them with meaning and let the engine do its thing — and then retrain it when it misclassifies something. And even if you have a corpus (a collection of language data) from years of your agents’ chats on your website, that corpus might not work well for a chatbot deployed on SMS. People use language very differently when they text than they do when they type on a full keyboard. And the differences might be greater still when they know they are communicating with a machine rather than a human — they might use command-like keywords rather than full sentences, for example.

On other point to remember: a neural network knows nothing about language. It doesn’t know that the word balance is related to money, cash, savings, banking, etc., so you need to spoon-feed it every possible variation in the way a question can be worded. If you don’t, it will fail to recognize lots of sentences that mean BALANCEINQUIRY but happen to be expressed in a totally different way from the sentences in the training material. For example, a bot could be stumped by “How much money do I have in the account?”β€Š — which does not include the words balance or my at all. 

What if you need to deploy the bot in a different language? You must start from scratch and do the entire training phase again for each new language you want to support, because the surface structure will look completely different.

However, a linguistic-rules based approach also takes time. Similar to the ML approach, semantic rules also need to be programmed for every intent you want the bot to distinguish. The difference is that with a linguistic engine, the accuracy you get with your first attempt is typically higher than it would be with machine learning, since the linguistic engine knows a lot about language out of the box. And the rules you define are language-independent, so once the rules are constructed, the bot immediately understands all languages the engine supports. 

Furthermore, the outcome of creating a set of rules is a fully verifiable model that can easily be changed when intent misclassifications occur. With a neural network, you must convince what’s essentially a “black box” to interpret a sentence differently by feeding it lots of counter-examples that would sway its belief.

The Hybrid Option

The value proposition of a system that gets better with more data over time remains appealing, but long term, it is unlikely that either approach can live on its own. The linguistic rules-based approach suffers from an inability to improve itself once enough data does exist (no matter how large or small the manual effort involved), while the machine learning approach is too limited, too generic on its own. A hybrid approach, one that combines the best of both worlds, offers the most promise. It will be interesting to see how technology companies address this need.