chatterteeth toy eating toy desserts
PHOTO: rawpixel

While they may sound like a throwback to the 2000s, chatbots are an increasingly important part of many customer engagement strategies. The dividing line between chatbots and virtual assistants is up for debate, but it helps to think about them in the same way. Essentially, a chatbot is a program that mimics human conversation. Embedded on many of your favorite websites, they’re designed to make the customer experience easier through a variety of functions. These can include anything from managing account information to answering customer questions.

However, this explanation barely scratches the surface. Behind the little pop-up in the corner of your screen is a host of complex processes, designed to help the algorithm work in the rapidly evolving, nuanced world of language. As chatbot technology is ever more widely adopted, an understanding of these processes could turn your chatbot from a gimmick into an integral part of your product. So, with the mindset that knowledge is power, let’s take apart the chatbot engine and see how this piece of tech really works.

Natural Language Processing: Why Language Is Difficult

Imagine you’re on an ecommerce website and ask a chatbot to update your credit card information. The query “update my card info” triggers a set of distinct processes, which can be summarized as follows:

  1. Your query is converted from plain text into commands that the algorithm can understand.
  2. These commands are then processed in a decision engine.
  3. If more information is required, the chatbot will compose a reply, such as “please tell me your card number.”
  4. When it receives this number, it again turns your responses into commands for itself.
  5. Once enough information has been received, it performs your request.

All of these efforts to change your raw text input into commands for the chatbot are covered by a field of machine learning called natural language processing, or NLP. This field is mainly concerned with improving the ability of machines to understand what is said to them, recognize intent, determine the appropriate action, and respond in language that the user will understand.

Within NLP, natural language understanding refers to the specific set of processes that are concerned with understanding language and its intended meaning, as well as converting these into a structured form that machines can act upon. Natural language generation covers the reverse set of actions, where the machine has to create text that the user can understand.

Natural language processing is the driving force behind a chatbot’s ability to communicate. While they sound simple at a basic level, the intricacies of language actually make any of the above processes extremely complex to navigate. A full explanation of any one of the bullets in our process could fill several books. However, an easy way to grasp the scale of the challenge is to focus on just one language problem that chatbots face. Let’s consider the struggle to pinpoint the intent of a sentence.

A high-performance chatbot is excellent at identifying the action that a user has in mind from the text they write. After all, if the chatbot makes a mistake here it can cause serious issues for the customer and damage the reputation of its business. However, this is easier said than done. Going back to our credit card example, consider the following sentences:

  • “Make this my preferred credit card.”
  • “Do you charge a transaction fee for credit cards?”
  • “Is free to use credit card?”

All of these examples contain the phrase "credit card," but the intent is not the same for all three. Since the first sentence is a request, its intent is an action: "make." The second and third sentences share a different intent: to get more information about credit card payments. If the chatbot can’t recognize this, it might request new card information from both the first and second users. In the worst-case scenario, the presence of the phrases "credit card" and "fee" could lead the bot to process unwanted transactions.

This is a difficult problem to solve, since there isn’t one single way for a user to express an intent. Although the second and third sentence share the intent to find out more about credit cards, they express it in completely different ways. "Charge" and "free" have polar opposite meanings in isolation, but in these sentences their intent is the same. The grammatical mistakes in the third example also have to be compensated for by the chatbot, which can’t control what a user puts into the system. It even has to know exactly where the mistake is in the sentence, as correcting "fee" to ‘free’ could dramatically change the intent in some situations.

In short, it’s extremely difficult to build a functioning chatbot. Before it can effectively serve customers, it needs to understand not only semantics, but also grammar and the relationships between words — often in multiple languages. Luckily, new NLP solutions are constantly finding ways to mitigate some of this complexity.

Related Article: Calling All Linguists: The Messaging Bots Need Help

How Chatbots Tackle the Language Barrier

Not every chatbot needs to be built from scratch. In fact, many companies that develop chatbots rely on outsourced solutions to solve the technical problems that language throws up. Several of these solutions use entity extraction to resolve a range of issues and give software developers a strong foundation to build on.

An entity is a part of the text that fits into a pre-defined category, such as companies, people or places. Extracting these entities involves detecting their existence and adding the correct label. For example, a good entity extractor could identify Microsoft as a company, Bill Gates as a person and Seattle as a place. This is a popular method of data annotation for services like chatbots, as it adds a lot of metadata to previously unstructured text. By identifying which pieces of text do what, the chatbot can develop more than just an understanding of simple words. It can begin to understand the systems behind them and the importance of the relationships between different elements of the text.

Within entity extraction, there are a multitude of processes that can be layered to create an understanding of language. Many solutions have several of these baked into their systems, enabling data scientists to manipulate their data in a variety of useful ways. An exhaustive list would be too long to reproduce here, but these examples should give a sense of the broad possibilities on offer:

Phrase chunking consists of tagging parts of speech with their linguistic or grammatical meaning. For example, all nouns would be tagged as nouns, all verbs as verbs, and so on. This can be done word by word but is also common at the phrase level. The resulting data looks like a syntactical handbook to the workings of a particular language, which is absolutely essential to understanding any sentence.

phrase chunking
PHOTO: http://brat.nlplab.org/img/examples/train.txt-doc-109-full.png

Named entity recognition (NER) is a process in which words or phrases are labelled with semantic tags according to a comprehensive classification system. These can differ significantly in both content and complexity from project to project, sometimes stretching to several tiers of sub-categories. In no particular order, some common categories for NER include: names, numbers, places, currencies, dates and companies. This allows the solution to begin to recognize new entities based on their common characteristics, such as capitalization or placement in a sentence.

name entity recognition
PHOTO: https://towardsdatascience.com/named-entity-recognition-and-classification-with-scikit-learn-f05372f07ba2

Intent extraction is a technical solution to the problem we outlined earlier. Rather than have the machine guess what the intent is, intent extraction explicitly labels them in the data on a phrase or sentence level. By doing this, it’s possible to build out a library of ways that people request certain things and begin to extrapolate new sentences from it.

Many of these solutions can be relied upon to provide a consistent, high-quality foundation for a chatbot. However, you might not have to immediately turn to outside help. In some cases, you may be able to use your company FAQs and existing customer support data as the basis for training your chatbot. If this isn’t enough to create a fully-fledged, independent algorithm, you may need additional data labeling or curation to finish building your product. For example, you may need more variations of ways to communicate your core intents in order to guarantee that your chatbot can deal with unpredictable input. In these cases, there are plenty of dataset services that can provide you with the variation that you need to make your chatbot great.

Related Article: A Good Chatbot Is Hard to Find

Should You Build a Chatbot?

To a critical eye, it’s possible to view the above as an exhausting technical exercise that results in nothing more than a pop-up on your website. However, it would be a serious mistake to dismiss chatbots as nothing more than a publicity stunt. If you have a chatbot capable of serving your clients quickly and capably in multiple languages, your customer service could rapidly become the talk of the industry.

Of course, a chatbot isn’t right for every business — but the benefits are large enough that it’s worth considering. Identify the pain points your customers are experiencing and begin to brainstorm the kind of intents that summarize these, as well as the kind of classification system that might help a chatbot to quickly recognize them and take action. Done correctly, this exercise will do more than just remind you of your customers’ expectations. It might be the start of an automated solution to some of their biggest problems, drastically decreasing your customer support costs and improving your NPS.