Hype by Verena Yunita Yapi
These days, even well intentioned technologists and vendors contribute to hype that's both misleading and counterproductive. PHOTO: Verena Yunita Yapi

(First article in a two-part series.)

We live in exciting times. A new wave of systems is emerging that apply machine learning, AI, Natural Language Processing (NLP) and more to provide more effective and adaptive search, intranets and digital workplaces.

Everywhere you look across our industry you see claims about cognitive computing, AI-based search, breakthrough machine learning and semantic technology.

Venture investors tell me every business plan they see has "machine learning for big data analytics" in it. Cool Vendor demos abound, along with visions of automated assistants having dialogs with customers and employees and providing perfect answers on demand.

Much of This Is Hype

This is a space where it's easy to make a cool demo and really hard to make an effective system, where most of the technology elements have been around for years without hitting the mark, and where the technology is abstract and opaque so it's incredibly hard to understand what works or not.

In Gartner's report, "Hype Cycle for Emerging Technologies, 2016" (fee charged), machine learning and cognitive systems are listed at the peak of inflated expectations. We've seen this movie before: heavy hype grabs attention, but then leads to dismissal as over-promised science-fiction make-believe.

But in truth, there is genuine progress happening, and this area of technology is delivering real value. There are new useful tools and products available — however overhyped they are.

Adopting these will change the ways we work in many small ways (and some big ones). Dismissing the hype entirely is throwing out the baby with the bathwater.

How to Identify Hype: 7 Tests

In this climate, even well intentioned technologists and vendors can contribute to hype that is both misleading and counterproductive.

Plus, there is a strong incentive for marketers to overhype technology that can’t be made black and white, and to deliberately coin terms or use them in different ways to get attention. AI and Cognitive Computing have become catchall phrases for lots of different technologies, to the point that they are almost meaningless terms.

Here are a few basic tests you can use to tell how much hype you are dealing with:

1. Making analogies to the human brain — machine learning does not work the same way the human brain works, nor does any current form of artificial intelligence. If someone says they are “building a brain” or “teaching the system to think” without quickly indicating it’s a rough analogy, beware.

2. Neglecting to mention error rates — human language is messy, and all software that tries to deal with it is imperfect. OCR and Speech Recognition systems regularly describe error rates quantitatively or qualitatively, and search is a much harder problem. There may not be an easy way to measure error rate for this kind of system — but if someone pretends it is zero, run away.

3. Portraying domain independence — although the technology is not domain dependent, machine learning and computational linguistics both work better with a limited domain - because there are more similar examples to work with and learn from. All systems of this type have some domain-specificity as a result. If there isn’t a process to learn or adapt to your specific domain and language, there’s probably something missing.

4. Showing results on perfect data — if you have clean data, nicely organized, in a limited domain, it’s likely you can get great-looking results and gorgeous demos. In the real world, all data is dirty — and the old adage “garbage in, garbage out” still applies. Ask what happens if your data is dirty — or for a Proof of Concept on your own data.

5. Skipping the fundamentals — no matter how sophisticated the technology, effective search still requires the fundamentals — secure access to the important content, attention to content quality and good metadata at the tag or entity level. This all takes work. Ask how these things are done and listen for a pragmatic answer.

6. Claiming results without administrative work — one of the main reasons traditional search deployments suck is that people don’t tend them. The new wave of “self-tuning” systems sound like they eliminate any need to administer these systems, but that’s simply not true. Machine learning is only as good as the attributes (called “features”) you tell it to use and the examples it learns from. People are still needed to supervise the process, at one level or another. If it sounds like a magic system that needs no care and feeding at all, it probably doesn’t work.

7. Buzzword Bingo — perhaps this goes without saying, but the veracity of a statement seems inversely proportional to the number of buzzwords in it. Here’s a current buzzword list: AI, cognitive, intelligent, machine learning, semantic, conversational, self-tuning, self-learning, analytics, concept-based, big data. If you see or hear 3 or more of these in one sentence, you should yell “Bingo!” and ask for an explanation in plain English.

Now that you can cut out the worst of the hype, you can start finding the real value in this new wave of intelligent search systems, and consider how and when to apply it in your organization.

There is some amazing potential in this technology and some real value behind the hype. By asking “why now” you can get some useful insight.

(In the second part of this series, I'll provide some perspective on timing and tips for finding the right applications at the right time. Read Intelligent, Cognitive, AI-Based Search: Reasons for Optimism.)