Ever try to search for an unfolding event on Twitter as it happens? Sometimes the results are not so great. When news breaks, Twitter doesn't always have the context it needs to fulfill those requests. Until now.
Twitter has come up with a novel new way make sure those trending searches indeed serve up the proper results. It's not an algorithm, but simply real people looking into what exactly something like #bindersfullofwomen, for example, really means.
Human Interpreters Hash out New Hashtags
When a new phrase or hashtag pops up and starts seeing a peak in Twitter activity, the new system kicks in. Twitter engineers then hunt down a few select outsourced researchers to see what exactly the new term means. Those researchers hash out what the term means and send back their answers to Twitter. Welcome crowdsourced search context, from Twitter!
The Twitter team can then add this information into its machine learning models so the next time someone searches for Big Bird, for example, political results from the presidential election show up and not Sesame Street.
It turns out, Twitter uses human evaluators for several different types of tasks. In a Twitter Engineering blog post about how (in detail) this kind of evaluation works, authors Edwin Chen and Alpa Jain mentioned the team uses human computation to periodically test and calibrate its search quality.
Events like President Obama's reelection cause huge spikes in Twitter traffic related to one person, but sometimes Twitter can't tell right away exactly who that person is.
The Crowd is the Source
Twitter is very popular — but not so popular it can simply let its users give the search system more context. Twitter uses a pool of researchers who are actually contractors hired from an Amazon service called Mechanical Turk. These workers do a variety of Human Intelligence Tasks, as Amazon calls them, and anyone can hire them for thousands of different types of data work.
For Twitter, a customized group of selected people are chosen for the search results work. That way, only a few or many dozens can be deployed at once to help figure out what kind of search result a particular name or hashtag might be referring to. It's a way for Twitter to scale up a project quickly.
Additionally, because these are real-time searches, a globally distributed team helps reduce latency and speed up the improved searches. Not only do the searches improve, however, advertisements can also be better placed when a new term or hashtag appears, and its nature can be deciphered quickly.
- Extracting Insight from Unstructured Data
- Box Cops to Bad IPO Timing, It's Time to Unbox
- Are You Too Old to Work in Tech? IT's Midlife Crisis
- Big Data is Getting Smaller and Smarter
- Who Are the 100 Fastest Growing Software Companies?
- Chaos Reigns at Content Management Vendors
- B2B Marketers: Think More Like Brand Marketers