(Page 2 of 2)
At TechFest 2012, however, there was an added step Rashid didn't show off at his now famous lecture in China. Researchers built a virtual talking head simulation of Craig Mundie, head of research and strategy (and Rashid's boss), and used his voice to show off the software.
Craig Mundie lip syncing Chinese, all simulated with a computer.
So in this image, Mundie's voice and image are computer generated, but very much based on real data from the man himself. None of this would be possible without deep learning. The problem is, the system is still far from perfect, a problem voice recognition and interactive voice response (IVR) technology has had since the 1970s.
Speech Analytics in 2013
While Microsoft and Nuance, the technology behind Apple's Siri digital assistant, are getting better at speech recognition and machine learning at a faster pace, the enterprise still has to rely on a bevy of less sci fi-y tools.
Aberdeen Group has a new speech analytics buyers guide out in November, and it focuses on things like improving financial returns at call centers, for example. The report found speech to text in particular is "especially helpful for contact centers serving a wide range of demographics as it helps them update their vocabulary with words and phrases used more widely by certain demographic groups."
We spoke to Aberdeen researcher Omer Minkara who wrote the report, and he said revolutionizing customer interactions through voice recognition could take another 5-10 years.
"Despite the clear benefits of helping businesses personalize customer interactions, both consumers and businesses find many gaps and inaccuracies in real-life use cases of these tools," Minkara said in an email.
"This use case information needs to be compiled, analyzed and built into refinement of next generation speech recognition tools in order to improve the accuracy and timeliness of these tools within future customer interaction scenarios."