Microsoft recently completed its $8.5 billion acquisition of Skype. Many have speculated on topics such as continued support for iOS and Android devices, or the potential for Skype integration to boost lagging sales of Microsoft smartphones. I'm wondering something quite different: Will voice content finally grow up and be counted among the other types of documents produced by Microsoft Office and managed by content management systems (CMS)?
Speech-to-text systems have been around for years, but people in meetings still don't share voice "documents" as readily as they do emails, reports or spreadsheets. They say, "I'll forward you Bill's email" or "I'll email you Sarah's document." But they don't say "I'll copy you on Carol's voicemail" or "Here's a playback of what Derek said in the meeting last week." Why is that? Is vocal information less information-rich than other types of documents? Not at all. Is voice-based information less accurate than written information? No reason that it should be.
The Technology Gap
There is no question that a technology gap has contributed to the slow deployment of speech in computer-assisted communications. From the earliest days of computers, the keyboard transformed the process of handwriting into something that computers could understand. The vocal equivalent of a keyboard, with the same degree of accuracy, has been very slow in arriving. The first attempts were inaccurate, expensive and more trouble than they were worth for most people.
Spoken Communication not as Valued in School
Another reason that voice content is not considered on a par with written content has to do with socialization processes. From an early age, schools train students that written communication has more value than spoken communication. Teachers say "Stop talking" and "Be quiet and pay attention" many times throughout the day. Have you ever heard a teacher say, "Stop all that writing!" When they refer to writing, teachers use a tone that conveys high expectations: "Please pass in your homework assignment from last night (always written) so that I can check them." Written documents are evaluated and graded. The vast majority of spoken communication in schools is not graded; if anything, it's punished.
Added to the general unwelcome reception to the spoken word, there were the dreaded occasions when you were forced to give a speech -- flushed with embarrassment, sweaty palms, a sense of dread, a total lack of recall. By the time we graduate from college and reach the work environment, the things we say and the things we write are on two separate, although somehow related, tracks. Not that schools don't turn out some gifted speakers. But it's despite their efforts, not because of them, with the exception of a few lucky ones who had a great Speech teacher.
Recent Developments Give Hope for Voice Content
A couple of recent developments give me reason to think that the spoken word may finally reach the level of being a respected content type, worthy of being stored, disseminated and updated. One of these developments is the attention that heavy hitters like Google and Apple are paying to voice content. Google Voice and Voice Search are two products by Google that have begun to bridge the gap between everyday spoken language and digital text. With the iPhone 4S, Apple has released Siri, a voice recognition feature that understands natural language, to a degree. Early reports indicate a warm reception and high satisfaction by users, particularly those who have experienced earlier attempts at voice recognition.
Another development that could raise the profile of voice content is Microsoft's purchase of Skype. Skype's statistics for voice and video traffic are staggering. Over 405 million people have a Skype account. Users log 700,000,000 minutes every day talking on Skype-to-Skype connections, approximately 40% of which are video calls.
While many industry pundits are theorizing about the integration of Skype into Windows mobile devices, I think a more interesting development will be the integration of Skype into Office products. Microsoft Office is the de facto standard method of producing and distributing office documents. It would make sense that Skype calls will get baked into the standard office communication workflow. Recording calls is already a feature of Skype. So the next logical step would be to treat voice communication as documents managed by a content management system.
Once Skype calls are stored and distributed as documents, it's only a short leap to a scenario where all voice communications receive the same elevated treatment -- even meetings and voicemail. So what's the best way to prepare for this newfound importance of the spoken word, when everything you say is on the record? Better take that old Speech 101 book and dust it off!
Editor's Note: You may also be interested in reading: