Does Siri soar on Dragon’s wings?
It’s right out of “Star Trek.’’
Siri, the smart virtual assistant that is the most sensational feature of the newest smartphone from Apple Inc., the iPhone 4S, is getting all the buzz. Talk to it out loud; ask questions in your own words. You’ll get answers (often correct), directions, and suggestions.
And behind some of those capabilities is Nuance Communications Inc., the Burlington maker of voice-recognition software.
Nuance confirmed that its software powers Siri’s uncanny ability to recognize human speech, but would not comment further on the specific technology it provided to Apple. Apple would not comment on that at all.
Some analysts, however, said Nuance’s Dragon Dictation app, the company’s voice-recognition program for smartphones, appears to be embedded in Siri.
“It’s quite likely that Apple is using Nuance technology under the covers,’’ said Frank Gillett, a technology analyst at Forrester Research in Cambridge. He said Apple is known to buy technologies, then make them its own through creative applications.
Smartphones have had the ability to take dictation for years. “What’s different about Siri is that it tries to go from voice to text to meaning to action,’’ Gillett said.
Nuance and its competitor, Vlingo, said the number of downloads of their speech-recognition apps jumped after Apple unveiled Siri on Oct. 4. Siri itself may be enough of an advance to change the behavior of users, making them more comfortable talking to - not just into - their devices, just like Captain Kirk.
When an iPhone 4S was told “Beam me up, Scotty,’’ Siri responded, “Energize.’’
Siri “has really been an exclamation point on speech capabilities,’’ said Mike Thompson, senior vice president and general manager of Nuance’s mobile technology division. “The awareness of speech [technology] is hitting an all-time high.’’
Still, there will have to be a psychological shift in the way people view their phones before speech recognition takes off. Do users really want to ask their phones about restaurant recommendations orally, instead of searching for them on Google?
Some users will think that’s cool; others will think they look like geeks, said Mike Phillips , chief technology officer of Vlingo, a Cambridge producer of speech-recognition software. He is hoping Siri will help speed up the public’s embrace of speech-recognition tech tools like his company’s app for Android phones, iPhones, BlackBerrys, and other smartphones.
“Since 2008, we have been trying to convince the rest of the market that they should do this,’’ Phillips said. “There really is a long history of work being done on this. It’s not just simple programming, and there is really deep analysis going into it.’’
Nuance’s Thompson said his company helped to develop Siri, which evolved over the past decade from cognitive software research conducted by SRI International, of Menlo Park, Calif.
Siri was spun out as an independent company in 2007. A Siri mobile application was first offered on the online Apple App Store in February 2010. Two months later, Apple bought Siri for an undisclosed amount, reported to be around $200 million.
Nuance software has been adopted by automotive companies, health care businesses, and television set manufacturers, as well as smartphone builders, Thompson said. “Voice is quickly becoming a primary interface, and you’re going to see much more deeper integration between voice and touch and a variety of applications,’’ he said.
Nuance’s Dragon Go app, released this year, promises not only to help users find the nearest sushi bar but could potentially tell users which of their friends like sushi.
Recent leaps in speech-recognition technology make it possible for Siri to do what it does - combine speech recognition, rapid processing, and Internet search algorithms to respond to spoken commands and produce answers in natural language. The goal is to determine the speakers’ meaning and intent, which Thompson and Phillips said are among the biggest challenges.
The science is still inexact. Apps from Nuance and Vlingo, and from Siri itself, make mistakes when processing some commands. But the technology continues to evolve, with improved ability to navigate the ambiguities of human conversation.
Forrester’s Gillett expects to see more interest in speech-recognition tools from both consumers and developers.
“We are in the fourth inning of the speech evolution,’’ Thompson said. “We have a very, very exciting next five innings in front of us.’’
B. Farrell can be reached at firstname.lastname@example.org.