Sunday, June 3, 2012

Nuance fights to maintain edge in speech recognition

Now that speech recognition has taken center stage, this technology enjoys massive R & D investments....

In the latest batch of television ads for Apple’s iPhone 4s, one high-profile Massachusetts resident takes center stage. But a second, perhaps more crucial local player, remains incognito.

Movie star John Malkovich, who lives in Cambridge, lounges in a leather chair and chats with Siri, the virtual personal assistant built into Apple’s latest mobile phone. “Life . . .,” Malkovich says languidly. “Try and be nice to people,” Siri replies. “Avoid eating fat.” The software that translates Malkovich’s words into text so that Siri can serve up an answer is made by Nuance Communications, a Burlington company that goes uncredited in the ads, on Apple’s website, or in any of the company’s public statements.

The Apple campaign, unavoidable on the airwaves these days, is the highest-profile moment that the rather nerdy field of speech recognition has ever had. Yet the company that has grown into the dominant player over the past decade — mainly through a series of smart acquisitions — doesn’t get a chance to bask in the glory.

“We provide technology for certain Apple products,” says Peter Mahoney, Nuance’s chief marketing officer. “That’s all we can say. If you’ve worked with Apple before, you know they can be fairly restrictive when it comes to protecting their brand.” (Apple didn’t return my phone calls.)

Nuance is in the same position as many other successful Massachusetts tech companies: Most of its revenue — $1.4 billion in 2011 — comes from selling its products to other businesses, and yet it yearns to have a more direct relationship with consumers. When I spoke with Mahoney last week, the challenge facing Nuance was much the same as it was when we talked in 2008: “creating a recognizable, valuable consumer brand,” in his words.

You can’t fault Nuance chief executive Paul Ricci for his vision: In 2001, when Nuance was known as ScanSoft, he paid $40 million in a bankruptcy auction to acquire the assets of Lernout & Hauspie, a Belgian company that had bought up many pioneers of speech recognition. A parade of other purchases followed, including many of local companies such as SpeechWorks International, Voice Signal Technologies, and, last December, Vlingo.

Recently, Nuance has demonstrated new technology called Dragon Drive that can be built into cars, allowing drivers to play certain songs or initiate calls by speaking. Samsung’s latest high-end “smart TVs” include voice recognition software from Nuance that allows viewers to turn them on, change channels, and even initiate Skype videoconferences by speaking. And the free Dragon Go app, available for iPhones and Android phones, gives you 80 percent of Siri’s features, without having to buy Apple’s latest phone.

Ask it to play Count Basie, find the closest Panera, or tell you when “The Avengers” movie is playing and it obliges. (Unlike Siri, it can only search the Web — not the calendar or contact information that’s stored on your phone.) In other words, Nuance is staking out a spot in your pocket, your vehicle, and your living room.

In speech recognition, “they skated to where the puck was going,” says Daniel Ives, a research analyst who follows the company for the investment bank FBR & Co. “Both organically and through acquisitions, they’ve built a massive treasure chest of technology and customers. They’re miles and miles ahead of any other competitor.”

While Mahoney, Nuance’s chief marketing officer, acknowledges that the bulk of the company’s revenue will continue to come from working with manufacturers of TVs, mobile devices, and cars, the company wants to keep exploring products that it can offer directly to consumers, like the Dragon Go app. While it’s free and doesn’t carry any advertising, Mahoney says, “If you can make these direct connections to restaurant reservations or buying movie tickets, potentially you can take a referral fee from those transactions.”

A next step for the company, Mahoney says, is software that doesn’t just respond to spoken instructions, but understands emotion and context. If you sound stressed because you’re late for a meeting and stuck in traffic, for instance, it might suggest an alternate route or offer to send a text message to let colleagues know you’re running late.

But as Nuance buys up smaller speech companies and tries to develop more products to deliver directly to consumers, it finds itself competing with two enormous companies that already have relationships with millions of us: Google and Microsoft. Both have been making major investments in speech recognition.

Google has been focused on improving the speech capabilities that are built into its Android operating system, which it gives away free to makers of cellphones and tablets. And, says industry analyst Walt Tetschner of Acton, Microsoft has been licensing its speech recognition technology to new players like nVoq, which is creating a less expensive medical dictation service that competes with Nuance’s. (Amazon.com also quietly acquired a small speech recognition company last year, and could be adding voice capabilities to its Kindle line of tablets and e-books.)

A decade of acquisitions has turned Nuance into one of those “pillar” companies that Massachusetts so desperately needs. It’s publicly traded, employs nearly 1,000 people in the state, and helps set the agenda for its industry. (Who cares if Ricci, the CEO, spends two-thirds of his time in Nuance’s smallish Silicon Valley office, according to a company spokesperson, and just one-third at the Massachusetts headquarters?)

But one of the things Nuance has sought in buying up so many of its rivals is pricing power — the ability to command a high price for its products and services when it negotiates with customers. Now, the company is facing off against two heavyweights with innovation in their DNA, and the ability to undercut Nuance on price. That means Nuance will have to devise new tactics to stay on the leading edge of the speech industry.

On that matter Siri may not be much help.


The Boston Globe by Scott Kirsner
via http://www.speechtechnologygroup.com/speech-blog - Now that speech recognition has taken center stage, this technology enjoys massive R & D investments.... I n the latest batch of television ads for Apple’s iPhone 4s, one high-profile Massachusetts resident takes center stage. But a second, perhaps more crucial local player, remains incognito. Movie ...

No comments:

Post a Comment