Voice recognition technology is critically important, not just for mobile phones, but potentially for control of lots of other devices, particularly televisions. It is still early days, but if you’re thinking about which side will win in the battle between Apple’s Siri and Google Voice Search, consider the lesson of spell check.
When Eric Schmidt was still chief executive of Google, I asked him what the company owned that would make it particularly hard for any emerging search contender to wipe Google out. Spell check, he said. Google had observed the spelling mistakes and corrections typed into billions of queries, and had a vast understanding of what people really meant when they typed like thsi. Google was able to use this knowledge to offer a “did you mean” function in search, eventually completing queries before people were finished typing.
Other companies would not be able to get that learning, he said, since people had come to expect search engines to fix their spelling. The customers would stay with Google, where that problem was solved. Microsoft Bing has proved Mr. Schmidt was not entirely correct in Google owning spell check, but it does take a company of Microsoft’s size to come at the problem.
It is common around the world to use Google to check one’s spelling now, and it’s common inside Google to use that same ancillary learning on new products.
That is probably why Google Voice Search, though a newer product than Siri, appears to be winning the heart of my colleague Nick Bilton. Nick says Google Voice Search appears to have better understanding of what he’s talking about, and can answer questions better.
If Google is better, it is most likely because of a product Google introduced in 2007, called Google-411, or Google Local Voice Search. Ostensibly a product that provided free directory assistance, Google was mostly interested in capturing the way different people pronounced words.
While Voice Search is new, Google’s linguists have five years of data on billions of pronunciations. A year ago, just for the English language, Google had a database of 230 billion word strings, and had worked on 23 other languages, based largely on 411 and a related voice-based search product. It’s another spell check.
Apple never worked on that kind of feature, which is one reason Siri is one of the few products Apple officially released in beta form. It is building up its database of speech during Siri’s early life. Some of the cute ways Siri talks when it does not understand a question, such as repeating back what you have said, may in fact be efforts to see if you will correct its understanding, somewhat in the way Google learned spell check. Google Voice Search on Android is starting late, but its quality advantage from all that learning beforehand is what makes it better in the early days.
That is not the only area where Google develops one product for the sake of another. The Google Goggles application on Android, which uses computer-driven image recognition to help identify an object the customer photographs, is also a product for use in connection with Google Maps. You can take a picture of a street in Goggles, and if Google Maps has taken a picture of that place with its Street View cars, it can tell you where you are.