Sunday, June 3, 2012

Siri voice recognition and the lean back experience

The interesting change in voice recognition is that we don't have to concentrate on which exact words to say anymore, but that we can literally have a conversation with a machine.

Mobile phones have come a long way from the brick-heavy permutations of yesteryear, with their emergency-phone frame and oversized antenna. Back then, phones were considered a necessary supplementary in the rapidly moving world of business – a use that rendered it a somewhat gaudy symbol of opulence.

Now mobile phones are so ubiquitous as to be one of the main pieces of modern technology that transcends social boundaries – and in its most advanced iterations, highlights an eloquent marriage of sound ergonomic design and superior technological competence. Such is its capacity, that an internet-ready phone replete with a megapixel camera and MP3 player no longer awakens even the most impressionable eyebrow.

Understandably, one of the elements of sophisticated design that has been less quick to catch-up has been voice recognition. I can personally remember some time spent testing voice recognition capability on Microsoft Word in the late 90s. Such software was considered more of a playful distraction than a serious practical tool, such was its inability to recognise phrases and the regularity of misunderstandings.

The current version of the iPhone however, has its own built-in voice recognition (Siri), which uses what is referred to as a natural language user interface - essentially a programme that ignores the traditional route of speech recognition (involving identification of keywords), instead focusing on developing a form of language processing that seeks to understand the nature of any question. In essence, this means transforming verbs, phrases and clauses within sentences into triggers for creating and searching for specific data.

There are two really exciting features regarding this technology within the iPhone format. The first is that it is not simply a distractionary plaything. It works. And it does so 70-80% of the time, which shows the remarkable refinement of voice recognition over the last 10-15 years. The second is that Siri gets better over time, by recognising a user’s voice and learning its preferences and tastes.

The online blog TechCrunch gushingly calls Siri a signal that it is “game over” for the normal way of searching and locating information. It may well be over the long term, but there is undoubtedly some way to go before internet browsing and searching becomes a voice-driven experience. However, as the John Battelle quote states in the aforementioned TechCrunch post, “the future of search isn’t search, it’s a conversation with someone we trust.”

The future lean back reading experience framed as a conversation opens up enormous possibilities for users to broaden and expand their knowledge. A future “conversation” may not involve us scouring the web or an app for articles of interest – it may involve a Siri-like companion doing the legwork for us, bringing up a whole host of possible reading (perhaps even from sources unknown to a user), based on what we have been reading about or commented on in the past. The conversation could even stretch to asking a voice-recognition companion to help elaborate on a point/fact/figure of interest within an article, guiding us in the direction of explanations, further reading, allowing us to compare facts and figures or clarifying definitions.

That this is the beginning of voice recognition is not a matter of dispute. The question is, just how far will it be able to go? 

via http://www.speechtechnologygroup.com/speech-blog - The interesting change in voice recognition is that we don't have to concentrate on which exact words to say anymore, but that we can literally have a conversation with a machine. Mobile phones have come a long way from the brick-heavy permutations of yesteryear, with their emergency-phone frame and ...

No comments:

Post a Comment