Speech Technology Group

Tuesday, August 14, 2012

Cloud Based Solutions Grow Mobile Speech Recognition

Cloud-based speech recognition technology allows to have the heavy lifting done off the device. This will make it possible to deploy extremely powerful technology on mobile devices.…

Cloud based solutions are providing impetus for mobile speech recognition platform sales, with revenue growth forecast at 68% through 2017, according to research.

“Reaching a varied group of developers working on different OS and hardware platforms makes cloud based solutions the optimum approach to enabling the masses,” said mobile devices, content and applications senior analyst Michael Morgan for market research company ABI Research, which authored the report. “It is the approach of using network based solutions that will drive the rapid increase in cloud based revenues.”

Historically, mobile speech recognition was delivered to consumers through relationships between device OEMs and platform vendors. The other route to the consumer came through virtual assistant applications that were often developed by the platform vendors. Smaller application development efforts lacked the resources and expertise to bring the benefit of speech recognition to their products. This dynamic has kept speech recognition trapped in functionally specific applications.

“Leveraging the cloud as a delivery mechanism, platform vendors can enable nearly any application developer that wishes to make its user interface experience more efficient,” adds mobile devices, content and applications senior practice director Jeff Orr. “ABI Research expects that consumers will first see the benefits of these efforts in mobile banking and retail applications.”

google.com

via http://www.speechtechnologygroup.com/speech-blog - Cloud-based speech recognition technology allows to have the heavy lifting done off the device. This will make it possible to deploy extremely powerful technology on mobile devices.… Cloud based solutions are providing impetus for mobile speech recognition platform sales, with revenue growth forecas ...

TeleNav Inc. : Telenav Adds Offline GPS Navigation to Scout for iPhone

Speech recognition on the phone for navigation songs like a good idea when you don't have a signal and your cloud-based speech recognizer is not available.…

08/14/2012 | 09:48am US/Eastern

SUNNYVALE, CA — (Marketwire) — 08/14/12 — Telenav®, Inc. (NASDAQ: TNAV), the leader in personalized navigation, published today an update to Scout™ in the Apple® App Store, to include Always There Navigation, downloadable offline navigation that provides GPS navigation service at all times, regardless of whether or not drivers are in wireless coverage. Scout also now comes with free speech recognition, allowing users to conduct hands-free local business and address searches with voice commands.

“When you are driving in an unfamiliar area, the last thing you want to worry about is losing your GPS navigation if you drive out of wireless coverage and take a wrong turn. With our latest version, we are ensuring that this will no longer be a concern for our users,” said Ryan Drake, Scout product manager at Telenav. “Scout will guide you anywhere, anytime — with or without wireless coverage.”

Offline Navigation

Scout’s Always There Navigation is now available for download by U.S. region — Western, Central or Eastern. With a simple download, Scout’s voice-guided, turn-by-turn navigation service is now accessible with or without wireless coverage. Customers always have the option of downloading different regions at any time to ensure they will have GPS navigation if they are traveling.

“We originally launched offline navigation last year as part of the AT&T Navigator iPhone app,” continued Drake. “It’s been a popular feature that we are now very happy to offer to our Scout iPhone customers.”

Speech Recognition

With the addition of speech recognition, Scout makes it safer, faster and simpler to search for nearby destinations. With the touch of one button, a user can activate voice commands and hands-free search and begin speaking to Scout immediately with any command, such as “Find Starbucks” or “Drive Home.” Based on each voice query, Scout will either find nearby results or create the best route summaries based on traffic and current location.

Download Scout Today

The updated Scout for iPhone is available today in the Apple® App Store. The lowest priced option currently available for iPhones, offline navigation is available for $9.99 per year or $2.99 per month and includes additional premium features. Telenav offers a free 30-day trial for new customers. Speech recognition is available to all users as part of the free version of Scout for iPhone.

A video of new features is available online.

Screenshots are available on the Telenav Flickr page.

About Telenav, Inc.

Telenav’s mission is to help make people’s lives easier, less stressful, more fun, and more productive while they are on the go. Our personalized navigation services help people make faster and smarter daily decisions about where to go, when to leave, how to get there, and what to do when they arrive. We have approximately 34 million users worldwide — connecting with us from mobile phones, tablets, computers, cars and developer applications. Our customers have scouted more than 1.6 billion personal journeys since 2007.

We aim to be everywhere people need us. Our partners are wireless carriers, automobile manufacturers and original equipment manufacturers (OEMs), app developers, advertisers and agencies, as well as enterprises large and small.QNX Software Systems, Rogers, Sony, Sprint Nextel, Telcel, T-Mobile UK, T-Mobile US, U.S. Cellular, Verizon Wireless and Vivo Brazil. You can also find us in mobile app stores and on the web at www.telenav.com and www.scout.me.

Follow Telenav on Twitter at www.twitter.com/telenav or on Facebook at www.facebook.com/telenav.

TNAV-C

Image Available: http://www2.marketwire.com/mw/frame_mw?attachid=2064632
Image Available: http://www2.marketwire.com/mw/frame_mw?attachid=2064635

Media Contact: Aamir Syed Telenav, Inc. 408.207.4081 aamirs@telenav.com Investor Relations: Cynthia Hiponia The Blueshirt Group for Telenav, Inc. 415.217.4966 IR@telenav.com

Source: Telenav

News Provided by Acquire Media

google.com

via http://www.speechtechnologygroup.com/speech-blog - Speech recognition on the phone for navigation songs like a good idea when you don't have a signal and your cloud-based speech recognizer is not available.… 08/14/2012 | 09:48am US/Eastern SUNNYVALE, CA — (Marketwire) — 08/14/12 — Telenav®, Inc. (NASDAQ: TNAV), the leader in personalized navigatio ...

Monday, August 13, 2012

Apple Siri vs Micromax Aisha Video

an Indian personal assistant compared to Apple's Siri…

Of all the companies that have decided to take on Apples voice control personal assistant, Siri, Aisha from Micromax is one that really caught our eye.We asked both Siri and Aisha a bunch of questions. The results were quite surprising, especially since the iPhone 4S costs Rs. 45,000 where as the Micromax A50 (one of the cheaper Aisha-bearing Micromax device.

via http://www.speechtechnologygroup.com/speech-blog - an Indian personal assistant compared to Apple's Siri… Of all the companies that have decided to take on Apples voice control personal assistant, Siri, Aisha from Micromax is one that really caught our eye.We asked both Siri and Aisha a bunch of questions. The results were quite surprising, especial ...

Thursday, August 9, 2012

Google Translate Update Adds Visual Translation Support

Cool new feature offers optical character recognition in addition to speech recognition and other input options to enter the text that should be translated.…

With the latest update to the Android version of Google Translate, you no longer have to type in those language queries.

The search giant has integrated Google Goggles’ optical character recognition (OCR) technology into the translate app, making it possible for a user to simply point their smartphone camera at unfamiliar text, click, brush, and translate, without having to manually type in words.

To use the new feature, press the camera button in the bottom right of the screen, point and tap to freeze the photo, then brush your finger over the particular segment you want translated, and a translation pops onto the screen. The usual text-to-speech option remains, to audibly learn what a sign, menu, or book says.

This update makes Google Translate for Android one of the most intelligent and learning-intensive apps the company has produced, Etienne Deguine, associate product manager for Google Translate, said in a blog post.

The app currently supports character recognition for Czech, Dutch, English, French, German, Italian, Polish, Portugese, Russian, Spanish, and Turkish. Google is working to add more languages to the list.

Google Translate is available for free in the Google Play store for Android 2.1 and up.

Google isn’t the first to take on visual translation technology, though. In late 2010, the World Lens app hit the iPhone, providing users a Spanish or English translation instantly when the camera lens is pointed at foreign words.

Once referred to as a “futuristic” app, World Lens is still available in the Apple App Store, updated to include Italian, and the ability to translate languages in both directions – from Spanish to English, or English to Spanish. The app is compatible with iOS 4.0 and later devices, including the iPhone, iPod touch, and iPad.

PCMag

via http://www.speechtechnologygroup.com/speech-blog - Cool new feature offers optical character recognition in addition to speech recognition and other input options to enter the text that should be translated.… With the latest update to the Android version of Google Translate, you no longer have to type in those language queries. The search giant has ...

For Google, keeping search relevant means baking big data into everything

More information about the Google knowledge graph.…

Google has opened its Knowledge Graph to the English-speaking world and has made intelligent voice search possible on mobile phones. Underneath it all, of course, are ever more-complex methods of analyzing data to make search smarter and easier than it has any business being.

It’s a fashionable practice in the Valley to write off Google’s search business, but the company is putting its big data chops to the test to prove doubters wrong. In a Wednesday morning blog post, Google SVP of Search Amit Singhal announced that Google’s Knowledge Graph is now live across every English-speaking country in the world, and that voice search on mobile phones has been improved to understand user intent. Useful, yes, but the real story is the technology that makes these features work.

For Google, it’s all about collecting and analyzing billions of data points to learn what each one really means. With Knowledge Graph, for example, Google uses a “database of more than 500 million real-world people, places and things with 3.5 billion attributes and connections among them.” It’s those connections that are the key, as they’re what make the system smart enough to know what you’re looking for that wouldn’t naturally show up in a standard keyword search.

Although Google hasn’t come out and said so, I’d imagine the Knowledge Graph utilizes Google’s Pregel graph processing engine. Graph processing and databases are catching on in social networks and other large-scale environments because they organize pieces of data by how they’re connected to one another. Those connections are called edges, and they’d keep Knowledge Graph results both informative and focused because the system knows how closely they’re related in any given circumstance.

This example of a personalized interest graph from Gravity Labs illustrates how one might visualize a graph, in this case the connections between a reader’s perceived interests:

Of course, Google has another tool at its disposal, which is the collective wisdom it’s able to glean from billions of searches every day. So, as Singhal wrote when first explaining Knowledge Graph in May, “[W]e can now sometimes help answer your next question before you’ve asked it, because the facts we show are informed by what other people have searched for. For example, the information we show for Tom Cruise answers 37 percent of next queries that people ask about him.”

Google’s other big announcement today is improved voice search on mobile phones, both Android and iOS. Here’s how Singhal describes the new capability:

You just need to tap the microphone icon and ask your question, the same way you’d ask a friend. For example, ask “What movies are playing this weekend?” and you’ll see your words streamed back to you quickly as you speak. Then Google will show you a list of the latest movies in theaters near you, with schedules and even trailers. … When Google can supply a direct answer to your question, you’ll get a spoken response too.

On Monday, a Google Research blog post noted how the company’s work on neural networks — which it famously used to train a system capable of detecting cats and human faces in video streams — is being used to power speech recognition in the Jelly Bean release of Android. Seventeen-year-old Brittany Wenger recently won the Google Science Fair by building an application atop Google App Engine that uses a neural network to help detect breast cancer.

As one might imagine, however, the big challenge for Google, Microsoft , Apple and everyone else trying to provide intelligent but intuitive user experiences is figuring out how to shape high computer science into easily digestible formats on ever-smaller devices. Search would certainly be a more effective tool if everyone could write complex queries directly against a company’s database, but the trick is making products good enough that we don’t have to. It’s boiling years of machine learning, natural-language processing and neural network research into “you ask a question and your phone spits back the right answer.”

GigaOM

via http://www.speechtechnologygroup.com/speech-blog - More information about the Google knowledge graph.… Google has opened its Knowledge Graph to the English-speaking world and has made intelligent voice search possible on mobile phones. Underneath it all, of course, are ever more-complex methods of analyzing data to make search smarter and easier tha ...

Let Your Computer Talk To You: Text To Speech

Here are some hands on instructions to get your computer use a TTS to read information back to you…

HAL

Some people absorb information most efficiently through their ears. I happen to be one of them. I listen to podcasts, “read” audiobooks, and use text to speech engines to read long articles. If it wasn’t for text to speech, I wouldn’t read nearly as much as I do, and so it leaves me well informed. If you’re into auditory info, take a look at this how-to for turning on text to speech in OS X 10.8 and iOS 5.First off, let’s tackle OS X. Launch System Preferences. Now, go to the System section, and click on the button labeled “Dictation & Speech.”

System Preferences

Now, tick the checkbox labeled “Speak selected text when the key is pressed.” You can click the “Change key” button to set an alternate key combo, and I suggest finding one that works for you. I like Ctrl+V, because it is easy to type with one hand, and it doesn’t conflict with any shortcuts I already use. Once it is set up, select some text, perform your key command, and listen to the dulcet tones of your Mac.

Dictation Speech

Now, let’s go for iOS. Launch the Settings app, and while in the General tab, scroll down until you see “Accessibility.” Tap it, and go to the next screen.

Now, tap on the “Speak Selection” button to take you to the toggle for this feature.

Skitched 20120808 195446

Finally, you’ll see a toggle. You’ll want that to be on. This screen also allows you to alter the speaking rate, but I leave it where it is. Whenever you select text, a new button will appear next to “Copy” labeled “Speak.” Tap it, and you’ll hear a lovely lady robot speaking what you selected.

Skitched 20120808 195604

google.com

via http://www.speechtechnologygroup.com/speech-blog - Here are some hands on instructions to get your computer use a TTS to read information back to you… Some people absorb information most efficiently through their ears . I happen to be one of them. I listen to podcasts, “read” audiobooks, and use text to speech engines to read long articles. If it ...

Talking of gadgets…

Gadgets and speech recognition…

Charmingly efficient: After integrating with Apple iOS, and iPhone4S sporting this application in October 2011, Siri has been enticing users with its abilities and has set the trend for voice input in natural language. — File photo: AP

A concept that has for long captured the imagination of several sci-fi writers, Human Machine Interface today is inching closer to bridging the gap between the worlds of machines and humans.

Though these machines may not be in the form of gentle humanoids as Hollywood has taught us to imagine — like Robin Williams in Bicentennial Man, or giant robots like Bumble-bee in Transformers — they are still rather powerful, and increasingly ubiquitous: they reside in our pockets in the form of smartphones.

While touch and gesture control have become all too common with several original equipment manufacturers (OEMs) playing around with it to boost consumer experience, the next big thing in this gadget segment appears to be voice input. Today, small and compact devices such as smartphones are now capable of processing complex voice input, and with the Siri application in the iPhone and the voice input support in the recent versions of Android operating system, it appears that tech majors are now working hard to outdo each other in this growing field.

The complexities

As a user, your phone might allow you to type, touch, gesture or speak to it; however, at the end of the day, all machines understand are digital commands. The process of converting these human inputs into machine understandable format increases in complexity with increasing attempts to emulate the ‘human elements’ in these devices.

Speech recognition of complex strings is the latest offering in smartphones. This is, as of today, the most complex mode of commanding smartphones; complex, because processing speech inherently is an arduous task. Gestures or touch can be made independent across all users; for instance, auto-rotate screen, or ‘slide to unlock’ kind of operations are not dependent on the users’ behaviour. And even if they are, these user dependencies can be easily eliminated and, hence, are easy to normalise.

Speech, however, inherently has multiple traits to be worked upon — tone (frequency), volume (intensity) and the speed of utterance. Each of these traits varies not only between individuals, but also in most cases alters when the same person speaks at different times. The challenge in speech recognition is to normalise these traits into fundamental templates, which can be matched across different speakers to make consistent comparisons and logical decisions, explained Sneha Das, a signal processing student, and intern at the Indian Institute of Science. Once normalised, these templates are compared to new inputs and logical commands are handed over to the devices.

Speech recognition

“Speech recognition is complex because to attain normalisation, the speech recognition engine must train itself by running multiple iterations of the same content. This involves heavy digital signal processing computation first, and a tedious look-up algorithm to make the comparisons,” added Ms. Das. With smartphone processors carrying Digital Signal Processing engines, sensible speech recognition has entered the smartphone segment.

Siri and Google Voice input capture spoken commands, convert them into text using complex algorithms and try matching them with a database of known terms and, in some cases, consult their backend servers to verify or derive more accurate decisions. With time, these applications personalise the results by better understanding the dialect of the speaker and by mapping the interests of the user to make more sensible suggestions.

Siri

After acquiring Siri Inc. in 2010, Siri, the intelligent personal assistant with voice input, has been popularised by Apple. After integrating with Apple iOS, and iPhone4S sporting this application in October, 2011, Siri has been enticing users with its abilities and has set the trend for voice input in natural language. The knowledge navigation aspect in Siri, which understands speech and performs tasks based on voice comprehension, happens at two levels — locally using the processing on board, and by communicating with the Apple servers. Users with good connectivity get results in almost real time.

Google has almost simultaneously, albeit amidst less hype, introduced voice input abilities starting from its Android 2.2 (Version Froyo) and is in many ways faring better than Siri, with every updated release.

This advantage Google seems to exercise over Siri can be traced back to GOOG411, a Google initiative in the U.S. in 2007, where Google provided free voice-based search of phone numbers. This project was discontinued in 2010, after functioning for about 30 months. While the purpose of this project was not entirely clear then, Google in the process had begun analysing voice samples of its users, which must have come in handy during its attempts at voice input.

This ability of Google to understand user dialects, coupled with its massive understanding of user search patterns has an evident advantage over Siri. With every release, one can expect both applications to get better, and in this tussle it is users who would benefit through better interaction with smartphones.

google.com

via http://www.speechtechnologygroup.com/speech-blog - Gadgets and speech recognition… Charmingly efficient: After integrating with Apple iOS, and iPhone4S sporting this application in October 2011, Siri has been enticing users with its abilities and has set the trend for voice input in natural language. — File photo: AP A concept that has for long capt ...