Speech Technology Group: December 2012

Monday, December 17, 2012

Is Apple's Epic Run Over? Analyzing Apple's R&D Advantages

Interesting analysis about the R&D budgets of the large mobile players.....,,

Source http://www.fool.com/investing/general/2012/12/17/is-apples-epic-run-over-analyzing-apples-rd-advant.aspx

Apple (NASDAQ: AAPL ) has gone from Wall Street darling that couldn't miss to a contrarian play trading at earnings multiples well below the broader market.

With iPhones still flying off the shelves and the Mini an early hit, many investors are left wondering there are deeper problems with Apple that they're missing, or whether the current sell-off makes Apple a screaming buy.

We've created a brand-new report titled "Is Apple's Epic Run Over?" that gives investors a comprehensive look at Apple. It answers whether the bears are right, or if Apple has huge advantages investors are overlooking. Below is a sample section of the report that focuses on Apple and its hidden cost advantages compared to competitors.We hope you enjoy this preview content from our premium Apple report.

Spend less, focus and innovate more: Apple and R&D

Company	Percent of Sales	Absolute Total Spend (millions)
Apple	2.2%	$3,381
Samsung	5.9%	$10,169
HTC	4.3%	$488
Nokia (NYSE: NOK )	15.8%	$6,507
Research In Motion	9.6%	$1,447
Dell	1.7%	$1,001
Lenovo	1.7%	$545
Microsoft (NASDAQ: MSFT )	13.8%	$9,942
Google	13.1%	$6,208

Source: S&P Capital IQ. All values for the last 12 months as of Dec. 1, 2012.

Research and development is an area that has a minor impact on Apple's bottom line, but this lack of influence is the very reason it's so important. Apple's unique handling of R&D once again gives the company a key cost structure advantage over rivals.

As you can see from the R&D table above, PC-reliant companies like Dell and Lenovo come in lowest, at 1.7%. That's not surprising, as intense price competition in Windows PCs long ago shifted the focus from design to supply chain management, scale, and marketing. Gross margins in the single-digit range don't leave much space for expense on R&D. Apple has long been the only "PC" company that could collect outsized margins on its designs.

What's surprising, however, is how close Apple is to companies like Dell and Lenovo. HTC, a company whose focus is on selling Android phones -- so there's little R&D expense on the software platform -- spends almost twice what Apple does as a percent of sales. Samsung spends 168% more than Apple as a percent of sales, and 200% more on R&D overall.

It's important to note that research and development can be a differentiator, but only if it's focused on the right area and the company is using it to bring the right products to market. Nokia's R&D is still 15.8% of sales in spite of the company trimming nearly $2 billion worth of annual expenses over the past two years.

Yet, despite having vast R&D resources, Nokia still couldn't bring an iPhone competitor out in any timely matter despite prototyping iPhone-like touchscreen phones early in the last decade. That's the result of two factors: a management team that didn't prioritize the right projects, and an R&D infrastructure that lacked the right focus. By nature, R&D can be scattershot, but in general it should have a viable commercial outcome. Nokia's research arm conducted experiments sawing a 1-ton block of ice into 50-cm slabs with infrared sensors to study heat trails left by hands, and sent anthropologists to Indian villages to observe rural phone usage.

Likewise, in Microsoft's case, the company kept R&D budgets concentrated on areas like giant "Surface" tables (years before the recent Surface tablet was released, Microsoft used the name on touchscreen tables) even while smartphones and tablets took off. The idea of a 4-foot table you can play a small set of games on and manipulate with touch is significantly less cool when you have an iPad on a table at home with hundreds of thousands of apps. Microsoft's research is often woefully out of contact with the commercial realities building around it, and the result is an abundance of waste.

To be fair to Microsoft, it has plenty of enterprise sales, and other companies like Cisco and Oracle have R&D budgets that are well over 10%.

Apple doesn't suffer the same fate as competitors who spend far too much R&D on projects with no viable commercial outcome or have divergent business units with little overlap. In Apple's case, an R&D team looking into mobile advances can be applied across almost its entire revenue base since iOS comprises the majority of Apple's revenue. Throw in the fact Apple maintains very few product lines relative to its rivals, and you can see why the company has been so innovative even with less R&D. Its management saw the right areas of growth, focused its resources there in an outsized way, and is now reaping the rewards.

The challenge to Apple is that Internet services that have accompanied recent iOS launches like iCloud, Maps, and Siri – while all of them did still apply to iOS – begin stretching focus. As Apple looks to control more central parts of its platform, that can lead to an R&D budget that's expanding in size closer to the total absolute spend of a company like Google.

Bottom line
Comparing Apple's R&D as a percent of sales to competitors, it's safe to say the company enjoys an R&D cost advantage that's 3-6 percentage points better than its peer group. Comparing Apple to Microsoft or Google would be unfair, as both those companies' models are light on hardware. However, at the same time, with its development of a desktop operating system, a mobile operating system, and varying adjacent services, it's safe to assume Apple's R&D as a percent of sales should be higher than Samsung and HTC, two companies which save on platform development thanks to Google's creation of Android.

Check out the chart below, which shows the theoretical impact on Apple's bottom line if it were to spend on R&D like two selected rivals, Google and Samsung.

via http://www.speechtechnologygroup.com/speech-blog - Interesting analysis about the R&D budgets of the large mobile players.....,, Source http://www.fool.com/investing/general/2012/12/17/is-apples-epic-run-over-analyzing-apples-rd-advant.aspx Apple ( NASDAQ: AAPL ) has gone from Wall Street darling that couldn't miss to a contrarian play trading ...

Posted by SpeechTechnologyGroup at 6:30 PM No comments:
Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Wednesday, December 12, 2012

boosting automatic speech recognition through articulatory inversion

via http://www.speechtechnologygroup.com/speech-blog - ...

Posted by SpeechTechnologyGroup at 6:36 AM No comments:
Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Amazon Launching “Voice Guide” and “Explore by Touch” for Kindle Fire and Kindle Fire HD

Text to Speech is an invaluable tool for the vision impaired community…

Source: google.com

Friday, December 07, 2012 - by Paul Lilly

Text-to-speech and adjustable font sizes are two features that have been around for years on Kindle devices to help vision-impaired Kindle owners with learning disabilities. Now Amazon is bringing “Voice Guide” and “Explore by Touch” to its standard definition Kindle Fire and Kindle Fire HD 7” tablets. These two accessibility features were previously available on just the Kindle Fire HD 8.9” tablet. “We have heard from thousands of customers who are vision-impaired that Kindle has made a difference in their lives. With Kindle Fire HD 8.9” and soon our full line of new tablets, we are continuing our efforts to provide a range of accessibility features—Voice Guide, Explore by Touch, text-to-speech, optional text coloring and adjustable font sizes—for our vision-impaired customers,” said Dave Limp, Vice President, Amazon Kindle. “We plan to deliver additional accessibility features, so that our vision-impaired customers can have a better experience for reading, communicating and consuming media.”
With Voice Guide navigation enabled, any action performed by the Kindle owner will be read aloud. The idea is to provide immediate feedback to visually-impaired individuals.
Taking it a step further, using the Explore by Touch mode, customers can swipe their fingers across the touchscreen and as they touch an item, the Kindle announces which item has been tapped. A second tap performs the default action.
via http://www.speechtechnologygroup.com/speech-blog - Text to Speech is an invaluable tool for the vision impaired community… Source: google.com Friday, December 07, 2012 - by Paul Lilly Text-to-speech and adjustable font sizes are two features that have been around for years on Kindle devices to help vision-impaired Kindle owners with learning disab ...

Posted by SpeechTechnologyGroup at 4:25 AM No comments:
Email This BlogThis!Share to X Share to Facebook Share to Pinterest

How Should Investors Play Nuance Communications in 2013?

A "foolish" look at speech recognition and Nuance…

Source:   google.com

The market hasn’t been too kind to mobile growth play Nuance Communications (Nasdaq: NUAN) so far this year. Heading into the end of 2012, the company finds its shares down nearly 11%. However, investors have hung with it, in no small part thanks to its close ties to the tech investing storyline of the decade — the rise of mobile devices such as smartphones and tablets.

Looking beyond its slumping share price, though, the company still managed a relatively strong set of financial performances this year, growing its top line by more than 20% each quarter. Translating those results to the bottom line, however, did prove somewhat problematic. Regardless, no one disputes the immense growth potential the company holds, and the potentially massive payouts it could generate for shareholders as a result.

The real question when it comes to sizing up Nuance, and whether to buy in or sit on the sidelines, is how likely shareholders are to see this outcome, The Fool recently enlisted one of our star tech writers to create a premium research report on Nuance. To acquaint our readers, we decided to include a brief excerpt from the report here for you today, free of charge. If you want to learn more about Nuance, you can access the report in its entirety by just clicking here. Enjoy!

Understanding the layers
Long ago, most of the voice recognition industry transitioned from local speech recognition engines housed in physical devices to a server-based approach that taps into the power of the cloud. The primary benefit was that far more computational horsepower is available in servers than what’s available in a local device, especially if the device is to be small and portable. The biggest disadvantage to this approach is the need for constant connectivity, although this downside continues to become less prominent thanks to advances in cellular data technology like 3G and 4G.

Voice interactions primarily involve two layers to function: the speech recognition engine and the application software. The speech engine serves as the ears, translating sounds into input data, and is typically located in the server. The application software is like the brain, processing and interpreting that data into meaning, and is typically processed locally.

When consumers have poor interactions with voice recognition, like calling the bank or cable company, it’s typically weaknesses in the application software layer (frequently built in-house by said bank or cable company) that leads to a poor experience. Many of these companies tap Nuance for the speech engine since it has the best third-party speech engine available to license, working in over 60 languages, but that’s little good if the application software doesn’t recognize all the different ways that a speaker can say “yes.”

Buying the way to the top
Nuance is a serial acquirer, and scoops up companies at a breathtaking pace. The company has purchased over 50 companies since Paul Ricci became CEO in 2000. It’s even been known to pummel rivals with intense competition and patent infringement lawsuits only to turn around and make an outlandish acquisition offer.

This presents some unique challenges when assessing Nuance’s fundamentals, because in the pursuit of its highly acquisitive strategy, the company’s ballooning goodwill and intangible asset carrying values represent impairment risks should acquisitions fail to generate revenue growth or cost-saving synergies. For example, goodwill and intangible assets totaled $3.65 billion last quarter, or 76% of total assets.

At the same time, Nuance has accumulated quite a bit of debt to finance these acquisitions over the years. Total long-term debt and capital leases (including the current portion due within the next 12 months) now stands at $1.41 billion. That’s well above the $860 million it owed a year ago, and quarterly net interest expense has likewise increased from $8 million to $19.9 million since then. Nuance also recently announced that it plans on selling an additional $600 million in senior notes due 2020, further increasing its debt burden, to pursue more acquisitions.

Additionally, Nuance is continuously amortizing intangible assets, which puts a drag on reported net income and inflates traditional valuation multiples like price-to-earnings, making it appear more expensive relative to GAAP earnings. This is why investors should also factor in other metrics like enterprise value-to-EBITDA or enterprise value-to-free cash flow when evaluating Nuance.

These multiples have decreased since the beginning of the year, offering investors looking for an entry point a more attractive valuation.
via http://www.speechtechnologygroup.com/speech-blog - A "foolish" look at speech recognition and Nuance… Source:    google.com The market hasn’t been too kind to mobile growth play Nuance Communications (Nasdaq: NUAN) so far this year. Heading into the end of 2012, the company finds its shares down nearly 11%. However, investors have hung with it, in ...

Posted by SpeechTechnologyGroup at 3:20 AM No comments:
Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Tuesday, December 11, 2012

Apple Versus Google: The War Escalates - and speech recognition is at the center off it

The battle of the Titans is heating up - and it looks like speech recognition will play a pivotal role...

Read more here: google.com

via http://www.speechtechnologygroup.com/speech-blog - The battle of the Titans is heating up - and it looks like speech recognition will play a pivotal role... Google’s ( GOOG ) entry into the smartphone market continues to negatively impact other aspects of its business model in a classic case of “unintended consequences.” The company’s acquisition of ...

Posted by SpeechTechnologyGroup at 7:59 AM No comments:
Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Dragon Systems Founders Take Goldman to Trial Over Advice

Will the people who set the stage for what speech recognition is today finally be heard?0

Source: BusinessWeek

Jim and Janet Baker, pioneers in the field of computer speech recognition, turned to Goldman Sachs Group Inc. (GS) in late 1999 when they needed investment bankers to advise them on the sale of Dragon Systems Inc., the company they had spent 17 years building.

The Bakers, who started Dragon in their Boston-area home, had seen it grow into a company with $68 million in sales, more than 350 employees and operations in the U.S., Germany, U.K, France and Japan. They wanted to sell to a company that would let them continue to develop the technology they had spent their professional and married lives creating.

“This was the most important business decision of our lives,” Janet Baker said in an interview in the couple’s home in West Newton, Massachusetts. “We chose Goldman because of their global reach and their reputation as the world’s most important investment bank.”

In a federal trial that began yesterday in Boston, the Bakers claim that shoddy work by Goldman Sachs on the $580 million all-stock sale of Dragon to a Belgian competitor, Lernout & Hauspie Speech Products NV, cost them their company and their fortune.

Within months of the sale’s June 2000 close, Lernout & Hauspie collapsed in an accounting scandal and its shares that the Bakers took as payment for their 51 percent stake in Dragon were worthless.

‘Like Our Child’

Worse, according to Jim Baker, they no longer had access to the speech-recognition technology they had created. The patents underlying Dragon products including their popular dictation program, Dragon NaturallySpeaking, were sold at a bankruptcy auction.

“Dragon Systems and the Dragon technology was like our child,” Jim Baker said in the interview in May.

Goldman Sachs says it isn’t to blame. The fraud at Lernout & Hauspie, which the bank said it couldn’t have been expected to discover, caused the Bakers’ losses, Goldman Sachs argued in court papers.

The bank also said its contract was with Dragon, which no longer exists, and argued that the Bakers themselves lack legal standing to sue. The New York-based bank said it advised Dragon to get its accountants, Arthur Andersen LLP, to probe Lernout & Hauspie’s financial status.

The four-man Goldman Sachs team assigned to the transaction provided Dragon with competent advice, the bank said in court filings.

Same Approach

“There’s no difference in the quality or the standard or the approach that we take in working on assignments for small companies or large companies,” Gene T. Sykes, Goldman Sachs’s head of mergers and acquisitions, said in a 2011 deposition in the case. “Every client, whether large or small, gets the same high-quality approach.”

The Bakers claim the opposite is true. The same year that the couple sold Dragon, Goldman Sachs advised mobile phone company Vodafone Group Plc (VOD) on its $185 billion takeover of Mannesmann AG and drugmaker Warner-Lambert Co. in its $120 billion sale to Pfizer Inc. (PFE)

The Dragon deal, for which Goldman Sachs was paid $5 million, was “small potatoes” for the firm, the Bakers said in court papers. The four Goldman Sachs bankers assigned to shepherd Dragon through the sale were “unsupervised, inexperienced, incompetent and lazy,” they said in a court filing.

Lost ‘Everything’

The Bakers lost “virtually everything” in the sale of their company, their lawyer, Alan Cotler of Reed Smith LLP, told jurors in his opening statement yesterday.

The couple, along with two other co-founders who are also suing Goldman Sachs, are “the nicest, salt-of-the-earth people, who are geniuses at what they do,” Cotler said. What they are not, he said, are financial professionals.

“Goldman Sachs emphasized the fact that it was the biggest and the best,” Cotler told the jury.

Lawyers for Goldman Sachs are scheduled to present their opening argument in the case today.

James Baker and Janet MacIver met as graduate students and married in 1971. Janet brought a fascination with Asian dragons to the marriage.

Jim, a mathematician, and Janet, a biophysicist, said they methodically considered what area of research to spend their lives pursuing. They were looking for a field in which they could make a concrete contribution within 40 years or so — the span of a professional career, Janet Baker said. They chose computer speech recognition.

Dragon Decor

Much of the couple’s early work was done at the dining-room table in their Victorian home, which is still filled with dragons — dragon kites, a dragon sculpture by the door, the Bakers’ dragon-pattern wedding china.

After founding Dragon, the Bakers’ work proceeded apace with the development of desktop computers with ever-increasing memory and processing speeds.

Jim’s work in mathematics led them to apply a mathematical principle called Hidden Markov Models to predict what word would follow another.

In the 1980s, Dragon sold an early voice recognition program to a U.K. company called Apricot Computers Ltd. In 1990, the company introduced DragonDictate, a commercial dictation program.

“You had to pause … between … each … word,” Jim Baker said, describing the limits of the program.

Grateful Customers

From the start, the Bakers got feedback from early adopters of the technology, including many disabled people. They’ve kept letters from grateful customers who credited Dragon’s products with giving them new powers of communication.

The ultimate goal, a large-vocabulary, continuous speech recognition product that could register words as they were spoken, remained out of reach until 1997 when Dragon introduced Dragon NaturallySpeaking.

NaturallySpeaking had a dictionary-sized vocabulary and was available in six languages. Users could speak at a normal speed. Pauses were no longer necessary. Oscar-winning actor Richard Dreyfuss, a fan of the Dragon program, volunteered as master-of- ceremonies when it was introduced.

Industry Awards

The product was a hit with consumers and won dozens of industry awards. The current version of Dragon NaturallySpeaking is sold by Burlington, Massachusetts-based Nuance Communications Inc. (NUAN)

The success of NaturallySpeaking brought Dragon into competition (IBM) with companies including International Business Machines Corp. (IBM) and Microsoft Corp. (MSFT), which were interested in developing their (IBM) own speech-recognition products.

The Bakers, who were getting unsolicited offers to buy Dragon, knew they needed capital to develop their technology. That led to Goldman Sachs, the sale to Lernout & Hauspie and the loss of Dragon itself.

“It was devastating,” said Janet Baker. “You spend round the clock working on something for decades. And the train stops.”

Their case will be considered by a jury of six, with as many as six alternates, who were selected yesterday in the courtroom of U.S. District Judge Patti Saris. Saris, appointed by President Bill Clinton, has presided over civil and criminal cases, including one in which she accepted in April, Merck & Co.’s agreement to pay a $321.6 million criminal fine and $628.3 million to resolve civil claims that it sold Vioxx for unapproved uses and improperly touted its safety.

Co-Founders, Witnesses

Some potential jurors gasped when Saris said the trial may take until Jan. 25 to complete. She told them the court will take a break from Dec. 21 to Jan. 2 for the Christmas and New Year holidays.

After opening statements, the first witness will be Jim Baker, Cotler told Saris yesterday.

The Bakers’ witnesses will include Paul Bamberg and Robert Roth, Dragon co-founders who held minority shares in the company. Bamberg’s and Roth’s claims against Goldman Sachs will be tried along with the Bakers’ in the trial.

Both sides said they will show jurors recorded testimony from Sykes. They also plan to present expert witnesses to testify about investment banking practices and to offer their opinions on Goldman Sachs’s conduct in the Dragon transaction.

Goldman Sachs didn’t take part in a crucial March 8, 2000, meeting set to put together final terms of the sale, according to the Bakers. The leader of the Goldman Sachs team was on vacation and said he couldn’t phone into the meeting, they said. It was at that meeting that Dragon agreed, disastrously, to the Belgian company’s request to change its half-cash, half-stock offer into one that paid only in Lernout & Hauspie stock.

Inconsistent Statements

The Bakers claim the Goldman Sachs team failed to investigate inconsistencies in Lernout & Hauspie’s financial statements that should have prompted them to steer Dragon away from the deal.

The Bakers point to press reports showing that Lernout & Hauspie’s reported revenue in Asia jumped from $9 million in 1998 to $138 million the following year.

Later, after the Dragon sale closed, the Wall Street Journal said reporters had contacted a group of Korean companies that Lernout & Hauspie identified as customers. Many of the companies said they didn’t do any business with the company.

The Bakers may tell jurors that Goldman Sachs considered investing $30 million of its own money in Lernout & Hauspie in 1998, before it was hired to advise Dragon. The firm abandoned the idea, which it called “Project Sermon,” after pursuing due diligence, including calls to customers of the Belgian company. The Goldman Sachs team made no such calls in preparation for the Dragon transaction, the Bakers said in court filings.

‘Huge Red Flag’

“If Goldman had made even one such phone call to one of the many fake Asian customers that L&H was claiming were sources of huge amounts of new revenue, it would have raised a huge red flag that would have stopped the merger,” the Bakers argued in court papers.

According to the U.S. Securities and Exchange Commission, Lernout & Hauspie created bogus customers, booked circular transactions with shell companies and recorded loans as sales from 1996 to 2000. The company was forced to restate $373 million in earnings and filed for bankruptcy in November 2000.

The suit against Goldman Sachs was put on hold by agreement of both sides for years while the Bakers sued other Lernout & Hauspie and participants and advisers in the deal. They’ve reached a total of about $70 million in settlements, Cotler told Saris earlier.

Damages Sought

Based on the value of Dragon at the time of the sale, the Bakers lost as much as $288.8 million in the sale and its aftermath, they said in court filings. Bamberg and Roth have said they lost as much as $50 million.

In court filings, Goldman Sachs said the plaintiffs damages are less than the amount they’ve already recovered in settlements.

The judge yesterday denied a request by Goldman Sachs to delay the trial because of statements made by Cotler to the press.

She ordered the lawyer not to talk to reporters about the case.

“You cannot talk to the press,” she told Cotler, who was quoted last week in a Boston Globe column about the case. “You cannot. I don’t know why you did.”

She ordered Goldman Sachs to remove from its website arguments it posted about its defenses.

The case is Baker v. Goldman Sachs & Co., 09-cv-10053, U.S. District Court, District of Massachusetts (Boston).
via http://www.speechtechnologygroup.com/speech-blog - Will the people who set the stage for what speech recognition is today finally be heard?0 Source: BusinessWeek Jim and Janet Baker, pioneers in the field of computer speech recognition, turned to Goldman Sachs Group Inc. ( GS ) in late 1999 when they needed investment bankers to advise them on ...

Posted by SpeechTechnologyGroup at 7:17 AM No comments:
Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Place Your Bets, People: Semantic Speech Recognition and the Future of Libraries -

Will semantic and conversational speech recognition replace your Librarian?

Source: google.com

Some years ago, when cellphones were still mostly the province of celebrities and hardcore business travelers, I was walking through an airport and saw a well-groomed and prosperous-looking man engaged in animated conversation with, as far as I could tell, himself. He certainly didn’t seem to be conversing with anyone nearby, anyway. As I (carefully) got closer and continued to watch him talking and gesticulating into the empty social space around him I thought to myself, “That’s interesting; he doesn’t look crazy…”

But then as I passed him, I noticed that there was something stuck in his ear, and I realized he was using one of those newfangled Bluetooth devices that I had started hearing about.

Today, of course, we no longer think twice when we see a business-suited person striding purposefully down the street and talking animatedly to no one. When you witness someone having an out-loud conversation with an invisible interlocutor, you automatically assume “Bluetooth device.”

There’s a big conceptual gap between conversing with someone through your phone, though, and conversing with your phone. Apple took the first mass-market leap across that gap a year or so ago with the introduction of an iPhone-native voice recognition application called Siri. Now there’s a growing buzz around a new smartphone app called Google Voice Search, which some people are saying is vastly better than Siri. If, in fact, it’s that good—and my own experience suggests that it is—then I have to wonder: will Google Voice Search usher us into a culture in which we no longer think twice about someone sitting in her office carrying on a vocal conversation with her computer? Semantic speech recognition has been a standard trope of sci-fi movies for decades, but it has yet to become a common feature of home and office life. That may well be about to change.

Let’s squint our eyes and try to look forward five years. I don’t normally consider myself a betting man, but the fact is that we’re all betting people. Every day we allocate time, energy, and other resources to certain activities and processes based on the belief that things are going to be a certain way in the foreseeable future. Each of those allocations of time and energy constitutes a bet.

So, who’s willing to bet against semantic speech recognition software coming into full maturity within the next two years? Not me. There’s been too much progress already, and it offers too rich an array of solutions to too many problems in too many contexts for me to be willing to assume that its progress won’t continue and quickly accelerate. If it does, what implications would such development have for the future roles of librarians?

I suggest that it’s a relatively small conceptual jump from asking your phone “How close am I to a Mexican restaurant with at least a three-star rating?” to asking it “Please find me five peer-reviewed articles on demographic trends in Europe from no fewer than three journals, each with an impact factor not lower than 11, and email them to me as .pdf files.” And I would suggest that the jump from there to “What are the best journals in microbiology?” or even “Are there any important and relevant articles missing from my works-cited list?” is also relatively small.

Now, maybe you disagree that the jump between “find me some articles” and “analyze my citations for completeness” is a relatively small one. And maybe you feel that even in a research environment characterized by advanced semantic speech recognition, what the human librarian offers—social perceptivity; a finely-honed sensitivity to the subtleties of marginal relevance; an awareness of sources not yet fully mapped by the open Web—is still going to constitute a unique and essential value proposition for the foreseeable future.

And maybe you’re right. But what if those value-adds that humans alone can offer represent surplus value to our patrons? The fact that a service is unique does not automatically make it valuable, and the fact that it’s valuable does not guarantee that it will be seen by those to whom we’re trying to sell it as more valuable than the alternatives. And make no mistake: we librarians are selling something. We are offering our services in exchange for our patrons’ increasingly-scarce time and attention. By offering them research support we are asking them, in effect, to make a bet—the bet is that their investment of time, energy, and inconvenience will pay off more richly than the much smaller investment required to ask a question of their phones or computers. It’s the same bet we’re asking them to make when we encourage them to “start their research with the library website” rather than with Google or (shudder) Wikipedia.

Nor should we forget a major problem with human-based library research support: it’s not scalable. One librarian can’t supply one-on-one service to thousands of patrons. A computer powered by semantic speech recognition, on the other hand, can, and its results don’t have to be just as good as the results you get from talking to a librarian—they only have to seem good enough to obviate talking to a librarian.

Does that scare you? It should. It does me. The question is: does it scare us enough to start thinking in radically different ways about how we support our patrons in their work? #End

This article was featured in Library Journal’s Academic Newswire enewsletter. Subscribe today to have more articles like this delivered to your inbox for free.

google.com |by Rick Anderson
◆

Original Page: http://www.google.com/url?sa=X&q=http://lj.libraryjournal.com/2012/11/opinion/peer-to-peer-review/place-your-bets-people-semantic-speech-recognition-and-the-future-of-libraries-peer-to-peer-review/&ct=ga&cad=CAcQARgBIAAoATAAOABA4u-YhgVIAlAAWABiBWVuLVVT&cd=3WhPyVgfA5w&usg=AFQjCNEgObhin52_0A1qcXY_exkHLSMH3Q
Best,
Gerd
via http://www.speechtechnologygroup.com/speech-blog - Will semantic and conversational speech recognition replace your Librarian? Source: google.com Some years ago, when cellphones were still mostly the province of celebrities and hardcore business travelers, I was walking through an airport and saw a well-groomed and prosperous-looking man engaged ...

Posted by SpeechTechnologyGroup at 5:12 AM No comments:
Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Three Years Until In-Car Voice Recognition Really Rocks

Speech recognition and NLU need a lot more horsepower to work well in the car....
Source: google.com
Voice recognition at its best in a car today is about 90 percent accurate. The goal of the car companies and of speech recognition software supplier Nuance is to reach about 95 percent accuracy.

And Nuance estimates that is about 3 years away.

A key ingredient to boosting recognition power is processor speed. In today’s cars, an infotainment system is running at about 300 to 1,000 million instructions per second (MIPS), said Nuance Director of Automotive Solution Architecture Brian Radloff. When you get to about 10 to 15X that level you can really support true natural language recognition, he said.

If you follow Moore’s law, with processor speeds doubling every year, we get to true natural speech recognition in a few years.

Som car makers claim they recognize natural speech right now, but it’s likely only for certain functions like making phone calls or searching through music.

Full natural speech recognition eliminates the need for key commands like “Call Bill,” and lets you say, “Please get Bill on the phone.”

“I believe the biggest gains to be made are going to be in conversational speech and understanding the intent of what the users is trying to accomplish. We’re starting to see that in telephony in the mobile space,” said Radloff.

It is also important that car makers improve other parts of the infotainment system to get the best speech recognition experience. The graphics on the screen should properly tie into the speech control. And the microphone plays a role in the system as well.

“The biggest improvement we’ll see in system level performance is when the car company really takes a holistic view of voice and focuses on the whole package: the graphics on the screen, the microphone. Also, how do these systems handle a recognition failure? Does it help you along if it doesn’t recognize if I said, ‘I feel like rocking today, let’s hear the Stones.’ Or it can say, ‘I don’t understand or it can come back and say “Please say the name of the artist again.’”

Radloff added, “The reality is, there are systems that are very sophisticated today and for some users, they deliver a very good experience. The bulk of the focus over the next 5 years in the automotive space and in voice in general is going to be how do we take this experience, that is very good for a certain group, and make it very good for a large swath of the car buying public.”
via http://www.speechtechnologygroup.com/speech-blog - Speech recognition and NLU need a lot more horsepower to work well in the car.... Source: google.com Voice recognition at its best in a car today is about 90 percent accurate. The goal of the car companies and of speech recognition software supplier Nuance is to reach about 95 percent accuracy. A ...

Posted by SpeechTechnologyGroup at 4:03 AM No comments:
Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Monday, December 10, 2012

Google’s Siri-competitor, Google Now, could be coming to Chrome

Rumors have it that Google Now with its speech recognition feature would be included with Google Chrome. This would mean that millions of desktops running chrome will soon be in E-equipped with personal assistant capabilities.

Source google.com

Google’s Siri competitor, Google Now, could be moving from Android-only devices onto Windows and Mac desktops, according to reports from CNET.

CNET is reporting, “Chrome team programmers accepted the addition of a “skeleton for Google Now for Chrome” to the Google browser yesterday, an early step in a larger project to show Google Now notifications in Chrome.”

Google Now was released mid-summer this year and so far has only been something that the owners of the latest 4.0+ Android devices could get to use. Google Now is the company’s answer to Apple’s Siri; the service performs certain tasks, such as providing directions or opening applications, based on voice commands.

Apart from voice operation Google Now’s box of tricks include providing automatic weather, sports, and flight information based on users’ locations and previous searches. Its main selling point it that it can work out a user’s typical daily commute and provide driving and transit information in real-time.

Described as a killer feature for Android devices the service has been compared favourably to Siri but due to the relative low number of Android Jelly Bean users it hasn’t become a mainstream Google product yet.

The likely reasons for releasing a version of Google Now for Chrome would be to encourage more users to remain with Android when upgrading their smartphones and to keep users in the big G’s ecosystem.

It’s a smart move from Google, which would allow it to capitalize on its large number of Chrome users to boost its growing number of Android 4+ users (depending on which report you read Chrome is vying for second place with Firefox in the browser wars or has already overtaken Mozilla’s flagship product).

So how likely is this? Well, Google Chrome developer rumours have a habit of being right – the last one we reported was the possible release of Chrome for Android, so it’s quite appropriate that an Android service is now (possibly) coming to Chrome.

That said, it’s early days, so don’t expect this to be appearing the Chrome Web Store any time soon.
via http://www.speechtechnologygroup.com/speech-blog - Rumors have it that Google Now with its speech recognition feature would be included with Google Chrome. This would mean that millions of desktops running chrome will soon be in E-equipped with personal assistant capabilities. Source google.com Google’s Siri competitor, Google Now , could be moving ...

Posted by SpeechTechnologyGroup at 7:19 AM No comments:
Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Saturday, December 8, 2012

Are Apple And Google In Race For North Carolina’s ‘Black Gold’? - Technology

Even the cloud has a power plug…
Source google.com

Apple, fuel cells, pig manure, and Google in rural North Carolina.

MINYANVILLE ORIGINAL To support its iCloud and SIRI Voice recognition services, Apple (NYSE:AAPL) runs a data center in Maiden, NC, 40 miles northwest of Charlotte. That center is powered by 24 hydrogen fuel cells that produce 4.8 megawatts of power. Apple, which prides itself on the transparency of this project , has made it clear though the NC Renewable Energy Tracking System, that it will add 26 more fuel cells by January, upping its electricity production to 10 megawatts, enough to power 6,250 homes. In addition, the company is in the midst of building a 20-megawatt-producing solar panel array. According to Tina Casey of Talking Points Media’s Idea Lab, “It looks like Apple could become the next tech giant to join the state’s hog waste ‘black gold’ rush.” The question is, why North Carolina?
A fuel cell converts chemical energy from a fuel source — in Apple’s case, natural gas (though this will be changing, and more on that to come) — into electricity through a chemical reaction with an oxidizing agent, most commonly, and in this case, hydrogen. Energy efficiency for fuel cells is between 40% and 60%, though that figure increases to 85% if waste heat is captured. Fuel cells have been traditionally used in hugely expensive experiments (NASA was one of the first pioneers of the technology), but recently with decreasing costs, fuel cell development is growing: The global fuel cell industry boasted a compound annual growth rate of 83% during the period from 2009 to 2011. Despite this, fuel cell technology still occupies a relatively small sector of energy production. Through tax incentives, California offers companies to cover roughly half the cost of fuel cell production. In addition, the federal government offers a 30% tax credit to companies in the field. So why is the country’s biggest fuel cell complex in North Carolina and not California?
First of all, the electrical system as a whole is very reliable in North Carolina and costs to industrial customers are about 30% lower than the national average. Secondly, the state is beautiful (full disclosure: I am a proud former resident of the state). But there may be something more interesting at play, that being, pig manure. One of the potential alternatives to natural gas that Apple is exploring is methane gas, a by-product that can be harvested from pig manure, which is plentiful in North Carolina (the state holds 14% of the nation’s commercial swine population).
More interesting still, Google (NASDAQ:GOOG) is already researching the potential for using methane from pig manure as alternative energy, collaborating on a project with Duke University, Duke Energy, and a pig farm in Yadkinville, NC. The farm’s 65-kilowatt turbine generates enough electricity to power the plant itself and five of Bryant’s nine hog barns. Is this perhaps indicative of a race between Apple and Google to see who can tap the potential of pig manure first?
If Apple wants to qualify for NC’s renewable energy credits, it will need to begin transferring to alternative energy. Its proposed massive solar panel facility will contribute to that, and I speculate that so will methane produced from pig manure. Extrapolating from North Carolina, perhaps other states with an abundance of energy-producing manure and lower-than-average energy costs (like Iowa) can expect to see interest from other tech giants and their fuel cells. With Apple behind this unprecedented fuel cell project, other companies are sure to follow.
via http://www.speechtechnologygroup.com/speech-blog - Even the cloud has a power plug… Source google.com Apple, fuel cells, pig manure, and Google in rural North Carolina. MINYANVILLE ORIGINAL To support its iCloud and SIRI Voice recognition services, Apple ( NYSE:AAPL ) runs a data center in Maiden, NC, 40 miles northwest of Charlotte. That center is ...

Posted by SpeechTechnologyGroup at 10:56 AM No comments:
Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Siri TV analysis: Apple TV, Siri could revolutionize television

Making the case for speech recognition in the living room…
google.com
Let’s face it — TV is boring. Not the 57 Channels and Nothin’ On kind of boring, the kind of boring you experience when you’ve been using the same old technology for far too long. It’s always the same story: a flat panel on a wall or table, a viewer on a couch and a remote control that connects them. Sure, remotes have changed over the years but regardless of how many buttons or touchscreens you slap on them, the way users interact with their televisions is the same. And that’s why Apple’s (AAPL) upcoming reentry into the living room is so compelling.

I could not possibly be less interested in an HDTV from Apple. I like my TVs and if the various rumors we’ve read for more than a year now are to be believed, I’m not convinced television hardware from Apple would offer any real advantages over market leaders like Samsung (005930) and LG (066570). What I am very interested in, however, is Siri.

Siri is pretty great on an iPhone. The ability to speak conversationally to perform functions, get answers or find points of interest is a nice value-add. Siri is somewhat less useful on an iPad. The functionality is the same but the tablet form factor makes using Siri much less convenient.

But Siri on a TV remote? Now we’re talking.

Using voice commands to interact with a cell phone is feasible only for a fraction of the functions one might perform with a smartphone. Siri’s utility is further limited by the user’s situation and surroundings. Speaking to a phone is not ideal while in a meeting, while eating in a restaurant, while in close quarters with other people, while exercising, or in countless other scenarios.

Using voice commands to control a TV is almost always feasible. Nearly every function one might perform while watching TV can be performed using voice commands. More importantly, voice commands would significantly simplify functions compared to a standard remote.

Think of the possibilities.

Do you want to see if The Cabin in the Woods is available on demand? Instead of trying to remember which channel Movies On Demand occupies and then scrolling through genres and movie titles, why not just tell your TV to “buy The Cabin in the Woods on demand”?

Do you want to program your DVR to record new Big Bang Theory episodes each week? Instead of navigating through menus to locate the search screen and then pushing a cursor around an on-screen keyboard until you find the show and set it to record, why not just say, “record all new episodes of Big Bang Theory“?

Did you see a commercial for the Snuggie and want to learn more about it? You could pull out your phone and search the Web or get up and do the same on a PC, or you could just tell your TV to “go to mysnuggiestore.com,” or “find reviews of the Snuggie.”

Want to find out what action movies are on while watching TV? You could scroll through a barely responsive on-screen guide for 10 minutes or use a companion app on a smartphone or tablet, or you could simply tell your TV to “list all action movies on TV right now.”

Want to tune into the Yankees game? Don’t bother trying to find the YES channel, just tell your television to “put on the Yankees game.”

I could go on.

Siri is pretty awesome, but I’m not sure it can reach its full potential on a smartphone without some slick integration — and the living room would be an amazing start.

Using Siri on a new type of remote or simply on an iPhone, iPod touch or iPad as a remote control tied to an Apple HDTV — or better yet, an updated version of the $99 Apple TV, which can be connected to any existing HDTV — would completely change the way we watch TV. And here in America, we watch a whole lot of TV.

Gesture controls baked into smart TVs are nifty, but they don’t always work well on the first try. More importantly, gesture support doesn’t really simplify anything as it stands today. That may change if we see cool tech that uses natural movements to perform tasks hit the market, but waving arms and hands around to move a cursor or mute a TV isn’t exactly natural.

And yes, there are some smart TVs already on the market that support voice controls. Saying “hi TV” and waiting five seconds before you can speak a command is hardly what we’re talking about here. And when the only supported commands are simply replacements for button presses like changing channels or opening a Web browser, it’s just an added layer of complexity that most people likely won’t use.

Siri is perfect for the living room and using it to interact with a TV would simplify the user experience as opposed to complicating it like most current solutions do. You don’t have to learn anything or remember any commands to interact with Siri. If you can talk, you can use Siri — and it doesn’t get any easier than that.

Tags:
Apple, Siri

TV November 27, 2012 at 9:55 AM by Zach Epstein

9:55 AM
Let’s face it — TV is boring. Not the 57 Channels and Nothin’ On kind of boring, the kind of boring you experience when you’ve been using the same old technology for far too long. It’s always the same story: a flat panel on a wall or table, a viewer on a couch and a remote control that connects them. Sure, remotes have changed over the years but regardless of how many buttons or touchscreens you slap on them, the way users interact with their televisions is the same. And that’s why Apple’s (AAPL) upcoming reentry into the living room is so compelling.

I could not possibly be less interested in an HDTV from Apple. I like my TVs and if the various rumors we’ve read for more than a year now are to be believed, I’m not convinced television hardware from Apple would offer any real advantages over market leaders like Samsung (005930) and LG (066570). What I am very interested in, however, is Siri.

Siri is pretty great on an iPhone. The ability to speak conversationally to perform functions, get answers or find points of interest is a nice value-add. Siri is somewhat less useful on an iPad. The functionality is the same but the tablet form factor makes using Siri much less convenient.

But Siri on a TV remote? Now we’re talking.

Using voice commands to interact with a cell phone is feasible only for a fraction of the functions one might perform with a smartphone. Siri’s utility is further limited by the user’s situation and surroundings. Speaking to a phone is not ideal while in a meeting, while eating in a restaurant, while in close quarters with other people, while exercising, or in countless other scenarios.

Using voice commands to control a TV is almost always feasible. Nearly every function one might perform while watching TV can be performed using voice commands. More importantly, voice commands would significantly simplify functions compared to a standard remote.

Think of the possibilities.

Do you want to see if The Cabin in the Woods is available on demand? Instead of trying to remember which channel Movies On Demand occupies and then scrolling through genres and movie titles, why not just tell your TV to “buy The Cabin in the Woods on demand”?

Do you want to program your DVR to record new Big Bang Theory episodes each week? Instead of navigating through menus to locate the search screen and then pushing a cursor around an on-screen keyboard until you find the show and set it to record, why not just say, “record all new episodes of Big Bang Theory“?

Did you see a commercial for the Snuggie and want to learn more about it? You could pull out your phone and search the Web or get up and do the same on a PC, or you could just tell your TV to “go to mysnuggiestore.com,” or “find reviews of the Snuggie.”

Want to find out what action movies are on while watching TV? You could scroll through a barely responsive on-screen guide for 10 minutes or use a companion app on a smartphone or tablet, or you could simply tell your TV to “list all action movies on TV right now.”

Want to tune into the Yankees game? Don’t bother trying to find the YES channel, just tell your television to “put on the Yankees game.”

I could go on.

Siri is pretty awesome, but I’m not sure it can reach its full potential on a smartphone without some slick integration — and the living room would be an amazing start.

Using Siri on a new type of remote or simply on an iPhone, iPod touch or iPad as a remote control tied to an Apple HDTV — or better yet, an updated version of the $99 Apple TV, which can be connected to any existing HDTV — would completely change the way we watch TV. And here in America, we watch a whole lot of TV.

Gesture controls baked into smart TVs are nifty, but they don’t always work well on the first try. More importantly, gesture support doesn’t really simplify anything as it stands today. That may change if we see cool tech that uses natural movements to perform tasks hit the market, but waving arms and hands around to move a cursor or mute a TV isn’t exactly natural.

And yes, there are some smart TVs already on the market that support voice controls. Saying “hi TV” and waiting five seconds before you can speak a command is hardly what we’re talking about here. And when the only supported commands are simply replacements for button presses like changing channels or opening a Web browser, it’s just an added layer of complexity that most people likely won’t use.

Siri is perfect for the living room and using it to interact with a TV would simplify the user experience as opposed to complicating it like most current solutions do. You don’t have to learn anything or remember any commands to interact with Siri. If you can talk, you can use Siri — and it doesn’t get any easier than that.
via http://www.speechtechnologygroup.com/speech-blog - Making the case for speech recognition in the living room… google.com Let’s face it — TV is boring. Not the 57 Channels and Nothin’ On kind of boring, the kind of boring you experience when you’ve been using the same old technology for far too long. It’s always the same story: a flat panel on a wal ...

Posted by SpeechTechnologyGroup at 9:49 AM No comments:
Email This BlogThis!Share to X Share to Facebook Share to Pinterest

Blog Archive

► 2013 (12)

► March (5)

► February (2)

► January (5)

▼ 2012 (305)

▼ December (11)

Is Apple's Epic Run Over? Analyzing Apple's R&D Ad...

boosting automatic speech recognition through arti...

Amazon Launching “Voice Guide” and “Explore by Tou...

How Should Investors Play Nuance Communications in...

Apple Versus Google: The War Escalates - and speec...

Dragon Systems Founders Take Goldman to Trial Over...

Place Your Bets, People: Semantic Speech Recogniti...

Three Years Until In-Car Voice Recognition Really ...

Google’s Siri-competitor, Google Now, could be com...

Are Apple And Google In Race For North Carolina’s ...

Siri TV analysis: Apple TV, Siri could revolutioni...

► November (16)

► October (15)

► September (7)

► August (39)

► July (54)

► June (60)

► May (103)

About Me

SpeechTechnologyGroup

Speech Technology Group offers powerful 64-bit text-to-speech and speech recognition technology. We provide exceptional quality at an affordable license model with a passion for reliable and responsive support. The TTS can be deployed in cloud, server, desktop, embedded and mobile environments and comes with a broad variety of standard interface options. The voices are very natural and have a high degree of syntactical accuracy (correct pronounciation of proper names). The 64-bit speech engine can be deployed in server and cloud-based environments using the industry standard MRCP interface. This ASR/TTS engine is very efficient (up to 96 channels per server), highly accurate and supports large grammars and 26 languages. Both Technologies are used by thousands of companies of all sizes and are integrated with leading platforms like Asterisk, Avaya, Cisco and Genesys. Take advantage of our turnkey "compile and run" service to migrate your existing speech applications and infrastructure form your current ASR and TTS. You will save more than half of what you currently spend on core speech technology and finally have a partner that is truly committed to your success

View my complete profile