Sunday, June 10, 2012

Automatic Speech-to-Text Transcription: Preliminary Results - OpenSpires

How well this speech recognition working for transcription? Here's a research project that was set ups to find some answers....

As part of the project SPINDLE we are running a series of experiments to evaluate the use of  Large Vocabulary Continuous Speech Recognition Software for the automatic transcription of podcasts. We already know this automatic transcription is not going to be 100% accurate at transcription level but ‘good enough’ to enrich the existing metadata of the University podcasts with a set of keywords generated from this automatic transcription.

Today we will present some preliminary results of three different podcasts already available at the University of Oxford Podcasts website. We used the Speech Analysis tool from Adobe Premiere Pro CS5 to automatically transcribe these three podcasts. We selected the English UK language option and the High (slower) quality parameter.

The Table below shows the characteristics of the three podcasts (title, duration, number of words in the manual transcription and number of words in the automatic transcription). We report the automatic transcription results in terms of Word Accuracy using the Levenshtein distance between the manual transcript and the automatic transcript (the higher the better).

Analysing the results we see that the range of accuracy goes from 17% to 56%. Why is accuracy so variable? Listening to the recordings and analysing the audio signals we see that the recording conditions of these three podcasts are really different from each other and that is what we consider the important factor in obtaining such different results.

The first podcast contains background noise and even a video conference speaker.

The second podcast has a really low signal.

The last podcast was professionally recorded and edited and therefore obtains the best results.

There may be other factors affecting the accuracy of the automatic transcription such as the podcast topic (language model), out of vocabulary words (dictionary) or accents (acoustic model).

In following weeks we will report how do we generate keywords automatically from these automatic transcripts. Stay tuned!

via http://www.speechtechnologygroup.com/speech-blog - How well this speech recognition working for transcription? Here's a research project that was set ups to find some answers.... As part of the project SPINDLE we are running a series of experiments to evaluate the use of   Large Vocabulary Continuous Speech Recognition Software for the automatic tra ...

2 comments:

  1. เว็บคาสิโน Online gambling service providers That answers all needs Excellent

    ReplyDelete
  2. เว็บคาสิโน The best online gambling sites With every bet To meet all needs of the gambler with a complete

    ReplyDelete