Print Page | Close Window

How to improve SDK accuracy?

Printed From: MSSpeech-Forum
Category: Windows™ Speech Recognition Forums
Forum Name: Transcriptionists
Forum Description: Discussion group for Medical, Legal and other transcriptionists
URL: https://www.msspeech-forum.com/forum_posts.asp?TID=122
Printed Date: 16/Nov/2024 at 6:06am
Software Version: Web Wiz Forums 12.02 - http://www.webwizforums.com


Topic: How to improve SDK accuracy?
Posted By: carolfly
Subject: How to improve SDK accuracy?
Date Posted: 22/Sep/2009 at 10:45am
Hi, I'm not sure its the right place to ask a question about the MS Speech Engine SDk. Hopefully some one could help me with my problem.
I've been trapped in this problem for a very long time. I'm developing an application that take audio files as input and generate transcript from it using SAPI 5.1. However the accuracy is too disappointing, the accuracy is almost below 30%, most of the time the engine just guess what uttered in the audio file, even with a good quality audio file without any back ground noise and has standard pronunciation. I use dictation grammar and the wav file format is 16 bit,44100 hz and mono. Could anyone told me what should I do to improve the accuracy or it's the nature of MS SAPI that could only recognize voice correctly after trained? Is there any way to train the speech engine with the audio file which might including multiple speekers?  



Replies:
Posted By: mmarkoe_admin
Date Posted: 22/Sep/2009 at 12:27pm
Originally posted by carolfly carolfly wrote:

I'm developing an application that take audio files as input and generate transcript from it using SAPI 5.1. However the accuracy is too disappointing, the accuracy is almost below 30%, most of the time the engine just guess what uttered in the audio file, even with a good quality audio file without any back ground noise and has standard pronunciation. I use dictation grammar and the wav file format is 16 bit,44100 hz and mono. Could anyone told me what should I do to improve the accuracy or it's the nature of MS SAPI that could only recognize voice correctly after trained? Is there any way to train the speech engine with the audio file which might including multiple speekers? 
You have several serious hurdles which are difficult to impossible to overcome.
 
First of all, you need to understand that large vocabulary speech recognition is speaker dependent and works best when it is trained for an individual's unique voice.
 
Next, large vocabulary speech recognition software not only looks for the sounds of each word, but compares each word to the words around it for context clues. For example, I dictate the following correctly every time, "Two boys went to see a doctor because they ate too much food." Context is how it knows which to, two or too to use.
 
In other words you cannot input (directly through a microphone or indirectly through a digital recording) conversational speech and expect high accuracy. High accuracy is attained when each word is enunciated clearly, when words are spoken in phrases for context comparisons, and the speaker uses spoken punctuation.
 
I hope this helps you understand why you have not been successful.
 
Marty


Posted By: carolfly
Date Posted: 23/Sep/2009 at 10:22am
Thanks a lot for your reply. Actually, I used to feel it's impossible to improve the accuracy since you cannot train the engine with the audio files. However, my friend showed me a software named docsoft, with the same audio file it got a much better transcript than the MS engine did. Also I noticed that the dragon naturally speaking has a promising accuracy.  I start to doubt whether it is my fault in using the MS engine or it is the MS engine itself suffers from the problem of enable to get accurate transcript ?


Posted By: mmarkoe
Date Posted: 23/Sep/2009 at 4:36pm
Originally posted by carolfly carolfly wrote:

Thanks a lot for your reply. Actually, I used to feel it's impossible to improve the accuracy since you cannot train the engine with the audio files. However, my friend showed me a software named docsoft, with the same audio file it got a much better transcript than the MS engine did. Also I noticed that the dragon naturally speaking has a promising accuracy.  I start to doubt whether it is my fault in using the MS engine or it is the MS engine itself suffers from the problem of enable to get accurate transcript ?
The best way to tell this to do a test with a single digital recording file with the each software. Compare the results and you will know immediately.
 
Marty


-------------
Marty Markoe, MVP
Microsoft Valued Partner
See us at: http://www.mymsspeech.com


Posted By: carolfly
Date Posted: 24/Sep/2009 at 10:34am
I've tested the same audio file on docsoft and my application, it seems that the accuracy of docsoft is much better above 80%, but in my application it only captured some of the words but most of them are incorrect. I checked my code again and again, I'm sure I followed every step as follow:
1) creating SpInprocRecognizer ,
2) create context ,
3)create grammar (I've tried grammarid from 0-10 but none of them gave out a good accuracy)
4)load dictation grammar
5) load audiofile ,
 and the SpeechRecognized event fired as well and I use the PhraseInfo.GetText to get the recognized results. However the accuracy is still very bad, I can believe the accuracy of the engine is as low as it. Could you tell me where I did wrong ?


Posted By: mmarkoe_admin
Date Posted: 24/Sep/2009 at 3:05pm
Originally posted by carolfly carolfly wrote:

and the SpeechRecognized event fired as well and I use the PhraseInfo.GetText to get the recognized results. However the accuracy is still very bad, I can believe the accuracy of the engine is as low as it. Could you tell me where I did wrong ?
See my response above Posted: 22/Sep/2009 at 12:27pm . This is why you cannot get good accuracy.
 
Marty


Posted By: srinwantudey
Date Posted: 16/Nov/2009 at 5:06am
dear carolfly,
I would suggest you to post your code here, so that we may give tips to improve..



Print Page | Close Window

Forum Software by Web Wiz Forums® version 12.02 - http://www.webwizforums.com
Copyright ©2001-2019 Web Wiz Ltd. - https://www.webwiz.net