<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="RSS_xslt_style.asp" version="1.0" ?>
<rss version="2.0" xmlns:WebWizForums="https://syndication.webwiz.net/rss_namespace/">
 <channel>
  <title>MSSpeech Forum : Speech Recognition for a wave file</title>
  <link>https://www.msspeech-forum.com/</link>
  <description><![CDATA[This is an XML content feed of; MSSpeech Forum : New Users &amp; General Questions : Speech Recognition for a wave file]]></description>
  <copyright>Copyright (c) 2006-2013 Web Wiz Forums - All Rights Reserved.</copyright>
  <pubDate>Wed, 29 Apr 2026 15:37:10 +0000</pubDate>
  <lastBuildDate>Tue, 07 Apr 2009 09:25:50 +0000</lastBuildDate>
  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  <generator>Web Wiz Forums 12.02</generator>
  <ttl>360</ttl>
  <WebWizForums:feedURL>https://www.msspeech-forum.com/RSS_post_feed.asp?TID=72</WebWizForums:feedURL>
  <image>
   <title><![CDATA[MSSpeech Forum]]></title>
   <url>https://www.msspeech-forum.com/forum_images/msspeech_forum.png</url>
   <link>https://www.msspeech-forum.com/</link>
  </image>
  <item>
   <title><![CDATA[Speech Recognition for a wave file :   gkhanna wrote:Currently I am...]]></title>
   <link>https://www.msspeech-forum.com/speech-recognition-for-a-wave-file_topic72_post330.html#330</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="https://www.msspeech-forum.com/member_profile.asp?PF=7">mmarkoe</a><br /><strong>Subject:</strong> 72<br /><strong>Posted:</strong> 07/Apr/2009 at 9:25am<br /><br /><table width="99%"><tr><td class="BBquote"><img src="forum_images/quote_box.png" title="Originally posted by gkhanna" alt="Originally posted by gkhanna" style="vertical-align: text-bottom;" /> <strong>gkhanna wrote:</strong><br /><br />Currently I am working on a project which can recognize the speech in a wave file. I am using SpeechRecognitionEngine class&nbsp;in Framework 3.5. I want to display each word which in said in wave file (something like we see in movies where every sentences spoken is display in text below). Wave file could be a voice of a person or a song. When I try to using Recognize function it only display first 4-5 words from the wave file and they are totally wrong, and have confidence level 0.004. When I try to do it asynchronously (RecognizeAsync function)&nbsp;then speech is not detected at all. I have also tried SAPI 5.3 but I get wrong result.</td></tr></table> <DIV>&nbsp;</DIV><DIV>It sounds to me as if you have been watching too many Star Wars movies and forgotten that this is the 21st century not the 25th century.&nbsp; There's not enough computing power in the world to do what you were requiring.&nbsp; Large vocabulary Speech Recognition today is capable of very high accuracy.&nbsp; However, it is speaker dependent.&nbsp; This means to work optimally (I would call it usefully to the point where you understand what speakers say) it needs to be trained to an individual's voice.&nbsp; Speech recognition software does not just listen for the sounds of words but also compares each word to the around it for context clues.</DIV><DIV>&nbsp;</DIV><DIV>Therefore, large vocabulary speech recognition works best:</DIV><OL><LI>An individual speaker creates a user training profile unique to their voice's audio quality and has samples of this person's typical syntax.</LI><LI>The&nbsp; individual speaker enunciates each word clearly as they are speaking.</LI><LI>The speaker speaks in phrases.</LI><LI>When speaking, they use punctuation marks to indicate pauses and the beginnings of sentences and paragraphs.</LI></OL><P>It is highly unlikely the public speaker would be willing to do the above.&nbsp; Also, when you try to decipher more than one voice, the task becomes geometrically more difficult.</P><P>Marty Markoe</P>]]>
   </description>
   <pubDate>Tue, 07 Apr 2009 09:25:50 +0000</pubDate>
   <guid isPermaLink="true">https://www.msspeech-forum.com/speech-recognition-for-a-wave-file_topic72_post330.html#330</guid>
  </item> 
  <item>
   <title><![CDATA[Speech Recognition for a wave file :  Currently I am working on a...]]></title>
   <link>https://www.msspeech-forum.com/speech-recognition-for-a-wave-file_topic72_post329.html#329</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="https://www.msspeech-forum.com/member_profile.asp?PF=103">gkhanna</a><br /><strong>Subject:</strong> 72<br /><strong>Posted:</strong> 07/Apr/2009 at 2:55am<br /><br /><DIV =ForumPostText id=ctl00_ctl01_bcr_SinglePostView___PostViewWrapper><P>Currently I am working on a project which can recognize the speech in a wave file. I am using SpeechRecognitionEngine class&nbsp;in Framework 3.5. I want to display each word which in said in wave file (something like we see in movies where every sentences spoken is display in text below). Wave file could be a voice of a person or a song. When I try to using Recognize function it only display first 4-5 words from the wave file and they are totally wrong, and have confidence level 0.004. When I try to do it asynchronously (RecognizeAsync function)&nbsp;then speech is not detected at all. I have also tried SAPI 5.3 but I get wrong result.</P><P>Below is my code</P><DIV style="COLOR: black; : white"><PRE><SPAN style="COLOR: blue">Private</SPAN> <SPAN style="COLOR: blue">Sub</SPAN> Form1_Load(<SPAN style="COLOR: blue">ByVal</SPAN> sender <SPAN style="COLOR: blue">As</SPAN> System.Object, <SPAN style="COLOR: blue">ByVal</SPAN> e <SPAN style="COLOR: blue">As</SPAN> System.EventArgs) <SPAN style="COLOR: blue">Handles</SPAN> <SPAN style="COLOR: blue">MyBase</SPAN>.Load        <SPAN style="COLOR: blue">Dim</SPAN> rec <SPAN style="COLOR: blue">As</SPAN> <SPAN style="COLOR: blue">New</SPAN> Speech.Recognition.SpeechRecognitionEngine        <SPAN style="COLOR: blue">Dim</SPAN> fs <SPAN style="COLOR: blue">As</SPAN> <SPAN style="COLOR: blue">New</SPAN> FileStream(<SPAN style="COLOR: #a31515">"C:\Test.wav"</SPAN>, FileMode.Open, FileAccess.Read)        rec.SetInputToWaveStream(fs)        rec.LoadGrammar(<SPAN style="COLOR: blue">New</SPAN> DictationGrammar)        <SPAN style="COLOR: blue">AddHandler</SPAN> rec.SpeechRecognized, <SPAN style="COLOR: blue">AddressOf</SPAN> SpeechRecognized_Click        <SPAN style="COLOR: blue">AddHandler</SPAN> rec.RecognizeCompleted, <SPAN style="COLOR: blue">AddressOf</SPAN> RecognizedComplete_Click        rec.RecognizeAsync(RecognizeMode.Multiple)     <SPAN style="COLOR: blue">End</SPAN> <SPAN style="COLOR: blue">Sub</SPAN><SPAN style="COLOR: blue">Private</SPAN> <SPAN style="COLOR: blue">Sub</SPAN> SpeechRecognized_Click(<SPAN style="COLOR: blue">ByVal</SPAN> sender <SPAN style="COLOR: blue">As</SPAN> <SPAN style="COLOR: blue">Object</SPAN>, <SPAN style="COLOR: blue">ByVal</SPAN> e <SPAN style="COLOR: blue">As</SPAN> SpeechRecognizedEventArgs)        <SPAN style="COLOR: blue">Dim</SPAN> result <SPAN style="COLOR: blue">As</SPAN> RecognitionResult = e.Result        RichTextBox1.AppendText(result.Text)<SPAN style="COLOR: blue">End</SPAN> <SPAN style="COLOR: blue">Sub</SPAN><SPAN style="COLOR: blue">Private</SPAN> <SPAN style="COLOR: blue">Sub</SPAN> RecognizedComplete_Click(<SPAN style="COLOR: blue">ByVal</SPAN> sender <SPAN style="COLOR: blue">As</SPAN> <SPAN style="COLOR: blue">Object</SPAN>, <SPAN style="COLOR: blue">ByVal</SPAN> e <SPAN style="COLOR: blue">As</SPAN> RecognizeCompletedEventArgs)        <SPAN style="COLOR: blue">Dim</SPAN> result <SPAN style="COLOR: blue">As</SPAN> RecognitionResult = e.Result        MsgBox(result.Text)<SPAN style="COLOR: blue">End</SPAN> <SPAN style="COLOR: blue">Sub</SPAN></PRE><SPAN style="COLOR: blue"></SPAN></DIV><DIV style="COLOR: black; : white"><SPAN style="COLOR: blue"><FONT color=#000000>Below link contains the voice file which contains wave file which I am giving as input. </FONT></SPAN></DIV><DIV style="COLOR: black; : white"><SPAN style="COLOR: blue"><FONT color=#000000></FONT></SPAN>&nbsp;</DIV><DIV style="COLOR: black; : white"><SPAN style="COLOR: blue"><a href="http://gauravkhanna.blog.co.in/files/2009/04/voicedemo.zip" target="_blank"><U><FONT color=#0066cc>http://gauravkhanna.blog.co.in/files/2009/04/voicedemo.zip</FONT></U></A></SPAN></DIV><DIV style="COLOR: black; : white"><PRE>Thanks,</PRE><PRE>Gaurav Khanna<BR></PRE></DIV></DIV>]]>
   </description>
   <pubDate>Tue, 07 Apr 2009 02:55:52 +0000</pubDate>
   <guid isPermaLink="true">https://www.msspeech-forum.com/speech-recognition-for-a-wave-file_topic72_post329.html#329</guid>
  </item> 
 </channel>
</rss>