Speech Recognition for a wave file |
Post Reply |
Author | |
gkhanna
Member Joined: 07/Apr/2009 Status: Offline Points: 1 |
Post Options
Thanks(0)
Posted: 07/Apr/2009 at 2:55am |
Currently I am working on a project which can recognize the speech in a wave file. I am using SpeechRecognitionEngine class in Framework 3.5. I want to display each word which in said in wave file (something like we see in movies where every sentences spoken is display in text below). Wave file could be a voice of a person or a song. When I try to using Recognize function it only display first 4-5 words from the wave file and they are totally wrong, and have confidence level 0.004. When I try to do it asynchronously (RecognizeAsync function) then speech is not detected at all. I have also tried SAPI 5.3 but I get wrong result. Below is my code Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load Dim rec As New Speech.Recognition.SpeechRecognitionEngine Dim fs As New FileStream("C:\Test.wav", FileMode.Open, FileAccess.Read) rec.SetInputToWaveStream(fs) rec.LoadGrammar(New DictationGrammar) AddHandler rec.SpeechRecognized, AddressOf SpeechRecognized_Click AddHandler rec.RecognizeCompleted, AddressOf RecognizedComplete_Click rec.RecognizeAsync(RecognizeMode.Multiple) End Sub Private Sub SpeechRecognized_Click(ByVal sender As Object, ByVal e As SpeechRecognizedEventArgs) Dim result As RecognitionResult = e.Result RichTextBox1.AppendText(result.Text) End Sub Private Sub RecognizedComplete_Click(ByVal sender As Object, ByVal e As RecognizeCompletedEventArgs) Dim result As RecognitionResult = e.Result MsgBox(result.Text) End Sub Below link contains the voice file which contains wave file which I am giving as input.
Thanks, Gaurav Khanna |
|
mmarkoe
Moderator Group Joined: 24/Jul/2008 Status: Offline Points: 210 |
Post Options
Thanks(0)
|
It sounds to me as if you have been watching too many Star Wars movies and forgotten that this is the 21st century not the 25th century. There's not enough computing power in the world to do what you were requiring. Large vocabulary Speech Recognition today is capable of very high accuracy. However, it is speaker dependent. This means to work optimally (I would call it usefully to the point where you understand what speakers say) it needs to be trained to an individual's voice. Speech recognition software does not just listen for the sounds of words but also compares each word to the around it for context clues.
Therefore, large vocabulary speech recognition works best:
It is highly unlikely the public speaker would be willing to do the above. Also, when you try to decipher more than one voice, the task becomes geometrically more difficult. Marty Markoe |
|
Marty Markoe, MVP
Microsoft Valued Partner See us at: http://www.mymsspeech.com |
|
Post Reply | |
Tweet
|
Forum Jump | Forum Permissions You cannot post new topics in this forum You cannot reply to topics in this forum You cannot delete your posts in this forum You cannot edit your posts in this forum You cannot create polls in this forum You cannot vote in polls in this forum |