New Users & General Questions - Developing a Speech Recognition GUI

Print Page | Close Window

Developing a Speech Recognition GUI

Printed From: MSSpeech-Forum
Category: Windows™ Speech Recognition Forums
Forum Name: New Users & General Questions
Forum Description: Ask questions, give and get answers.
URL: https://www.msspeech-forum.com/forum_posts.asp?TID=8
Printed Date: 18/Nov/2024 at 5:14am
Software Version: Web Wiz Forums 12.02 - http://www.webwizforums.com

Topic: Developing a Speech Recognition GUI

Posted By: Sam
Subject: Developing a Speech Recognition GUI
Date Posted: 29/Jul/2008 at 6:45pm

I'm keen to develop an intuitive and usable front end for the excellent speech recognition engine that ships with Vista, with the following key considerations:

Targeted primarily towards an able bodied user (but usable in an alternate configuration by a disabled user)
Striking an optimal balance between speech and manual input
eg keys to turn the microphone on and off, hold down a key while dictating a name, hold down a key while dictating a command, use keys to navigate text by phrase, select alternatives etc
Dictate seamlessly into any application, by dictating into an on-screen 'buffer' which then (on demand) dumps its content into the active window. This buffer would be something like NotePad, but designed specifically for synthesising keyboard and speech input.
Provide an integrated and context sensitive GUI
eg the list of active commands would display/toggle in semi transparent overlay
Powerful macro scripting language (such as AutoHotKey.com)

I can see a genuine need, and an opportunity to create a premium product to address this need.

Much as I would love to manifest this myself, it is not practical for me. RSI may be no more a barrier to long forum posts, but it is a stopper for coding.

I would like to throw this open for discussion, and get some temperature reading from the speech recognition community (develop and user).

If you are working with speech technology, and fancy involving with this effort, please post a reply. If we hit critical mass we can do something.

If you fancy coring it, please involve me - post a reply or send me an e-mail. I can see a clear path to an effective solution.

Sam

PS The existing Vista SR front end is for me little short of a nightmare. Please don't get me wrong when I say this, I mean to be objective and constructive. This is an honest reflection of my experience with this product (and every other speech recognition product I have ever used).

It allows practically no customizability.
Interspersing commands with dictation, eg Every time I say 'speech recognition', it detects a match on the webpage with 'WSR windows speech recognition help', and changes the webpage. After having lost my post three times , Im now dictating into notepad and using copy / paste.
It is specifically targeted at people with disabilities, hence tries to accomplish everything through speech - rather than focusing on an optimal balance between speech and manual input
It is unresponsive in a noisy environment. Clearly the code locks as it tries to decipher a noisy signal, but this means that user interface (ie the single on off button for the microphone) does not respond, sometimes for several minutes.
It does not work (or half-works) in certain contexts / apps
etc

I'm starting to realise that the community at Microsoft developing this technology is opening portals of communication to their user base.

If it is of use to someone, I would be happy to record a diary of every frustrating incident over several days, and categorize into key areas.

Replies:

Posted By: Guests
Date Posted: 11/Aug/2008 at 11:43am

Hi Sam,

An interesting idea about a completely new GUI for the Vista Speech Recognition engine. Have you tried using the WSR Macro add-in from Microsoft that is available here?

http://download.speechmacros.com/

It allows the customization and control you are seeking with practically unlimited potential...

BradT

Posted By: macro_lover
Date Posted: 13/Aug/2008 at 11:01pm

The add-in, BTW includes features that are very handy;

It's extremely easy to familiarize with and begin using -

and in several respects, it's simpler and more user friendly
than the Vista(tm) in-box functions & control features!

-------------
The Team
http://wirelessspeech.blogspot.com - http://wirelessspeech.blogspot.com