Arrow-right Camera
The Spokesman-Review Newspaper
Spokane, Washington  Est. May 19, 1883

Digital Dictation Speech-Recognition Software Not Quite Ready To Hammer Out A Great American Novel

Speech-recognition technology is beginning to change the way people interact with computers and other wired systems. However, development still has a ways to go.

Current off-the-shelf speech-recognition software still has spotty accuracy, and, for many, regular keyboards are a better alternative.

For example, I recently tested the Office version of IBM’s ViaVoice ‘98 ($110).

The minimum requirements for ViaVoice, similar to Dragon Systems Naturally Speaking, the market leader, are a 166 megahertz processor, 32 megabytes of memory and 180 megabytes of space on the hard disk.

ViaVoice is easy to install, but getting it to work well takes time.

After ViaVoice - which comes with a microphone headset - is running, an hour or more is needed to get the software somewhat familiar with your speech patterns. This involves repeating hundreds of scripted text passages to give ViaVoice enough of a sample for vocal analysis. As you speak into the microphone, the computer types the words on the screen.

At first, I read 50 sentences provided from “Deep Blue vs. Garry Kasparov” into the computer so the software could train itself to recognize my voice. The computer message said I had recorded enough sentences to train the system. However, when I began dictating my own notes into Word 97, the accuracy was so poor it was easily apparent the computer program needed to further its education.

Despite extra training, ViaVoice still garbled a number of my words. On average, the sentences that appeared on the screen had one or more errors.

However, ViaVoice has features that allow users to expand the vocabulary and correct errors, reducing the chance of having words incorrectly identified by the program the more it is used.

I found ViaVoice works best with controlled bursts of speech - 10 to 15 words - and pauses, rather than reading or talking non-stop.

Overall, the product does not yet seem reliable or efficient enough for dictating an article, especially on deadline. But, to the chagrin of some co-workers who don’t like hearing me talk to my computer, I will use ViaVoice for transcribing notes, quotes and other work-related ideas.

It’s an ever-expanding field

Speech-recognition software products extend far beyond desktop computers. Steve Simmons, a computer science instructor at Eastern Washington University, notes that many other programs are being developed used in a variety of fields.

Currently, the largest market for speech-recognition technology is for fielding phone calls.

The Social Security Administration, for instance, uses a basic speech-recognition program for callers requesting forms or an account history.

Separately, an answering system used by some smaller firms answers calls, then routes callers to the proper extension once they say a name. It also handles voice mail and allows users to dial up to 150 phone numbers by just saying a name, such as “call Chuck.” The program, called Wildfire, is made by a Massachusetts company of the same name.

While the speech-recognition software market has seen solid growth in recent years, Information Associates, a Lexington, Mass.-based research firm, estimates the total market — for the desktop, telephones and other applications — is only about $500 million.

Walt Tetschner, an associate with the company, said speech-recognition devices will likely make their greatest inroads for telephone systems and other niche markets.

As for the desktop, Tetschner said, “Anyone who’s predicting that this will be a primary-user interface is in a different world.”

Tetschner noted that speech recognition in the office environment is not a dying industry, and in many cases, ViaVoice and Naturally Speaking can be quite useful. Tetschner said the programs are especially useful for workers unable to use a keyboard.

Simmons, the EWU instructor, is working on a speech-recognition project of his own, designed, for instance, to assist repair workers in noisy surroundings. Workers would be able to keep their hands free and ask a laptop computer for the next step in a repair process. He said such niche products likely hold more promise than desktop speech-recognition software.

Simmons said he plans to raise the accuracy rate on his product by limiting the recognizable vocabulary to 5,000 to 10,000 words.

The 64,000 words initially recognized by products like Naturally Speaking and ViaVoice contribute in part to higher error rates, he said.

While Simmons notes that voice-recognition has its technical shortcomings, he is more bullish on desktop usage than Tetschner. He argues that growth patterns will be similar to that of Apple computers: It will start with those who need it, such as people with disabilities, or in the case of Apple, designers, and as processor speeds continue to increase the products will become better for desktop use.

Those days, however, are at least three to four years away, Simmons predicts.