The History of Voice Recognition Software

Description :

As you listen to your credit card company's automated phone voice asking you to describe your reason for calling, you probably don't care how the system originated. Why should you? The technology is ubiquitous these days. 

The History of Voice Recognition Software

Concerning most technological innovations, the average person simply wants to use them. The newer the toy, the more they want to quickly avail themselves of its promised joy. For commercial entities, it's reduced to how much will the newfangled device contribute to a favorable bottom line.

Stepping Stones

The versatility in speech-recognition devices may be modern, but certain inventive thinkers in Austria, Russia and England were giving form to the concept of mechanical speaking machines as early as the eighteenth century.

For the purpose of voice recognition via electronic equipment, scientists experimenting at Bell Laboratories developed an electronic speech analyzer called a Vocoder in 1928. Homer W. Dudley, the lead scientist, further developed the Vocoder into an electronic synthesizer operated through a keyboard and dubbed it the Voder. Exhibitors from Bell demonstrated the device at the 1939 New York World's Fair. 

The rudiments of modern techniques for speech perception and recognition lay in a talking machine called the Pattern Playback. Built by Franklin S. Cooper and his collaborators at Haskins Laboratories in the late 1940s, the Playback could produce speech sounds from their frequency spectrum. 

The 1950s to the late 1970s saw the creation of the first formant synthesizer, the articulatory synthesizer and concatenative synthesis. These advancements permitted the phonetic quality of vowels to be determined, a major step. Additionally, continuous speech recognition -- eliminating the requirement to pause between words -- began during this period.

In 1988, Apple Computer produced a futuristic video in which the practical application of speech with computers was envisioned. Set for realization in the early 21st century, it defined the ideas of a Speech User Interface and a Multimodal User Interface joined with the representation of intelligent voice-enabled agents.

One underlying contribution to voice recognition not to be overlooked is the Hidden Markov Model. Actually developed in the late 1960s, the HMM permitted researchers to unite diverse sources including language, syntax and acoustics into one probable model. While the technique wasn't sophisticated enough to encompass a wide range of human language characteristics, it did serve as the dominant speech-recognition algorithm during the 1980s. 

Stumbling Blocks

During most of the 20th century, the success of voice recognition researchers was hampered by unsubstantial computing power. Consider that the best available time-sharing system from 1964 to 1983 was the Digital Equipment Corporation's PDP-10, a 36-bit mainframe computer. Still, the system's interactive capability was highly advantageous toward the scientists' goals. The 1990s began to see some of those goals bear fruit.

Listen for It

Technological capabilities became more versatile. Computers were gaining in popularity with the consumer public. Manufacturers were trying to outdo each other in offering faster systems. A monumental development in speech recognition came from Carnegie Mellon University in 1992. Under the leadership of Xuedong Huang, a CMU research team created the SPHINX-11 system. Not only was the system capable of a 5,000 word speaker-independent recognition, it also boasted a mere five percent error rate.

During the 1990s, the first commercially successful speech recognition programs were also introduced. The typical system's vocabulary was more expansive than that of the average human. Dragon Systems was a leading technology with its multiple professional dragon products. It became Nuance via Scansoft's purchase in 2005. Initially, Apple licensed software from Nuance to give voice to its digital assistant Siri.

SEE ALSO: How to Easily Edit and Rename Voice Recordings in iOS

Advanced voice recognition is now commonplace in diverse channels. Its usage is a reality of the 21st century.

*by andreascy*