The airport in El Borma, Tunisia, is fogbound. A dispatcher must reroute military traffic for the U.S. Transportation Command. He looks down at his computer screen, but instead of typing at the keyboard he speaks into a microphone.
"Give me a list of all airports in Tunisia," he says. His words appear instantaneously on the computer screen. Within seconds, his answer, a long list of airport locations, also appears. "Which airport is closest to El Borma?" he asks. Again, the system responds within seconds. The dispatcher takes the data, plots alternate routes and then sends out alerts that Transportation Command outposts around the world could receive.
For the last two decades, researchers have repeatedly proclaimed that they were on the verge of producing computer systems that could hear and respond to spoken commands. Performance has seldom matched the promise.
Now, several companies and universities have developed speech recognition systems that are proving themselves capable of performing tasks useful to business and the military. The military routing system is already being tested by Bolt, Beranek & Newman Inc., a computer research concern here with $260 million in annual revenues. Within a year, the company expects to have a prototype ready for the military.
The concern's system and others built under grants from the Defense Advanced Research Projects Agency, or DARPA, couple the ability to convert speech into electronic text with the artificial intelligence to understand that text for the first time. Though systems that accept spoken data are already for sale, these are the first where computers answer spoken questions.
Bolt, Beranek runs its speech recognition software on Sun Microsystems workstations, which are powerful desktop computers. But the software can operate on other workstations that use the Unix computer language. The concern intends that its speech programs eventually run on personal computers.
Bolt, Beranek is not alone in its work. Similar projects under DARPA's Spoken Language Systems initiative, with an annual budget of more than $5 million, are under way at SRI International in Menlo Park, Calif., the Massachusetts Institute of Technology and Carnegie-Mellon University. Researchers say that commercial products will be available in a year or two, even
before the military could put its transport system into service.
SRI has a test version of an air travel system that lets a user say: "I want to fly from Boston to San Francisco on July 4. What flights are available?" In seconds, the information is pulled from a data base and displayed on the screen. "This is a prototype of the future of computing," said Alex Rudnicky, a researcher and systems scientist at Carnegie-Mellon. "It's a step toward the free interaction with the computer by voice."
Charles Wayne, program manager for the Spoken Language Systems initiative at DARPA, expects that computers that can listen and respond will prove critical to the military. Tank commanders or fighter pilots could converse with a computer without touching a keyboard. "A pilot in a combat situation can't be looking down at a CRT in his cockpit," Mr. Wayne said, referring to a cathode ray terminal, or computer screen. "But he could speak to his computer and ask, 'What is my fuel level?' Information could be gotten more quickly and easily using voice."
Mr. Wayne, though pleased with the progress, said the technology was "nowhere near where we want to take this." Critics say this work is still a long leap from computers that can fully converse with a user on any subject. Until computers are infused with a wide breadth of human knowledge -- currently a technological impossibility -- researchers must limit their programs to narrow domains of knowledge.
Even these first steps have attracted the interest of several major corporations. Bolt, Beranek is discussing a partnership with the Digital Equipment Corp., and SRI is talking to American Airlines about giving its reservations system the ability to understand speech. Mr. Rudnicky said that Apple Computer was working on a voice system for offices and was evaluating the research at Carnegie-Mellon.
"Voice communication is absolutely crucial," said Kai-Fu Lee, manager of the speech and language technology group at Apple. Computers have already become more flexible, responding as readily to a click of a mouse as to typed instructions, he said. "The next step will be delegation, the computer acting as an agent that knows what you want and how to do it," he added. "And we don't just want voice input, we want voice output as well."
To this end, Apple includes stereo speakers in some of its computers, and plans to add microphones to future models. Although Bolt, Beranek's speech recognition system does not read messages aloud, the necessary equipment can be easily added.
Systems currently available from such pioneers as Kurzweil Applied Intelligence and Dragon Systems Inc. can display spoken words on a computer screen. The Bolt, Beranek approach takes speech recognition to its next logical step, providing answers to spoken queries.
The use of natural, spoken language, instead of typed commands in a specialized language, to converse with computers has been one of the industry's most difficult problems. Researchers have long sought a solution to what John Oberteuffer, an analyst with Voice Information Associates in Lexington, Mass., called the "bottleneck of the keyboard."
Bolt, Beranek has been at work on speech processing for 20 years, said Michael Krasner, manager of the company's speech and natural language processing department. The current project began in 1986 under a DARPA grant but has made real progress only in the last two years, Mr. Krasner said.
To compare the progress of several contractors in the program, DARPA asked each to develop an air travel information system (which is what SRI is offering to American Airlines). Each of the contractors has also worked on its own special project. Bolt, Beranek is adding speech recognition to the military's automated system for routing and keeping track of personnel and supplies, so that Transportation Command planners with no computer experience can speak to a computer and get the information they need immediately.
Bolt, Beranek's voice system matches the spoken word with a model composed of the individual sounds that make up that word. When a user asks a question, the computer generates its best estimate of what was said. To guard against misunderstandings, the computer also displays the next 10 best possibilities, from which the speaker can choose.
The system understands continuous sentences from virtually any English speaker without a heavy accent; most commercially available systems require speakers to awkwardly pause about a quarter of a second between words. Bolt, Beranek researchers taught the system to respond to a variety of male and female voices by having more than a dozen native-born Americans speak into the microphone. Those with foreign accents can use the system after reading a few pages of material into the system to attune it to their speech patterns.
The more difficult problem is teaching a computer to understand natural, everyday language instead of the specialized commands most programs require. The Bolt, Beranek system, according to Krasner, asks itself: "Does this make sense? How likely, based on grammar and syntax, is it for someone to ask this question?" If the natural language program does not understand the question, it will flash a message asking the speaker for more information. Once the system grasps the question, it converts that question into a computerized query to a data base and finds the answer, which it then displays on the screen.