Microsoft’s speech recognizer can correctly identify 86 to 88 percent of the words in arbitrary speech, Rashid said.
Ever wondered what you’d sound like if you were fluent in Chinese, French or another language you don’t know? New software that’s in development might give you an idea. Microsoft has created a program designed to provide on-the-fly, spoken translations, in the user’s own voice.
“We may not have to wait until the 22nd century for a usable equivalent of Star Trek’s universal translator,” Rick Rashid, Microsoft’s chief research officer, wrote in a blog post Nov. 8. Microsoft’s translator still makes errors at a noticeable rate, but significantly improves on previous speech translators, Rashid said.
“The results can sometimes be humorous,” he said. “Still, the technology has developed to be quite useful.”
Rashid presented the software on Oct. 25, getting some of his remarks translated into Mandarin Chinese during a conference held in Tianjin, China. In a video Microsoft posted online, the software’s Chinese voice doesn’t sound exactly like Rashid, but it does have the same general tone.
One of the biggest challenges in making the software came in getting it to recognize what users say, Rashid said. Computer scientists have been working on this problem virtually since computers were invented, and the fruits of a generation of research include the automated systems that U.S. banks use for call-in customer service (“Please enter or say your account number now”). In those systems, the speech recognizer only has to understand digits and perhaps some menu options, such as “make a transfer” or “bank hours.”
It’s more difficult for computers to understand freewheeling conversation, however. Until recently, speech-recognizing programs could only understand 75 to 80 percent of the words a person might say during a conversation, Rashid said. Microsoft Research has been working on improving that rate, he said, by using Deep Neural Networks, which are connected networks of computer processors that act a little like the connections between cells in human and animal brains. Google used the same technique this summer to build a computer that taught itself to recognize cat pictures on the Internet.