Microsoft Creates Speech Recognition Tech With Human Accuracy

Microsoft researchers have announced details on the company’s latest speech recognition technology, which is claimed can transcribe conversational speech as accurately as a human.

The team of researchers and engineers at Microsoft’s Artificial Intelligence (AI) and Research division noted that the speech recognition system they developed makes the same or fewer errors than professional transcriptionists.

They reported a word error rate of 5.9 percent, about equal to that of people asked to transcribe the same conversation the system was tested against.

AI milestone

“We’ve reached human parity,” said Xuedong Huang, the company’s chief speech scientist. “This is an historic achievement.”

It’s a bold claim, but when speech recognition with the likes of virtual assistants such as Cortana and Apple’s Siri can be hit and miss, such improvements can take speech recognition tools and smart software from being gimmicks and nice-to-have features into genuinely useful day-to-day tools.

It is also indicative of the rapid evolution of AI and smart systems, which makes concerns that the impact of intelligent machines and software needs to be considered sooner than later.

“Even five years ago, I wouldn’t have thought we could have achieved this. I just wouldn’t have thought it would be possible,” said Harry Shum, the executive vice president who heads the Microsoft Artificial Intelligence and Research group.

To get transcription parity with humans, Microsoft made use of deep learning neural networks, which replicates in part how the human brain learns, to train the system to recognise patterns in sounds rather than be trained manually to make sense of each sound.

Using Microsoft’s Computational Network Toolkit the researchers were able to process deep learning algorithms across multiple computers running graphics processing chips for parallel processing, an important technique needed for crunching the vast amount of information a neural network needs to ingest. This allowed the researchers to carry out their testing and training at quite a lick.

While the researchers have some way to go before they can make sure the speech recognition technology works well in real-world settings with background noise, the current discoveries are likely to find their way into existing speech recognition features found in Windows and Xbox platforms.

“This will make Cortana more powerful, making a truly intelligent assistant possible,” Shum said.

Microsoft’s AI efforts are timely given how Google is looking to make waves with its AI-powered Assistant found in its new Pixel smartphones, and the search company has figured out how to make its speech-based technology replicate human speech.

What do you know about Windows 10? Try our quiz?

Roland Moore-Colyer

As News Editor of Silicon UK, Roland keeps a keen eye on the daily tech news coverage for the site, while also focusing on stories around cyber security, public sector IT, innovation, AI, and gadgets.

Recent Posts

Google Delays Removal Of Third-Party Cookies, Again

For third time Google delays phase-out of third-party Chrome cookies after pushback from industry and…

2 hours ago

Tesla Posts Biggest Revenue Drop Since 2012

Elon Musk firm touts cheaper EV models, as profits slump over 50 percent in the…

3 hours ago

Apple iPhone Q1 Sales In China Fall 19 Percent, Says Counterpoint

Bad news for Tim Cook, as Counterpoint records 19 percent fall in iPhone sales in…

7 hours ago

President Biden Signs TikTok Ban Or Divest Bill Into Law

TikTok pledges to challenge 'unconstitutional' US ban in the courts, after President Joe Biden signs…

8 hours ago

UK CMA Seeks Feedback On Microsoft, Amazon AI Partnerships

British regulator invites feedback on major partnerships Microsoft and Amazon have struck with smaller AI…

1 day ago