Breaking News

Viltrox Announces AF 15mm F1.7 Air MSI Prestige 16 AI Mercedes-AMG Motorsport Limited Edition Laptop GAMEMAX Introduces AERIS 330 Series micro-ATX PC Case COLORFUL Launches Rimbook Series Laptops Circular Smart Rings Offer Early Detection of Sickness Symptoms

logo

  • Share Us
    • Facebook
    • Twitter
  • Home
  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map

Search form

Microsoft Researchers Reach Human Parity in Conversational Speech Recognition

Microsoft Researchers Reach Human Parity in Conversational Speech Recognition

Enterprise & IT Oct 28,2016 0

Microsoft researchers have set a world record for speech recognition, using a technology it announced this week with GPU-accelerated deep learning to recognize words in a conversation as well as a person does. Microsoft's team described how they achieved an error rate of 5.9 percent - the lowest ever for machine transcription - and about as accurate as people who transcribed the same conversation. It’s also a 6 percent improvement over a record Microsoft set only a month ago.

"We’ve reached human parity," said Xuedong Huang, the company’s chief speech scientist and co-author of a paper published this week. "This is an historic achievement."

Conversational speech poses some of the biggest challenges to speech recognition, said Geoffrey Zweig, who manages the Speech & Dialog research group at Microsoft.

"Speech recognition gets hard when people are talking informally, when they get excited, when they make mistakes and correct themselves, when they change topics. All of these are characteristics of conversational speech," he said.

The researchers credit their breakthrough in conversational speech recognition to deep learning, in particular, the systematic use of convolutional and recurrent neural networks. In their latest work, the team applied a type of recurrent neural network called Long Short-Term Memory (LSTM) to the language model.

LSTM networks have the advantage of being able to "remember" information for a longer period time, so they are sensitive to more words than most neural network language models are.

Microsoft’s Cognitive Toolkit (previously known as CNTK), an open source deep learning framework, played key role in reaching human parity for conversational speech recognition. The cognitive toolkit, which Microsoft announced this week, is a system for deep learning that is used to speed advances in areas such as speech and image recognition and search relevance on GPUs.

By using Nvidia's Tesla M40 GPUs, Zweig said researchers reduced the training time for some language models from months to weeks. "That makes all the difference because the rate of progress we can make is linked to the number of experiments we can run," he said.

More work needs to be done to improve speech recognition in real-life settings like parties or city streets, where there may be music, traffic, people talking and other types of background noise. Researchers are also improving conversational speech recognition for meetings, where there are often multiple speakers seated at different distances from a microphone.

Zweig said the research milestone means the company has the right tools to quickly deploy a new generation of improved speech recognition in its Cortana personal digital assistant, Xbox gaming console and other products.

Their long-term goal is to move from speech recognition to understanding, he said. This would make it possible for devices to answer questions or take actions based on what they’re told.

Tags: Microsoft
Previous Post
Shuttle to Launch ARM-based Android Mini PC-NS02
Next Post
EU Privacy Authorities Warn WhatsApp on Data Privacy Policy, Yahoo on Breach

Related Posts

  • Snapdragon X Series is the Exclusive Platform to Power the Next Generation of Windows PCs with Copilot+ Today

  • Activision Blizzard King to Team Xbox

  • NVIDIA Studio Lineup Adds RTX-Powered Microsoft Surface Laptop Studio 2

  • Samsung and Microsoft Unveil First On-Device Attestation Solution for Enterprise

  • Introducing Xbox Game Pass Core, Coming This September

  • Announcing the next wave of AI innovation with Microsoft Bing and Edge

  • Microsoft Announces Security Copilot AI

  • Microsoft breaks new ground in healthcare with the next evolution of AI

Latest News

Viltrox Announces AF 15mm F1.7 Air
Cameras

Viltrox Announces AF 15mm F1.7 Air

MSI Prestige 16 AI Mercedes-AMG Motorsport Limited Edition Laptop
Consumer Electronics

MSI Prestige 16 AI Mercedes-AMG Motorsport Limited Edition Laptop

GAMEMAX Introduces AERIS 330 Series micro-ATX PC Case
Cooling Systems

GAMEMAX Introduces AERIS 330 Series micro-ATX PC Case

COLORFUL Launches Rimbook Series Laptops
Enterprise & IT

COLORFUL Launches Rimbook Series Laptops

Circular Smart Rings Offer Early Detection of Sickness Symptoms
Consumer Electronics

Circular Smart Rings Offer Early Detection of Sickness Symptoms

Popular Reviews

be quiet! Light Loop 360mm

be quiet! Light Loop 360mm

be quiet! Dark Mount Keyboard

be quiet! Dark Mount Keyboard

be quiet! Light Mount Keyboard

be quiet! Light Mount Keyboard

Noctua NH-D15 G2

Noctua NH-D15 G2

Soundpeats Pop Clip

Soundpeats Pop Clip

be quiet! Light Base 600 LX

be quiet! Light Base 600 LX

be quiet! Pure Base 501

be quiet! Pure Base 501

Terramaster F8-SSD

Terramaster F8-SSD

Main menu

  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map
  • About
  • Privacy
  • Contact Us
  • Promotional Opportunities @ CdrInfo.com
  • Advertise on out site
  • Submit your News to our site
  • RSS Feed