Breaking News

COLORFUL Unveils New iGame M15 and M16 Origo Gaming Laptops at COMPUTEX 2026 GIGABYTE Showcases Sleek STEALTH and Elegant WOOD PC Builds at COMPUTEX 2026 GIGABYTE Showcases Industry-leading CQDIMM Performance and Ecosystem Expansion at COMPUTEX 2026 G.SKILL Demos Trident Z5 NeoX RGB Series DDR5 with AMD EXPOT Technology NVIDIA and Microsoft Reinvent Windows PCs for the Age of Personal AI

logo

  • Share Us
    • Facebook
    • Twitter
  • Home
  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map

Search form

Google's WavenNet Technology Generates Speech That Matches Human Voice

Google's WavenNet Technology Generates Speech That Matches Human Voice

Enterprise & IT Sep 9,2016 0

Google’s DeepMind unit has created a system for machine-generated speech that it says outperforms existing technology by 50 percent. DeepMind says that the 'WaveNet' deep generative model of raw audio waveforms is able to generate speech which mimics any human voice and which sounds more natural than the best existing Text-to-Speech systems.

The same network can be used to synthesize other audio signals such as music.

In blind tests for U.S. English and Mandarin Chinese, human listeners found WaveNet-generated speech sounded more natural than that created with any of Google’s existing text-to-speech programs, which are based on different technologies. WaveNet still underperformed recordings of actual human speech.

The ability of computers to understand natural speech has been revolutionised in the last few years by the application of deep neural networks (e.g., Google Voice Search). However, generating speech with computers - a process usually referred to as speech synthesis or text-to-speech (TTS) - is still largely based on so-called concatenative TTS, where a very large database of short speech fragments are recorded from a single speaker and then recombined to form complete utterances. This makes it difficult to modify the voice (for example switching to a different speaker, or altering the emphasis or emotion of their speech) without recording a whole new database.

This has led to a great demand for parametric TTS, where all the information required to generate the data is stored in the parameters of the model, and the contents and characteristics of the speech can be controlled via the inputs to the model. So far, however, parametric TTS has tended to sound less natural than concatenative, at least for syllabic languages such as English. Existing parametric models typically generate audio signals by passing their outputs through signal processing algorithms known as vocoders.

DeepMind says that WaveNet changes this paradigm by directly modelling the raw waveform of the audio signal, one sample at a time. As well as yielding more natural-sounding speech, using raw waveforms means that WaveNet can model any kind of audio, including music.

Google's company has described how WavenNet works in this paper.

Tags: deepmind
Previous Post
Pokémon GO Plus Going On Sale 16th September
Next Post
US, Japan Aviation Authorities Warn Against Using Samsung Galaxy Note 7 On Planes

Related Posts

  • DeepMind Researchers Create Deep RL Agent That Outperforms Humans in the Atari Human Benchmark

  • Google AI System Could Used to Detect Breast Cancer Detection

  • DeepMind Uses WaveNet technology to Reunite Speech-impaired Users with Their Original Voices

  • DeepMind's Alphastar Achieved a Grandmaster Level at StarCraft II

  • DeepMind and Waymo Work on Training More Capable Self-driving Cars

  • DeepMind AI Beats Professional StarCraft II Players

  • Google DeepMind Go AI Opens Up New Horizons In Chess And Shogi Games

  • Deep Mind's Neural Scene Rendering System Predicts 3D Surroundings Using Its Own Sensors

Latest News

COLORFUL Unveils New iGame M15 and M16 Origo Gaming Laptops at COMPUTEX 2026
Consumer Electronics

COLORFUL Unveils New iGame M15 and M16 Origo Gaming Laptops at COMPUTEX 2026

GIGABYTE Showcases Sleek STEALTH and Elegant WOOD PC Builds at COMPUTEX 2026
Cooling Systems

GIGABYTE Showcases Sleek STEALTH and Elegant WOOD PC Builds at COMPUTEX 2026

GIGABYTE Showcases Industry-leading CQDIMM Performance and Ecosystem Expansion at COMPUTEX 2026
PC components

GIGABYTE Showcases Industry-leading CQDIMM Performance and Ecosystem Expansion at COMPUTEX 2026

G.SKILL Demos Trident Z5 NeoX RGB Series DDR5 with AMD EXPOT Technology
PC components

G.SKILL Demos Trident Z5 NeoX RGB Series DDR5 with AMD EXPOT Technology

NVIDIA and Microsoft Reinvent Windows PCs for the Age of Personal AI
Enterprise & IT

NVIDIA and Microsoft Reinvent Windows PCs for the Age of Personal AI

Popular Reviews

Akaso 360 Action camera

Akaso 360 Action camera

Dragon Touch Digital Calendar

Dragon Touch Digital Calendar

be quiet! Pure Loop 3 280mm

be quiet! Pure Loop 3 280mm

Endorfy Thock V2 Wireless Keyboard

Endorfy Thock V2 Wireless Keyboard

Noctua NF-A12x25 G2 fans

Noctua NF-A12x25 G2 fans

Soft2bet and the unseen hardware that makes instant play possible

Soft2bet and the unseen hardware that makes instant play possible

Crucial T710 2TB NVME SSD

Crucial T710 2TB NVME SSD

JSAUX 65Wh Rog Ally Battery

JSAUX 65Wh Rog Ally Battery

Main menu

  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map
  • About
  • Privacy
  • Contact Us
  • Promotional Opportunities @ CdrInfo.com
  • Advertise on out site
  • Submit your News to our site
  • RSS Feed