Microsoft brings new voice styles to Azure Cognitive Services

April 2, 2020 Technology Comments Off 235 Views

Microsoft today announced the launch of new neural text-to-speech (TTS) capabilities in Azure Cognitive Services, its suite of AI-imbued APIs and SDKs, that enable developers to tailor the voice of their apps and services to fit their brand. Each of three new styles — newscast, customer service, and digital assistant — offer fluid and natural-sounding speech that matches the patterns and intonations of human voices, allowing customers to deliver better, more memorable user experiences — in theory.

“Built on a powerful base model, our neural TTS voices are very natural, reliable, and expressive. Through transfer learning, the neural TTS model can learn different speaking styles from various speakers, enabling nuanced voices,” wrote Microsoft in a blog post.

The newscast voice reflects a “professional tone” you might hear on a TV or radio newscast, which is to say it contains no trace of regionalism and uses standard broadcasting pronunciation, a form of pronunciation in which no letters are dropped. In addition to Azure Cognitive Services, Microsoft says that the newscast-style voice is in the Microsoft Listening Docs for WeChat, which can read aloud Word, PowerPoint, and Excel documents and generate audio for online trainings, news podcasts, and more. It’s also in the Bing mobile app — when you search with the voice search feature, you’ll hear the news briefs using the newscast voice:

As for the customer service-style voice, it features a “friendly” and “engaging” tone that Microsoft says is tuned for scenarios involving customer support, like reporting a claim. By contrast, the digital assistant voice — which is available in two styles, a chat style for casual, conversational bots and a professional style for applications like in-car digital assistants — features a helpful tone that’s suited to relaying weather forecasts, navigation directions, reminders, and other such information.

Beyond the voice styles optimized for specific scenarios, Microsoft this morning released several new emotion styles, which can be adjusted to express different emotions to fit a given context. There’s cheerfulness or empathy, and in Chinese, there’s lyrical, which Microsoft describes as “heartfelt” and optimized to read prose or poetry.

The new voice styles are available in English and Chinese while the emotion styles are available for English, Chinese, and Brazilian Portugese, though not all of the styles are available in all languages. Microsoft notes that the styles can be customized through the Custom Neural Voice feature within Microsoft Speech Studio, allowing brands to build unique voices that benefit from the new scenarios.

Microsoft is effectively going toe to toe with Google, which last year debuted 31 new AI-synthesized WaveNet voices and 24 new standard voices in its Cloud Text-to-Speech service (bringing the total number of WaveNet voices to 57). It has another rival in Amazon, which recently launched a service — Brand Voice — that taps AI to generate custom spokespeople, and which offers a number of voice styles and emotion styles through Amazon Polly, Amazon’s cloud offering that converts text into speech.

Let’s block ads! (Why?)

VentureBeat

Web Wad

Microsoft brings new voice styles to Azure Cognitive Services

About

Related Articles

Check Also

Kongregate focuses on building its own idle games for mobile

SAG-AFTRA hits out at AI Taylor Swift deepfakes and George Carlin special, calls to make nonconsensual ‘fake images’ illegal

PlayStation Plus mid-July additions include It Takes Two, Undertale

Louisa Jacobson, Actor

Achieving reliable generative AI

Kongregate focuses on building its own idle games for mobile

Could a Keto Diet Be Bad for Athletes’ Bones?

How to Invest in Real Estate to Achieve FIRE

Appeal Cosmetics New Products!

What Might Fasting Insulin Predict About Health?

8 Things I Always Buy at Thrift Stores

Could a Keto Diet Be Bad for Athletes’ Bones?

How to Invest in Real Estate to Achieve FIRE

Appeal Cosmetics New Products!

What’s Your Next Big Beauty Purchase?

Acupuncture Helped People With Back Pain Walk and Bend Better

Growing a Family in the Shadow of a Pandemic

SAG-AFTRA hits out at AI Taylor Swift deepfakes and George Carlin special, calls to make nonconsensual ‘fake images’ illegal

PlayStation Plus mid-July additions include It Takes Two, Undertale

Louisa Jacobson, Actor