Researchers’ AI system strips identifiable attributes like gender from speech recordings

July 20, 2020 Technology Comments Off 339 Views

In a study accepted to the 2020 International Conference on Machine Learning last week, researchers at the Chalmers University of Technology and the RISE Research Institutes of Sweden propose a privacy-preserving technique that learns to obfuscate attributes like gender in speech data. They use a model that’s trained
to filter sensitive information in recordings and then generate new and private information independent of the filtered one, ensuring sensitive information remains hidden without sacrificing realism and utility.

Maintaining privacy without dispensing with like voice assistants altogether is a challenging task, given state-of-the-art AI techniques have been used to infer attributes like intention, gender, emotional state, and identity from timbre, pitch, and speaker style. Recent reporting revealed that accidental voice assistant activations exposed workers to private conversations; the risk is such that law firms including Mischon de Reya have advised staff to mute smart speakers when they talk about client matters at home. Google Assistant, Siri, Cortana, and other major voice recognition platforms allow the deletion of recorded data, but this requires some — and in some cases substantial — effort on users’ parts.

The researcher’s solution employs a generative adversarial network (GAN) called PCMelGAN, a two-part AI model consisting of a generator that creates samples and a discriminator that attempts to differentiate between the generated samples and real-world samples. It maps speech recordings to mel spectrograms, or representations of the spectrum of frequencies of the audio signal as it varies over time, and passes them through a filter that removes sensitive information and a generator that adds synthetic information in its place. PCMelGAN then inverts the mel spectrogram output into audio in the form of a raw waveform.

In experiments, the researchers trained PCMelGAN on 10,000 samples from the open source AudioMNIST data set, which comprises 30,000 audio recordings of the digits one through nine spoken in the English language. They measured privacy by determining whether a classifier could predict with better than 50% accuracy a speaker’s original gender after five runs on the spectrograms and the raw audio.

Here’s a recording of someone saying “four”:

And here’s PCMelGAN’s output:

Here’s someone saying “six”:

And here’s PCMelGAN’s output:

According to the researchers, the results show PCMelGAN makes it empirically difficult for adversaries to, for example, infer the gender of the speaker while retaining qualities including intonation and content. “The proposed method can successfully obfuscate sensitive attributes in speech data and generates realistic speech independent of the sensitive input attribute. Our results for censoring the gender attribute on the AudioMNIST dataset, demonstrate that the method can maintain a high level of utility,” they wrote. As more data is collected in various settings across organizations, companies, and countries, there has been an increase in the demand of user privacy.”

Let’s block ads! (Why?)

VentureBeat

Web Wad

Researchers’ AI system strips identifiable attributes like gender from speech recordings

About

Related Articles

Check Also

The scale of ambition in gaming is getting bigger | Brian Ward fireside chat

How RapidCanvas automates 70% of data tasks for gen AI projects

10 Tree Shapes to Transform Your Yard

Unifying gen X, Y, Z and boomers: The overlooked secret to AI success

Tomato.ai launches zero-shot accent softening model to revolutionize call center industry

The scale of ambition in gaming is getting bigger | Brian Ward fireside chat

Could a Keto Diet Be Bad for Athletes’ Bones?

How to Invest in Real Estate to Achieve FIRE

Appeal Cosmetics New Products!

What Might Fasting Insulin Predict About Health?

8 Things I Always Buy at Thrift Stores

Could a Keto Diet Be Bad for Athletes’ Bones?

How to Invest in Real Estate to Achieve FIRE

Appeal Cosmetics New Products!

T-Mobile: Massive BYOD growth raises huge enterprise security risks

What Is the Principal Amount on a Personal Loan?

A Look Back At LG’s Wins & 5 Reasons It Couldn’t Survive In The Smartphone Market

How RapidCanvas automates 70% of data tasks for gen AI projects

10 Tree Shapes to Transform Your Yard

Unifying gen X, Y, Z and boomers: The overlooked secret to AI success