Researchers’ AI system strips identifiable attributes like gender from speech recordings

July 20, 2020 Technology Comments Off 238 Views

In a study accepted to the 2020 International Conference on Machine Learning last week, researchers at the Chalmers University of Technology and the RISE Research Institutes of Sweden propose a privacy-preserving technique that learns to obfuscate attributes like gender in speech data. They use a model that’s trained
to filter sensitive information in recordings and then generate new and private information independent of the filtered one, ensuring sensitive information remains hidden without sacrificing realism and utility.

Maintaining privacy without dispensing with like voice assistants altogether is a challenging task, given state-of-the-art AI techniques have been used to infer attributes like intention, gender, emotional state, and identity from timbre, pitch, and speaker style. Recent reporting revealed that accidental voice assistant activations exposed workers to private conversations; the risk is such that law firms including Mischon de Reya have advised staff to mute smart speakers when they talk about client matters at home. Google Assistant, Siri, Cortana, and other major voice recognition platforms allow the deletion of recorded data, but this requires some — and in some cases substantial — effort on users’ parts.

The researcher’s solution employs a generative adversarial network (GAN) called PCMelGAN, a two-part AI model consisting of a generator that creates samples and a discriminator that attempts to differentiate between the generated samples and real-world samples. It maps speech recordings to mel spectrograms, or representations of the spectrum of frequencies of the audio signal as it varies over time, and passes them through a filter that removes sensitive information and a generator that adds synthetic information in its place. PCMelGAN then inverts the mel spectrogram output into audio in the form of a raw waveform.

In experiments, the researchers trained PCMelGAN on 10,000 samples from the open source AudioMNIST data set, which comprises 30,000 audio recordings of the digits one through nine spoken in the English language. They measured privacy by determining whether a classifier could predict with better than 50% accuracy a speaker’s original gender after five runs on the spectrograms and the raw audio.

Here’s a recording of someone saying “four”:

And here’s PCMelGAN’s output:

Here’s someone saying “six”:

And here’s PCMelGAN’s output:

According to the researchers, the results show PCMelGAN makes it empirically difficult for adversaries to, for example, infer the gender of the speaker while retaining qualities including intonation and content. “The proposed method can successfully obfuscate sensitive attributes in speech data and generates realistic speech independent of the sensitive input attribute. Our results for censoring the gender attribute on the AudioMNIST dataset, demonstrate that the method can maintain a high level of utility,” they wrote. As more data is collected in various settings across organizations, companies, and countries, there has been an increase in the demand of user privacy.”

Let’s block ads! (Why?)

VentureBeat

Web Wad

Researchers’ AI system strips identifiable attributes like gender from speech recordings

About

Related Articles

Check Also

Kongregate focuses on building its own idle games for mobile

SAG-AFTRA hits out at AI Taylor Swift deepfakes and George Carlin special, calls to make nonconsensual ‘fake images’ illegal

PlayStation Plus mid-July additions include It Takes Two, Undertale

Louisa Jacobson, Actor

Achieving reliable generative AI

Kongregate focuses on building its own idle games for mobile

Could a Keto Diet Be Bad for Athletes’ Bones?

How to Invest in Real Estate to Achieve FIRE

Appeal Cosmetics New Products!

What Might Fasting Insulin Predict About Health?

8 Things I Always Buy at Thrift Stores

Could a Keto Diet Be Bad for Athletes’ Bones?

How to Invest in Real Estate to Achieve FIRE

Appeal Cosmetics New Products!

The ER Doctor Whose Morning Routine Starts At Midnight

What’s Next For The Pre-Launch M’sian Startup That Just Raised RM12mil Seed Funding

Why Marks & Spencer May Follow The Demise Of Sister Brand Robinsons Despite New S’pore Store

SAG-AFTRA hits out at AI Taylor Swift deepfakes and George Carlin special, calls to make nonconsensual ‘fake images’ illegal

PlayStation Plus mid-July additions include It Takes Two, Undertale

Louisa Jacobson, Actor