Synthesize and recognize artificial voices

Audio deep fakes are a challenge for Telekom, so the solution synthesizes voices of board members and can distinguish them from fakes.

The problem

01

The project

01

As a large corporation with a high level of digital affinity, Telekom is exposed to numerous risks and fraud attempts. In addition to classic phishing, these have increasingly included attempts by audio deep fakes in recent years. These are used, for example, to obtain internal Group information, to initiate bank transfers or to manipulate the market with false audio and video messages.

With ever-improving open source tools - and an almost endless data pool of publicly available recordings of performances by telecom executives - these deepfakes are becoming increasingly difficult for the human ear to identify as such.

For this reason, a project was launched with Telekom's innovation arm, T-Labs, to use AI to reliably automate this very distinction and consistently prevent fraud attempts.

The solution

02

Our contribution

02

Our solution approach consisted of two parts: speech synthesis (to test the goodness of our model) and the actual counterfeit classification tool.

Speech synthesis consisted of an encoder that encodes a target's voice using an audio sample, a synthesizer that creates the audio spectrogram (i.e., a 2D image) of a given text using the encoded voice, and a vocoder that finally generates the audio from the spectrogram.

The forgery detection tool was trained with publicly available datasets. The normalized and trimmed 2s audio sequences were converted to a fixed dimension Mel spectrogram and a CNN-based network was trained.

As a result, deepfakes could be created for numerous executives based on public material and classified as real or fake with 98.6% reliability using the tool.

Our result

03
10+
Executive voices deceptively synchronized
98.6%
Reliability in the identification of fakes
>2s
Audio material to make a classification
10+
Executive voices deceptively synchronized
98.6%
Reliability in the identification of fakes
>2s
Audio material to make a classification

Contact us

  • Critical and holistic evaluation of the approach
  • Development of guidance for reliable implementation
  • Free of charge and without obligation
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
We would like to get to know you!

Start your AI journey with us now

Subscribe now to the Merantix Momentum Newsletter.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Contact us

  • Critical and holistic evaluation of the approach
  • Development of guidance for reliable implementation
  • Free of charge and without obligation
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
We would like to get to know you!

Start your AI journey with us now

Subscribe now to the Merantix Momentum Newsletter.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.