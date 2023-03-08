“Grandma I ended up in prison, I have no money or documents, can you send me something for bail?”: this is the request for help that la 73enne canadese Ruth Card received from the voice of his nephew Brandon. Immediately after the phone call, the woman ran to withdraw $3,000 to help her nephew in need. When she arrived at another bank, to try to recover even more money, she was blocked by one of her directors: he wasn’t the first person to receive such a call, apparently from a relative. The request came from a fake rumor, generated via artificial intelligence.

The story told by the Washington Post is a testimony of the unintended consequences of a world in which AI is constantly improving in its ability to create believable content, be it text, images or audio.

The same newspaper owned by Jeff Bezos tells how phone scams with false rumors are now increasingly common in the United States: according to Federal Trade Commission datain 2022 alone there were over 36,000 reports of people pressured to give money to imposters posing as relatives or friends.

A situation that threatens to worsen with the new evolutions of artificial intelligence: after text and images, in fact, audio seems to be one of the next frontiers of AI. Like ChatGPT and Dall-E 2, there are tools that democratize the creating audio deepfakes ever more credible.

The alarm The dark side of ChatGPT: how artificial intelligence is used for cybercrime by Emanuele Capone

January 12, 2023 See also Inland Revenue website ko, possible hacker attack?



How speech synthesis works

Waiting for debut of Microsoft’s Vall-E, online there are already some decidedly easy to use tools, which work mainly in English. The better known is Eleven Labswhich allows you to clone any voice in minutes for a $5-per-month subscription.

The process is not immediate, but easy enough to make it interesting for those who want to work with it or use it for less legal purposes. It’s all about inserting learning material, which allows the AI ​​to learn and replicate. In the case of Eleven Labs, you need to enter 25 files containing clear recordings of the voice you want to imitate. Once created, a white space appears, where it is possible enter the text you want to read to the artificial voice.

And another similar service is Resemble, which allows you to create a new voice for free with a process similar to that of Eleven Labs. The difference is that, in this case, the voice can also be recorded live, repeating the sentences (for now in English) offered by the service.

Famous and less famous voices

As you understand, the mechanism is relatively simple: The AI ​​listens to examples of a given voice and learns to replicate it, to make it say what you choose. A system that potentially puts anyone at risk, because our voice is one of the data we give online, especially in videos posted on social networks: “A year ago, it took an enormous amount of audio material to clone a person’s voice – he explained Hanih Farid, of the UC Berkeleyal Washington Post – Now, if you have a Facebook page, or if you recorded a TikTok and 30 seconds of your voice is online, it’s become pretty easy to replicate it.

And this is the real difference compared to the past and compared to services like FakeYou: potentially, anyone could reach one of our social profiles (if available to the public), download the videos we have published and train one of the artificial intelligence systems available on the Net with those audios.

Despite this, the victims most affected by this practice are naturally the people whose voice is most widespread online, especially US celebrities. In an article published in Vice, it is said how, shortly after the launch of the public beta of Eleven Labs, 4Chan was literally invaded by audio deepfakes starring controversial characters such as Joe Rogan or Ben Shapiro. TO Emma Watson, the actress who played Hermione in Harry Potterit got worse: an AI-generated audio still circulates on Twitter in which he reads some passages from the my fight.

The same Eleven Labs, some time later, intervened in a thread on Twitter to announce that voice cloning capabilities would only be available for a fee and that, in the near future, a system to report artificially generated audio would be introduced.

The point is, it’s not always easy distinguish true from false, even for those with some online experience. There is a video on TikTok where Taylor Swift comments in an at least provocative way on the question concerning the price of tickets for her concerts on the Ticketmaster retailer: “I don’t give a damn, I don’t perform for poor people”, comments a voice that really sounds like that of the record-breaking singer-songwriter. It’s not her, but the video to date has nearly 3.5 million views and 8 thousand comments: some speak of artificial intelligence, others however comment on the declaration, as if it were real.