{"id":40,"date":"2021-10-29T23:58:54","date_gmt":"2021-10-29T23:58:54","guid":{"rendered":"https:\/\/unsung.davidpogue.com\/?p=40"},"modified":"2023-10-28T00:30:34","modified_gmt":"2023-10-28T00:30:34","slug":"audio-deepfakes-and-the-end-of-trust","status":"publish","type":"post","link":"https:\/\/www.unsungscience.com\/index.php\/2021\/10\/29\/audio-deepfakes-and-the-end-of-trust\/","title":{"rendered":"Audio Deepfakes and the End of Trust"},"content":{"rendered":"\n<h3 class=\"wp-block-heading\">Season 1 \u2022 Episode 3<\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">The media is plenty freaked out about \u201cdeepfakes\u201d: Computer-generated videos of famous people saying things they never actually said. But only the video is faked; the audio parts, the voices of those fake celebrities, were supplied by human impersonators. But now, software exists to mimic anyone\u2019s voice, opening a Pandora\u2019s Box of fraud, deception, and what one expert calls \u201cthe end of trust.\u201d Fortunately, a new coalition of 60 news organizations and software companies think they have a way to shut down the nightmare before it begins.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Guests: Ragavan Thurairatnam, Dessa. Nina Schick, author and deepfakes expert. Joan Donavan, Harvard Kennedy School. Charlie Choi, CEO of Lovo. Dana Rao, chief counsel, Adobe.<\/p>\n\n\n\n<figure class=\"wp-block-audio\"><audio controls src=\"https:\/\/unsung.davidpogue.com\/wp-content\/uploads\/2023\/10\/unsungscience-20211029.mp3\"><\/audio><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Episode transcript<\/h2>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Intro<\/strong><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\"><em>Theme begins.<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Deepfakes are the latest in computer-generated imagery: they\u2019re videos of people doing and saying things that they never actually did or said. Like, there\u2019s a video of Obama saying,<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>\u201cObama:\u201d \u201cPresident Trump is a total and complete dipshit.\u201d<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But what\u2019s weird is the&nbsp;<em>voices&nbsp;<\/em>in those videos are still done by human beings. Impressionists. Impersonators. The technology to simulate their&nbsp;<em>voices&nbsp;<\/em>still wasn\u2019t good enough to fool anyone.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Until now.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>I\u2019m David Pogue\u2014And this\u2026is \u201cUnsung Science.\u201d<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Season 1, Episode 3: Voice Deepfakes<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Every fall, Adobe hosts a conference called Adobe Max. It\u2019s a chance for the engineers to strut their stuff, show what they\u2019ve been working on, and make announcements to a captive audience of customers and press.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The conference focuses, of course, on creative software\u2014for photos, videos, music, and so on\u2014because that\u2019s Adobe\u2019s thing, right? They make Photoshop for editing photos, Premiere for editing videos, and so on.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">One session every year is called Adobe Max Sneak. Here\u2019s how Adobe describes this presentation:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Faux 1: The Max Sneaks session invites our engineers out of the lab and onto the stage. Many Sneaks from previous years have later been incorporated into our products.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">In 2016, the Sneak session featured the usual sorts of Adobe experiments. There was a prototype app that replaces the sky in a photo with a different sky, with one click<s>\u2026<\/s><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">There was an app that could adjust the colors in a bunch of photos to match the color scheme of an existing document.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And then\u2026there was Project Voco, which was described as Photoshop for voice.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Speaker Let\u2019s hear from Zeyu you about Photoshop voiceovers. Please welcome to the stage\u2026 Zeyu.&nbsp;<em>(applause)<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu: Hello, everyone! Let\u2019s do something to human speech.<strong>&nbsp;<\/strong>I have obtained this piece of audio where there\u2019s Michael Key talking to Peele about his feeling after getting nominated.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">He\u2019s referring to Key and Peele, the comedy duo.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I forgot to mention that Jordan Peele, half of that duo, was sitting right there on the stage. He\u2019d been hired as the cohost for this event. That\u2019s the Jordan Peele who went on to write and direct movies like \u201cGet Out\u201d and \u201cUs.\u201d&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Anyway, the Adobe researcher now played a recording of Peele\u2019s partner, Keegan-Michael Key. In the clip, Key is describing his reaction at learning that he\u2019d been nominated for an Emmy.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Key&nbsp;<strong><\/strong>I jumped on the bed and\u2014 and I kissed my dogs and my wife, in that order.&nbsp;<em>(laughter)<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu: So how about we mess with who he actually kissed? Project Voco allows you to edit speech in text, so let\u2019s bring it up.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The Voco window shows audio waveforms across the top\u2014and lined up beneath them, the corresponding words.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu And when we play back, the text and the audio should play back at the same time. So let\u2019s try that.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Key And I kissed my dogs and my wife.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu OK, so what do we do? Easily, copy paste. Let\u2019s do it.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Using his cursor, Zeyu&nbsp;<em>copied and pasted&nbsp;<\/em>the word \u201cwife\u201d to make it come earlier in the sentence\u2014<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Key And I kissed my wife and my wife.&nbsp;<em>(crowd)<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu: Oops!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u2014and then&nbsp;<em>typed over<\/em>&nbsp;the second occurrence of the word \u201cwife\u201d with the word \u201cdogs.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu We can just type the word \u201cdogs\u201d here.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Crowd No, no!&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Key And I kissed my wife and my dogs.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Crowd Woooo!!!!&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But that was just rearranging recorded words. Now came the&nbsp;<em>really&nbsp;<\/em>nutty stuff.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu Wait, here\u2019s more, here\u2019s more; we can actually type something that\u2019s not here, so.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Using his keyboard, he deleted the word \u201cwife,\u201d and he&nbsp;<em>typed&nbsp;<\/em>the word Jordan, as in Jordan Peele.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu: And here we go:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Key And I kissed Jordan and my dogs.&nbsp;<em>(crowd reacts)<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">At this point, Jordan Peele leaps out of his chair in mock horror. He\u2019s stomping across the stage, like, \u201cI\u2019m outta here.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Peele You\u2014you a witch! You a demon!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu I\u2019m magic. We\u2019re not just going to do with words, we can actually type small phrases. So we do \u201cthree times.\u201d&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Speaker Oooooh.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu: And, playback!&nbsp;&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Key And I kissed Jordan three times.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Crowd&nbsp;<strong><\/strong>Ohhhhh!!&nbsp;<em>(Crowd cheering)<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So yeah. They had fed 20 minutes\u2019 of Key\u2019s voice recordings into Project Voco, and now, just by typing, they could make him say things that he had never actually said. And there was absolutely no way to tell that it wasn\u2019t real.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The crowd seemed to love it. The only person who seemed at all troubled\u2014was Jordan Peele.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Peele I, I\u2019m, I\u2019m blown away. I can\u2019t believe that\u2019s possible. You just type it in, and it interprets the person\u2019s voice. If this technology gets into the wrong hands\u2026&nbsp;<em>(laughter)<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But Zeyu Jin was quick with reassurance.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu<strong>&nbsp;<\/strong><strong><\/strong>Don\u2019t worry. We actually have researched how to, like, prevent forgery. We have, like think about like a watermarking detection.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Later, the Adobe blog described the event like this.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Faux 2: Project VoCo, allows you to change words in a voiceover simply by typing new words. As always, we\u2019d love your feedback.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And boy, did Adobe get feedback. From the BBC:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Faux 3: It seems that Adobe\u2019s programmers ignored the ethical dilemmas brought up by its potential misuse.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">From the CreativeBloq blog:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Faux 4: This raises ethical alarm bells about the ability to change facts after the event.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">From Affinity Magazine:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Faux 5: The ethical issues associated with its misuse are endless.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Adobe soon began issuing this statement to reporters:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Faux 6: Project Voco was shown at Adobe Max as a first look of forward looking technologies from Adobe\u2019s research labs, and may or may not be released as a product or product feature.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Well\u2014surprise, surprise: It was&nbsp;<em>not<\/em>&nbsp;released as a product or product feature. In fact, it was never heard from again.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now, meanwhile, in the rest of the world, the tech media was abuzz with stories about the rise of&nbsp;<em>deepfakes<\/em>.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>&gt;&gt;TV &amp; NPR audio clips about \u201cdeepfakes\u201d<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Someone on Reddit first coined that term\u2014<em>deepfakes<\/em>\u2014to describe videos where the computer has replaced one person\u2019s face with another\u2019s. In the beginning, most deepfakes were made by amateurs grafting popular actresses\u2019 faces into porn videos.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But there was also a hilarious-slash-creepy trend of people putting Nicolas Cage\u2019s face onto other actors in famous movie scenes.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">By 2018, deepfakes had gotten good enough that one video of President Obama was convincing in every way\u2014except for the words coming out of his mouth:&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Peele: They could have me say things like, I don\u2019t know, \u201cPresident Trump is a total and complete dipshit.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now, you see, I would never say these things. But someone else would.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Buzzfeed had made that video as a sort of public-service announcement about video deepfakes.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Oh, it made the point, alright. But 8.5 million views later, hardly anyone has commented on its one glaring flaw: The computer algorithm did a great job of generating the video of Obama. But they had to use an Obama impersonator\u2014a human being\u2014to do the voice. Guess who they got to do the impression?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Jordan Peele.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Yup\u2014same guy who\u2019d been on the stage two years earlier witnessing the unveiling of Project Voco.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now, usually, audio technology always comes before video. There was radio before there was TV. There were cassette tapes before there were videotapes. There was streaming audio before there was streaming video.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But for some reason, in deepfakes, video came first. Audio deepfakes came along only later\u2014and they took their time.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Let me play you the state of the art in voice deepfakes as of 2017. This is supposed to sound like Donald Trump:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Trump: I am not a robot. My intonation is always different.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">By the beginning of 2019, the state of the Trump deepfake had reached this level:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Trump: With this technology, it can make me say anything. Such as the following: \/ Barack Obama is a wonderful man. Do you think this sounds like me? We are working hard to improve these results. That is all for now. See you later, alligator.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Yeah. Maybe&nbsp;<em>much&nbsp;<\/em>later, alligator.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But then, later in 2019, the world got a load of Fake Joe Rogan.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Rogan: It\u2019s me, Joe Rogan. Friends, I\u2019ve got something new to tell all of you. I\u2019ve decided to sponsor a hockey team made up entirely of chimps. Chimps are just superior athletes. And these chimps have been working out hard. I\u2019ve got them on a strict diet of bone broth and elk meat. See you on the ice, folks.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">That is not, in fact, the voice of comedian and top podcaster Joe Rogan. That\u2026is an audio deepfake. It\u2019s not Joe Rogan; it\u2019s&nbsp;<em>Faux&nbsp;<\/em>Rogan. Let\u2019s meet the guy who made it.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">David For my pronunciation pleasure, Ragavan, will you pronounce your name so I can get it right?&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Ragavan Yeah, it\u2019s uh\u2014 it\u2019s yeah, you could say Ragavan. The last name, even I don\u2019t attempt to pronounce.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">DAVID: WHAT?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">RAGAVAN: It\u2019s Thurairatnam, but I\u2019m pretty sure I\u2019m saying it wrong.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Ragavan is the cofounder of Dessa, a Toronto company specializing in machine learning.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">David And what is Dessa\u2019s actual business? What \u2013what did you found it to do?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Ragavan We kind of started looking at banks as potential customers. [00:08:47] And it kind of we\u2014 we ended up making like AI software for these sort of big, big, boring companies \u2014 But we also want to do crazy stuff because, like, it\u2019s\u2014 we just saw that, you know, this this technology can do so many things. And we \u2014we really wanted to show the world what it could do and also just have some fun.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">David So the RealTalk project was one of these side \u2014side hustles.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Ragavan That\u2019s right. Yeah. The Real Talk Project was one of those side projects.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Dessa\u2019s dive into AI speech synthesis began at a company dinner in the summer of 2018.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Ragavan I asked the team like, what can we do to, like, really show people, like deep learning can do amazing things, and also get a lot of attention.&nbsp;&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So one of the engineers who ended up working on the project, his name is Hashim Kadem. He said, like, you know, \u201cthere\u2019s this podcast that looks like the most popular podcast in the world, Joe Rogan\u2019s podcast, like, if we could get on there, get noticed on there, something, that could be that could be really good.\u201d&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And that was the sort of seed of it.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The Dessa team figured they had plenty of source material to use for its Rogan voice clone. After all, Joe Rogan has made over 1600 episodes of his podcast\u2014and they tend to be&nbsp;<em>long.&nbsp;<\/em>Sometimes five hours long.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Ragavan<strong>&nbsp;<\/strong><strong><\/strong>On the surface it\u2019s like, \u201coh, there\u2019s hours of podcast recording. This should be easy, right?\u201d&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But Joe Rogan, it\u2019s just \u2014like he\u2019s just crazy. He puts his mouth like right on the microphone. It has all these weird things in it. And also, it\u2019s like a conversation, which is just completely different. It\u2019s like \u2014one person talks, the other person talks in the middle of him talking, there\u2019s laughing, there\u2019s like coffee drinking and, you know, all sorts of things. And\u2014 and that makes it a lot harder.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So what the team ended up doing was, they ended up using just his ad reads. So like, you know, whenever he\u2019s reading an ad for his podcast\u2014because we knew it\u2019s just him, you know, he\u2019s not going to be doing anything weird. It\u2019s a lot easier.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>&lt;&lt;&lt;Clip of Joe Rogan reading an ad &gt;&gt;&gt;<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Meanwhile, the team was also working through the tedious process of producing a perfect typed transcript of everything in those Rogan recordings.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now, the next part of the story requires a gentle understanding of artificial intelligence, machine learning, and deep learning. It\u2019s technical, and I debated just cutting this whole section. But hey\u2014you\u2019ve put on a podcast to learn something, right?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Ragavan? Take it away.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">David<strong>&nbsp;<\/strong><strong><\/strong>Are you able to do a layman-friendly distinction between deep learning, and machine learning?&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Ragavan Yeah, let\u2019s try that.&nbsp; Normally, software, you have to write all the rules for it. So, like, you know, if you think of an app, like you have to say exactly, like \u201cwhen the user does this, I want this to happen.\u201d&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong><\/strong>But with machine learning, what we do is, we, we make software kind of\u2014 learn how to do things just by showing it data.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So, for example, let\u2019s say I wanted to recognize something in an image. In machine learning, I would write by hand, like these sort of mathematical things that say, like, \u201coh, look for straight lines, you know, count how many straight lines there are. Count how many, you know, blobs of yellow there are or red there are.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">OK, I got it, sort of. Then what\u2019s&nbsp;<em>deep&nbsp;<\/em>learning?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">With deep learning, we take \u2014take it a bit further, and it kind of learns directly from the data. With deep learning, it\u2019s just like, \u201cgive me the data and give me the answer; I will figure out the rest. I will learn everything in order to make this happen.\u201d&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I think that\u2019s what\u2019s really powerful about deep learning. And that\u2019s\u2014 that\u2019s one of the reasons why, you know, in the past few years, we\u2019ve seen so many crazy things come out of it.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I\u2019m beating this dead horse about how hard is to create a deepfake voice to emphasize\u2026 how hard is to create a deepfake voice! I mean, the early incarnations of the Faux Rogan voice were not that convincing. Here\u2019s an example:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Rogan:&nbsp;<em>Rogan on AlexNet (RealTalk early clip)<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Yeah, OK, it\u2019s\u2026something. But even after further work and hand-tweaking, there were still weird gaps and unnatural emphasis. Like this:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Rogan:&nbsp;<em>(Anyone else clip)&nbsp;<\/em><\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But eventually, the Rogan voice got really good. At fakeJoeRogan.com, for example, you can take a little quiz. You can listen to sentences, and try to figure out if they\u2019re Joe Rogan\u2026or Faux Rogan.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I\u2019ll play three examples, and you can test your deepfake radar. Ready? First one:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Rogan: What was the person thinking when they discovered cow\u2019s milk was fine for human consumption? And why did they do it in the first place?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Real or fake? Remember your vote. I\u2019ll give you the answers in a second. OK, second one:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Rogan: Some of you just need to improve the quality of your existence on earth. You gotta do the right things.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And finally, example number 3:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Rogan: Fantastic old-world craftsmanship that you just don\u2019t see any more.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">OK, remember your answers. After the break\u2014we\u2019ll see how you did.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>BREAK&nbsp;<\/strong><\/h3>\n\n\n\n<p class=\"wp-block-paragraph\">Welcome back. Let\u2019s see how you did with your deepfake detection skills.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The first sentence\u2014<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Rogan: What was the person thinking when they discovered cow\u2019s milk was OK for human consumption? And why did they do it in the first place?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">That one\u2019s fake.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Rogan: Some of you just need to improve the quality of your existence on earth. You\u2019ve got to do the right things.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Fake again. And finally, example 3:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Rogan: Fantastic old-world craftsmanship that you just don\u2019t see any more.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">That\u2019s an actual recording of Joe Rogan.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So how\u2019d you do? Pretty soon, it\u2019s going to matter.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Nina By the end of the decade, you\u2019re looking at a future where one Youtuber with limited resources or skills can kind of produce something that\u2019s better than what the best Hollywood studio can produce today for millions of dollars and with teams of special effects artists. Don\u2019t you worry, that is coming!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is Nina Schick, author of a book called&nbsp;<em>Deep Fakes: The Coming Infocolypse.&nbsp;<\/em>Like, \u201capocalypse\u201d but with \u201cinfo.\u201d \u201cInfocolypse.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Nina When I first started to come to deepfakes, you know, it was as they were emerging at the end of 2017 in the form of nonconsensual pornography on Reddit. And I immediately realized that deepfakes could become the most powerful weapon of political disinformation known to humanity.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Nina may be one of the world\u2019s most informed experts on why audio deepfakes are dangerous.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Nina Number one, you can fake media of anyone saying or doing anything. So you can imagine how, for instance, if you take the context of the United States after the George Floyd video came out, imagine there was a leaked recording of Donald Trump uttering a racial slur. You can see how that leaked audiotape could, in that incendiary kind of political environment, really kick off something far more dangerous.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But I should note that this also has a very real risk to businesses. Imagine a business leader is caught on tape saying something that they didn\u2019t actually say. It could be potentially devastating.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And, of course, the opportunities for scammers are delicious.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Nina Ultimately, it\u2019s something that can affect every individual, right? One of the classic frauds that is perpetrated against millions of us every day worldwide, is the desperate phone call from a loved one. Right? \u201cDad, I\u2019ve been an accident. I need money now. I\u2019m in jail.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now, imagine fraudsters can use AI to scrape social media to find a video of your son, your wife, your daughter, and then use that AI basically emulate their voice with just a few seconds of training data\u2014 and now you get the call and it\u2019s literally your son. \/It is absolutely terrifying, to say the least, that this technology can be deployed by malicious actors without control.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Anyway, deepfakes purporting to show people saying things they never said is only half the problem. The other half is the opposite situation\u2014people&nbsp;<em>blaming&nbsp;<\/em>deepfakes for things they actually&nbsp;<em>did&nbsp;<\/em>say!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">David I remember that one of Trump\u2019s first responses to the \u201cgrab them by the pussy\u201d video was, \u201cI never said that! Software created that.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Joan That kind of reaction is called the Liar\u2019s Dividend, which is that people can come out and say, \u201cwell, I didn\u2019t do that. I didn\u2019t say that \u2014that wasn\u2019t me.\u201d&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Meet Joan Donavan, research director at Harvard\u2019s Kennedy School Shorenstein Center on Media, Politics and Public Policy.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">David And that fits on a business card?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Joan Hey, when you\u2019re me, you don\u2019t want anyone to have your email or your phone number. I don\u2019t even have business cards, I don\u2019t want people to know how to get in touch.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">She\u2019s spent a lot of time studying misinformation. And she says that the antidote for the liar\u2019s dividend\u2014is other people as witnesses.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Joan You don\u2019t build a court case based on a single shred of evidence. Everything adds up. Right? We have to kind of build or weave a story here.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And then also if it is an interaction that is being that is being faked, like, is there a way to legitimate those claims, just as we would as any good journalists would, you know, verify.&nbsp; But it\u2019s going to require people talking to people to make sense of the thing.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now, Dessa, the company that created Faux Rogan, did it, as they say, to get attention.&nbsp;<strong>And they got it.<\/strong>&nbsp;The whole company was soon thereafter bought by Square, the digital payments company.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But why did Adobe do it? Why did they make Project Voco in the first place?&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">It wasn\u2019t to torpedo our public trust in anything anybody ever says again. Here\u2019s what Adobe\u2019s blog post said:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Faux 7: When recording voiceovers, dialogue, and narration, wouldn\u2019t you love the option to edit or insert a few words without the hassle of recreating the recording environment or bringing the voiceover artist in for another session?&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Voco was created to make life easier for creative people. To fix stumbles in podcasts, audiobooks, and narration. To clean up dialogue in movies, TV shows, and games, when you need to edit lines after the actors are no longer available. To dub movies into other languages with the original actor\u2019s voice.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here\u2019s Nina Schick again:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Nina You can see how this is going to basically change the future of the movies, change the future of advertising, I mean, change entire industries.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But another really compelling example, is using synthetic voice to give those people who\u2019ve lost the ability to speak, for instance, through a neurodegenerative disease or a stroke, being able to give them their voice back, literally give them their voice back. And there\u2019s already a team of researchers working on this.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And that\u2019s why there are now a bunch of companies that can turn&nbsp;<em>your&nbsp;<\/em>voice into a deepfake\u2014a voice clone\u2014so that you can type whatever you want to have read aloud in your voice.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To make a voice clone, you need to feed the machine-learning algorithm a lot of clean audio. You\u2019re usually asked to read 20 or 50 sentences into the mic.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">DP reading: Sentence number 4. The rainbow is composed of many bands of white light.&nbsp;&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">That\u2019s partly to teach the AI\u2014and partly to prevent you from cloning the voice of somebody else without their awareness. You\u2019d have to put a gun to their head, sit them down, and make them read those exact sentences.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So: How good is the result? I tried all of the voice-cloning services I could find.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here\u2019s a voice I generated for free at site called Resemble.ai:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Resemble: Hello, and welcome to the brilliant new podcast called Unsung Science. I\u2019m David&nbsp;Pogue. Or not.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Wow\u2026Well, Resemble does offer sliders that let you change the pitch, emphasis, and emotion of each word. That \u201cor not\u201d at the end really sounded wrong\u2013<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Resemble: Or not.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u2014so I\u2019m going to make the pitch lower, and change the emotion to annoyed.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Resemble: Or not.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Much better! Or not.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Well, let\u2019s see if I could use it to pull off the phone scam that Nina Schick described:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Resemble: Hi Dad, it\u2019s David, as you can obviously tell by the sound of my voice. I\u2019ve been an accident. I need money now. I\u2019m in jail. Can you send me some money right away?&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Yeah, probably not.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Well, how about its competitor, ReplicaStudios.com?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Replica: Hi Dad, it\u2019s me again. David. I have some bad news. I\u2019ve been brutally mugged in the streets of Paris! I need you to send me money. Lots of money. Please please please.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Nope. Not sold. Without a lot of hand work by engineers, the state of the art is just lame.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now,&nbsp;<em>with&nbsp;<\/em>hand work by engineers, the state of the art is really good. This is the David Pogue voice clone made for me by a company called Lovo.ai:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">DP Lovo:&nbsp;<em>Now<\/em>&nbsp;I\u2019m in business. This fake Pogue is much more convincing than those free ones.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">To get something that good, I had to read 20 minutes of text. And if I were an actual customer, I would have had to pay a thousand dollars.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You know the voices you\u2019ve been hearing in this episode, reading statements by Adobe, and quotes from various news outlets? They\u2019re all AI voices generated by Lovo.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Gotcha! Yeah\u2014I like my podcasts with a twist.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Charlie So a lot of the AI systems out there, if you feed it in gold, it will output gold. But if you feed it in garbage, it will output garbage.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Meet Charlie Choi. He\u2019s the CEO of Lovo, speaking to me from Korea.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">David I tried a bunch of the free voice cloning services and they were not good. Why is it that you can make ones that could actually fool someone and they can\u2019t?&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Charlie We have a team of data scientists who, after receiving the recording data, we go in and really try to understand if this person has spoken every single word. And we try to annotate every single emphasis or maybe breathing patterns or laughs, so that the AI voice sounds more natural and more human. And for us, we can even simulate stuttering or, all of these imperfect artifacts which make human voice so real. Because humans aren\u2019t perfect.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Charlie We\u2019re also teaching it where the emphasis goes in, or which part of it is a laugh or which part of it is a sigh. We\u2019re also feeding it, for example, pitch information, so that the model learns how to change around the pitch.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">By the way: Remember how Adobe\u2019s Project Voco was meant to make it easier to edit podcasts and audiobooks? Well\u2014that idea was too good to stay down. Today, you can have that freedom by paying for a service called Descript.com. It\u2019s a suite of tools for podcasters to make it easier to edit recordings. Here\u2019s their ad:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Ad: Meet Descript. It\u2019s a powerful new tool that makes editing easy. So easy that you\u2019ll want to edit videos.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Girl: Nice!<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And if you\u2019re willing to pay $24 a month, you get this:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Ad: Get this. Descript can turn your text back into audio. It\u2019s called Overdub. Just type what you meant to say right into Descript.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Wait, what? Isn\u2019t that exactly what Project Voco was supposed to do\u2014five years ago?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I tried it out. (Open parenthesis: Descript and the other companies mentioned here didn\u2019t pay me to talk about them; most of \u2018em didn\u2019t even know I was doing this. Close paren.)<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">First, I had to teach Descript my voice by reading 15 minutes\u2019 worth of prepared text:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">DP: \u201cThe penguins stay when all other creatures have fled, because each guards a treasure.\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u2026and then, 24 hours later, Descript was ready to do the Project Voco thing. Let\u2019s recreate the same Key and Peele joke that Adobe used, but using my own voice. Here\u2019s what I actually recorded:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">DP: I jumped on the bed and\u2014 and I kissed my dogs and my wife in that order.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And then, I edited the sentence just the way the Adobe guy did onstage, to produce this hilarious result:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Faux DP: I jumped on the bed and I kissed Jordan three times.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">OK, that\u2019s pretty amazing.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Voco and Descript are meant to fix a word or two in a legitimate recording. You can\u2019t use them to generate a whole paragraph, or a whole speech.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><em>That&nbsp;<\/em>is a bigger challenge, and that\u2019s the purpose of services like Lovo\u2014to make a full-scale voice clone that can say anything of any length and sound convincingly human. Right now, they take a lot of work and a lot of money. But Harvard\u2019s Joan Donovan says that technology will march on soon enough.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Donovan: As deep fakes require fewer and fewer images of people, and audio fakes require fewer and fewer sound bites, it\u2019s pushing us into a future of forgery that is going to it\u2019s\u2013 it\u2019s going to be confusing for a while.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So\u2014is that it? Society is doomed? Nobody will ever be able to trust any photo, video, or audio clip again?&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Well\u2014maybe not. It may be that you already know about the solution to the deepfakes problem\u2014you heard it described 20 minutes ago.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Remember Adobe\u2019s 2015 demo of Project Voco? In that session, the presenter promised that the company was also working on fraud-detection technology, so we\u2019d know the difference between real and phony recordings. Remember?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu Don\u2019t worry. We have, like think about like a watermarking detection.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Well, Adobe hasn\u2019t forgotten.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Dana: One of the early experiments we\u2014 we were working on was something we called Project VoCo which is a voice editing, synthesizing software. But we actually ended up deciding not to release it yet, because we actually didn\u2019t know how to protect it.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This is Dana Rao, who\u2019s Adobe\u2019s chief counsel. Ever since that Voco demo, he and Adobe\u2019s engineers have been trying to figure out how to prevent a deepfake-ageddon.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Or an Infocalypse.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Dana: I was talking to our chief product officer, I said, you know, we\u2019re probably at the point where, where this is going to be really hard, as we said, tell fact from fiction. In a world where you don\u2019t believe anything anymore, there are two big problems. One is, you believe a lie. And the other big problem is you no longer believe the truth. Right? And once you lose both of those things, if you\u2019re in a democracy, you\u2019ve sort of lost the ability to govern.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Their first thought was to use artificial intelligence to detect if some photo or recording is fake or not.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">So we took the question back to our research team. The first question is, can we use A.I. to detect fakes? Like, that would be the easiest answer, right? And the response we got back from our researchers was, the technology to do the editing, which is what we do, is always going to be at par or step ahead of any technology to detect it. It\u2019s just like the security in the arms race where you like, you\u2019re always\u2014 you\u2019re improving your security, but the bad guys are out there improving their attacks. And sooner or later, you\u2019re going to lose that battle, or at least something\u2019s going to get through.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">But then\u2014a eureka moment.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We don\u2019t necessarily need technology that can identify a fake. What would be just as good is a way to prove that something is real. That would solve the trust problem. If there\u2019s some leaked recording of the president saying, you know, \u201cI like to run over baby animals,\u201d knowing if it\u2019s authentic would be just as good as knowing if it\u2019s a fake.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Dana: And so we said, \u201call right, what is another way to talk about this problem?\u201d Let\u2019s flip the problem on its head. And what we meant by that was, why don\u2019t we give a place for good actors to go to be trusted, instead of trying to catch all the bad actors, which we think is a losing proposition? And that\u2019s what CAI is designed to do.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">CAI is the Content Authentication Initiative. Five years after the Project Voco demonstration, Zeyu Jin\u2019s reference to watermarking\u2014&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Zeyu Think about, like, a watermarking detection.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">\u2014has blossomed into a full-blown\u2014I don\u2019t know, program? Feature? Technology? Campaign? Consortium? All of the above.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I\u2019ll let Dana Rao describe how it works.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Dana: It occurred to us that we\u2019re in this unique position to help the consumers understand, like what happened to an image? I\u2019m gonna, you know, enhance the image. I\u2019m going to make it sharper. I\u2019m going to make it clearer.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You make all the edits, and then you publish it. Once you publish it on the social media platform or wherever it is, the people can see it, they can see a little icon and they\u2019re like, \u201coh, I wonder if the president really did go there,\u201d and they can click on it and they can say, \u201cwell, it was David who took it.\u201d They can see the location of the image, where it was taken. They can see the edits that were made if they want to. They can actually see the original, they can go to our website, see the original image and see edited image and decide for themselves.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now you have the facts. You decide for yourself. We empower the user to do it. That\u2019s sort of the end to end system that we\u2019re working on with a bunch of different partners to build out and hopefully change the conversation around how you consume content.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Obviously, this idea can work only if every link of the chain preserves that encrypted metadata that\u2019s embedded in the picture or recording. The phone camera that takes it. The software that edits it. The social-media network that posts it. Every step of the way.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Dana; And that\u2019s why this is an open standard. It\u2019s not an Adobe tool, it\u2019s not proprietary. We\u2019re building it with a bunch of partners. We want everyone to use it, we want every news media outlet to use it. We want every social platform. We want everyone, whoever does this. This is not a not a play for us to get money. We\u2019re not charging for it. So if you want your story to be told, you can do it.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Already, a bunch of companies are on board, including chip makers like Intel, ARM, and Qualcomm; software makers like Adobe and Microsoft; news outlets like the New York Times, the BBC, and the CBC; websites like Twitter, Facebook, and Getty images; and 55 other companies.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Here\u2019s an ad from the CAI website, which gives you an idea of how these companies will explain CAI to the public:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Ad: I am photographing with a CAI-enabled prototype. It\u2019s saying, \u201cDon\u2019t take my word for it.\u201d There\u2019s literally software that can prove, like, I didn\u2019t mess with this photo. This is where it was taken, this is when it was taken, and this is the certification that it\u2019s me who\u2019s made that content.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The feature that the CAI companies are adopting has a name, too. It shall be known as \u201cContent Credentials.\u201d When you see something suspicious online, you\u2019ll click a Content Credentials icon to see that content\u2019s credentials. And the path that it took to your eyeballs.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And now, the big punch line: after years of work, Adobe has finally introduced this feature to the public. Just this week\u2014assuming you\u2019re listening to this podcast when it\u2019s hot off the servers\u2014Adobe unveiled the Content Credentials at Adobe Max.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Yeah, that\u2019s right: the story that began at the Adobe Max conference five years ago\u2026ended with the Adobe Max conference last week. This episode has bookends! Now&nbsp;<em>that\u2019s<\/em>&nbsp;what you call an ingeniously structured podcast.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Of course, the Content Credentials technology isn\u2019t a silver bullet. For one thing, the version just released works only on photos. Adobe hopes to have video and audio authentication maybe next year. Meanwhile, Harvard\u2019s Joan Donovan says we\u2019ll still have a lot of work to do\u2014in policy, law, and in public awareness:<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Joan People have figured out how to wield this technology for serious, serious and grave consequences. We have a duty to the future to say that we\u2019re not going to allow it. We\u2019re not going to let it proliferate.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And so as we think about the future of technology policy, I believe we need a whole of society approach. What is our responsibility to one another? What is technology companies\u2019 responsibility for that distribution and that exposure? And then how do we as a society, like, figure out what the true costs of misinformation are, so that we can do something about it?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">You know, throughout all of these interviews, I kept thinking: A new technology. Capable of editing a record of actual events. Experts predicting the erosion of public trust\u2026Where have I heard all this before?<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Deborah: A picture may no longer be worth a thousand words. These days, the picture that the camera takes may well not be the picture that we end up seeing in newspapers and magazines. Technology makes it difficult, maybe even impossible, to tell what\u2019s real and what\u2019s not.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">That\u2019s Deborah Norville, the host of \u201cThe Today Show,\u201d in February 1990. Her guest that day was Russell Brown, from Adobe, demonstrating version 1.0 of a brand-new program called\u2026 Photoshop.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Russell: We\u2019ll take this show of Nancy and Ron. I\u2019m gonna place myself into this photograph. Based upon the skill of the artist using the program,&nbsp; they can give the illusion that photograph was quite real.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">There was also another guest, a cautionary voice:&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Norville: Fred Ritchin is an author who has written a book. You warn against the dangers of what people like Russell do.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Fred: Well, the thing is, when you see a photograph, you really tend to believe that something happened and when people start monkeying with photographs, you don\u2019t know which photographs are real, which ones happened, and which didn\u2019t. My concern is that if the media takes to doing what Russell is demonstrating now, that people, the public, will begin to disbelieve photographs generally, and it won\u2019t be as effective and powerful a document of social communication as it has been for the last 150 years.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Of course, these days, nobody worries about Photoshop bringing down civilization. We\u2019re totally blas\u00e9 about edited photos. We just go, \u201coh, that must have been Photoshopped,\u201d and we go on with our lives.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I asked Nina Schick if these audio and video deepfakes are really any different.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">David Is there a newness to audio and video deepfakes that makes it more terrifying?&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Nina Yes.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">David And maybe we\u2019ll just get to a place where everyone\u2019s like, \u201coh, that\u2019s probably a deepfake?\u201d<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Nina Photo and image manipulation has a long history. The difference now is that it is not just images. You are talking about video\u2014 video manipulation, which until now has only been in the realm of Hollywood studios.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Still, she does acknowledge that there\u2019s more to it than the dawn of the Infocalypse.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Nina Like all powerful technologies of the exponential age, this is going to be an amplifier of human intention. It will be used for bad, just as it will be used for good. So just as they will be used by malicious actors, they\u2019re going to be many commercially valid, legitimate applications.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Now, I wanted to end this episode with a twist: I thought I\u2019d let my own voice clone from Lovo speak the final paragraph. But when I got the results back from Charlie Choi, it sounded so much like me that I didn\u2019t think you\u2019d be able to tell when I stopped and the deepfake voice started, and the gag would lose all impact. So I\u2019m going to make it super clear. From the end of this sentence until the credits, you\u2019re going to hear nothing but software, starting\u2026now.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Clone: I thought I\u2019d give the last word to\u2014my clone. My voice clone, the one that Charlie Choi\u2019s team at Lovo made for me. You\u2019re listening to him right now.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">And what I\u2019d like my voice to say is that: Well, in the end, voice synthesis is just another technology. What happens from here isn\u2019t about the tool; it\u2019s about whoever\u2019s wielding it.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">I\u2019m David Pogue\u2014or a synthetic version thereof. And this\u2026is \u201cUnsung Science.\u201d<\/p>\n<div class=\"powerpress_player\" id=\"powerpress_player_7824\"><audio class=\"wp-audio-shortcode\" id=\"audio-40-1\" preload=\"none\" style=\"width: 100%;\" controls=\"controls\"><source type=\"audio\/mpeg\" src=\"https:\/\/unsung.davidpogue.com\/wp-content\/uploads\/2023\/10\/unsungscience-20211029.mp3?_=1\" \/><a href=\"https:\/\/unsung.davidpogue.com\/wp-content\/uploads\/2023\/10\/unsungscience-20211029.mp3\">https:\/\/unsung.davidpogue.com\/wp-content\/uploads\/2023\/10\/unsungscience-20211029.mp3<\/a><\/audio><\/div><p class=\"powerpress_links powerpress_links_mp3\" style=\"margin-bottom: 1px !important;\">Podcast: <a href=\"https:\/\/unsung.davidpogue.com\/wp-content\/uploads\/2023\/10\/unsungscience-20211029.mp3\" class=\"powerpress_link_pinw\" target=\"_blank\" title=\"Play in new window\" onclick=\"return powerpress_pinw('https:\/\/www.unsungscience.com\/?powerpress_pinw=40-podcast');\" rel=\"nofollow\">Play in new window<\/a> | <a href=\"https:\/\/unsung.davidpogue.com\/wp-content\/uploads\/2023\/10\/unsungscience-20211029.mp3\" class=\"powerpress_link_d\" title=\"Download\" rel=\"nofollow\" download=\"unsungscience-20211029.mp3\">Download<\/a><\/p>","protected":false},"excerpt":{"rendered":"<p>The media is plenty freaked out about \u201cdeepfakes\u201d: Computer-generated videos of famous people saying things they never actually said. But only the video is faked; the audio parts, the voices of those fake celebrities, were supplied by human impersonators. But now, software exists to mimic anyone\u2019s voice, opening a Pandora\u2019s Box of fraud, deception, and what one expert calls \u201cthe end of trust.\u201d<span class=\"excerpt-more-link\"><a class=\"more-link\" href=\"https:\/\/www.unsungscience.com\/index.php\/2021\/10\/29\/audio-deepfakes-and-the-end-of-trust\/\">More <svg class=\"svg-icon\" width=\"24\" height=\"24\" aria-hidden=\"true\" role=\"img\" focusable=\"false\" viewBox=\"0 0 24 24\" fill=\"none\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\"><path fill-rule=\"evenodd\" clip-rule=\"evenodd\" d=\"M6.96954 10.2804L11.9999 15.3107L17.0302 10.2804L15.9695 9.21973L11.9999 13.1894L8.0302 9.21973L6.96954 10.2804Z\" fill=\"currentColor\"\/><\/svg><\/a><\/span><\/p>\n<div class=\"excerpt-audio-block\">\n<figure class=\"wp-block-audio\"><audio controls src=\"https:\/\/unsung.davidpogue.com\/wp-content\/uploads\/2023\/10\/unsungscience-20211029.mp3\"><\/audio><\/figure>\n<\/div>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-40","post","type-post","status-publish","format-standard","hentry","category-uncategorized","entry"],"_links":{"self":[{"href":"https:\/\/www.unsungscience.com\/index.php\/wp-json\/wp\/v2\/posts\/40","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.unsungscience.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.unsungscience.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.unsungscience.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.unsungscience.com\/index.php\/wp-json\/wp\/v2\/comments?post=40"}],"version-history":[{"count":2,"href":"https:\/\/www.unsungscience.com\/index.php\/wp-json\/wp\/v2\/posts\/40\/revisions"}],"predecessor-version":[{"id":47,"href":"https:\/\/www.unsungscience.com\/index.php\/wp-json\/wp\/v2\/posts\/40\/revisions\/47"}],"wp:attachment":[{"href":"https:\/\/www.unsungscience.com\/index.php\/wp-json\/wp\/v2\/media?parent=40"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.unsungscience.com\/index.php\/wp-json\/wp\/v2\/categories?post=40"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.unsungscience.com\/index.php\/wp-json\/wp\/v2\/tags?post=40"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}