The Path to Deepfake Harm
How, when, and why synthetic media (now called generative AI) can be used for harm
This is post is an excerpted section from a working paper with Jess Whittlestone (shared in 2019, but minimal updates were needed). While the full paper was focused on synthetic media research, this section is far more broadly applicable and often referenced in other contexts—it applies in general to malicious (and beneficial) use of technologies, from video generation, to language models (e.g. GPT-3), to cryptocurrencies. This piece jumps into the meat, so for more background on this topic, see the paper overview here.
We aim to connect the dots between the theoretical potential for the malicious use (mal-use) of synthetic media technology, and what actually makes significant harm likely.
Factors Impacting the Likelihood of Harm
Below we explore the factors influencing whether a new capability overcomes activation energy and friction, and will lead to sustained mal-use in practice. We use artificial voice cloning — “copying” a voice so that it can be used to say anything — as an illustrative example. It is a relatively new capability with many useful applications (e.g. in voice translation and audio editing) but also significant potential for mal-use (e.g. in scams, political propaganda, and market manipulation).
1. Awareness: Do actors with malicious intent know about a capability and believe it can help them?
We can break this down into:
- Attention of adversaries: Are malicious actors likely to realize that they could use a new capability to further their ends? If adversary groups are already using closely related methods, this is much more likely: for example, if edited voice clips are already being used for political manipulation, groups doing this are more likely to pay attention to demonstrations of voice cloning.
- ‘Convincibility’ of those with resources: Are there compelling arguments, perhaps by authoritative third parties, for the effectiveness of new capabilities? For example, a scammer who realizes that voice cloning is useful might need to be able to convince a superior that this technology is effective enough to justify the costs and overcome institutional inertia.
2. Deployment: How difficult is it for adversaries to weaponize this capability in practice?
For a capability to be deployed for malicious purposes, adversaries not only need to be aware but to have the necessary skills and resources to productize and weaponize the capability. This isn’t a binary — e.g. having ML expertise vs. not — but rather many different factors will influence how easy a capability is to weaponize. At the extreme, we might have a product which can be immediately used by anyone, regardless of technical capability (such as free to use voice cloning software).
Factors that influence the ease of deployment for mal-use include:
- Talent pipelines: How difficult is it to source someone who can apply a new capability for the desired use case? (e.g. do malicious actors need someone with machine learning experience, programming experience, or can they just use a program directly to achieve their goals?).
- Reproducibility: How difficult is it to reproduce a capability given the information available? (e.g. is it easy to replicate a voice cloning capability given the available papers, models, code, etc.?)
- Modifiability: How difficult is it to modify or use a system in order to enable mal-use? (e.g. if a voice cloning product makes it difficult to clone a voice without consent or watermarks, how hard is it to overcome those limitations?)
- Slottability: Can new capabilities be slotted into existing organizational processes or technical systems? (e.g. are there already established processes for phone scams into which new voice generation capabilities can be slotted easily, without any need to change goals or strategy?)
- Environmental factors: How does the existing ‘environment’ or ‘infrastructure’ impact the usefulness of the new capability for malicious actors? (E.g. currently, in the US it is easy to ‘spoof’ phone numbers to make it appear like a call is coming from a family member, which could impact the likelihood of voice cloning being weaponized for phone scams.)
Websites now enabling anyone to instantly generate seemingly photorealistic faces are a concrete example of deployment barriers falling away and making mal-use easier. It had been possible for well over a year to generate synthetic images of faces with fairly high quality, but such websites have enabled anyone to do so with no technical expertise. This capability can also immediately slot into existing processes, such as fake account creation. Previously, malicious actors would often use existing photos of real people, which could be identified with reverse image search, unlike wholly generated synthetic images.
[Update: Since summer of 2019 when the previous paragraph was written, generated faces have been used for espionage, influence operations, phishing, etc., as predicted. At the time, in an attempt to limit such harm, I reached out to the creators of tools that made face generation broadly accessible, suggesting modifications such as watermarks to at least make mal-use more difficult; but no changes were made. There is reason to believe that if those changes had been made, it would have bought more time for mitigations before this technology was successfully weaponized.]
3. Sustained use: How likely is it that a capability will lead to sustained use with substantial negative impacts?
Even if adversaries are aware of and able to weaponize some new capability, whether or not this leads to sustained use depends on:
- Actual ROI: If malicious actors believe that the return on investment (ROI) for using a capability is low they might not continue to use it in practice. For example, if a form of mal-use is easy to detect, then adversaries might decide it’s not worth the risk or might be shut down very quickly. [Social factors can also influence ROI; partly explaining why explicit deepfakes are disproportionately used to harass women.]
- Assessment of ROI: If malicious actors have no way of assessing whether new capabilities are helping them better achieve their goals, or if their assessments are flawed, they might not continue to put resources into using those capabilities.
We can think of this as a kind of progression, from a theoretical capability to scaled-up use in practice. Once a technology has progressed down this path and has become easy to use, and proven to have high ROI for mal-use, it can be much more difficult to address than at earlier stages — we call this the access ratchet (like a ratchet, increased access to technology cannot generally be undone). For any capability with potential for mal-use, it is therefore worth thinking about where it currently sits on this progression: how much attention and interest it is receiving; whether it has been weaponized and/or how costly it would be to do so; and whether it’s likely to be, or already in sustained use. This can help us think more clearly about where the greatest risks of mal-use are, and different kinds of interventions that might be appropriate or necessary in a given situation.
Researchers may argue that a capability is unlikely to cause harm since it has not been used maliciously yet. What this doesn’t address is the fact that a capability which has not yet been used maliciously might sit anywhere along this progression, which makes a huge difference to how likely it is to cause harm. For example, Face2Face, a technique for real-time facial reenactment (i.e. changing a person’s expressions in a video), has existed for 4 years but not been developed into any products that can easily be used. This lack of productization makes harmful use vastly less likely, especially given the competition for AI and engineering talent today. It is also worth considering how costly it would be to make a given capability easier to misuse: even the DeepFake application, which is more accessible to non-technical users, is currently resource-intensive to weaponize in practice.
Indirect Harms and Disinformation Ratchets
Sometimes the path to harm from synthetic media will be fairly direct and immediate: such as a person losing their money, returning to our example of voice cloning being used in financial scams.
But in other cases, improved synthetic media capabilities might cause harm in more complex and indirect ways. Consider the case where misinformation purveyors get hold of sophisticated synthetic media capabilities and use them to win substantial democratic power, which they then use to control narratives further and undermine any mitigation efforts (not an uncommon path from democracy to authoritarianism). We can think about this as a disinformation ratchet: the ability to use disinformation to enhance one’s ability to distribute further disinformation; and the opportunity for this type of ratchet can be influenced by new technology impacting media distribution channels and capabilities.
These less direct kinds of harms may be harder to anticipate or imagine, but in the long-run may be much more important — particularly if they influence the future development of technology in ways that undermine our ability to deal with future threats. We suggest that it’s particularly important to consider these kinds of “sociotechnical-path dependencies” as well as more direct and immediate threats, and what kinds of risk mitigation strategies might best address them.
People often ask why synthetic media isn’t a significant problem yet, particularly given all of the prior concern around it. Was this a false fear?
The answer is simply that synthetic media technology has intentionally not been productized yet to a meaningful extent, decreasing the degree to which all of the factors described above are met (relative to other risks, such as the status quo of “shallow fakes”). While misuse does happen, and does cause real harm, the impact has not thus not yet been as broadly significant as has been feared.
This is not a deep technical limitation, it is a result the concern, and the resulting intentional choices made by well-meaning actors and those sensitive to negative press or liability. Wherever we have seen consumer grade synthetic media tools of sufficient quality, without mitigation measures, they have been weaponized. Thankfully, so far, most organizations and researchers that have had the resources to productize synthetic media tools have also put significant resources into mitigation measures, or they simply avoided productization entirely. There are many examples of this, with Overdub’s consent system and Adobe VOCO’s disappearance being among the most prominent. In other words, synthetic media’s impact has been comparatively small so far because of a concerted effort to limit misuse — often essentially burying mature technology or significantly restricting valuable products. (2023 update: This has changed—there are far fewer guardrails now, and increasing levels of abuse.)
Moreover, while this excerpt is focused on the specific case of intentional malicious use, it is crucial to note that synthetic media technology does not need to be used by malicious actors to cause very significant harm. As is noted in the paper: “The simple existence of the technology, regardless of its use, can allow bad actors to claim that evidence of e.g. corruption or war crimes were synthesized in order to avoid accountability. For example, allegations of videos being faked have been used to justify a coup in Gabon, and exculpate a cabinet minister in Malaysia.” There are also other potential sources of harm, such as misinterpretations of synthetic content created as an act of parody.
Citation note: Cite the paper in academic works, but link to this piece in contexts where it is more applicable. This piece I wrote for MIT Technology Review also goes into more detail on concrete steps that developers of synthetic media tools can take to limit harm.