Improvement in image synthesis, from https://arxiv.org/abs/1406.2661 https://arxiv.org/abs/1511.06434 https://arxiv.org/abs/1606.07536 https://arxiv.org/abs/1710.10196 https://arxiv.org/abs/1812.04948, via Ian Goodfellow

The Path to Deepfake Harm

How, when, and why synthetic media can be used for harm

Factors Impacting the Likelihood of Harm

1. Awareness: Do actors with malicious intent know about a capability and believe it can help them?

  • Attention of adversaries: Are malicious actors likely to realize that they could use a new capability to further their ends? If adversary groups are already using closely related methods, this is much more likely: for example, if edited voice clips are already being used for political manipulation, groups doing this are more likely to pay attention to demonstrations of voice cloning.
  • ‘Convincibility’ of those with resources: Are there compelling arguments, perhaps by authoritative third parties, for the effectiveness of new capabilities? For example, a scammer who realizes that voice cloning is useful might need to be able to convince a superior that this technology is effective enough to justify the costs and overcome institutional inertia.

2. Deployment: How difficult is it for adversaries to weaponize this capability in practice?

  • Talent pipelines: How difficult is it to source someone who can apply a new capability for the desired use case? (e.g. do malicious actors need someone with machine learning experience, programming experience, or can they just use a program directly to achieve their goals?).
  • Reproducibility: How difficult is it to reproduce a capability given the information available? (e.g. is it easy to replicate a voice cloning capability given the available papers, models, code, etc.?)
  • Modifiability: How difficult is it to modify or use a system in order to enable mal-use? (e.g. if a voice cloning product makes it difficult to clone a voice without consent or watermarks, how hard is it to overcome those limitations?)
  • Slottability: Can new capabilities be slotted into existing organizational processes or technical systems? (e.g. are there already established processes for phone scams into which new voice generation capabilities can be slotted easily, without any need to change goals or strategy?)
  • Environmental factors: How does the existing ‘environment’ or ‘infrastructure’ impact the usefulness of the new capability for malicious actors? (E.g. currently, in the US it is easy to ‘spoof’ phone numbers to make it appear like a call is coming from a family member, which could impact the likelihood of voice cloning being weaponized for phone scams.)

3. Sustained use: How likely is it that a capability will lead to sustained use with substantial negative impacts?

  • Actual ROI: If malicious actors believe that the return on investment (ROI) for using a capability is low they might not continue to use it in practice. For example, if a form of mal-use is easy to detect, then adversaries might decide it’s not worth the risk or might be shut down very quickly. [Social factors can also influence ROI; partly explaining why explicit deepfakes are disproportionately used to harass women.]
  • Assessment of ROI: If malicious actors have no way of assessing whether new capabilities are helping them better achieve their goals, or if their assessments are flawed, they might not continue to put resources into using those capabilities.

Access Ratchets

Indirect Harms and Disinformation Ratchets

Addendum

--

--

Founder of the Thoughtful Technology Project & GMF non-res fellow. Prev Tow fellow & Chief Technologist @ Center for Social Media Responsibility. av@aviv.me

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Aviv Ovadya

Founder of the Thoughtful Technology Project & GMF non-res fellow. Prev Tow fellow & Chief Technologist @ Center for Social Media Responsibility. av@aviv.me