Reducing malicious use of synthetic media research
Jess Whittlestone and I recently distributed a working paper exploring the challenges and options around ensuring that machine learning research is not used for harm, focusing on the challenges of synthetic media. This post is just a brief overview so read or skim the full paper here—and it was written specifically to be skimmable and referenceable! (Here is the citable arXiv link, though it might be missing some minor fixes given update delays).
Over the last few years, research advances — primarily in machine learning (ML) — have enabled the creation of increasingly convincing and realistic synthetic media: artificially generated and manipulated video, audio, images, and text. It is now possible to generate images of faces that are almost indistinguishable from photographs. It is becoming easier to manipulate existing videos of people: replacing a person’s facial expressions or movements with those of another person, and/or overlaying synthesized speech which imitates a person’s voice. Language models have advanced to the point where, given a sentence prompt, they can write articles that appear at least superficially realistic.
These advances could be used to impersonate people, sway public opinion, or more generally spread doubt about the veracity of all media. Modern synthetic media is in fact already being used for such harmful purposes: face-swapping tools are being used to harass journalists; synthetic voices are being used for financial crimes; and synthetic faces have allegedly been used for espionage. The researchers whose work contributed to these advances in synthetic media did not, of course, intend them to be used for such malicious purposes — and may not have anticipated this. We need more humility around what we don’t know about the potential impacts and uses of new technology, and we may need better systems for understanding and managing such impacts.
One approach being used to potentially mitigate harmful impacts is to consider partially withholding research outputs which seem particularly susceptible to malicious use (mal-use). Other fields, such as biotechnology and information security, have developed practices that can limit what researchers study or publish in order to reduce harm. Whether the machine learning community should also develop analogous practices, and specifically more cautionary release practices, has been the subject of recent discussion (we go into specific practices that might be adoptable from those fields in the full paper).
This debate can easily become polarized, especially because the ML community has very strong norms around the open sharing of data, algorithms, models and research papers. While some argue that restricted release will be necessary to prevent increasingly powerful ML capabilities from being misused, others worry that doing so might undermine the ability to distribute the benefits of ML widely, and increase power concentration as just a few research groups disproportionately control the development of ML capabilities. Opinions on how effective release practices will be at reducing risks of mal-use also vary, as do views on how large the risks from ML research even are at present.
Our aim is to reduce this polarization, and encourage a more nuanced conversation about both the risks of ML (and particularly synthetic media) research, and different release practices for mitigating those risks. We start by simply laying out some useful tools, analogies, and options for thinking about these issues: some different ways of thinking about the harms that might arise from ML research; various approaches to risk mitigation from other fields that ML might adopt; and important considerations for making tradeoffs between the costs and benefits of different release practices.
We particularly want to emphasize that when thinking about release practices, the choice is not a binary one between ‘release’ or ‘don’t release’. There are several different dimensions to consider and many different options within each of these dimensions, including: (1) content — what is released (options ranging from a fully runnable system all the way to a simple use case idea or concept); (2) timing — when it is released (options include immediate release, release at a specific predetermined time period or external event, staged release of increasingly powerful systems); and (3) distribution — where/to whom it is released to (options ranging from full public access to having release safety levels with auditing and approval processes for determining who has access).
It is also equally, if not more, important to explore not just different release options, but all the processes that sit around this: under what conditions should different types of release generally be used? If decisions need to be made on a case-by-case basis, how should such evaluations be conducted? Who needs to be able to make these decisions, and what training should they receive? What kinds of institutions, systems, and processes need to coordinate and manage all of this?
We think that by considering this much wider array of options and questions, and taking seriously different perspectives on the tradeoffs involved in withholding publication of any research, it should be possible for the ML community to develop norms and processes which strike a careful balance between mitigating mal-use risks while not compromising other important values and benefits of openness in research. For example, having formalized release procedures, perhaps developed and managed by external institutions, could help reduce concerns around power concentration while recognizing that different release options will be required in different circumstances, and full openness may not always be the best approach.
More concretely, we believe that the ML research community would benefit from:
- Working to better understand the landscape of potential risks from ML research and possible mitigation strategies by: mapping risks of different types of ML research in collaboration with subject matter experts and mapping mitigation options for specific areas of research.
- Building a community and norms around understanding and mitigating mal-use impacts of ML research, by for example: establishing regular workshops at major conferences; spreading awareness of risks to impacted groups and those who can help mitigate them; and encouraging impact evaluation for research publications, presentations, and proposals.
- Establishing institutions and systems to support research practices in ML, including potentially by: funding expert impact evaluation of research proposals; prototyping vetting systems to help enable shared access to potentially sensitive research; and developing release procedures for research already deemed as raising potential risks.
Finally, release practices are of course only one component of mitigating mal-use of research, from project inception to funding, execution, and productization. We must also more closely examine what kinds of work are incentivized by conferences, funders, and hiring decisions.
We simply do not know enough yet about negative impacts or the efficacy of new research practices. We do not advocate here for mandating policy change. But not knowing is not an invitation to ignore risks, especially concerning the irreversible act of publication. We must invest in knowing — in understanding the likely impacts of research while avoiding hyperbole — and in testing out processes to see if they do work to mitigate harm.
Ultimately, we want research to benefit humanity. We see this work as part of a maturing of the ML community, alongside crucial efforts to ensure that systems are fair, transparent, and accountable. As ML reshapes our lives, researchers will continue to come to terms with their new powers and impacts on world affairs.
Aviv Ovadya is the founder of the Thoughtful Technology Project, and was previously Chief Technologist at the Center for Social Media Responsibility at UMSI. Jess Whittlestone is a postdoc at the Leverhulme Centre for the Future of Intelligence at the University of Cambridge.