Technologist Roundtable: Key Issues in AI and Data Protection Post-Event Summary and Takeaways
Co-Authored with Marlene Smith, Research Assistant for AI
On November 27, 2024, the Future of Privacy Forum (FPF) hosted a Technologist Roundtable with the goal of convening an open dialogue on complex technical questions that impact law and policy, and assisting global data protection and privacy policymakers in understanding the relevant technical basics of large language models (LLMs). We invited a wide range of academic technical experts to convene with each other and data protection regulators and policymakers from around the world.
We were joined by the following experts:
- Dr. Yves-Alexandre de Montjoye, Associate Professor of Applied Mathematics and Computer Science, Imperial College London; Lead, Computational Privacy Group, Imperial College London; Special Adviser on AI and Data Protection to EC Justice Commissioner Reynders; Parliament-appointed expert to the Belgian Data Protection Agency
- Dr. David Weinkauf, Senior IT Research Analyst, Office of the Privacy Commissioner of Canada; Member, Berlin Group
- Dr. Norman Sadeh, Professor in the School of Computer Science, Carnegie Mellon University; Co-Director, Privacy Engineering Program, Carnegie Mellon University.
- Dr. Niloofar Mireshghallah, Post-doctoral scholar at the Paul G. Allen Center for Computer Science and Engineering, University of Washington
- Dr. Rachel Cummings, Associate Professor of Industrial Engineering and Operations Research, Columbia University; Co-chair of the Cybersecurity Research Center at the Data Science Institute, Columbia University.
- Dr. Damien Desfontaines Staff Scientist, Tumult Labs; Expert for the EDPB Support Pool of Experts.
As a result of the emergence of LLMs, data protection authorities and lawmakers are exploring a range of novel data protection issues, including how to ensure lawful processing of personal data in LLMs, and how to comply with obligations such as data deletion and correction requests. While LLMs can process personal data at different stages,1 including in training and in the input and output of models, there is an emerging question of the extent to which personal data exists “within” a model itself.2 Navigating these complex emerging issues increasingly requires understanding the technical building blocks of LLMs.
This post-event summary contains highlights and key takeaways from three parts of the Roundtable on 27 November.
The post-event summary contains highlights and key discussion takeaways regarding the following:
- Basics of Transformer Technology and Tokenization
- Training and Data Minimization
- Memorization, Filters, and “Un-Learning”
We hope that this document supports ongoing efforts to explore and understand a range of novel issues at the intersection of data protection and artificial intelligence models.
If you have any questions, comments, or wish to discuss any of the topics related to the Roundtable and Post-Event Summary, please do not hesitate to reach out to FPF’s Center for AI at [email protected].