New White Paper Provides Guidance on Embedding Data Protection Principles in Machine Learning


Immuta and the Future of Privacy Forum (FPF) today released a working white paper, Data Protection by Process: How to Operationalise Data Protection by Design for Machine Learning, that provides guidance on embedding data protection principles within the life cycle of a machine learning model. 

Data Protection by Design (DPbD) is a core data protection requirement introduced in Article 25 of The General Data Protection Regulation (GDPR). In the machine learning context, this obligation requires engineers to integrate data protection and privacy measures from the very beginning of a new ML model’s life cycle and then take them into account at every stage throughout the process. The requirement has frequently been criticized for being vague and difficult to implement in practice. 

The paper, co-authored by Sophie Stalla-Bourdillion of Immuta, Alfred Rossi of Immuta, and Gabriela Zanfir-Fortuna of FPF, provides clear instructions on how to fulfill the DPbD obligation and how to build a DPbD strategy in line with data protection principles. 

“The GDPR has been criticised by many for being too high-level or outdated and therefore impossible to implement in practice,” said Stalla-Bourdillon, senior privacy counsel and legal engineer at Immuta. “Our work aims to bridge the gap between theory and practice, to make it possible for data scientists to seriously take into account data protection and privacy requirements as early as possible. Working closely with engineers, we have built a framework to operationalise data protection by design, which should be seen by all as the backbone of the GDPR.” 

The authors have released this as a working paper and welcome your comments. Please share your comments by sending an email to [email protected]. The ultimate goal of the paper is to begin shaping a framework for leveraging DPbD when developing and deploying ML models.