Privacy Scholarship Research Reporter: Issue 5, July 2020 – Preserving Privacy in Machine Learning: New Research on Data and Model Privacy


Notes from FPF

In this edition of the “Privacy Scholarship Reporter”, we build on the general knowledge from the first two and then explore some of the technical research being conducted to achieve ethical and privacy goals. 

Is it possible to preserve privacy in the age of AI?”, is a provocative question asked by academic researchers and company based researchers. The answer depends on a mixture of responses that secure privacy for training data and input data, and preserve privacy and reduce the possibility of social harms which arise from interpretation of model output.  Likewise, preservation of data privacy requires securing AI assets to ensure model privacy (Thain and Penn 2020, Figure 1). As machine learning and privacy researchers’ work demonstrates, there are myriad ways to preserve privacy and choosing the best methods may vary according to the purpose of the algorithm or the form of machine learning built. 

Within the growing body of research addressing privacy considerations in artificial intelligence and machine learning, there are two approaches emerging — a data centered approach and a model centered approach.  One group of methods to secure privacy through attention to data is “differential privacy”. Differential privacy is a type of mathematical perturbation introduced into a data set to ensure that a specific individuals’ inclusion in that data cannot be detected when summary statistics generated from either a true or differentiated data set are compared to one another. Some of the other methods for increasing the privacy protections for data prior to use in ML models include: homomorphic encryption, secure multi-party computation, and federated learning.  Homomorphic encryption preserves data privacy through analysis of encrypted data.  Secure multi-party computation is a protocol for collaboration between parties holding information they prefer to keep private from one another without intervention of a trusted third party actor.  Federated learning allows data to be stored and analyzed locally through models or segments of models sent to the user’s device.  

Data centered methods to preserve privacy introduce some possibility that models may be compromised, allowing for an attack on user’s data and on a company’s models.  Conversely, research to design model centered methods to preserve privacy presently focus on ways to secure models from attack. There are two general forms of attack against models that could reduce privacy for the developers of the models or for those whose data is used by the attacked model.. “Black box” attacks against machine learning draw private information from machine learning models through malicious gains of functional access to a model without knowing its internal details, while “white-box” attacks gain access to information about an individuals contribution through use of ill-gained knowledge of details about the model itself.  Both represent risks to individual data privacy that arise from challenges to a company’s model privacy.  The costs from loss of privacy through machine learning may redound  to individuals from privacy breaches or to companies from the associated cost from loss of proprietary assets and reputation, and can range into the millions of dollars. Working to reduce those losses is an important component of present privacy and machine learning research, such as is represented in the papers below.

As always, we would love to hear your feedback on this issue. You can email us at [email protected].

Sara Jordan, Policy Counsel, FPF

Preserving Privacy in Artificial Intelligence and Machine Learning: Theory and Practice


Perfectly Privacy-Preserving AI: What is it and how do we achieve it?


Perfect preservation of privacy in artificial intelligence applications will entail significant efforts across the full lifecycle of AI product development, deployment, and decommissioning, focusing on the privacy protections implemented by  both data creators and model creators. Preservation of privacy would entail focusing on:  

1. Training Data Privacy: The guarantee that a malicious actor will not be able to reverse-engineer the training data. 

  1. Input Privacy: The guarantee that a user’s input data cannot be observed by other parties, including the model creator. 
  2. Output Privacy: The guarantee that the output of a model is not visible by anyone except for the user whose data is being inferred upon. 
  3. Model Privacy: The guarantee that the model cannot be stolen by a malicious party”. 

A combination of the tools available may represent the best, albeit still theoretical, paths toward perfectly privacy-preserving AI.

Authors’ Abstract

Many AI applications need to process huge amounts of sensitive information for model training, evaluation, and real-world integration. These tasks include facial recognition, speaker recognition, text processing, and genomic data analysis. Unfortunately, one of the following two scenarios occur when training models to perform the aforementioned tasks: either models end up being trained on sensitive user information, making them vulnerable to malicious actors, or their evaluations are not representative of their abilities since the scope of the test set is limited. In some cases, the models never get created in the first place. There are a number of approaches that can be integrated into AI algorithms in order to maintain various levels of privacy. Namely, differential privacy, secure multi-party computation, homomorphic encryption, federated learning, secure enclaves, and automatic data de-identification. We will briefly explain each of these methods and describe the scenarios in which they would be most appropriate. Recently, several of these methods have been applied to machine learning models. We will cover some of the most interesting examples of privacy-preserving ML, including the integration of differential privacy with neural networks to avoid unwanted inferences from being made of a network’s training data. Finally, we will discuss how the privacy-preserving machine learning approaches that have been proposed so far would need to be combined in order to achieve perfectly privacy-preserving machine learning.

“Perfectly Privacy-Preserving AI” by Patricia Thaine from Towards Data Science, January 1, 2020.

Privacy-Preserving Deep Learning


Privacy preservation in machine learning depends on the privacy of the training and input data, privacy of the model itself, and privacy of the models’ outputs.  Researchers have demonstrated an ability to improve privacy preservation in these areas for many forms of machine learning, but doing so for neural network training over sensitive data presents persistent problems. In this (now) classic article, the authors describe “collaborative neural network training” which “protects privacy of the training data, enabled participants to control the learning objective and how much to reveal about their individual models, and lets them apply the jointly learned model to their own inputs without revealing the inputs or the outputs”.  A collaborative architecture protects privacy by ensuring data is not revealed to third party, like an MLaaS provider,  passive adversary, or malicious attacker, and by ensuring that data owners have control over their data assets.  This is particularly useful in areas where data owners cannot directly share their data with third parties due to privacy or confidentiality concerns (e.g., healthcare).

Authors’ Abstract

Deep learning based on artificial neural networks is a very popular approach to modeling, classifying, and recognizing complex data such as images, speech, and text. The unprecedented accuracy of deep learning methods has turned them into the foundation of new AI-based services on the Internet. Commercial companies that collect user data on a large scale have been the main beneficiaries of this trend since the success of deep learning techniques is directly proportional to the amount of data available for training. Massive data collection required for deep learning presents obvious privacy issues. Users’ personal, highly sensitive data such as photos and voice recordings is kept indefinitely by the companies that collect it. Users can neither delete it, nor restrict the purposes for which it is used. Furthermore, centrally kept data is subject to legal subpoenas and extra-judicial surveillance. Many data owners—for example, medical institutions that may want to apply deep learning methods to clinical records—are prevented by privacy and confidentiality concerns from sharing the data and thus benefiting from large-scale deep learning. In this paper, we design, implement, and evaluate a practical system that enables multiple parties to jointly learn an accurate neural network model for a given objective without sharing their input datasets. We exploit the fact that the optimization algorithms used in modern deep learning, namely, those based on stochastic gradient descent, can be parallelized and executed asynchronously. Our system lets participants train independently on their own datasets and selectively share small subsets of their models’ key parameters during training. This offers an attractive point in the utility/privacy tradeoff space: participants preserve the privacy of their respective data while still benefiting from other participants’ models and thus boosting their learning accuracy beyond what is achievable solely on their own inputs. We demonstrate the accuracy of our privacy preserving deep learning on benchmark datasets.

“Privacy-Preserving Deep Learning” by Reza Shokri, Vitaly Shmatikov, October, 2015.


Chiron: Privacy-preserving Machine Learning as a Service


Machine learning is a complex process that is difficult for all companies who might benefit to replicate well. To fill the need that groups have for machine learning, some machine learning companies now provide machine learning development and testing as a service, similar to analytics as a service.  Machine learning as a service (MLaaS) is democratizing access to the powerful analytic insights of machine learning techniques. Increasing access to machine learning techniques corresponds to an increase in the risk that companies’ models, part of their intellectual property and corporate private goods, might be leaked to users of MLaaS. Uses of MLaaS also increase the risk to the privacy of individuals whose data is introduced into ML models; service platforms could be compromised by internal or external attacks.  This paper proposes a model that builds on Software Guard Extensions (SGX) enclaves, which limit untrusted platform’s access to code or data, and Ryoan (distributed sandboxes that separate programs from one another to prevent unintentional transfer or contamination).  Ryoan sandboxes confine code, allows it to define and train a model, while ensuring that the model does not leak data to untrusted parties, to build a MLaaS platform which protects both those providing ML services and those seeking services and supplying data. 

Authors’ Abstract

Major cloud operators offer machine learning (ML) as a service, enabling customers who have the data but not ML expertise or infrastructure to train predictive models on this data. Existing ML-as-a-service platforms require users to reveal all training data to the service operator. We design, implement, and evaluate Chiron, a system for privacy-preserving machine learning as a service. First, Chiron conceals the training data from the service operator. Second, in keeping with how many existing ML-as-a-service platforms work, Chiron reveals neither the training algorithm nor the model structure to the user, providing only black-box access to the trained model. Chiron is implemented using SGX enclaves, but SGX alone does not achieve the dual goals of data privacy and model confidentiality. Chiron runs the standard ML training toolchain (including the popular Theano framework and C compiler) in an enclave, but the untrusted model-creation code from the service operator is further confined in a Ryoan sandbox to prevent it from leaking the training data outside the enclave. To support distributed training, Chiron executes multiple concurrent enclaves that exchange model parameters via a parameter server. We evaluate Chiron on popular deep learning models, focusing on benchmark image classification tasks such as CIFAR and ImageNet, and show that its training performance and accuracy of the resulting models are practical for common uses of ML-as-a-service.

“Chiron: Privacy-preserving Machine Learning as a Service” by Tyler Hunt, Congzheng Song, Reza Shokri, Vitaly Shmatikov, and Emmet Witchel, March 15, 2018.


Privacy Preserving Machine Learning: Applications to Health Care Information


Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation


A common concern about privacy in machine learning systems is that the massive amounts of data involved represent a quantitatively definable risk that is over and above the risk to privacy when smaller datasets are used.  However, as researchers using small datasets of sensitive information attest, the risk may not lie in how much data is used but in how small or disparate pieces of data are aggregated for use.  Researchers studying collaborative learning models propose methods for securing data against leakage when it is gathered from multiple sources.  Forms of collaborative learning proposed as methods to improve integration of sensitive private information in neural network settings include “round robin techniques” and “privacy-preserving distributed selective stochastic gradient descent (DSSGD)”. DSSGD is a method for protecting private information through both use of local (on-device/ at site) training and individual restriction on the level of information shared back to a central model server.  

Authors’ Abstract

Background: Collaborative privacy-preserving training methods allow for the integration of locally stored private data sets into machine learning approaches while ensuring confidentiality and nondisclosure. Objective: In this work we assess the performance of a state-of-the-art neural network approach for the detection of protected health information in texts trained in a collaborative privacy-preserving way. Methods: The training adopts distributed selective stochastic gradient descent (ie, it works by exchanging local learning results achieved on private data sets). Five networks were trained on separated real-world clinical data sets by using the privacy-protecting protocol. In total, the data sets contain 1304 real longitudinal patient records for 296 patients. Results: These networks reached a mean F1 value of 0.955. The gold standard centralized training that is based on the union of all sets and does not take data security into consideration reaches a final value of 0.962. Conclusions: Using real-world clinical data, our study shows that detection of protected health information can be secured by collaborative privacy-preserving training. In general, the approach shows the feasibility of deep learning on distributed and confidential clinical data while ensuring data protection.

“Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation” by Sven Festag and Cord Spreckelsen, May 5, 2020.


An Improved Method for Sharing Medical Images for Privacy Preserving Machine Learning using Multiparty Computation and Steganography


Preserving the security of data that cannot be easily shared due to privacy and confidentiality concerns presents opportunities for creativity in transfer mechanisms.  Sharing digital images, which due to their rich and pixelated nature cannot be easily encrypted and decrypted without loss, represents a particularly unique challenge and opportunity for creativity.  One way in which images can be shared more securely is by using secret sharing protocols, which require a combination of small pieces of secret information until a threshold of secret bits is compiled at which point a secret, such as a unique user key, is revealed and the full information rendered. Rich image data can also be used to transfer low dimensionality text data as an embedded component of the image. Steganography or insertion of small secret bits of data into an image without changing the perception of the image is one way to transfer information for use in secret sharing protocols. Uses of steganography can allow multiple machine learning uses of a single image in a privacy preserving way.

Authors’ Abstract

Digital data privacy is one of the main concerns in today’s world. When everything is digitized, there is a threat of private data being misused. Privacy-preserving machine learning is becoming a top research area. For machines to learn, massive data is needed and when it comes to sensitive data, privacy issues arise. With this paper, we combine secure multiparty computation and steganography helping machine learning researchers to make use of a huge volume of medical images with hospitals without compromising patients’ privacy. This also has application in digital image authentication. Steganography is one way of securing digital image data by secretly embedding the data in the image without creating visually perceptible changes. Secret sharing schemes have gained popularity in the last few years and research has been done on numerous aspects.

“An Improved method for sharing medical images for Privacy Preserving Machine Learning using Multiparty Computation and Steganography” by R. Vignesh, R. Vishnu, Sreenu M. Raj, M.B. Akshay, Divya G. Nair, and Jyothisha R. Nair, 2o19.


Model Centered Privacy Protections


Dynamic Backdoor Attacks Against Machine Learning Models


Machine learning systems are vulnerable to attack from conventional methods, such as model theft, but also from backdoor attacks where malicious functions are introduced into the models themselves which then express undesirable behavior when appropriately triggered. Some model backdoors use “static” triggers which could be detected by defense techniques, but the authors of this paper propose three forms of dynamic backdoor attacks. Dynamic backdoor attacks raise specific privacy concerns as these attacks allow adversaries access to both centralized and decentralized systems. Once given access, the backdoor attack with a dynamic trigger will cause a  model to misclassify any input. As a consequence, users may inadvertently adversely train machine learning models they rely upon. Likewise, backdoor attacked machine learning algorithms in users’ systems or devices may report users’ information without application of differential privacy techniques, thus compromising user privacy and personal information.

Authors’ Abstract

Machine learning (ML) has made tremendous progress during the past decade and is being adopted in various critical real-world applications. However, recent research has shown that ML models are vulnerable to multiple security and privacy attacks. In particular, backdoor attacks against ML models that have recently raised a lot of awareness. A successful backdoor attack can cause severe consequences, such as allowing an adversary to bypass critical authentication systems. Current backdooring techniques rely on adding static triggers (with fixed patterns and locations) on ML model inputs. In this paper, we propose the first class of dynamic backdooring techniques: Random Backdoor, Backdoor Generating Network (BaN), and conditional Backdoor Generating Network (c-BaN). Triggers generated by our techniques can have random patterns and locations, which reduce the efficacy of the current backdoor detection mechanisms. In particular, BaN and c-BaN are the first two schemes that algorithmically generate triggers, which rely on a novel generative network. Moreover, c-BaN is the first conditional backdooring technique, that given a target label, it can generate a target-specific trigger. Both BaN and c-BaN are essentially a general framework which renders the adversary the flexibility for further customizing backdoor attacks. We extensively evaluate our techniques on three benchmark datasets: MNIST, CelebA, and CIFAR-10. Our techniques achieve almost perfect attack performance on backdoored data with a negligible utility loss. We further show that our techniques can bypass current state-of-the-art defense mechanisms against backdoor attacks, including Neural Cleanse, ABS, and STRIP.

“Dynamic Backdoor Attacks Against Machine Learning Models” by Ahmed Salem, Rui Wen, Michael Backes, Shiqing Ma, and Yang Zhang, March 7, 2020.


Mind Your Weights: A Large-Scale Study on Insufficient Machine Learning Model Protection in Mobile Apps


To protect privacy when using machine learning, many researchers or developers focus on securing the data of individuals whose interaction with the internet of things, mobile phones, and other data gathering devices powers much of machine learning. But, truly securing privacy in machine learning systems also means securing the models themselves. Securing models protects the privacy and security of the companies’ machine learning assets and protects users who could be subject to a higher risk for exposure due to inversion attacks by those who maliciously or surreptitiously gain access to models. This paper studies the methods that companies do and do not use when protecting machine learning in mobile apps. Importantly, these researchers also quantify the risk if  models are stolen, finding that the cost of a stolen model can run into the millions of dollars.  

Authors’ Abstract

On-device machine learning (ML) is quickly gaining popularity among mobile apps. It allows offline model inference while preserving user privacy. However, ML models, considered as core intellectual properties of model owners, are now stored on billions of untrusted devices and subject to potential thefts. Leaked models can cause both severe financial loss and security consequences. This paper presents the first empirical study of ML model protection on mobile devices. Our study aims to answer three open questions with quantitative evidence: How widely is model protection used in apps? How robust are existing model protection techniques? How much can (stolen) models cost? To that end, we built a simple app analysis pipeline and analyzed 46,753 popular apps collected from the US and Chinese app markets. We identified 1,468MLapps spanning all popular app categories. We found that, alarmingly, 41% of ML apps do not protect their models at all, which can be trivially stolen from app packages. Even for those apps that use model protection or encryption, we were able to extract the models from 66% of them via unsophisticated dynamic analysis techniques. The extracted models are mostly commercial products and used for face recognition, liveness detection, ID/bank card recognition, and malware detection. We quantitatively estimated the potential financial impact of a leaked model, which can amount to millions of dollars for different stakeholders. Our study reveals that on-device models are currently at high risk of being leaked; attackers are highly motivated to steal such models. Drawn from our large-scale study, we report our insights into this emerging security problem and discuss the technical challenges, hoping to inspire future research on robust and practical model protection for mobile devices.

“Mind Your Weight(s): A Large-scale Study on Insufficient Machine Learning Model Protection in Mobile Apps” by Zhichuang Sun, Ruimin Sun, and Long Lu, February 18, 2020.



Privacy and security in a machine learning enabled world involves protecting both data and models.  As the papers reviewed here show, this will require new ways of thinking about analysis of privacy in development and deployment of machine learning models. Whether in research on healthcare applications or mobile apps, researchers are pointing to these new ways of thinking and new techniques to improve privacy in machine learning.