FPF Seeks Nominations for 2021 Research Data Stewardship Award

The Call for Nominations for the 2021 FPF Award for Research Data Stewardship is now open. You can find the nominations forms here. We ask that nominations be submitted by Monday, March 1, 2021.

The FPF Award for Research Data Stewardship recognizes excellence in the privacy-protective stewardship of corporate data that is shared with academic researchers. The award highlights companies and academics who demonstrate novel best practices and approaches to sharing corporate data in order to advance scientific knowledge. 

The 2020 Research Data Stewardship Award was given to Lumos Labs and Professor Mark Steyvers for their path breaking work using data to model human attention and cognition. We look forward to reading submissions of innovative work in cognitive science and any other scientific field this year. We are keenly interested in multidisciplinary work that shows the power of shared data to answer questions that span multiple fields of inquiry. 

Academics and their corporate partners are invited to nominate a successful data-sharing project that reflects privacy protective approaches to data protection and ethical data sharing. Nominations will be reviewed and selected by an Award Committee comprised of representatives from FPF, leading foundations, academics, and industry leaders. Nominated projects will be judged based on several factors, including their adherence to privacy protection in the sharing process, the quality of the data handling process, and the company’s commitment to supporting the academic research. 

The nominations form will be open until February 19th. If you have questions about the nomination form, the process of review, or any other information about the award, please contact Dr. Sara Jordan at sjordan at fpf dot org.

FPF in 2020: Adjusting to the Unexpected

With 2020 fast coming to a close, we wanted to take a moment to reflect on a year that forced us to re-focus our priorities, along with much of our lives, while continuing to produce engaging events, thought-provoking analysis, and insightful reports. 

Considering Privacy & Ethics at the Dawn of a New Decade

Early in the year, our eyes were on the future – at least through the rest of the decade – and how privacy and ethical considerations would impact our lives and new and upcoming technologies over the next ten years. 

The Privacy 2020: 10 Privacy Risks and 10 Privacy Enhancing Technologies to Watch in the Next Decade white paper, co-authored by FPF CEO Jules Polonetsky and hacklawyER Founder Elizabeth Renieris, helped corporate officers, nonprofit leaders, and policymakers better understand the privacy risks that will grow to prominence during the 2020s, as well as rising technologies that will be used to help manage privacy through the decade.

In February, we hosted the 10th-annual Privacy Papers for Policymakers event. The yearly event recognizes the year’s leading privacy research and analytical work that is relevant for policymakers in the United States Congress, federal agencies, and international data protection authorities. We were honored to be joined by FTC Commissioner Christine S. Wilson, who keynoted the event, and leading privacy scholars and policy and regulatory staff. 

FPF CEO Jules Polonetsky delivered a keynote address at RSA Conference 2020 in San Francisco, Navigating Privacy in a Data-Driven World: Treating Privacy as a Human Right. In his speech, Jules outlined the limitations of consumer protection laws in protecting individuals’ privacy and explored how to best safeguard data to protect human rights. 

“Are corporations having too much power over individuals because of how much data they have? Are foreign countries interfering in our elections? Are automated decisions being made where I’ll be turned down for healthcare, I’ll be turned down for insurance, my probation will be extended? These are not [only] privacy issues, right? These are issues of power. These are issues of human rights at the end of the day.”

Adjusting to a One-in-a-Century Pandemic

By March, it was clear that the COVID-19 pandemic would impact every aspect of our lives, and we moved nimbly to respond and re-assess our immediate priorities. FPF launched the Privacy and Pandemicsseries, a collection of resources published throughout the year that explores the challenges posed by the COVID-19 pandemic to existing ethical, privacy, and data protection frameworks, seeking to provide information and guidance to governments, companies, academics, and civil society organizations interested in responsible data sharing to support the public health response. Some of the initial materials published in the spring as part of this series included: 

In April, FPF Senior Counsel Stacey Gray provided the Senate Committee on Commerce, Science, and Transportation with written testimony, including recommendations based on how experts in the U.S. and around the world are currently mitigating the risks of using data to combat the COVID-19 pandemic. 

Experts have used machine learning technologies to study the virus, test potential treatments, diagnose individuals, analyze the public health impacts, and more. In early May, FPF Policy Counsel Dr. Sara Jordan and FPF Senior Counsel Brenda Leong published a resource covering leading efforts, data protection and ethical issues related to machine learning and COVID-19. 

As part of our ongoing Privacy and Pandemics series, FPF, Highmark Health, and Carnegie Mellon University’s CyLab Security and Privacy Institute, hosted a virtual symposium that took an in-depth look at the role of biometrics and privacy in the COVID-19 era. During the virtual symposium, expert discussants and presenters examined the impact of biometrics in the ongoing fight against the novel coronavirus. Discussions about law and policy were enhanced by demonstrations of the latest facial recognition and sensing technology and privacy controls from researchers at CMU’s CyLab. 

On October 27 and 28, FPF hosted a workshop titled Privacy & Pandemics: Responsible Uses of Technology and Health Data During Times of Crisis. Dr. Lauren Gardner, creator of Johns Hopkins University’s COVID-19 Dashboard, and UC Berkeley data analytics researcher Dr. Katherine Yelick were keynote speakers. The workshop – held in collaboration with the National Science Foundation, Duke Sanford School of Public Policy, SFI ADAPT Research Centre, Dublin City University, OneTrust, and the Intel Corporation – also featured wide-ranging conversations with participants from the fields of data and computer science, public health, law and policy. Following the workshop, FPF prepared a report for the National Science Foundation to help speed the transition of research into practice to address this challenge of national importance. 

Global Expertise & Leadership

FPF’s international work continued to expand in 2020, as policymakers around the world are focused on ways to improve privacy frameworks. More than 120 countries have enacted a privacy or data protection law, and FPF both closely followed and advised upon significant developments in the European Union, Latin America, and Asia. 

We were proud to announce our new partnership with Dublin City University (DCU), which will lead to joint conferences and workshops, collaborative research projects, joint resources for policymakers, and applications for research opportunities. DCU is home to some of the leading AI-focused research and scholarship programs in Ireland. DCU is a lead university for the Science Foundation Ireland ADAPT program, and hosts the consortium leadership for the INSIGHT research centre, two of the largest government funded AI and tech-focused development programs. Notably, the collaboration has already resulted in a joint webinar, The Independent and Effective DPO: Legal and Policy Perspectives, and we’re excited about further collaboration in 2021 and beyond.

Following the Prime Minister of Israel’s announcement that the government planned to use technology to address the spread of COVID-19, Limor Shmerling Magazanik, Managing Director of the Israel Tech Policy Institute, published recommendations to ensure a balance between civilian freedoms and public health. Specifically, her recommendations centered around ensuring transparency, limits on the length of time that data is held, requiring a clear purpose for data collection, and robust security. 

The Schrems II decision from the Court of Justice of the European Union held serious consequences for dataflows coming from the EY to the United States, as well as to most of the other countries in the world. In advance of the decision, FPF published a guide called, What to Expect from the Court of Justice of the EU in the Schrems II Decision This Week by FPF’s Dr. Gabriela Zanfir Fortuna. FPF also conducted a study of the companies enrolled in the cross-border privacy program called Privacy Shield, finding that 259 European-headquartered companies are active Privacy Shield participants. 

We released many papers and blog posts analyzing privacy legislation in the EU, Brazil, South Korea, Singapore, India, Canada, New Zealand and elsewhere. One example was the white paper published in May titled, New Decade, New Priorities: A summary of twelve European Data Protection Authorities strategic and operational plans for 2020 and beyond. The paper provides guidance on the priorities and focus areas considered top concerns among European Data Protection Authorities (DPAs) for the 2020s and beyond. 

Leveraging our growing focus on the international privacy landscape, and as part of our growing work related to education privacy, we published a report titled The General Data Protection Regulation: An Analysis and Guidance for US Higher Education Institutions, authored by FPF Senior Counsel Dr. Gabriela Zanfir-Fortuna. The report contains analysis and guidance to assist U.S.-based higher education institutions and their edtech service providers in assessing their compliance with the European Union’s General Data Protection Regulation. 

We hosted several events and roundtables in Europe. On December 2, 2020, the fourth iteration of the Brussels Privacy Symposium, Research and the Protection of Personal Data Under the GDPR, took place as a virtual international meeting where industry privacy leaders, academic researchers, and regulators discussed the present and future of data protection in the context of scientific data-based research and in the age of COVID. The virtual event is the latest aspect of an ongoing partnership between FPF and Vrije Universiteit Brussel (VUB). Keynote speakers were Malte Beyer-Katzenberger, Policy Officer at the European Commission; Dr. Wojciech Wiewiórowski, the European Data Protection Supervisor; and Microsoft’s Cornelia Kutterer. Their presentations sparked engaging conversations on the complex interactions between data protection and research as well as the ways in which processing of sensitive data can present privacy risks, and also unearth covert bias and discrimination. 

Scholarship & Analysis on Impactful Topics

The core of our work is providing insightful analysis on prevailing privacy issues. FPF convenes industry experts, academics, consumer advocates, and other thought leaders to explore the challenges posed by technological innovation, and develop privacy protections, ethical norms, and workable business practices. In 2020 – through events, awards, infographic guides, papers, studies, or briefings – FPF provided thoughtful leadership on issues ranging from corporate-academic data sharing to encryption. 

In mid-May, the Future of Privacy Forum announced the winners for the first-ever FPF Award for Research Data Stewardship: Professor Mark Steyvers, University of California, Irvine Department of Cognitive Sciences, and Lumos Labs. The first-of-its-kind award recognizes a privacy protective research collaboration between a company and academic researchers, based on the notion that when privately held data is responsibly shared with academic researchers, it can support significant progress in medicine, public health, education, social science, and other fields. In October, FPF hosted a virtual event honoring the winners, featuring – in addition to the awardees – Daniel L. Goroff, Vice President and Program Director at the Alfred P. Sloan Foundation, which funded the award, as well as FPF CEO Jules Polonetsky and FPF Policy Counsel Dr. Sara Jordan.  

Expanding upon its industry-leading best practices, in July FPF published Consumer Genetic Testing Companies & The Role of Transparency Reports in Revealing Government Requests for User Data, examining how leading consumer genetic testing companies require valid legal processes before disclosing consumer genetic information to the government, and how companies publish transparency reports around such disclosures. 

FPF published an interactive visual guide, Strong Data Encryption Protects Everyone, illustrating how strong encryption protects individuals, enterprises, and the government. The guide also highlights key risks that arise when crypto safeguards are undermined – risks that can expose sensitive health and financial records, undermine the security of critical infrastructure, and enable interception of officials’ confidential communications. 

Over the summer, we published interviews with senior FPF policy experts about their work on important privacy issues. As part of this series of internal interviews, we spoke with FPF Health Policy Counsel Dr. Rachele Hendricks-Sturrup, FPF Director of Technology and Privacy Research Christy Harris, FPF Managing Director for Europe Rob van Eijk, and FPF Policy Counsel Chelsey Colbert.

FPF Policy Fellow Casey Waughn, supported by Anisha Reddy and Juliana Cotto from FPF, and Antwan Perry, Donna Harris-Aikens, and Justin Thompson at the National Education Association, released new recommendations for the use of video conferencing platforms in online learning. The recommendations ask schools and districts to reconsider requiring students to have their cameras turned on during distanced learning. These requirements create unique privacy and equity risks for students, including increased data collection, an implied lack of trust, and conflating students’ school and home lives.

Following the 2020 election, FPF has hosted several events looking ahead to the policy implications of a new Administration and Congress in 2021, including a roundtable discussion where Jules was joined by Jonathan Baron, Principal of Baron Public Affairs, a leading policy and political risk strategist, as well as FPF’s Global and Europe leads, Dr. Gabriela Zanfir-Fortuna and Rob Van Eijk. In addition, Jules; FPF Senior Fellow Peter Swire; VP of Policy John Verdi; and Senior Counsel Stacey Gray also held a briefing with members of the media to discuss expectations on what the Biden administration, FTC, and states will accomplish on privacy in the coming year. IAPP published an article summarizing the briefing.

Linking Equity & Fairness with Privacy

Alongside a pandemic that forced us to shift our priorities, 2020 saw a needed national reckoning with issues related to diversity and equity. From racial justice and the LGBTQ+ community to child rights, FPF took conscious steps to reflect on, understand, and address essential questions related to equity and fairness in the context of privacy. 

The data protection community has particular challenges as we grapple with the many ways that data can be used unfairly. In response, our team has focused on listening and learning from leaders with diverse life and professional experiences to help shape more careful thinking about data and discrimination. As part of that project, we published remarks on diversity and inclusion from Macy’s Chief Privacy Officer and FPF Advisory Board member Michael McCullough delivered at the WireWheel Spokes 2020 conference. We also discussed Ruha Benjamin’s Race After Technology: Abolitionist Tools for the New Jim Code as part of our ongoing book club series and were honored to be joined by the author for the discussion. 

LGBTQ+ rights are, and have always been, linked with privacy. Over the years, privacy-invasive laws, practices, and norms have been used to oppress LGBTQ+ individuals by criminalizing and stigmatizing individuals on the basis of their sexual behavior, sexuality, and gender expression. In honor of October as LGBTQ+ History Month, FPF and LGBT Tech explored three of the most significant privacy invasions impacting the LGBTQ+ community in modern U.S. history: anti-sodomy laws; the “Lavender Scare” beginning in the 1950s; and privacy invasions during the HIV/AIDS epidemic. These examples and many more were discussed as part of a LinkedIn Live event on International Human Rights Day, featuring LGBT Tech Executive Director Christopher Wood, FPF Founder and Board Chair Christopher Wolf, LGBT Tech Deputy Director and General Counsel Carlos Gutierrez, FPF Policy Counsel Dr. Sara Jordan, and FPF Christopher Wolf Diversity Law Fellow Katelyn Ringrose. 

FPF submitted feedback and comments to the United Nations Children’s Fund (UNICEF) on the Draft Policy Guidance on Artificial Intelligence (AI) for Children, which seeks “to promote children’s rights in government and private sector AI policies and practices, and to raise awareness of how AI systems can uphold or undermine children’s rights.” FPF encouraged UNICEF to adopt an approach that accounts for the diversity of childhood experiences across countries and contexts. Earlier in October, FPF also submitted comments to the United Nations Office of the High Commissioner for Human Rights Special Rapporteur on the right to privacy to inform the Special Rapporteur’s upcoming report on the privacy rights of children. FPF will continue to provide expertise and insight on child and student privacy, AI, and ethics to agencies, governments, and corporations to promote the best interests of children. 

This post is by no means an exhaustive list of our most important work in 2020, but we hope it give you a sense of the scope of our impact. On behalf of everyone at FPF, best wishes for 2021!  

FPF Health and AI & Ethics Policy Counsels Present a Scientific Position at ICML 2020 and at 2020 CCSQ World Usability Day

On November 12, 2020, FPF Policy Counsels Drs. Rachele Hendricks-Sturrup and Sara Jordan presented privacy-by-design alongside human-centered design concepts during the 2020 CCSQ World Usability Day virtual conference. This presentation followed Drs. Hendricks-Sturrup’s and Jordan’s July 2020 scientific position paper presented at the International Conference on Machine Learning (ICML) 2020, entitled “Patient- Reported Outcomes: A Privacy-Centric and Federated Approach to Machine Learning.”

Drs. Hendricks-Sturrup and Jordan gave their position that patient reported outcomes (PROs) data requires special privacy and security considerations, due to the nature of the data, if used within on-device or federated machine learning constructs as well as in the development of artificial intelligence platforms. Patient reported outcomes, being a raw form of patient expression and feedback, help clinicians, researchers, medical device and drug manufacturers, and governmental stakeholders overseeing medical device and drug development, distribution, and safety monitor, understand, and document, in a readable format, patients’ symptoms, preferences, complaints, and/or experiences following a clinical intervention. Gathering and using such data requires careful attention to security, data architecture, data use, and machine-readable consent tools and privacy policies.

Even so, on-device patient reported outcome measurement tools, like patient surveys within third-party mobile apps that use machine learning, may employ the best machine-readable privacy policies or consent mechanisms, but may ultimately leave key components of privacy protections up to the patient-user. Keeping data in the hands of users opens those users up to unanticipated vectors of attack from adversaries striving to identify the valuable machine learning models or seeking to uncover data about a specific patient. 

Drs. Hendricks-Sturrup and Jordan recommended that developers of patient reported outcome measurement systems leveraging federated learning architectures:

  1. Intentionally design user device security, as well as security for transmission of either data (raw or processed) or model gradients, to the highest level of protections that do not degrade essential performance for critical health and safety monitoring procedures (e.g. remote monitoring for clinical trials, post-market drug safety surveillance, hospital performance scores, etc.);
  2. Ensure that models are not compromised and that valuable machine learning spending is not lost to competitors; 
  3. Design systems to operate atop a federated machine learning architecture, both when model components are sent or gradients received, to ensure privacy of users’ data; and  
  4. Design learning algorithms with algorithmic privacy techniques, such as including differential privacy, which is essential to secure valuable and sensitive PRO data. 

Drs. Hendricks-Sturrup’s and Jordan’s paper, poster, and presentation can be found at these links:

ICML paper

ICML poster

CCSQ World Usability Day recording

CCSQ World Usability Day slides

To learn more about FPF’s Health & Genetics and AI & Ethics initiatives, contact Drs. Hendricks-Sturrup and Jordan, respectively, at: [email protected] and [email protected]

Workshop Report: Privacy & Pandemics – Responsible Use of Data During Times of Crisis

In October 2020, the Future of Privacy Forum (FPF) convened a virtual workshop entitled “Privacy and Pandemics: Responsible Uses of Technology and Health Data During Times of Crisis” with invited computer science, privacy law, public policy, social science, and health information experts from around the world to examine benefits, risks, and strategies for the collection and use of data in support of public health initiatives in response to COVID-19 and for consideration of future public health crises. With support from the National Science Foundation, Intel Corporation, Duke Sanford School of Public Policy, and Dublin City University’s SFI ADAPT Research Centre, the workshop identified research priorities to improve data governance systems and structures in the context of the COVID-19 pandemic. 

To learn more about FPF’s work related to privacy and pandemics, please visit the Privacy & Pandemics page.

Drawing on the expertise of workshop participant submissions and session discussions, FPF prepared a workshop report which was submitted to the National Science Foundation for use in planning the Convergence Accelerator 2021 Workshops. This NSF program aims to speed the transition of convergence research into practice to address grand challenges of national importance. The final submitted workshop report is also available on our website.

Based on analysis of expert positions reflected in the workshop, FPF recommends NSF consider the following roadmap for research, practice improvements, and development of privacy-preserving products and services to inform responses to the COVID-19 crisis and in preparation for future pandemics or other public crises:

To learn more about the Privacy & Pandemics conference, including information about the topics, participants, sessions, presentations, and to read the final workshop report, head to the event page

South Korean Personal Information Protection Commission Announces Three-Year Data Protection Policy Plan

by Jasmine Park

On November 24, 2020, the South Korean Personal Information Protection Commission (PIPC), the nation’s central administrative agency tasked with protecting the privacy rights of individuals by enforcing the country’s privacy laws, released its revised three-year “Personal Information Protection Master Plan” (‘21-‘23). A wide range of policies that balance both the protection and use of personal information will be implemented at the national level, such as improving the system for obtaining consent when collecting personal information, providing incentives for self-regulation, and reforming the system regulating the cross-border transfer of personal information. One innovative area where the PIPC will play a leading role is developing a comprehensive support system for the use of pseudonymized data. 

The plan includes three key strategies:

  1. Confident protection of personal information;
  2. Secure use of personal information that increases the value of data; and
  3. A fair balance between protection and use of personal information as the control tower.

The first strategy aims to 1) reinforce data subjects’ rights and promote citizen’s privacy literacy, 2) create a business ecosystem of voluntary protection of personal information, and 3) advance personal information protection systems in public sectors. The second strategy will 1) support the safe use of personal information, 2) eliminate blind spots in the digital transformation environment, and 3) create a safer environment for personal information through research and development. The third strategy sets out to 1) take stern measures against and respond promptly to privacy violations, 2) build national governance for personal information protection, 3) strengthen global personal information cooperation, and 4) reinforce the PIPC’s leadership as a unified supervisory body.

PIPC Chairman Yoon Jong-in also presented a report on the plan to the State Council of South Korea on November 24, 2020. The plan revises the “4th Personal Information Protection Plan” which was announced earlier in February 2020, and lays out the driving strategy and direction of major policies for the next three years, including the government’s plan for personal information protection. A need to revise the “4th Personal Information Protection Plan” arose due to the establishment of the PIPC as a central administrative agency on August 8, 2020, under the “Amendments to the Three Data Privacy Laws”, and the socially distanced and digital society brought about by COVID-19. 

Therefore, the PIPC revised its plan after conducting an analysis of the environment, public surveys, and system research. Yoon Jong-in announced that the new plan will take effect in 2021 on the 10th anniversary of the enactment of PIPA, and encouragingly stated that “if the past decade was the time to lay the foundation for personal information protection in Korea, the next decade is the key to action. Built on trust in the data economy, we will do our best to implement the personal information protection plan so data can be used safely and well.”

The PIPC also aims to strengthen policies that confidently protect people’s personal information in the private and public sectors. It sets out to improve obtaining consent when collecting personal information, introduce new rights such as data portability, and effectively protect people’s control over their personal information in accordance with the changing times. In addition, the PIPC plans to enhance the self-protection of personal information by having people protect their own personal information taking into account the sensitivity of the data, providing incentives to businesses based on their performance voluntarily protecting personal information in a self-regulatory system, and developing professional expertise. 

The PIPC will also expand the existing standards for privacy impact assessments by considering emerging privacy risk factors from new technologies, and expand the data breach incident factors assessment standards to prevent data breaches in the public sector. The public sector itself will take the lead on strengthening the foundation of personal information management, raising the standard through on-site inspections.

Further, in an economy increasingly driven by data, the PIPC will activate a pseudonymized data system to ensure personal information is used securely, and develop personal information protection systems and technologies. South Korea’s data protection law, the Personal Information Protection Act (PIPA) was amended in January 2020, and centralizes the data regulatory functions of PIPC (established as an administrative agency in September 2011), the Ministry of the Interior and Safety (MOIS), and the Korea Communications Commission (KCC) under PIPC, elevating it to the central data privacy regulatory authority in South Korea. While the PIPA has laid the foundation for processing and using pseudonymized personal information, due to the need to continuously enhance protections, the PIPC will develop a comprehensive support system and operate a government council to this end. The system will allow the combination of pseudonymized information by including an application and guidelines for submitting, receiving, and combining pseudonymized information, generating combined key-linked information, and managing the status of combinations. 

The PIPC will also develop new protection standards for a digital society where new technologies such as artificial intelligence, cloud, and self-driving technologies have become widespread, and will actively review and seek to improve regulatory sandboxes that have been proven to require modification. 

Finally, as the nation’s personal information protection “control tower”, the PIPC aims to strengthen its role in personal information protection domestically and internationally, and lead public-private global governance in balancing the protection and use of personal information. The PIPC also announced that it will increase inspections of public institutions that have large-scale personal information, carry out strict investigations and enforcements, and convene a government joint response consulting body to respond to data breaches. While the PIPC serves as a one-stop-shop for obtaining advice related to personal information protection with addressing complaints as one of the most anticipated functions of the PIPC, it will also assess and improve the cross-border data transfer system in response to increasing overseas data transfers by reviewing the diversification of cross-border transfer requirements, such as non-consent standard contracts.

With thanks to Caroline Hopland for her contribution. 

***********************************************************

South Korean Personal Information Protection Plan Information (In Korean)
Pipc Launch Policy Vision Timeline

Machine Learning and Speech: A Review of FPF’s Digital Data Flows Masterclass

On Wednesday, December 9, 2020, FPF hosted a Digital Data Flows Masterclass on Machine Learning and Speech. The masterclass on Machine Learning and Speech is the first masterclass of a new series after completing the VUB-FPF Digital Data Flows Masterclass series with eight topics.

Professor Marine Carpuat (Associate Professor in Computer Science at the University of Maryland) starts with an introduction on the more advanced aspects of (un)supervised learning. One of the unique takeaways of the presentation is that Prof. Carpuat’s explanation of mathematical models to an audience without a mathematical background is second to none. Dr. Prem Natarajan (VP, Alexa AI-Natural Understanding) guides us through the intricacies of Machine Learning in the context of voice assistants. As a practitioner, Dr. Natarajan brings unique examples to this class. The presenters explored the differences between supervised, semi-supervised, and unsupervised machine learning (ML) and the impact of these recent technical developments on machine translation and speech recognition technologies. In addition, they briefly explored the history of machine learning to draw lessons from the long-term development of artificial intelligence and put new advancements in context. FPF’s Dr. Sara Jordan and Dr. Rob van Eijk moderated the discussion following expert overviews.

The recording of the class can be accessed here.

Machine Translation – An Introduction 

As a sub-field of computational linguistics, machine translation studies the use of software to translate text or speech from one language to another. Applications that use machine translation find the best fit for a translation based on probabilistic modelling. This means that for any given sentence in a language, translation models will find the highest probability of resemblance to a human translation and use that sentence in a translation. 

Machine translation has made considerable progress over the years. Previous models would break down simple sentences into individual words without taking into account the context of the sentence itself. This often led to incomplete, simplistic, or wholly incorrect translations. The introduction of deep learning and neural networks to machine translation greatly increased the quality and fluency of these translations by allowing translation tools to put the meaning of each word within the overall context of the sentence. 

Yet for all the progress, many challenges lay ahead. Deep learning requires very large datasets and paired translation examples (e.g., English to French), which might not exist for certain languages. Even when there is a large pool of data, sentences often have multiple correct translations, which poses additional problems for machine learning. In addition, the deployment of these models in the real world has also raised concerns, as translation errors can sometimes have severe consequences. 

Machine translation tools generally use sequence-to-sequence models (Seq2Seq) that share architecture with word embedding and other representation models. Seq2Seq models convert sequences from one language (e.g., English) to generate text in another (e.g., French). Seq2Seq models form the backbone of more accurate natural language processing (NLP) tools due to their use of recurrent neural networks (RNN), which allow models to take into account the context of the former input word when processing the next output word. Developers use Seq2Seq models for multiple use cases including dialogue construction, text summarization, question answering, image captioning, and style transfer. 

Word embedding refers to language processing models that map words and phrases onto vector space of real numbers through neural networks or other probabilistic models. This allows translation models to reduce confusion arising from a large dimension of possible outputs and better represent the context of a sentence in which words appear. These tools generally increase the performance of Seq2Seq models and NLP generally because they help models understand the context in which words appear. 

screen shot 2020 12 18 at 9.02.09 am

Figure 1. How to Train a Machine Learning Model?

How to Train a Machine Learning Model – Supervised, Unsupervised, and Semi-Supervised Learning 

Training neural models with limited translation data poses multiple problems (see also, Figure 1). For instance, situations calling for a more formal translation need plenty of formal writing data to achieve accuracy and fluency. But issues may arise if the model must account for different writing styles or more informal contexts. Fortunately, developers can apply different training paradigms, including supervised, unsupervised, and semi-supervised learning, to better train models with diverse and limited translation data. 

Supervised learning is a process of learning from labeled examples. Developers feed the model with instances of previous tests and answer keys to improve the model’s accuracy and precision. Supervised learning allows the model to match input source and output target sequences with correct translations through parallel sampling, which refers to sampling a limited subset of data for one language against a whole body of data for another. In other words, the translation model can learn from examples of text in one language to better calculate the probability of a correct translation in another. Under this type of learning, it is important to create parameters that give a high probability of success to avoid overfitting and make the model generalizable across contexts. 

On the other end of the spectrum, unsupervised learning gives previous tests to the model without answer keys. Under this type of training, models learn by making connections between unpaired monolingual samples in each language. Put simply, machine translation models study individual words in a language (e.g., through a dictionary) and draw connections between those words. This helps the model find similarities or patterns in language across the data and predict the next word in a sentence from the previous word in that sentence. 

Unsupervised learning is a simple yet powerful tool. It works extremely well when the developer has a large pool of data to help the model reinforce its understanding of the syntactic or grammatical structures between languages. For example, deep learning encodes an English sentence and plugs it into the French output model which then generates a translation based on the cross-lingual correspondences it learned through the data. While many applications of unsupervised learning are still at the proof of concept stage, this learning technique offers a promising avenue for language processing developers. 

Finally, semi-supervised learning combines elements of both supervised and unsupervised learning and is the primary training method for translation models. The model learns from a multitude of unpaired samples (unsupervised learning) but then receives paired samples (supervised learning) to reinforce cross-lingual connections. Put simply, unsupervised training helps the model learn syntactic patterns in languages, while supervised training helps the model understand the full meaning of a sentence in context. Combinations of both training techniques strengthen syntactic and semantic connections simultaneously.

Limitations and Challenges of Machine Translation 

While the accuracy and quality of machine translation have increased through supervised, unsupervised, and semi-supervised learning, there are still many obstacles to scale the 7,111 languages spoken in the world today. Indeed, while the lack of data poses its own challenges for training purposes, machine translation still runs into problems with adequacy, accuracy, and fluency in languages where plenty of data exists. 

Developers have addressed these problems by attempting to quantify adequacy and compute the semantic similarity between machine translation and human translation. One common method is BLEU, which counts the exact matches of word sequences between human translation and the machine output. Another, BVSS, combines word pair similarity into a sentence score to measure a translation against a real-world output. In practice, developers need to use both metrics at the same time in order to make machine translation more adequate and fluent. 

From a high-level perspective, neural models employing machine learning produce more fluent translations than older statistical models. Forward-looking models are beginning to build systems that adapt to different audiences and can incorporate other metrics of translation such as formality and style. While much work is needed in both supervised and unsupervised learning, case studies have already revealed a promising ground for future improvements.

Machine Learning and Speech 

Many of the lessons learned in recent advancements of machine translation also apply to the area of speech recognition. The introduction of deep learning and automatic features in both fields has accelerated development, bringing more complexity and new challenges to training and refining machine learning models than what came before. As the application of machine translation and speech recognition become more prevalent in the economy, it is important to understand where the field came from and where it is going. 

Like machine translation, speech recognition has a long history that stretches back to the invention of artificial intelligence (AI) as a formal academic discipline in the 1950s. The introduction of neural networks and statistical models in the 1980s and 1990s marked a watershed moment in the development of machine learning and speech. Inputting human interaction directly into the machine helped create trainable models and gave impetus to a variety of speech recognition projects. 

Yet while the idea of machine learning through neural networking became widespread in the discipline, the lack of computing power and data inhibited its development. With widespread increases in computational power in the 1990s and 2000s, researchers were able to introduce automatic machine learning tools, which greatly accelerated the development of speech recognition and machine translation. But as automatic models became more sophisticated in identifying recognizing inputs and reducing recognition errors, these models also led to decreased interpretability.

screen shot 2020 12 18 at 9.04.40 am

Figure 2. Historical overview of the technological evolution of AI/ML.

Since then, correcting for poor interpretability in these models has involved a range of training methods, including supervised and unsupervised learning. The widespread availability of data has helped this process because developers can now find novel ways to make automatic features more reliable and accurate. 

As speech recognition technology begins to interface with humans through conversational AI, the competency of these models may greatly improve. This is because engaging with data in real time allows machine learning models to adapt to novel situations when presented with new information or new vocabulary that it did not previously know. To this end, research is currently moving in the direction of parameter defining and error evaluation in order to make training more accurate and effective. 

To be sure, advancements in human-computer interface have a history going back to the era of Markov Models in the 1980s (see, Figure 2). However, today the proliferation of practical speech recognition applications fueled by automatic features has underscored a change in the pace of research. In addition, the role of data, including less labelled data, has also become increasingly important in the process of training models. 

Advancements in Training Models 

Indeed, present speech recognition technology utilizes a range of learning methods at the cutting edge of deep learning, including active learning, transfer learning, semi-supervised learning, learning to paraphrase, self-learning, and teaching AI. 

Active learning refers to a training process where different loss functions divide training data in different ways to test the models adaptability to new contexts with the same data. Transfer learning data takes data-rich domains and finds ways to transfer learning to new domains that lack corresponding data. For instance, in the context of language learning, this type of learning takes lessons learned from one language and transfers them to another. 

As discussed above, semi-supervised learning trains large models with a combination of supervised and unsupervised learning and uses such models to train smaller models and filter unlabeled data. Indeed with semi-supervised training, the current trend is moving away from supervised learning and reducing reliance on annotated data for training purposes.

Learning to paraphrase refers to a learning process where a model will take a first pass at guessing the meaning of an input, receive a new paraphrased version of the input, and then attempt again at decoding a correct output. Related to this, self-learning takes signals from the real-world environment and makes connections between data, without requiring any human intervention in the process. Finally, teaching AI allows models to integrate natural language with common-sense reasoning and learn new concepts through direct and interactive teaching from customers. 

Each of these training methods marks a departure from past learning paradigms that required direct supervision and labelled data from human trainers. While limitations in the overall architecture of training models still exist, major areas of research today focus on approximating the performance of supervised learning models with less labelled data. 

Self-Learning in New Applications and Products 

These advancements are coinciding with the proliferation of consumer products designed and optimized for conversational translation. Speech recognition converts audio into sequences of phonemes and then uses machine translation to select the best output between alternate interpretations of those phonemes. New applications on the consumer market can directly learn through their interaction with the consumer by combining traditional machine translation models with neural language models that apply AI and self-learning. 

Learning models that can act more independently and filter unlabelled data will become more effective in adapting to novel contexts and processing language more accurately and fluently. Already, developers are improving the level of fluency in these products by incorporating conversational-speech data that includes tone of voice, formality, and colloquial expressions in the training. 

The recording of the class can be accessed here

Digital Data Flows Masterclass Archive

The Digital Data Flows Masterclass Archive with the recordings and resources for previous classes can be accessed here

To learn more about FPF in Europe, visit https://fpf.org/eu.

How Federal Privacy Legislation Could Affect US-EU-UK Relations

With more than 30 bills filed in the United States Congress since 2018 to regulate privacy with overwhelming support from the public, it looks like the US might be going through a ‘privacy renaissance’. This week, FPF Senior Counsel Dr. Gabriela Zanfir-Fortuna’s article, “America’s ‘privacy renaissance’: What to expect under a new presidency and Congress,” was featured in the Ada Lovelace Institute Blog. In the article, Gabriela lays out similarities and differences between the two major federal privacy bills currently in Congress, the SAFE DATA Act and the Consumer Online Privacy Act (COPRA), before breaking down the potential implications that a federal privacy bill passed by the incoming Biden administration would have on US-EU-UK relations during a time that the President of the EU Commission called, “an unprecedented window of opportunity to set a joint EU-US tech agenda.”

You can read Gabriela’s full article on the Ada Lovelace Institute website.

Seven Questions to Ask if You Have XR on Your Holiday Wish List

The holidays are right around the corner, and with so many of us sheltering in place in response to COVID-19, some are looking for an escape from the same four walls. Enter XR to help virtually transport us to new worlds, immersive games, and social interactions. XR, or extended reality, is an umbrella term for virtual reality (VR), augmented reality (AR), and other immersive technology. In 2020, we have seen increasingly accessible and affordable XR on the market, and undoubtedly much of this technology is at the top of holiday wishlists. But new tech raises new questions about the privacy and security of personal data. Below are seven questions consumers should be asking before buying an XR gift for a loved one (or themselves) this holiday season, along with answers to better understand the implications of jumping into virtual (or augmented) reality.

Top questions to consider when purchasing a gift for the holidays:

  1. What is XR anyway? And for that matter, what are VR and AR? Is there a difference?
  2. Will my XR system collect or share my data?
  3. Okay, but what does that mean for my privacy?
  4. Are there any safety risks?
  5. My kid has been asking me about an XR toy or game. Is it appropriate for children?
  6. Are there any psychological impacts associated with XR?
  7. What about inclusion? Is XR accessible for everyone?

1. What is XR anyway? And for that matter, what are VR and AR? Is there a difference?

The terms virtual reality (VR), augmented reality (AR) and extended reality (XR) are often used interchangeably but there are significant differences. XR serves as an umbrella term for AR, VR, and other immersive technology, but VR and AR are separate technologies that offer different experiences. Consumers should understand the differences between these terms prior to making a holiday purchase.

2. Will my XR system collect or share my data?

XR products collect personal information and other data about the user and how the user interacts with the product. Most immersive products must collect a large swath of data in order to function. Some estimates show that twenty minutes of VR use can generate approximately two million data points and unique recordings of body language, biometrics, or other physiological information. Unique recordings might include a user’s fingerprint, scans of hand or face geometry, iris and eye tracking, as well as other movement patterns, skin temperature, and heart rate. AR use may require a vast amount of spatial and mapping data to successfully overlay digital information onto images or video of private homes and public spaces. Thus the collection of location data as well as access to cameras and physical spaces occupied by the user is inherent in the product operations. Many XR systems also collect: name, address, email, IP address, and other personal information commonly collected by Internet-connected devices. In-product purchasing can add further details to an XR user’s profile, preferences, and interests.

Some XR providers share personal data with subsidiaries and third parties for a number of purposes—from improving content and informing future updates to using XR data for advertising and recommendations for online content.

3. Ok, but what does that mean for my privacy?

User privacy is impacted by data collection, use, and sharing involving XR technology. Data collected by XR systems can often be used to identify, analyze, track, or market to a particular user.  Many leading consumer-focused VR headset manufacturers de-identify user data prior to sharing it with third parties, but risks can remain. For example, a recent study showed that a machine learning model trained on individuals’ head and hand movements during a five minute VR session could identify the original user with 95% accuracy based on this data alone.

Moreover, the analysis of personal data from VR users can reveal sensitive details about the user’s body and life. Much of the tracking experienced in VR involves biometric or biologically-derived information, such as head and hand movement, hand geometry (the measurement and dimensions of a user’s hands), fingerprints, eye gaze and movement, and gait. Because these details are associated with the human body, these elements typically cannot be altered by the user and can reveal intimate details about a user’s height, weight, or medical condition. Additionally, tracking a user’s movements, such as eye gaze, can reveal intimate details about a user’s physical or emotional reactions to content, including a user’s likes and dislikes.

As for AR, information about a user’s location is often collected, whether it be longitudinal or latitudinal coordinates, images or video taken by an AR app of a recognizable location such as a park or a restaurant, or the dimensions or images of a user’s home. Location information is sensitive when recorded over time, leading to inferences about a user such as where they live, work, worship, or seek medical care. Moreover, data collected within the home is historically considered sensitive as it could reveal personal information about the activities of a user or their family.

Companies’ data sharing practices should also be on users’ radar as potentially impacting privacy. XR providers may share data with third parties to serve ads to users both in XR environments and other digital contexts. The content of these ads could be largely based on how the user interacts with XR from the types of devices they access, experiences they choose to purchase or download, and how they interact with the technology.

4. Are there any safety risks?

XR developers are increasingly aware of the safety risks associated with immersive and emerging technology and have sought to address them. For example, when a user is fully immersed in a VR world, there is a risk that the user will trip over or  bump into real-world objects or people. VR developers address these sorts of safety concerns by requiring the user to create virtual boundaries to prevent themselves from coming into unwanted contact with furniture, walls, and others. Additionally, many VR developers recommend at the start of an XR experience that users make certain they have a wide enough space to move limbs without obstruction to safely enjoy the experience. AR experiences can be safer, particularly when AR content does not obscure a user’s surroundings. But AR content experienced in public runs the risk of turning users’ attention away from other pedestrians, nearby objects, or even moving vehicles.

Other safety concerns include online harassment in social XR. Like with other social media, cyberbullying and harassment can impact XR social experiences. However, harassment and cyberbullying in VR can result in even more negative feelings than harassment in other forms of social media because of the immersive nature of VR. For example, reports have shown that women are often subject to harassment in VR by other users, sometimes leading them to choose non-gendered avatars in VR, or to opt-out of VR entirely. Developers recognize the potential for especially harmful harassment in VR and are taking steps to mitigate harassment through heavy moderation, reporting, and providing users with multiple ways to quickly exit a VR experience.

Relatedly, deep fakes and other manipulative or harmful content may be present in XR experiences. VR avatars today are generally cartoonish, rather than realistic representations. But it is possible that new risks will emerge as more realistic avatar technology permits malicious users to create or manipulate avatars depicting another person without their consent.

5. My kid has been asking me about an XR toy or game. Is it appropriate for children?

XR can provide fun, engaging, and educational experiences for children, but parents should be aware of age restrictions on XR products and applications. Consumer-facing VR headsets are typically directed to children at least 13 years old and are best enjoyed by teens and adults. On the other hand, many AR games and applications are directed to younger children and available on gaming consoles and other devices enjoyed by kids as young as elementary school.

Adults should be aware that the psychological impact of XR on children has not been widely researched, but some studies indicate that developing minds could be especially susceptible to negative content. Moreover, parents should be aware that cyberbullying and harassment of children and teens that occurs in gaming and on social media can be present in social and gaming XR experiences; this could be especially harmful given the sense of presence children and teens feel in XR.

Screen time is another major concern for many parents. Most VR experiences are designed for shorter playtime sessions than traditional video games—many titles clock in at around 30 minutes. Shorter XR sessions can help users avoid the nausea or other negative health impacts associated with extended XR experiences. Unlike traditional gaming, VR often involves movements of limbs and at times exercise through content that encourages movement, including fitness-specific content. On the other hand, AR content experienced on smart devices and more traditional gaming consoles are often intended for longer and sometimes less active engagement. Parents concerned with screen time will want to consider how XR will reduce, increase, or have little impact on children’s total screen time.

6. Are there any psychological impacts associated with XR?

Researchers are just starting to unpack the psychological impact of XR. However, thought leaders have pointed to potential positive psychological impacts of XR, especially VR, as being the ultimate empathy machine. Imagine, for instance, the ability for a user to “walk a mile in someone else’s shoes” in an immersive environment to gain a greater understanding of someone else’s lived experience leading to educational and engaging content. However, consumers should consider that the sense of presence users feel in immersive XR might also trigger negative, strange, or uncomfortable emotions in some situations. Additionally, XR can influence the ways in which some users walk, interact with others, and concentrate in real life, for good or for ill. Relatedly, some studies point to XR potentially resulting in body ownership illusions and body dysmorphia.

More research is needed in this area for consumers to gain a better understanding of how activities in the virtual world may impact life offline. Users would be wise to be cautious, and contemplate on their own how their online behavior in XR impacts their offline behavior, thoughts, and feelings.

7. What about inclusion? Is XR accessible for everyone?

Many XR developers are currently working to create inclusive experiences for a diverse group of users far beyond the early adopters of VR technology. Early adopters often skew well-off, tech-savvy, and largely male. But XR technologies can bolster inclusion, representation, and diversity. For example, some social VR platforms are adding a wider range of customizable avatar features, including a greater range of skin tones and virtual clothing such as hijabs.

Beyond increased representation of avatars, some developers are making hardware changes to make the technology more accessible. For example, a developer recently designed a specialty VR head strap made to more comfortably fit Black women with larger hairstyles and head wraps, rather than the more restrictive head strap that comes standard with most VR headsets. Researchers are designing XR technology that better maps to women’s’ physiology to mitigate the higher rates of cybersickness experienced by women when VR headsets cannot properly adjust to fit their field of view and distance between pupils, which often differs from that of men. Other hardware developments include increasingly lighter headsets that allow for better mobility and more comfortable wear, especially important for older users and users with limited muscle coordination.

Additionally, VR developers are introducing an increasing number of headsets that are both more user friendly out of the box and are offered at lower price points. Popular consumer VR headsets are available as stand alone devices that no longer need to be tethered to an advanced gaming PC. Many of these headsets are now available at a price point more akin to other gaming consoles on the market.

Further, there is increasing recognition that XR can provide engaging experiences for the disability community. For example, users with limited mobility could enjoy virtual content that seemingly transports them to another location that was previously inaccessible. More research, study, and development is required to make XR accessible for everyone, but consumers should be aware of the current efforts to make XR accessible to an increasing number of individuals and communities.

Conclusion

The above questions and answers represent just a handful of the many questions, concerns, and fascinations expressed by consumers looking to purchase an XR experience for the holidays. There are already many XR devices, applications, games, and other content on the market with more on the way. New, improved, and never-before-seen XR technical feats will debut at major events, such as January’s Consumer Electronics Show (CES), but the above questions and answers will equally apply. While stakeholders should continue to promote responsible XR, consumers would be wise to familiarize themselves with the potential risks and benefits of XR this holiday season and beyond.

*Image courtesy of Darlene Alderson from Pexels, available here.

The Complex Landscape of Enforcing the LGPD in Brazil: Public Prosecutors, Courts and the National System of Consumer Defense

Authors: Hunter Dorwart (FPF), Mariana Rielli (DPB) and Rafael Zanatta (DPB)

On Tuesday, November 24, 2020, the Future of Privacy Forum (FPF) and Data Privacy Brasil (DPB) co-hosted a landscape webinar exploring the relationship between Brazil’s legal system and the implementation of Brazil’s new data protection law, Lei Geral de Proteção de Dados (LGPD). As a federation, Brazil hosts many separate authorities and courts with their own competencies and powers on the national, state/regional, and municipal levels. Brazil’s recently created National Data Protection Authority (NDPA) will operate in a very complex system, alongside well established enforcers of the law, like consumer protection authorities and public prosecutors, on top of broad private rights of action granted by the LGPD directly to individuals. Because of this complex environment, uncertainty may appear as to how the LGPD will be implemented and enforced in practice. 

What are the various legal and regulatory institutions in Brazil that have authority over data protection? Will the implementation of the LGPD create more fragmentation and lead to a conflict of these competencies or will the LGPD help produce more consistency across the board? What are the solutions to solve potential sources of conflict in the Brazilian legal system? FPF’s Gabriela Zanfir-Fortuna and Data Privacy Brasil’s Bruno Bioni and Renato Leite Monteiro convened a panel of experts to discuss these questions. 

The speakers included: 

This blog (1) summarizes the contributions of our three guest speakers, focusing on (2) public prosecutors under the Public Ministry, (3) recent case-law from the two highest Federal Brazilian Courts, (4) the national system of consumer defence, and (5) outlines potential conflicts of competence, before reaching (6) conclusions. 

1. Background 

Since the enactment of the LGPD into law in 2018 commentators both within and outside of Brazil have pointed out that the lack of clarification around provisions and terms in the LGPD have created uncertainty as to how regulators will implement the law. From a broader perspective, however, the structure of the legal system of Brazil makes clarity even more difficult. This is because many legal institutions in Brazil have competences to enforce consumer protection laws, including issues that involve data protection and privacy. 

In addition, while the LGPD operates as a federal law, state and municipal authorities introduced data and consumer protection measures within their jurisdictions well before the LGPD’s enactment. Such diffusion has created a mosaic of legal competences and introduced a range of complexities that all data protection practitioners should be aware of when engaging with Brazil and its new data protection law. 

Below we discuss how the Public Ministries, recent case law from Brazil’s Constitutional Court and Superior Court, and the National System of Consumer Defense, have each applied their own regulatory and legal authority to affect Brazil’s data protection ecosystem. We also examine the implications of these various authorities for a potential conflict of competences before discussing a proposed amendment to Brazil’s constitution that could provide for more legal harmonization.

2. Public Ministries  

Legally structured by the 1988 Constitution, the Public Ministry (Ministério Público) hosts independent public prosecutors at both the federal and state level. The Ministries have specific functions in Brazil to uphold justice and bring cases at all levels of Brazil’s court system such as before the Supreme Federal Court and the state appellate courts. The public prosecutors operate independently from the three major branches of government and help protect constitutional rights by initiating civil actions to adjudicate issues that involve collective rights. There are currently 31 different Public Ministries throughout Brazil. 

Every prosecutor in the Public Ministries can start a civil action or procedure if he or she believes there is a basis in law. This relative flexibility presents wide implications for data protection as a public prosecutor may take action under the LGPD outside of the NDPA which could lead to a unique Brazilian way of enforcing and clarifying the law. On the one hand, legal uncertainty may arise from a profusion of individual initiatives by public prosecutors. On the other, the role of Public Ministries is pivotal as it could serve as a check to any NDPA action that runs contrary to public consumer interest. 

3. Recent Case Law 

In addition to the Public Ministries, recent case law in Brazil also shines light on some unique regulatory challenges facing the implementation of the LGPD. This case law illustrates how data protection was a fundamental issue in the judicial system before the enactment of the recent law, particularly in the area of consumer protection. Some of the most important decisions have clarified many issues for data protection such as the rights of data subjects, the scope of surveillance, and the application of key processing principles such as purpose limitation. As such, grasping the implications of this case law is critical for understanding how regulators will implement the LGPD. 

The Supreme Federal Court, which serves as Brazil’s highest court, recently issued a decision related to Covid-19. In this case (ADI 6387), a legal provision mandated personal data sharing for statistical purposes as an emergency measure in response to the pandemic. Many organizations throughout Brazil contested this provisional measure, arguing it did not meet the standards of purpose limitation, transparency, information security, and that it was overly broad. The Court agreed, upholding a higher bar for purpose limitation and many key aspects of the LGPD as well as clarifying some constitutional issues surrounding data protection. 

While the decision has not neutralized all of the risks for data protection in Brazil, it did establish precedent for lower courts and sent a clear message by recognizing data protection as an autonomous fundamental right. In so holding, the Court acknowledged that other constitutional protections of individuals such as privacy and due process explicitly extended to the online world and the protection of personal data. It also clarified that, contrary to arguments made by the Federal Attorney General and the Attorney General of the Republic, there is no irrelevant data in this day and age, and even personal data that may seem trivial, such as individuals’ names, phone numbers and addresses, deserve constitutional protection from abuse. The decision notably took influence from the European Charter of Fundamental Rights. 

Another recent case discussed the implications of consent for the credit scoring industry in Brazil. Although obtaining consent is not mandatory for companies that engage in credit scoring, the Superior Court of Justice, the highest court of appeal in the Brazilian jurisdiction, held that such companies must follow data protection standards in the credit scoring process. The Court discussed five broad principles that entities must follow going forward. 

In addition, courts have also independently clarified the right to be forgotten. In DPN v. Google Brasil Internet Ltda in 2018, a lower court in Brazil mandated that search engines had to uphold the right of individuals to be forgotten in indexing search results. While the Superior Court of Justice may still decide the scope of this right under the LGPD, this case illustrates that the issue has already received attention from at least one important court in the country and could be influential for ongoing legal decisions. 

Finally, two additional cases also shed light on how recent case law has influenced data protection in Brazil. One case held that contracts that preclude the ability of consumers to have a say about the scope of data disclosure were illegal (Case “José Galvão Silva vs Procob SA”, Special Appeal 1.758.799, State of Minas Gerais, decided by the Superior Court of Justice in November 2019). Another mandated the government of São Paulo remove cameras from the Metros, finding that such pervasive installation of surveillance equipment.

3. National System of Consumer Defense 

The National System of Consumer Defense (SNDC) also raises complexities for the implementation of the LGPD in Brazil. Established with the Brazilian Code of Consumer Protection in 1990 and regulated by Presidential Decree nº 2.181/1997, the SNDC brings together federal, state, and municipal agencies, as well as civil society organizations, to prevent, investigate and prosecute violations of consumer protection law. As a broad institutional framework for consumer protection, the SNDC has over 30 years of experience and covers 798 units spread across 591 Brazilian cities. 

The Procons (Procuradoria de Proteção e Defesa do Consumidor) function within the National System to help consumers administratively file complaints, give instructions and information about consumer rights, and verify judgments. The Procons have issued a few decisions related to data protection over the years that have generated attention. For example, one decision in 2019 by the Procon in São Paulo resulted in a large fine for Google and Apple for imposing unfair terms for the use of FaceApp without making such terms available in Portuguese. Another in 2020 saw the Procon-SP reach an agreement with the energy distributor Enel over consumer complaints of increased and incorrect billings. In the agreement, the Procon stipulated that Enel must demonstrate the security and technical measures it will take to ensure that the problem does not recur. 

While the SNDC can take separate enforcement measures against companies that violate consumer protection laws, including those operating online, potential coordination problems with this competency and the LGPD may arise in the future. Article 18 of the LGPD states that data subjects can exercise all of their rights before consumer-defense entities such as the SNDC. However, Article 55(k) also specifies that the Brazilian DPA will have the final say in interpreting such rights. Because these two institutions may conflict in their subsequent interpretations concerning these issues, cooperation could be hindered and result in more legal confusion and fragmentation. The LGPD may have predicted such a scenario, since Article 55(k) also states that the NDPA will articulate its performance with that of other bodies and entities with sanctioning or normative competencies related to the protection of personal data, and that it will be the central body of interpretation and implementation of the law. How this will all play out is something for the coming months (considering that the administrative sanctions provided for by the LGPD will only be enforceable after August 2021). 

4. Conflicts of Competencies 

Indeed, conflicts between all three of the institutions mentioned above may surface with the implementation of the LGPD. Because each of these authorities have competencies over online consumer protection, a ruling or judgment from one could be inconsistent with enforcement actions taken by the NDPA, especially given the ambiguity and lack of clarification around specific terms and provisions within the LGPD. Such conflict could create further uncertainty as to the application of data protection standards within the unique and complex institutional structure of Brazil’s legal system. 

While there are many potential resolutions of these conflicts, it is hard to predict exactly how the process will play out. The LGPD does not preclude other competences from enforcing data protection in Brazil. Nor will the law dismantle the Brazilian legal system. However, it does state that the various public bodies engaged in data protection will coordinate with one another to fulfill their duties effectively. 

The challenge is generating the operational capacity for cooperation within the Brazilian government itself, given that the employees and staff within these institutions change. Currently, the NDPA has coordinated experts from different subject areas to create a National Council within the Data Protection Authority to provide technical and operational guidance on solving some of these institutional issues. Hopefully as the NDPA gains more experience, some of these larger potential sources of conflict can be addressed. 

Finally, a proposed amendment to Brazil’s constitution (Proposal of Constitutional Amendment n. 17/2019) could also help provide more coherency and coordination between the various institutions that enforce data protection. The proposed amendment would explicitly recognize data protection as a constitutional right, give exclusive competence over data protection to the Union (seeking to avoid regulation with antagonistic results), and ensure that the NDPA has functional, financial, and administrative independence to exercise authority under the law. 

5. Conclusion

Brazil has come a long way in the construction of a solid data protection normative framework, in which the LGPD is a central part. Before that, the protection of individuals’ personal data was mobilized mainly through a robust system of consumer protection that congregates Public Ministries, several administrative bodies such as the Procons, as well as civil society organizations. 

The LGPD standardized the discipline of personal data protection in Brazil, creating general obligations for all sectors and systematizing the rights of data subjects. It has been driving the adequacy of companies and the public sector alike, and the debates it has generated certainly represent the most important movement towards the consolidation of a data protection culture in the country. 

However, it is essential to note that the law operates within an existing framework, and therefore, it must be harmonized with other norms and institutions. There is a challenge for regulators in how they interpret and advance the right to data protection while remaining cohesive across institutional competences to supervise and enforce the law.

In that sense, the LGPD has proposed an articulation of all bodies that may have overlapping competencies on the matter of data protection, with the NDPA serving as the nerve center of interpretation and development of guidelines of implementation. This suggests that initiatives of cooperation are ahead of us, but it is too early to note what issues may arise from the combination of several different paths for data protection enforcement that the Brazilian legal environment provides, as well as how those issues will be addressed and eventually resolved. 

This scenario makes the harmonization of the interpretation of the General Personal Data Protection Law challenging. For companies operating in Brazil, this requires a more sophisticated capacity for mapping legal risks. For Brazilian authorities, it demands a greater capacity for institutional articulation. For civil society, it demands a broader monitoring capacity and multiple dialogues with authorities. For all stakeholders, the challenge is significant. As composer Antonio Carlos Jobim once said, “Brazil is not for beginners”.

***

Learn more about Data Privacy Brasil HERE.

Policy Brief: Location Data Under Existing Privacy Laws

The Future of Privacy Forum released a new policy brief, Location Data Under Existing Privacy Laws

Defining and regulating location data in a privacy law can be an elusive challenge. In part, this is due to its ubiquity in our lives: information about how devices and people move through spaces over time is utilized by Wi-Fi networks, smartphones, mobile apps, and a world of emerging screenless technologies, such as wearable fitness devices, scooters, autonomous vehicles, and video analytics. Existing legal and self-regulatory regimes in the United States (and globally) approach location data in a variety of ways that may serve as a model for policymakers. 

Read the policy brief to learn about the challenges associated with defining location data, when location data is considered “personal” data, and the specific legal protections for location data in the United States and around the world.