FPF Health and AI & Ethics Policy Counsels Present a Scientific Position at ICML 2020 and at 2020 CCSQ World Usability Day

On November 12, 2020, FPF Policy Counsels Drs. Rachele Hendricks-Sturrup and Sara Jordan presented privacy-by-design alongside human-centered design concepts during the 2020 CCSQ World Usability Day virtual conference. This presentation followed Drs. Hendricks-Sturrup’s and Jordan’s July 2020 scientific position paper presented at the International Conference on Machine Learning (ICML) 2020, entitled “Patient- Reported Outcomes: A Privacy-Centric and Federated Approach to Machine Learning.”

Drs. Hendricks-Sturrup and Jordan gave their position that patient reported outcomes (PROs) data requires special privacy and security considerations, due to the nature of the data, if used within on-device or federated machine learning constructs as well as in the development of artificial intelligence platforms. Patient reported outcomes, being a raw form of patient expression and feedback, help clinicians, researchers, medical device and drug manufacturers, and governmental stakeholders overseeing medical device and drug development, distribution, and safety monitor, understand, and document, in a readable format, patients’ symptoms, preferences, complaints, and/or experiences following a clinical intervention. Gathering and using such data requires careful attention to security, data architecture, data use, and machine-readable consent tools and privacy policies.

Even so, on-device patient reported outcome measurement tools, like patient surveys within third-party mobile apps that use machine learning, may employ the best machine-readable privacy policies or consent mechanisms, but may ultimately leave key components of privacy protections up to the patient-user. Keeping data in the hands of users opens those users up to unanticipated vectors of attack from adversaries striving to identify the valuable machine learning models or seeking to uncover data about a specific patient. 

Drs. Hendricks-Sturrup and Jordan recommended that developers of patient reported outcome measurement systems leveraging federated learning architectures:

  1. Intentionally design user device security, as well as security for transmission of either data (raw or processed) or model gradients, to the highest level of protections that do not degrade essential performance for critical health and safety monitoring procedures (e.g. remote monitoring for clinical trials, post-market drug safety surveillance, hospital performance scores, etc.);
  2. Ensure that models are not compromised and that valuable machine learning spending is not lost to competitors; 
  3. Design systems to operate atop a federated machine learning architecture, both when model components are sent or gradients received, to ensure privacy of users’ data; and  
  4. Design learning algorithms with algorithmic privacy techniques, such as including differential privacy, which is essential to secure valuable and sensitive PRO data. 

Drs. Hendricks-Sturrup’s and Jordan’s paper, poster, and presentation can be found at these links:

ICML paper

ICML poster

CCSQ World Usability Day recording

CCSQ World Usability Day slides

To learn more about FPF’s Health & Genetics and AI & Ethics initiatives, contact Drs. Hendricks-Sturrup and Jordan, respectively, at: [email protected] and [email protected]

Workshop Report: Privacy & Pandemics – Responsible Use of Data During Times of Crisis

In October 2020, the Future of Privacy Forum (FPF) convened a virtual workshop entitled “Privacy and Pandemics: Responsible Uses of Technology and Health Data During Times of Crisis” with invited computer science, privacy law, public policy, social science, and health information experts from around the world to examine benefits, risks, and strategies for the collection and use of data in support of public health initiatives in response to COVID-19 and for consideration of future public health crises. With support from the National Science Foundation, Intel Corporation, Duke Sanford School of Public Policy, and Dublin City University’s SFI ADAPT Research Centre, the workshop identified research priorities to improve data governance systems and structures in the context of the COVID-19 pandemic. 

To learn more about FPF’s work related to privacy and pandemics, please visit the Privacy & Pandemics page.

Drawing on the expertise of workshop participant submissions and session discussions, FPF prepared a workshop report which was submitted to the National Science Foundation for use in planning the Convergence Accelerator 2021 Workshops. This NSF program aims to speed the transition of convergence research into practice to address grand challenges of national importance. The final submitted workshop report is also available on our website.

Based on analysis of expert positions reflected in the workshop, FPF recommends NSF consider the following roadmap for research, practice improvements, and development of privacy-preserving products and services to inform responses to the COVID-19 crisis and in preparation for future pandemics or other public crises:

To learn more about the Privacy & Pandemics conference, including information about the topics, participants, sessions, presentations, and to read the final workshop report, head to the event page

South Korean Personal Information Protection Commission Announces Three-Year Data Protection Policy Plan

by Jasmine Park

On November 24, 2020, the South Korean Personal Information Protection Commission (PIPC), the nation’s central administrative agency tasked with protecting the privacy rights of individuals by enforcing the country’s privacy laws, released its revised three-year “Personal Information Protection Master Plan” (‘21-‘23). A wide range of policies that balance both the protection and use of personal information will be implemented at the national level, such as improving the system for obtaining consent when collecting personal information, providing incentives for self-regulation, and reforming the system regulating the cross-border transfer of personal information. One innovative area where the PIPC will play a leading role is developing a comprehensive support system for the use of pseudonymized data. 

The plan includes three key strategies:

  1. Confident protection of personal information;
  2. Secure use of personal information that increases the value of data; and
  3. A fair balance between protection and use of personal information as the control tower.

The first strategy aims to 1) reinforce data subjects’ rights and promote citizen’s privacy literacy, 2) create a business ecosystem of voluntary protection of personal information, and 3) advance personal information protection systems in public sectors. The second strategy will 1) support the safe use of personal information, 2) eliminate blind spots in the digital transformation environment, and 3) create a safer environment for personal information through research and development. The third strategy sets out to 1) take stern measures against and respond promptly to privacy violations, 2) build national governance for personal information protection, 3) strengthen global personal information cooperation, and 4) reinforce the PIPC’s leadership as a unified supervisory body.

PIPC Chairman Yoon Jong-in also presented a report on the plan to the State Council of South Korea on November 24, 2020. The plan revises the “4th Personal Information Protection Plan” which was announced earlier in February 2020, and lays out the driving strategy and direction of major policies for the next three years, including the government’s plan for personal information protection. A need to revise the “4th Personal Information Protection Plan” arose due to the establishment of the PIPC as a central administrative agency on August 8, 2020, under the “Amendments to the Three Data Privacy Laws”, and the socially distanced and digital society brought about by COVID-19. 

Therefore, the PIPC revised its plan after conducting an analysis of the environment, public surveys, and system research. Yoon Jong-in announced that the new plan will take effect in 2021 on the 10th anniversary of the enactment of PIPA, and encouragingly stated that “if the past decade was the time to lay the foundation for personal information protection in Korea, the next decade is the key to action. Built on trust in the data economy, we will do our best to implement the personal information protection plan so data can be used safely and well.”

The PIPC also aims to strengthen policies that confidently protect people’s personal information in the private and public sectors. It sets out to improve obtaining consent when collecting personal information, introduce new rights such as data portability, and effectively protect people’s control over their personal information in accordance with the changing times. In addition, the PIPC plans to enhance the self-protection of personal information by having people protect their own personal information taking into account the sensitivity of the data, providing incentives to businesses based on their performance voluntarily protecting personal information in a self-regulatory system, and developing professional expertise. 

The PIPC will also expand the existing standards for privacy impact assessments by considering emerging privacy risk factors from new technologies, and expand the data breach incident factors assessment standards to prevent data breaches in the public sector. The public sector itself will take the lead on strengthening the foundation of personal information management, raising the standard through on-site inspections.

Further, in an economy increasingly driven by data, the PIPC will activate a pseudonymized data system to ensure personal information is used securely, and develop personal information protection systems and technologies. South Korea’s data protection law, the Personal Information Protection Act (PIPA) was amended in January 2020, and centralizes the data regulatory functions of PIPC (established as an administrative agency in September 2011), the Ministry of the Interior and Safety (MOIS), and the Korea Communications Commission (KCC) under PIPC, elevating it to the central data privacy regulatory authority in South Korea. While the PIPA has laid the foundation for processing and using pseudonymized personal information, due to the need to continuously enhance protections, the PIPC will develop a comprehensive support system and operate a government council to this end. The system will allow the combination of pseudonymized information by including an application and guidelines for submitting, receiving, and combining pseudonymized information, generating combined key-linked information, and managing the status of combinations. 

The PIPC will also develop new protection standards for a digital society where new technologies such as artificial intelligence, cloud, and self-driving technologies have become widespread, and will actively review and seek to improve regulatory sandboxes that have been proven to require modification. 

Finally, as the nation’s personal information protection “control tower”, the PIPC aims to strengthen its role in personal information protection domestically and internationally, and lead public-private global governance in balancing the protection and use of personal information. The PIPC also announced that it will increase inspections of public institutions that have large-scale personal information, carry out strict investigations and enforcements, and convene a government joint response consulting body to respond to data breaches. While the PIPC serves as a one-stop-shop for obtaining advice related to personal information protection with addressing complaints as one of the most anticipated functions of the PIPC, it will also assess and improve the cross-border data transfer system in response to increasing overseas data transfers by reviewing the diversification of cross-border transfer requirements, such as non-consent standard contracts.

With thanks to Caroline Hopland for her contribution. 

***********************************************************

South Korean Personal Information Protection Plan Information (In Korean)
Pipc Launch Policy Vision Timeline

Machine Learning and Speech: A Review of FPF’s Digital Data Flows Masterclass

On Wednesday, December 9, 2020, FPF hosted a Digital Data Flows Masterclass on Machine Learning and Speech. The masterclass on Machine Learning and Speech is the first masterclass of a new series after completing the VUB-FPF Digital Data Flows Masterclass series with eight topics.

Professor Marine Carpuat (Associate Professor in Computer Science at the University of Maryland) starts with an introduction on the more advanced aspects of (un)supervised learning. One of the unique takeaways of the presentation is that Prof. Carpuat’s explanation of mathematical models to an audience without a mathematical background is second to none. Dr. Prem Natarajan (VP, Alexa AI-Natural Understanding) guides us through the intricacies of Machine Learning in the context of voice assistants. As a practitioner, Dr. Natarajan brings unique examples to this class. The presenters explored the differences between supervised, semi-supervised, and unsupervised machine learning (ML) and the impact of these recent technical developments on machine translation and speech recognition technologies. In addition, they briefly explored the history of machine learning to draw lessons from the long-term development of artificial intelligence and put new advancements in context. FPF’s Dr. Sara Jordan and Dr. Rob van Eijk moderated the discussion following expert overviews.

The recording of the class can be accessed here.

Machine Translation – An Introduction 

As a sub-field of computational linguistics, machine translation studies the use of software to translate text or speech from one language to another. Applications that use machine translation find the best fit for a translation based on probabilistic modelling. This means that for any given sentence in a language, translation models will find the highest probability of resemblance to a human translation and use that sentence in a translation. 

Machine translation has made considerable progress over the years. Previous models would break down simple sentences into individual words without taking into account the context of the sentence itself. This often led to incomplete, simplistic, or wholly incorrect translations. The introduction of deep learning and neural networks to machine translation greatly increased the quality and fluency of these translations by allowing translation tools to put the meaning of each word within the overall context of the sentence. 

Yet for all the progress, many challenges lay ahead. Deep learning requires very large datasets and paired translation examples (e.g., English to French), which might not exist for certain languages. Even when there is a large pool of data, sentences often have multiple correct translations, which poses additional problems for machine learning. In addition, the deployment of these models in the real world has also raised concerns, as translation errors can sometimes have severe consequences. 

Machine translation tools generally use sequence-to-sequence models (Seq2Seq) that share architecture with word embedding and other representation models. Seq2Seq models convert sequences from one language (e.g., English) to generate text in another (e.g., French). Seq2Seq models form the backbone of more accurate natural language processing (NLP) tools due to their use of recurrent neural networks (RNN), which allow models to take into account the context of the former input word when processing the next output word. Developers use Seq2Seq models for multiple use cases including dialogue construction, text summarization, question answering, image captioning, and style transfer. 

Word embedding refers to language processing models that map words and phrases onto vector space of real numbers through neural networks or other probabilistic models. This allows translation models to reduce confusion arising from a large dimension of possible outputs and better represent the context of a sentence in which words appear. These tools generally increase the performance of Seq2Seq models and NLP generally because they help models understand the context in which words appear. 

screen shot 2020 12 18 at 9.02.09 am

Figure 1. How to Train a Machine Learning Model?

How to Train a Machine Learning Model – Supervised, Unsupervised, and Semi-Supervised Learning 

Training neural models with limited translation data poses multiple problems (see also, Figure 1). For instance, situations calling for a more formal translation need plenty of formal writing data to achieve accuracy and fluency. But issues may arise if the model must account for different writing styles or more informal contexts. Fortunately, developers can apply different training paradigms, including supervised, unsupervised, and semi-supervised learning, to better train models with diverse and limited translation data. 

Supervised learning is a process of learning from labeled examples. Developers feed the model with instances of previous tests and answer keys to improve the model’s accuracy and precision. Supervised learning allows the model to match input source and output target sequences with correct translations through parallel sampling, which refers to sampling a limited subset of data for one language against a whole body of data for another. In other words, the translation model can learn from examples of text in one language to better calculate the probability of a correct translation in another. Under this type of learning, it is important to create parameters that give a high probability of success to avoid overfitting and make the model generalizable across contexts. 

On the other end of the spectrum, unsupervised learning gives previous tests to the model without answer keys. Under this type of training, models learn by making connections between unpaired monolingual samples in each language. Put simply, machine translation models study individual words in a language (e.g., through a dictionary) and draw connections between those words. This helps the model find similarities or patterns in language across the data and predict the next word in a sentence from the previous word in that sentence. 

Unsupervised learning is a simple yet powerful tool. It works extremely well when the developer has a large pool of data to help the model reinforce its understanding of the syntactic or grammatical structures between languages. For example, deep learning encodes an English sentence and plugs it into the French output model which then generates a translation based on the cross-lingual correspondences it learned through the data. While many applications of unsupervised learning are still at the proof of concept stage, this learning technique offers a promising avenue for language processing developers. 

Finally, semi-supervised learning combines elements of both supervised and unsupervised learning and is the primary training method for translation models. The model learns from a multitude of unpaired samples (unsupervised learning) but then receives paired samples (supervised learning) to reinforce cross-lingual connections. Put simply, unsupervised training helps the model learn syntactic patterns in languages, while supervised training helps the model understand the full meaning of a sentence in context. Combinations of both training techniques strengthen syntactic and semantic connections simultaneously.

Limitations and Challenges of Machine Translation 

While the accuracy and quality of machine translation have increased through supervised, unsupervised, and semi-supervised learning, there are still many obstacles to scale the 7,111 languages spoken in the world today. Indeed, while the lack of data poses its own challenges for training purposes, machine translation still runs into problems with adequacy, accuracy, and fluency in languages where plenty of data exists. 

Developers have addressed these problems by attempting to quantify adequacy and compute the semantic similarity between machine translation and human translation. One common method is BLEU, which counts the exact matches of word sequences between human translation and the machine output. Another, BVSS, combines word pair similarity into a sentence score to measure a translation against a real-world output. In practice, developers need to use both metrics at the same time in order to make machine translation more adequate and fluent. 

From a high-level perspective, neural models employing machine learning produce more fluent translations than older statistical models. Forward-looking models are beginning to build systems that adapt to different audiences and can incorporate other metrics of translation such as formality and style. While much work is needed in both supervised and unsupervised learning, case studies have already revealed a promising ground for future improvements.

Machine Learning and Speech 

Many of the lessons learned in recent advancements of machine translation also apply to the area of speech recognition. The introduction of deep learning and automatic features in both fields has accelerated development, bringing more complexity and new challenges to training and refining machine learning models than what came before. As the application of machine translation and speech recognition become more prevalent in the economy, it is important to understand where the field came from and where it is going. 

Like machine translation, speech recognition has a long history that stretches back to the invention of artificial intelligence (AI) as a formal academic discipline in the 1950s. The introduction of neural networks and statistical models in the 1980s and 1990s marked a watershed moment in the development of machine learning and speech. Inputting human interaction directly into the machine helped create trainable models and gave impetus to a variety of speech recognition projects. 

Yet while the idea of machine learning through neural networking became widespread in the discipline, the lack of computing power and data inhibited its development. With widespread increases in computational power in the 1990s and 2000s, researchers were able to introduce automatic machine learning tools, which greatly accelerated the development of speech recognition and machine translation. But as automatic models became more sophisticated in identifying recognizing inputs and reducing recognition errors, these models also led to decreased interpretability.

screen shot 2020 12 18 at 9.04.40 am

Figure 2. Historical overview of the technological evolution of AI/ML.

Since then, correcting for poor interpretability in these models has involved a range of training methods, including supervised and unsupervised learning. The widespread availability of data has helped this process because developers can now find novel ways to make automatic features more reliable and accurate. 

As speech recognition technology begins to interface with humans through conversational AI, the competency of these models may greatly improve. This is because engaging with data in real time allows machine learning models to adapt to novel situations when presented with new information or new vocabulary that it did not previously know. To this end, research is currently moving in the direction of parameter defining and error evaluation in order to make training more accurate and effective. 

To be sure, advancements in human-computer interface have a history going back to the era of Markov Models in the 1980s (see, Figure 2). However, today the proliferation of practical speech recognition applications fueled by automatic features has underscored a change in the pace of research. In addition, the role of data, including less labelled data, has also become increasingly important in the process of training models. 

Advancements in Training Models 

Indeed, present speech recognition technology utilizes a range of learning methods at the cutting edge of deep learning, including active learning, transfer learning, semi-supervised learning, learning to paraphrase, self-learning, and teaching AI. 

Active learning refers to a training process where different loss functions divide training data in different ways to test the models adaptability to new contexts with the same data. Transfer learning data takes data-rich domains and finds ways to transfer learning to new domains that lack corresponding data. For instance, in the context of language learning, this type of learning takes lessons learned from one language and transfers them to another. 

As discussed above, semi-supervised learning trains large models with a combination of supervised and unsupervised learning and uses such models to train smaller models and filter unlabeled data. Indeed with semi-supervised training, the current trend is moving away from supervised learning and reducing reliance on annotated data for training purposes.

Learning to paraphrase refers to a learning process where a model will take a first pass at guessing the meaning of an input, receive a new paraphrased version of the input, and then attempt again at decoding a correct output. Related to this, self-learning takes signals from the real-world environment and makes connections between data, without requiring any human intervention in the process. Finally, teaching AI allows models to integrate natural language with common-sense reasoning and learn new concepts through direct and interactive teaching from customers. 

Each of these training methods marks a departure from past learning paradigms that required direct supervision and labelled data from human trainers. While limitations in the overall architecture of training models still exist, major areas of research today focus on approximating the performance of supervised learning models with less labelled data. 

Self-Learning in New Applications and Products 

These advancements are coinciding with the proliferation of consumer products designed and optimized for conversational translation. Speech recognition converts audio into sequences of phonemes and then uses machine translation to select the best output between alternate interpretations of those phonemes. New applications on the consumer market can directly learn through their interaction with the consumer by combining traditional machine translation models with neural language models that apply AI and self-learning. 

Learning models that can act more independently and filter unlabelled data will become more effective in adapting to novel contexts and processing language more accurately and fluently. Already, developers are improving the level of fluency in these products by incorporating conversational-speech data that includes tone of voice, formality, and colloquial expressions in the training. 

The recording of the class can be accessed here

Digital Data Flows Masterclass Archive

The Digital Data Flows Masterclass Archive with the recordings and resources for previous classes can be accessed here

To learn more about FPF in Europe, visit https://fpf.org/eu.

How Federal Privacy Legislation Could Affect US-EU-UK Relations

With more than 30 bills filed in the United States Congress since 2018 to regulate privacy with overwhelming support from the public, it looks like the US might be going through a ‘privacy renaissance’. This week, FPF Senior Counsel Dr. Gabriela Zanfir-Fortuna’s article, “America’s ‘privacy renaissance’: What to expect under a new presidency and Congress,” was featured in the Ada Lovelace Institute Blog. In the article, Gabriela lays out similarities and differences between the two major federal privacy bills currently in Congress, the SAFE DATA Act and the Consumer Online Privacy Act (COPRA), before breaking down the potential implications that a federal privacy bill passed by the incoming Biden administration would have on US-EU-UK relations during a time that the President of the EU Commission called, “an unprecedented window of opportunity to set a joint EU-US tech agenda.”

You can read Gabriela’s full article on the Ada Lovelace Institute website.

Seven Questions to Ask if You Have XR on Your Holiday Wish List

The holidays are right around the corner, and with so many of us sheltering in place in response to COVID-19, some are looking for an escape from the same four walls. Enter XR to help virtually transport us to new worlds, immersive games, and social interactions. XR, or extended reality, is an umbrella term for virtual reality (VR), augmented reality (AR), and other immersive technology. In 2020, we have seen increasingly accessible and affordable XR on the market, and undoubtedly much of this technology is at the top of holiday wishlists. But new tech raises new questions about the privacy and security of personal data. Below are seven questions consumers should be asking before buying an XR gift for a loved one (or themselves) this holiday season, along with answers to better understand the implications of jumping into virtual (or augmented) reality.

Top questions to consider when purchasing a gift for the holidays:

  1. What is XR anyway? And for that matter, what are VR and AR? Is there a difference?
  2. Will my XR system collect or share my data?
  3. Okay, but what does that mean for my privacy?
  4. Are there any safety risks?
  5. My kid has been asking me about an XR toy or game. Is it appropriate for children?
  6. Are there any psychological impacts associated with XR?
  7. What about inclusion? Is XR accessible for everyone?

1. What is XR anyway? And for that matter, what are VR and AR? Is there a difference?

The terms virtual reality (VR), augmented reality (AR) and extended reality (XR) are often used interchangeably but there are significant differences. XR serves as an umbrella term for AR, VR, and other immersive technology, but VR and AR are separate technologies that offer different experiences. Consumers should understand the differences between these terms prior to making a holiday purchase.

2. Will my XR system collect or share my data?

XR products collect personal information and other data about the user and how the user interacts with the product. Most immersive products must collect a large swath of data in order to function. Some estimates show that twenty minutes of VR use can generate approximately two million data points and unique recordings of body language, biometrics, or other physiological information. Unique recordings might include a user’s fingerprint, scans of hand or face geometry, iris and eye tracking, as well as other movement patterns, skin temperature, and heart rate. AR use may require a vast amount of spatial and mapping data to successfully overlay digital information onto images or video of private homes and public spaces. Thus the collection of location data as well as access to cameras and physical spaces occupied by the user is inherent in the product operations. Many XR systems also collect: name, address, email, IP address, and other personal information commonly collected by Internet-connected devices. In-product purchasing can add further details to an XR user’s profile, preferences, and interests.

Some XR providers share personal data with subsidiaries and third parties for a number of purposes—from improving content and informing future updates to using XR data for advertising and recommendations for online content.

3. Ok, but what does that mean for my privacy?

User privacy is impacted by data collection, use, and sharing involving XR technology. Data collected by XR systems can often be used to identify, analyze, track, or market to a particular user.  Many leading consumer-focused VR headset manufacturers de-identify user data prior to sharing it with third parties, but risks can remain. For example, a recent study showed that a machine learning model trained on individuals’ head and hand movements during a five minute VR session could identify the original user with 95% accuracy based on this data alone.

Moreover, the analysis of personal data from VR users can reveal sensitive details about the user’s body and life. Much of the tracking experienced in VR involves biometric or biologically-derived information, such as head and hand movement, hand geometry (the measurement and dimensions of a user’s hands), fingerprints, eye gaze and movement, and gait. Because these details are associated with the human body, these elements typically cannot be altered by the user and can reveal intimate details about a user’s height, weight, or medical condition. Additionally, tracking a user’s movements, such as eye gaze, can reveal intimate details about a user’s physical or emotional reactions to content, including a user’s likes and dislikes.

As for AR, information about a user’s location is often collected, whether it be longitudinal or latitudinal coordinates, images or video taken by an AR app of a recognizable location such as a park or a restaurant, or the dimensions or images of a user’s home. Location information is sensitive when recorded over time, leading to inferences about a user such as where they live, work, worship, or seek medical care. Moreover, data collected within the home is historically considered sensitive as it could reveal personal information about the activities of a user or their family.

Companies’ data sharing practices should also be on users’ radar as potentially impacting privacy. XR providers may share data with third parties to serve ads to users both in XR environments and other digital contexts. The content of these ads could be largely based on how the user interacts with XR from the types of devices they access, experiences they choose to purchase or download, and how they interact with the technology.

4. Are there any safety risks?

XR developers are increasingly aware of the safety risks associated with immersive and emerging technology and have sought to address them. For example, when a user is fully immersed in a VR world, there is a risk that the user will trip over or  bump into real-world objects or people. VR developers address these sorts of safety concerns by requiring the user to create virtual boundaries to prevent themselves from coming into unwanted contact with furniture, walls, and others. Additionally, many VR developers recommend at the start of an XR experience that users make certain they have a wide enough space to move limbs without obstruction to safely enjoy the experience. AR experiences can be safer, particularly when AR content does not obscure a user’s surroundings. But AR content experienced in public runs the risk of turning users’ attention away from other pedestrians, nearby objects, or even moving vehicles.

Other safety concerns include online harassment in social XR. Like with other social media, cyberbullying and harassment can impact XR social experiences. However, harassment and cyberbullying in VR can result in even more negative feelings than harassment in other forms of social media because of the immersive nature of VR. For example, reports have shown that women are often subject to harassment in VR by other users, sometimes leading them to choose non-gendered avatars in VR, or to opt-out of VR entirely. Developers recognize the potential for especially harmful harassment in VR and are taking steps to mitigate harassment through heavy moderation, reporting, and providing users with multiple ways to quickly exit a VR experience.

Relatedly, deep fakes and other manipulative or harmful content may be present in XR experiences. VR avatars today are generally cartoonish, rather than realistic representations. But it is possible that new risks will emerge as more realistic avatar technology permits malicious users to create or manipulate avatars depicting another person without their consent.

5. My kid has been asking me about an XR toy or game. Is it appropriate for children?

XR can provide fun, engaging, and educational experiences for children, but parents should be aware of age restrictions on XR products and applications. Consumer-facing VR headsets are typically directed to children at least 13 years old and are best enjoyed by teens and adults. On the other hand, many AR games and applications are directed to younger children and available on gaming consoles and other devices enjoyed by kids as young as elementary school.

Adults should be aware that the psychological impact of XR on children has not been widely researched, but some studies indicate that developing minds could be especially susceptible to negative content. Moreover, parents should be aware that cyberbullying and harassment of children and teens that occurs in gaming and on social media can be present in social and gaming XR experiences; this could be especially harmful given the sense of presence children and teens feel in XR.

Screen time is another major concern for many parents. Most VR experiences are designed for shorter playtime sessions than traditional video games—many titles clock in at around 30 minutes. Shorter XR sessions can help users avoid the nausea or other negative health impacts associated with extended XR experiences. Unlike traditional gaming, VR often involves movements of limbs and at times exercise through content that encourages movement, including fitness-specific content. On the other hand, AR content experienced on smart devices and more traditional gaming consoles are often intended for longer and sometimes less active engagement. Parents concerned with screen time will want to consider how XR will reduce, increase, or have little impact on children’s total screen time.

6. Are there any psychological impacts associated with XR?

Researchers are just starting to unpack the psychological impact of XR. However, thought leaders have pointed to potential positive psychological impacts of XR, especially VR, as being the ultimate empathy machine. Imagine, for instance, the ability for a user to “walk a mile in someone else’s shoes” in an immersive environment to gain a greater understanding of someone else’s lived experience leading to educational and engaging content. However, consumers should consider that the sense of presence users feel in immersive XR might also trigger negative, strange, or uncomfortable emotions in some situations. Additionally, XR can influence the ways in which some users walk, interact with others, and concentrate in real life, for good or for ill. Relatedly, some studies point to XR potentially resulting in body ownership illusions and body dysmorphia.

More research is needed in this area for consumers to gain a better understanding of how activities in the virtual world may impact life offline. Users would be wise to be cautious, and contemplate on their own how their online behavior in XR impacts their offline behavior, thoughts, and feelings.

7. What about inclusion? Is XR accessible for everyone?

Many XR developers are currently working to create inclusive experiences for a diverse group of users far beyond the early adopters of VR technology. Early adopters often skew well-off, tech-savvy, and largely male. But XR technologies can bolster inclusion, representation, and diversity. For example, some social VR platforms are adding a wider range of customizable avatar features, including a greater range of skin tones and virtual clothing such as hijabs.

Beyond increased representation of avatars, some developers are making hardware changes to make the technology more accessible. For example, a developer recently designed a specialty VR head strap made to more comfortably fit Black women with larger hairstyles and head wraps, rather than the more restrictive head strap that comes standard with most VR headsets. Researchers are designing XR technology that better maps to women’s’ physiology to mitigate the higher rates of cybersickness experienced by women when VR headsets cannot properly adjust to fit their field of view and distance between pupils, which often differs from that of men. Other hardware developments include increasingly lighter headsets that allow for better mobility and more comfortable wear, especially important for older users and users with limited muscle coordination.

Additionally, VR developers are introducing an increasing number of headsets that are both more user friendly out of the box and are offered at lower price points. Popular consumer VR headsets are available as stand alone devices that no longer need to be tethered to an advanced gaming PC. Many of these headsets are now available at a price point more akin to other gaming consoles on the market.

Further, there is increasing recognition that XR can provide engaging experiences for the disability community. For example, users with limited mobility could enjoy virtual content that seemingly transports them to another location that was previously inaccessible. More research, study, and development is required to make XR accessible for everyone, but consumers should be aware of the current efforts to make XR accessible to an increasing number of individuals and communities.

Conclusion

The above questions and answers represent just a handful of the many questions, concerns, and fascinations expressed by consumers looking to purchase an XR experience for the holidays. There are already many XR devices, applications, games, and other content on the market with more on the way. New, improved, and never-before-seen XR technical feats will debut at major events, such as January’s Consumer Electronics Show (CES), but the above questions and answers will equally apply. While stakeholders should continue to promote responsible XR, consumers would be wise to familiarize themselves with the potential risks and benefits of XR this holiday season and beyond.

*Image courtesy of Darlene Alderson from Pexels, available here.

The Complex Landscape of Enforcing the LGPD in Brazil: Public Prosecutors, Courts and the National System of Consumer Defense

Authors: Hunter Dorwart (FPF), Mariana Rielli (DPB) and Rafael Zanatta (DPB)

On Tuesday, November 24, 2020, the Future of Privacy Forum (FPF) and Data Privacy Brasil (DPB) co-hosted a landscape webinar exploring the relationship between Brazil’s legal system and the implementation of Brazil’s new data protection law, Lei Geral de Proteção de Dados (LGPD). As a federation, Brazil hosts many separate authorities and courts with their own competencies and powers on the national, state/regional, and municipal levels. Brazil’s recently created National Data Protection Authority (NDPA) will operate in a very complex system, alongside well established enforcers of the law, like consumer protection authorities and public prosecutors, on top of broad private rights of action granted by the LGPD directly to individuals. Because of this complex environment, uncertainty may appear as to how the LGPD will be implemented and enforced in practice. 

What are the various legal and regulatory institutions in Brazil that have authority over data protection? Will the implementation of the LGPD create more fragmentation and lead to a conflict of these competencies or will the LGPD help produce more consistency across the board? What are the solutions to solve potential sources of conflict in the Brazilian legal system? FPF’s Gabriela Zanfir-Fortuna and Data Privacy Brasil’s Bruno Bioni and Renato Leite Monteiro convened a panel of experts to discuss these questions. 

The speakers included: 

This blog (1) summarizes the contributions of our three guest speakers, focusing on (2) public prosecutors under the Public Ministry, (3) recent case-law from the two highest Federal Brazilian Courts, (4) the national system of consumer defence, and (5) outlines potential conflicts of competence, before reaching (6) conclusions. 

1. Background 

Since the enactment of the LGPD into law in 2018 commentators both within and outside of Brazil have pointed out that the lack of clarification around provisions and terms in the LGPD have created uncertainty as to how regulators will implement the law. From a broader perspective, however, the structure of the legal system of Brazil makes clarity even more difficult. This is because many legal institutions in Brazil have competences to enforce consumer protection laws, including issues that involve data protection and privacy. 

In addition, while the LGPD operates as a federal law, state and municipal authorities introduced data and consumer protection measures within their jurisdictions well before the LGPD’s enactment. Such diffusion has created a mosaic of legal competences and introduced a range of complexities that all data protection practitioners should be aware of when engaging with Brazil and its new data protection law. 

Below we discuss how the Public Ministries, recent case law from Brazil’s Constitutional Court and Superior Court, and the National System of Consumer Defense, have each applied their own regulatory and legal authority to affect Brazil’s data protection ecosystem. We also examine the implications of these various authorities for a potential conflict of competences before discussing a proposed amendment to Brazil’s constitution that could provide for more legal harmonization.

2. Public Ministries  

Legally structured by the 1988 Constitution, the Public Ministry (Ministério Público) hosts independent public prosecutors at both the federal and state level. The Ministries have specific functions in Brazil to uphold justice and bring cases at all levels of Brazil’s court system such as before the Supreme Federal Court and the state appellate courts. The public prosecutors operate independently from the three major branches of government and help protect constitutional rights by initiating civil actions to adjudicate issues that involve collective rights. There are currently 31 different Public Ministries throughout Brazil. 

Every prosecutor in the Public Ministries can start a civil action or procedure if he or she believes there is a basis in law. This relative flexibility presents wide implications for data protection as a public prosecutor may take action under the LGPD outside of the NDPA which could lead to a unique Brazilian way of enforcing and clarifying the law. On the one hand, legal uncertainty may arise from a profusion of individual initiatives by public prosecutors. On the other, the role of Public Ministries is pivotal as it could serve as a check to any NDPA action that runs contrary to public consumer interest. 

3. Recent Case Law 

In addition to the Public Ministries, recent case law in Brazil also shines light on some unique regulatory challenges facing the implementation of the LGPD. This case law illustrates how data protection was a fundamental issue in the judicial system before the enactment of the recent law, particularly in the area of consumer protection. Some of the most important decisions have clarified many issues for data protection such as the rights of data subjects, the scope of surveillance, and the application of key processing principles such as purpose limitation. As such, grasping the implications of this case law is critical for understanding how regulators will implement the LGPD. 

The Supreme Federal Court, which serves as Brazil’s highest court, recently issued a decision related to Covid-19. In this case (ADI 6387), a legal provision mandated personal data sharing for statistical purposes as an emergency measure in response to the pandemic. Many organizations throughout Brazil contested this provisional measure, arguing it did not meet the standards of purpose limitation, transparency, information security, and that it was overly broad. The Court agreed, upholding a higher bar for purpose limitation and many key aspects of the LGPD as well as clarifying some constitutional issues surrounding data protection. 

While the decision has not neutralized all of the risks for data protection in Brazil, it did establish precedent for lower courts and sent a clear message by recognizing data protection as an autonomous fundamental right. In so holding, the Court acknowledged that other constitutional protections of individuals such as privacy and due process explicitly extended to the online world and the protection of personal data. It also clarified that, contrary to arguments made by the Federal Attorney General and the Attorney General of the Republic, there is no irrelevant data in this day and age, and even personal data that may seem trivial, such as individuals’ names, phone numbers and addresses, deserve constitutional protection from abuse. The decision notably took influence from the European Charter of Fundamental Rights. 

Another recent case discussed the implications of consent for the credit scoring industry in Brazil. Although obtaining consent is not mandatory for companies that engage in credit scoring, the Superior Court of Justice, the highest court of appeal in the Brazilian jurisdiction, held that such companies must follow data protection standards in the credit scoring process. The Court discussed five broad principles that entities must follow going forward. 

In addition, courts have also independently clarified the right to be forgotten. In DPN v. Google Brasil Internet Ltda in 2018, a lower court in Brazil mandated that search engines had to uphold the right of individuals to be forgotten in indexing search results. While the Superior Court of Justice may still decide the scope of this right under the LGPD, this case illustrates that the issue has already received attention from at least one important court in the country and could be influential for ongoing legal decisions. 

Finally, two additional cases also shed light on how recent case law has influenced data protection in Brazil. One case held that contracts that preclude the ability of consumers to have a say about the scope of data disclosure were illegal (Case “José Galvão Silva vs Procob SA”, Special Appeal 1.758.799, State of Minas Gerais, decided by the Superior Court of Justice in November 2019). Another mandated the government of São Paulo remove cameras from the Metros, finding that such pervasive installation of surveillance equipment.

3. National System of Consumer Defense 

The National System of Consumer Defense (SNDC) also raises complexities for the implementation of the LGPD in Brazil. Established with the Brazilian Code of Consumer Protection in 1990 and regulated by Presidential Decree nº 2.181/1997, the SNDC brings together federal, state, and municipal agencies, as well as civil society organizations, to prevent, investigate and prosecute violations of consumer protection law. As a broad institutional framework for consumer protection, the SNDC has over 30 years of experience and covers 798 units spread across 591 Brazilian cities. 

The Procons (Procuradoria de Proteção e Defesa do Consumidor) function within the National System to help consumers administratively file complaints, give instructions and information about consumer rights, and verify judgments. The Procons have issued a few decisions related to data protection over the years that have generated attention. For example, one decision in 2019 by the Procon in São Paulo resulted in a large fine for Google and Apple for imposing unfair terms for the use of FaceApp without making such terms available in Portuguese. Another in 2020 saw the Procon-SP reach an agreement with the energy distributor Enel over consumer complaints of increased and incorrect billings. In the agreement, the Procon stipulated that Enel must demonstrate the security and technical measures it will take to ensure that the problem does not recur. 

While the SNDC can take separate enforcement measures against companies that violate consumer protection laws, including those operating online, potential coordination problems with this competency and the LGPD may arise in the future. Article 18 of the LGPD states that data subjects can exercise all of their rights before consumer-defense entities such as the SNDC. However, Article 55(k) also specifies that the Brazilian DPA will have the final say in interpreting such rights. Because these two institutions may conflict in their subsequent interpretations concerning these issues, cooperation could be hindered and result in more legal confusion and fragmentation. The LGPD may have predicted such a scenario, since Article 55(k) also states that the NDPA will articulate its performance with that of other bodies and entities with sanctioning or normative competencies related to the protection of personal data, and that it will be the central body of interpretation and implementation of the law. How this will all play out is something for the coming months (considering that the administrative sanctions provided for by the LGPD will only be enforceable after August 2021). 

4. Conflicts of Competencies 

Indeed, conflicts between all three of the institutions mentioned above may surface with the implementation of the LGPD. Because each of these authorities have competencies over online consumer protection, a ruling or judgment from one could be inconsistent with enforcement actions taken by the NDPA, especially given the ambiguity and lack of clarification around specific terms and provisions within the LGPD. Such conflict could create further uncertainty as to the application of data protection standards within the unique and complex institutional structure of Brazil’s legal system. 

While there are many potential resolutions of these conflicts, it is hard to predict exactly how the process will play out. The LGPD does not preclude other competences from enforcing data protection in Brazil. Nor will the law dismantle the Brazilian legal system. However, it does state that the various public bodies engaged in data protection will coordinate with one another to fulfill their duties effectively. 

The challenge is generating the operational capacity for cooperation within the Brazilian government itself, given that the employees and staff within these institutions change. Currently, the NDPA has coordinated experts from different subject areas to create a National Council within the Data Protection Authority to provide technical and operational guidance on solving some of these institutional issues. Hopefully as the NDPA gains more experience, some of these larger potential sources of conflict can be addressed. 

Finally, a proposed amendment to Brazil’s constitution (Proposal of Constitutional Amendment n. 17/2019) could also help provide more coherency and coordination between the various institutions that enforce data protection. The proposed amendment would explicitly recognize data protection as a constitutional right, give exclusive competence over data protection to the Union (seeking to avoid regulation with antagonistic results), and ensure that the NDPA has functional, financial, and administrative independence to exercise authority under the law. 

5. Conclusion

Brazil has come a long way in the construction of a solid data protection normative framework, in which the LGPD is a central part. Before that, the protection of individuals’ personal data was mobilized mainly through a robust system of consumer protection that congregates Public Ministries, several administrative bodies such as the Procons, as well as civil society organizations. 

The LGPD standardized the discipline of personal data protection in Brazil, creating general obligations for all sectors and systematizing the rights of data subjects. It has been driving the adequacy of companies and the public sector alike, and the debates it has generated certainly represent the most important movement towards the consolidation of a data protection culture in the country. 

However, it is essential to note that the law operates within an existing framework, and therefore, it must be harmonized with other norms and institutions. There is a challenge for regulators in how they interpret and advance the right to data protection while remaining cohesive across institutional competences to supervise and enforce the law.

In that sense, the LGPD has proposed an articulation of all bodies that may have overlapping competencies on the matter of data protection, with the NDPA serving as the nerve center of interpretation and development of guidelines of implementation. This suggests that initiatives of cooperation are ahead of us, but it is too early to note what issues may arise from the combination of several different paths for data protection enforcement that the Brazilian legal environment provides, as well as how those issues will be addressed and eventually resolved. 

This scenario makes the harmonization of the interpretation of the General Personal Data Protection Law challenging. For companies operating in Brazil, this requires a more sophisticated capacity for mapping legal risks. For Brazilian authorities, it demands a greater capacity for institutional articulation. For civil society, it demands a broader monitoring capacity and multiple dialogues with authorities. For all stakeholders, the challenge is significant. As composer Antonio Carlos Jobim once said, “Brazil is not for beginners”.

***

Learn more about Data Privacy Brasil HERE.

Policy Brief: Location Data Under Existing Privacy Laws

The Future of Privacy Forum released a new policy brief, Location Data Under Existing Privacy Laws

Defining and regulating location data in a privacy law can be an elusive challenge. In part, this is due to its ubiquity in our lives: information about how devices and people move through spaces over time is utilized by Wi-Fi networks, smartphones, mobile apps, and a world of emerging screenless technologies, such as wearable fitness devices, scooters, autonomous vehicles, and video analytics. Existing legal and self-regulatory regimes in the United States (and globally) approach location data in a variety of ways that may serve as a model for policymakers. 

Read the policy brief to learn about the challenges associated with defining location data, when location data is considered “personal” data, and the specific legal protections for location data in the United States and around the world.

The Spectrum of Artificial Intelligence – An Infographic Tool

UPDATED August 3, 2021 and June 2023. FPF released the white paper, The Spectrum of AI: Companion to the FPF AI Infographic, to provide additional detail and analysis for use of this Infographic tool as an educational resources for policymakers or regulators.

FPF has just completed its newest infographic educational tool, The Spectrum of Artificial Intelligence.

AI is the computerized ability to perform tasks commonly associated with human intelligence, including reasoning, discovering patterns and meaning, generalizing knowledge across spheres of application, and learning from experience. The growth of AI-based systems in recent years has garnered much attention, particularly in the sphere of Machine Learning (ML). A subset of AI, ML systems “learn” from the success or accuracy of their outputs, and can adapt their programming over time, with minimal human intervention. But there are other types of AI that, alone or in combination, lie behind many of the real-world applications in common use.

General AI – truly human-level computational systems – does not yet exist. But Narrow AI exists in many fields and applications where computerized systems enhance human output or outperform humans at defined tasks. This chart is designed to identify and explain the main types of AI, their relationship to each other, and provide some specific examples of how they are currently in place throughout our day-to-day lives. It also demonstrates how AI exists within the timeline of human knowledge and development. Building on knowledge from philosophy, mathematics, physics, ethics, and logic, and more recently, statistical analytics and modeling, it also reflects the foundational role of computer design and security.

There is not 100 percent consensus around the labels for the various types of AI, but for our purposes, we have adopted the following categories as representing generally accepted terms in common use.

To aid in understanding these various types of AI programming, we have highlighted a specific use case in a number of broad topic areas.

For further information or to provide comments or suggestions, please contact Brenda Leong ([email protected]) or Dr. Sara Jordan ([email protected]).

Rgb Image

FPF and LGBT Tech Host Discussion in Honor of Human Rights Day

Yesterday, in honor of International Human Rights Day, a time when individuals and organizations around the world celebrate the 1948 ratification of the Universal Declaration of Human Rights, FPF and LGBT Tech hosted a discussion exploring the “LGBTQ+ Right to Privacy.” 

The conversation featured tech privacy and policy experts from both FPF and LGBT Tech, including: Christopher Wood, Executive Director, LGBT Tech; Christopher Wolf, Founder and Board Chair, FPF; Carlos Gutierrez, Deputy Director & General Counsel, LGBT Tech; Dr. Sara Jordan, Artificial Intelligence and Ethics Policy Counsel, FPF; and Katelyn Ringrose, Christopher Wolf Diversity Law Fellow, FPF. The conversation was moderated by Jules Polonetsky, FPF CEO. 

The conversation revolved around issues related to the United Nations 2020 theme — which is “Stand Up for Human Rights” — and the speakers tackled entrenched and systematic inequalities experienced by the LGBTQ+ community from a privacy and equity perspective.   

Throughout the discussion, speakers noted the many beneficial uses of data pertaining to an individual’s sexual orientation, gender identity and sex life; while also discussing the many ways that such data can be used to perpetuate systems that discriminate against, oppress, or otherwise harm members of the LGBTQ+ community. 

For example, while the collection of health data is important in the context of COVID-19 contact tracing, the LGBTQ+ community has not always benefited from inclusive data protection practices and may be less willing to have their data collected and used in the name of public health. According to Carlos Gutierrez, Deputy Director & General Counsel of LGBT Tech, “historically, HIV laws were weaponized against the LGBTQ community, and gay men. Getting an HIV positive test result could mean the loss of employment, dropped insurance, and more.” 

And while certain uses of data may pose risk for the LGBTQ community, speakers also pointed out that that inclusive data collection for marginalized communities tends to benefit society as a whole. For example, Dr. Sara Jordan, Artificial Intelligence and Ethics Policy Counsel for FPF, noted that existing definitions for sexual orientation and gender identity data (SOGI data) often do not address issues pertaining to the intersex community, and how identifying such gaps and leveraging data for good can allow the research community to “advance research, or identify ways to redirect existing research streams.” This, Dr. Jordan noted, will allow the benefits of using this data to “flow directly” back to individuals and communities in need of such advances. 

Watch the full discussion here on LGBTQ+ data privacy here:

https://www.youtube-nocookie.com/embed/2-GaUxv9Q8c?feature=oembed&width=500&height=750&discover=1

Regarding FPF or LGBT Tech’s joint efforts related to LGBTQ+ privacy; contact either Katelyn Ringrose at [email protected] or Chris Wood at [email protected] with any questions, comments, or to get involved.