Synthetic Content: Exploring the Risks, Technical Approaches, and Regulatory Responses

Today, the Future of Privacy Forum (FPF) released a new report, Synthetic Content: Exploring the Risks, Technical Approaches, and Regulatory Responses, which analyzes the various approaches being pursued to address the risks associated with “synthetic” content – material produced by generative artificial intelligence (AI) tools. As more people use generative AI to create synthetic content, civil society, media, and lawmakers are paying greater attention to some of the risks—such as disinformation, fraud, and abuse. Legislation to address these risks has focused primarily on disclosing the use of generative AI, increasing transparency around generative AI systems and content, and placing limitations on certain synthetic content. However, while these approaches may address some challenges with synthetic content, each one is individually limited in its reach and implicates a number of tradeoffs that policymakers should address going forward.

This report highlights the following themes:

This report is based on an extensive survey of existing technical and policy literature, recently-proposed and/or enacted legislation, and emerging regulatory guidance and rulemaking. The appendix provides further details about the current major legislative and regulatory frameworks being proposed in the U.S. regarding synthetic content. 

This report is part of a larger, ongoing FPF effort to monitor and analyze emerging trends in synthetic content, including its potential risks, technical developments, and relevant legislation and regulation. For previous FPF work on this issue, check out the following:

If you would like to speak with us about this work, or about synthetic content more generally, please reach out to Jameson Spivack ([email protected]).

Out, Not Outed: Privacy for Sexual Health, Orientations, and Gender Identities

Co-authored by: Judy Wang (FPF Intern), Jeter Sison (FPF Intern), Jordan Wrigley (FPF Data and Policy Analyst, Health & Wellness)

On National Coming Out Day, it’s important to recognize that Coming Out is a right of passage for many LGBTQ+ individuals and a decision that they should be empowered to make for themselves. 

Protections for health information are essential to ensuring the autonomy of an individual in choosing how to come out, to whom, and when. A person’s health information may facially reveal their sexual orientation and gender identity (“SOGI”). Alternatively, a person’s health information may not specifically include SOGI information, but SOGI information may be able to be inferred or extrapolated from other health information, especially in a personal profile that includes many different data points. 

Opportunities for Health Data and Services for LGBTQ Individuals

While uses of health data may carry heightened risk for LGBTQ+ individuals, it is also particularly critical for those same individuals to have access to safe, secure, and practicable physical and mental health services. A poll by LGBT Tech shows LGBTQ+ individuals use online and digital health resources extensively to navigate information and access to healthcare. The National Coalition for LGBTQ Health has found that “LGBTQ people are more likely to report poor physical and mental health than the general population, including increased incidence of HIV and other sexually transmitted infections (STIs), long term conditions such as arthritis and chronic fatigue, and elevated risk of depression, anxiety, and other mental illness.” In addition, LGBTQ+ youth have been found to face heightened risks for mental health issues, and are 300% more likely to suffer from symptoms of depression. Unfortunately, at the same time, LGBTQ+ adults are twice as likely to report having experienced a negative health care interaction. 

To address these health disparities, researchers have stressed the importance of collecting additional health information, including SOGI information. In addition, where LGBTQ+ individuals do not have local access to equitable health care resources, connected devices and services may provide new capacity for individuals to obtain valuable information about their health questions, engage with healthcare professionals, and even receive important physical and mental treatment. 

Specific categories of tech services and applications have been found to play an increased role in addressing the health and wellbeing needs of LGBTQ+ individuals. Some of these include:

Health Data Risks for LGBTQ+ People

While online health and wellbeing services can offer a lifeline to many LGBTQ+ individuals, they can also create new risks. For instance, improper uses of health information related to sexuality may undermine the autonomy of an LGBTQ+ individual. According to an article in the Oregon Law Review, data usage related to sexuality can result in pop-ups on an individual’s phone, creating worry and concern about potentially outing an individual in public or quasi-public situations. LGBTQ+ individuals are not always able to be safely out in public settings, with potential harms including impacts on employment, loss of housing opportunities, or jeopardization of personal relationships. 

These risks have increased as certain U.S. states have pursued and enacted laws to undermine the rights and freedoms of LGBTQ+ people and communities. Efforts to restrict gender-affirming care have gained traction in several states, and lawmakers in Oklahoma, Texas, and South Carolina have proposed legislation to prohibit such care for transgender individuals up to the age of 26. Additionally, multiple states have implemented policies barring the use of Medicaid or other state-sponsored insurance for gender-affirming treatments, regardless of the patient’s age. 

New Legal Protections for Health Data

When collected inside a health care environment, such as a telehealth service, health data, including SOGI information, is subject to protections under the Health Information Portability and Accountability Act (HIPAA) and related laws or regulations. However, these protections do not extend to similar information collected outside of that context. Filling that space, eighteen U.S. states (as of the writing of this post) have passed comprehensive privacy laws that apply more broadly, and each of them include “sexual orientation” as a category of sensitive data or sensitive personal information subject to special protections. However, of these 18 laws, only one (Maryland) affirmatively includes “gender-affirming treatment” in its scope of sensitive information, although three of the states (Oregon, Delaware, and New Jersey) do explicitly include “status as transgender or non-binary.” 

In addition to comprehensive privacy laws, two states have implemented health data privacy laws – Washington (My Health, My Data Act (MHMD)) and Nevada (SB 370). Both laws have definitions for “consumer health data,” and include a non-exhaustive list of qualifying categories of information, including “gender-affirming care” information. Additionally, both laws define “consumer health data” as “health information derived or inferred from non-health data,” which could include data usages that unintentionally reveal information about a user’s sexual orientation or gender identity. 

At the federal level, the Biden Administration has initiated the Federal Evidence Agenda on LGBTQI+ Equity, directing the collection of SOGI data in federal surveys and forms. The Federal Evidence Agenda on LGBTQ+ Equity follows the recommendation of implementing appropriate security and privacy safeguards with “Guideline 1: Ensure relevant data are collected and privacy protections are properly applied.” However, without comprehensive privacy legislation at the federal level, there is a lack of guidance regarding collecting and sharing data while protecting user privacy. 

Additionally, in 2024, the U.S. Department of Health and Human Services (HHS) issued a rule designed to prevent discrimination based on gender identity or sexual orientation by healthcare providers and insurers that receive federal funding. Unfortunately, recently, a federal judge in Mississippi issued a preliminary injunction against the rule, citing the Supreme Court’s decision to overturn Chevron deference to administrative agencies.  A recent Supreme Court decision also blocked the enforcement of a new rule from the Biden Administration that would protect transgender students from discrimination in education. The new rule included sexual orientation and gender identity within its discrimination protections for the first time. 

Current best practices and recommendations

It is imperative for the collectors of health data about LGBTQ+ individuals, and in particular sexual orientation and gender identity (SOGI) data, to work toward a safer and more equitable use of SOGI data with meaningful privacy safeguards in mind. 

SOGI data is inherently complex. Given its revelatory nature, SOGI data should be treated with heightened sensitivity. In 2022, FPF and LGBT Tech published a report on the Role of Data Protection in Safeguarding Sexual Orientation and Gender Identity Information. Organizations should apply regulatory safeguards robustly and approach SOGI information with appropriate care and respect for its contextual sensitivity. This can be done, for instance, by requiring consent for the use and collection of SOGI data and considering limitations before data is sent to third parties, particularly if it will make that data publicly accessible in a way that could increase the risk of outing someone. Organizations should implement appropriate security and privacy safeguards to protect any SOGI data in proportion to the sensitivity of the underlying data. They should also note that the sensitivity should be considered individually and in the community context.

Additional reports provide more information and can increase knowledge and understanding of the challenges and risks associated with the collection and use of health information for LGBTQ+ people. This includes a 2024 report by LGBT Tech that recommends that platforms adopt multifaceted strategies that prioritize user protection, inclusivity, and community empowerment. A  research brief from the Center for Democracy and Technology in 2022 is also relevant, that explains how LGBTQ+ students are increasingly targeted by policies and practices that threaten their privacy in schools, with 29 percent of LGBTQ+ students reporting that they or someone they know has been outed by school-sponsored monitoring technology. 

Conclusion

Coming Out Day is about having control over information. The decision to come out should be up to the discretion of the individual coming out. On Coming Out Day, and every day, robust privacy and data protection rules, policies, and practices are crucial to empower LGBTQ+ people to decide when and where to share their SOGI data.

FPF Analysis of New Requirements for Generative AI Use by Healthcare Entities in Patient Communications 

Co-Authored by Judy Wang, FPF Communications Intern

On September 28, Governor Gavin Newsom signed California AB 3030, among a host of AI bills. CA AB 3030 amended the California Health & Safety Code and requires specified healthcare entities to disclose the use of generative artificial intelligence (AI) in provider-patient communications through visual or verbal disclaimers presented before, during, and/or at the end of a communication.

A new chart from FPF provides a “cheatsheet” of the covered entities and activities, as well as requirements based on the relevant type of communication. The chart also includes information about the obligations of health facilities to which the law applies and enforcement provisions incorporated from the broader legislative package. 

Download the chart here.

FPF Key Takeaways from AB 3030:

Interested in getting a full analysis of AB 3030? Become an FPF member to access the complete analysis and breakdown of California AB 3030 in the member portal.

FPF Submits Comments to Inform New York Children’s Privacy Rulemaking Processes

At the end of the 2024 legislative session, New York State passed a pair of bills aimed at creating heightened protections for children and teens online. One, the New York Child Data Protection Act (NYCDPA), applies to a broad range of online services that are “primarily directed to children.” The NYCDPA creates novel substantive data minimization requirements, restricts the sale of data of children (defined as under 18), and requires businesses to respect “age flags” – new device signals intended to convey whether a user is a minor. The second law, the Stop Addictive Feeds Exploitation (SAFE) for Kids Act, is more narrowly focused on social media platforms and restricts minors’ access to the presentation of content made by “addictive feeds.” 

On September 30, the Future of Privacy Forum (FPF) filed comments with the New York Office of the Attorney General (OAG) to inform forthcoming rulemaking for the implementation of these two frameworks. While raising the protections for youth privacy and online safety has been a priority area for lawmakers over the last several years, New York’s two new laws both take unique approaches. FPF’s comments seek to ensure that New York’s protections for youth online can protect and empower minors while supporting interoperability with existing state and federal privacy frameworks. 

New York Child Data Protection Act

The NYCDPA creates new restrictions on the collection and procession of the personal data of teen users who are outside the scope of the federal Children’s Online Privacy Protection Act (COPPA). Under the law, a covered business must obtain “informed consent” to use teen data or the processing must be strictly necessary to meet one of nine enumerated “permissible purposes” such as conducting internal business operations or preventing cybersecurity threats. The requirement that, in the absence of informed consent, processing must be strictly necessary for minors 13+ could be stricter than COPPA standards, especially with respect to many digital advertising practices. The law also restricts the sale of minors’ personal data, including allowing a third party to sell the data.

Obtaining “informed consent” under the NYCDPA requires satisfying a number of conditions, some of which diverge from comparable privacy regimes. Consent must be made separately from any other transaction or part of a transaction; be made in the absence of any ‘dark patterns;’ clearly and conspicuously state that the processing for which consent is requested is not strictly necessary and that the minor may decline without preventing continued use of the service; and clearly present an option to refuse to provide consent as the most prominent option.

The NYCDPA is also unique in providing for the use of device signals to transmit legally binding information about a user’s age and their informed consent choices. Such technologies are not commonplace in the market and raise a number of both technical and policy questions and challenges.

With these unique provisions in mind, FPF’s comments recommend that the OAG: 

  1. 1. Consider existing sources of law, including the COPPA Rule’s internal operations exception, state privacy laws, and the GDPR to provide guidance on the scope of “permissible processing” activities;
  2. 2. Where appropriate, align core privacy concepts with the developing state comprehensive privacy landscape including the definition of “personal information” and opportunities for data sharing for research;
  3. 3. Consult with the New York State Education Department to ensure alignment with New York’s existing student privacy laws and implementing regulations to avoid disruption to both schools and students access to and use of educational products and services;
  4. 4. Mitigate privacy, technical, and practical implementation concerns with “age flags” by further consulting with stakeholders and establishing baseline criteria for qualifying signals. FPF offers technical and policy considerations the OAG should consider in furthering this emerging technology; and
  5. 5. Explicitly distinguish informed consent device signals from “age flags”, given that providing consent at scale raises a separate set of challenges and may undermine the integrity of the NYCDPA’s opt-in consent framework.

Read our Child Data Protection Act comments here.

New York SAFE for Kids Act

The SAFE for Kids Act restricts social media platforms from offering “addictive feeds” unless the service has conducted “commercially reasonable” age verification to determine that a user is over 17 years of age, or the service has obtained verifiable parental consent (VPC). The legislative intent makes clear that ordering content in a chronological list would not be considered an “addictive feed.” A social media platform will also need to obtain VPC to provide notifications concerning an addictive feed to minors between the hours of midnight and 6 am.

“Addictive feeds” are broadly defined as a service in which user-generated content is recommended, selected, or prioritized in whole or in part on user data. There are six carve-outs to the definition of “addictive feed” such as displaying or prioritizing content that was specifically and unambiguously requested by the user or displaying content in response to a specific search inquiry from a user.

Notably, the SAFE for Kids Act focuses on parent consent for teens to receive “addictive feeds.” In contrast, teens are empowered by the Child Data Protection Act to provide informed consent for a broad range of activities. The divergence in policy approaches between these two laws regarding who can provide consent for a teen using a service may lead to challenges in understanding individual rights and protections.

Given the critical role of age verification and parental consent within the SAFE for Kids Act, FPF’s comments to the OAG focus on highlighting considerations, risks, and benefits of various methods for conducting age assurance and parental consent. In particular we note that: 

  1. 1. There are three primary categories of age assurance in the United States: age declaration, age estimation, and age verification. Each method has its own challenges and risks that should be carefully balanced across the state interest in protecting minors online, the state of current technologies, and end-user realities when developing age verification standards.
  2. 2. When exploring appropriate methods for providing verifiable parental consent, the OAG should consider the known problems, concerns, and friction points that already exist with the existing verifiable parental consent framework under COPPA.
  3. 3. Strong data minimization, use limitations, and data retention standards could enhance data protection and user trust in age assurance and VPC requirements.

Read our SAFE For Kids Act comments here.

Regulatory Strategies of Data Protection Authorities in the Asia-Pacific Region: 2024, and Beyond

The Asia-Pacific (APAC) region has emerged as a dynamic and rapidly evolving landscape for data protection regulation. As digital economies flourish and cross-border data flows intensify, data protection authorities (DPAs) across the region are grappling with complex challenges posed by technological advancements, changing business practices, and evolving societal expectations regarding privacy.

This Report provides a comprehensive analysis of strategy documents and key regulatory actions of the DPAs in 10 jurisdictions, published or developed in 2023 and 2024, setting out regulatory priorities for the following years:

  1. 1. Australia 
  2. 2. China
  3. 3. Hong Kong, Special Administrative Region of China (SAR) 
  4. 4. Japan
  5. 5. Malaysia
  6. 6. New Zealand
  7. 7. Philippines
  8. 8. Singapore
  9. 9. South Korea
  10. 10. Thailand 

The Report is structured into two sections. 

Our analysis provides insights into how these DPAs have been working towards implementing their strategic priorities throughout 2023 and 2024. To the extent possible, the analysis in this Report is based on official strategy documents – that is, master plans, statements of regulatory priorities, annual reports, and the like – published by these DPAs between 2023-2024, supplemented by an examination of significant regulators actions taken by the DPAs during this period. 

While we offer a thorough examination of recent and ongoing initiatives, it is important to note that the data protection landscape is dynamic and rapidly evolving. Therefore, this report not only serves as a retrospective overview but also aims to highlight prospective directions that DPAs may pursue in 2025 and beyond. By highlighting the trajectory of these regulatory bodies, we hope that this Report will aid readers in anticipating potential developments in data protection regulation and enforcement across the region. However, readers should bear in mind that unforeseen technological advancements, geopolitical shifts, or other factors may influence future regulatory approaches in ways that cannot be fully predicted at the time of publication.

The Report recognizes that each jurisdiction faces unique challenges, operates within distinct legal and cultural contexts, and may prioritize different aspects of data protection based on their specific circumstances. The Report is therefore not intended to make value judgments on DPAs, rank them, or evaluate their effectiveness in key areas. Rather, our aim is to identify commonalities and divergences in the DPAs’ priorities and approaches, in order to shed light on key trends in the APAC region. We hope that these insights will prove useful to policymakers, businesses, and data protection privacy professionals as they navigate the APAC region’s complex data protection landscape.

To ensure a comprehensive and accurate understanding of this Report’s scope and methodology, readers should note the following key considerations:

Analysis of key strategic documents and recent regulatory actions across the 10 APAC DPAs reveals several common priorities for 2024 and beyond. 

Finally, 50% of DPAs emphasized the protection of children’s personal data, recognizing the unique needs of young people in digital environments.

Click here to read the Issue Brief.

Updated FPF Infographic Explores Data in Connected Vehicles

Today, The Future of Privacy Forum is launching the Data and the Connected Vehicle Infographic 2.0, including new updates to account for the types of data associated with connected vehicles, features in and outside of the vehicle, and data handlers who receive and process data. Lawmakers, manufacturers, privacy professionals, and consumers are actively engaged in work to examine and respond to privacy and transparency practices related to personal data collected in and around vehicles. The updated infographic provides a visual representation of where the data flows within the connected vehicle ecosystem. 

In 2017, FPF launched the first vehicle infographic, “Data and the Connected Car.” FPF’s continued work on connected vehicles has built upon this initial product, providing additional resources, up to and including the Vehicle Safety Systems Privacy Risks and Recommendations report from March 2024. The report specifically highlights the potential for privacy risks to exist when new technology is incorporated, through requirement or choice, In addition to comments to the Department of Transportation regarding privacy implications for future technology and privacy implications for the future use of  AI in transportation

Click here to view the updated infographic. 

The updated infographic highlights three specific areas within the connected vehicle ecosystem:

  1. Types of Data in the Vehicle include vehicle and safety data, occupant data, location data, account data, and biometric and body-related data. Artificial intelligence is likely to be present in various features and functions throughout the vehicle. Understanding the types and categories of data associated with connected vehicles is essential for regulating data and increasing privacy literacy among individual drivers and passengers. Some data, like operational information or data on engine health, is integral to the vehicle functions, while other types of data can be user-generated and intended for personalization or driver assistance, including GPS navigation and usage of smartphone integration.
  1. Features Inside and Outside the Vehicle include technologies such as infotainment systems, event data recorders, and tire sensors. Additional novel technologies may be more commonly incorporated into vehicles in the future. Some of the vehicle technologies may be added after-market by individuals or are specific to a certain vehicle make and model, such as keyless entry, augmented reality displays, or external charging. Certain vehicle features may be governed by specific requirements and rules according to state and federal regulations. In addition, manufacturers are increasingly incorporating certain technologies specifically in response to emerging regulatory requirements. An increase in technology and data collection can increase the privacy risk associated with the vehicle
  1. Data Receivers or Data Handlers are entities who collect and control the flow of data from inside and outside the vehicle for various purposes, including performance and safety. Once the data is collected, its transfer and use can depend on a number of factors, including agreements with the manufacturer, third parties and service providers, emergency services, and external infrastructure such as traffic lights and automatic license plate readers. Manufacturers may receive vehicle and safety data, location data, account data, occupant data, and biometric or body-related data (depending on the technology incorporated into the vehicle). Third parties and service providers may also receive information about the vehicle and potentially about the user.  Some third parties in the connected vehicle ecosystem include insurance companies, dealerships and service centers, and entities that provide in-vehicle services through the infotainment system. Notice to individuals should provide information about when data is required for the vehicle to function or for important safety or regulatory requirements.

Individuals should feel physically and digitally safe in their vehicles. In 2023, FPF conducted a survey wherein consumers indicated that transparency is important to trust and adoption of in-vehicle technologies intended to increase safety. This updated infographic can help provide people with transparency by providing a visual demonstration to foster an understanding of how technology is utilized in a vehicle and where personal data may be implicated. Additionally, this infographic can serve as a resource for policymakers who need to understand the ecosystem in order to regulate effectively. As vehicle privacy continues to be top of mind for all individuals, the updated FPF infographic serves to help improve understanding and provide the transparency that is needed for a trusted mobility ecosystem. 

Infographic Explores Driver Data Collection and Use in Connected Cars

FPF’s “Data and the Connected Vehicle” Demystifies Connected Car Ecosystem as Policymakers Look to Regulate

SEPT. 16, 2024 — Vehicle technologies are evolving rapidly, in every facet of the system, from safety features to entertainment, and occupant convenience. Many of these new features are enabled by the collection of driver and occupant data – and data collected from their surroundings – for vehicles to function and communicate with service providers, with one another, and with sensors on and around the road. An updated infographic from the Future of Privacy Forum (FPF) provides drivers with an understanding of how their data is collected and used in connected vehicles and how data flows in the connected vehicle ecosystem.

Individuals and policymakers have increasingly called for additional transparency regarding vehicle data and what happens with it. FPF’s updated Data and the Connected Vehicle infographic provides an accessible visual of the critical data flows in today’s connected vehicles and how they collect and use data and AI to operate different systems. 

“Most new vehicles have some, if not all, of the features outlined in Data and the Connected Vehicle, from wireless connectivity to cabin monitoring and microphones. To foster a trusted mobility ecosystem, it is vital that data is transferred respectfully and securely between a network of carmakers, vendors, and others to support individuals’ established safety, logistics, and information expectations,” said Adonne Washington, Policy Counsel of Data, Mobility, and Location at FPF and the project lead. “We created this project to demystify the behind-the-scenes of an everyday tool people rely on worldwide.”

A previous FPF survey found that many individuals value advanced vehicle safety technologies, but worry about the privacy risks, accuracy, cost, and data transfers to third parties. FPF’s infographic looks to clear misconceptions and clarify the privacy implications of connected cars and vehicle safety systems. This will be particularly pertinent, as the National Highway Traffic Safety Administration (NHTSA) is establishing new safety technology requirements for vehicle manufacturers and policymakers are looking to establish specific vehicle data policies. 

“Data and the Connected Vehicle” updates a 2017 infographic created in response to the evolving landscape of smart and connected vehicles over the last few years.

“Ensuring privacy protections in vehicles is necessary, as is understanding how they work,” Washington continued, “As these systems continue to evolve and adapt to new driver accommodations, transparency will be key to their adoption and building trust between manufacturers, regulators, and consumers.”


Download the new infographic here. In connection with its launch, FPF will host a public webinar on September 18 with privacy leaders from major automotive manufacturers, including Ford, Rivian, and Honda, to discuss how data collection and processing have enabled many new features in connected cars. Learn more and register for the event here.

FPF Unveils Report on Emerging Trends in U.S. State AI Regulation

Today, the Future of Privacy Forum (FPF) launched a new report—U.S. State AI Legislation: A Look at How U.S. State Policymakers Are Approaching Artificial Intelligence Regulation— analyzing recent proposed and enacted legislation in U.S. states. As artificial intelligence (AI) becomes increasingly embedded in daily life and critical sectors like healthcare and employment, state lawmakers have begun crafting regulatory strategies to promote its opportunities while addressing its heightened risks. This report by FPF delves into the trends of these legislative efforts, examines core questions and issues, and offers key considerations for policymakers as they navigate the complexities of AI policy.

The report primarily focuses on ‘Governance of AI in Consequential Decisions,’ a legislative framework most frequently adopted by lawmakers, which applies to a broad range of entities and industries, and offers the most comprehensive approach to mitigating specific AI risks across various proposals and laws. The report also discusses alternative approaches focused on particular technologies, such as generative artificial intelligence and frontier or foundation models.

In this Report, we highlight the following: 

This report is based on FPF’s analysis of key bills introduced in 2023 and 2024 (detailed in Supplementary Content), as well as our engagement with state policymakers. It also incorporates insights from civil society groups, businesses, and technical experts, whose diverse perspectives have been crucial in shaping a comprehensive examination of the nuances and challenges in advancing AI regulations.

The emerging trends highlighted in the report point to a collaborative movement toward an interoperable framework, where consistent definitions and principles are important for supporting business compliance, safeguarding individual rights, and ensuring regulatory clarity. 

Call for Nominations: 15th Annual Privacy Papers for Policymakers Award

Future of Privacy Forum Award Elevates Privacy Research to Inform Policy Discussion

September 9, 2024 —The Future of Privacy Forum (FPF) invites scholars and authors with an interest in privacy issues to submit finished papers to be considered for its 15th annual Privacy Papers for Policymakers (PPPM) Awards

The award provides privacy and data protection scholars, researchers, and authors in the U.S. and internationally with the opportunity to inject their ideas into the current policy discussion. It elevates and honors important work analyzing current and emerging privacy issues, with the potential to inform real-world policy solutions as the U.S. Congress, federal regulators, and international data protection agencies grapple with privacy issues.

FPF also offers a student paper award to honor work authored by students in undergraduate, graduate, and professional programs. Student submissions must follow the same guidelines as the general PPPM award.

“The accelerating pace of AI has raised complex challenges for policymakers, and scholarship from the privacy and technology academic community is increasingly critical for shaping legislative and compliance solutions,” said Jules Polonetsky, CEO of FPF. “As lawmakers worldwide seek to address urgent data protection issues, FPF’s Privacy Papers for Policymakers publication serves as a critical resource highlighting leading research and expert perspectives.”

We encourage you to share this opportunity with your peers and colleagues. Learn more about the Privacy Papers for Policymakers program and view previous year’s highlights and winning papers on our website.

FPF will invite winning papers focused on U.S. policy to present their work at an annual event in Washington, D.C., in March 2025, with top policymakers and privacy leaders. Winning papers focused on international policy will be invited to showcase their work focused on global policymakers and data protection authorities in a virtual event in March 2025. FPF will also publish a printed digest of the summaries of the winning papers for distribution to policymakers in the United States and abroad. 

Learn more and submit finished papers by October 11, 2024. Please note that the deadline for student submissions is the same. You can also learn more about last year’s event here.

Five ways in which the DPDPA could shape the development of AI in India

India enacted the Digital Personal Data Protection Act, 2023 (DPDPA) on August 11, 2023, a comprehensive data protection law culminating from a landmark Supreme Court decision recognizing a constitutional right to privacy in India, and discussions on multiple drafts spanning over half a decade. 1

The law comes at a time when, globally, there has been an exponential growth in artificial intelligence applications and use-cases, including consumer-facing generative AI systems. As a comprehensive data protection law, the DPDPA will significantly impact how organizations use and process personal data, which in turn affects the development and use of AI. Specifically, AI model developers and deployers will need to carefully consider the DPDPA’s regulatory scope concerning the processing of personal data, the limited grounds for processing, the rights of individuals in respect of their personal data, and the possible exemptions available to train and develop AI systems.

While the Central Government has yet to notify subordinate legislation to the DPDPA (the DPDP Rules), which will operationalize key provisions of the law, we can analyze the DPDPA for an early idea of how it could be applied to AI. While the new law may create challenges for AI training and development through its consent-centric regime, it also contains exemptions for publicly available data, exemptions for research, a limited territorial scope, and risk-based approach to the classification of obligations—an overall approach that is likely to significantly shape the development of AI in India. 

1. DPDPA’s consent-centric regime may pose challenges for AI training and development 

The DPDPA recognises consent and ‘certain legitimate uses’ as the two grounds for processing personal data. Section 7 of the DPDPA specifies scenarios where personal data can be processed without consent. These include situations where the data principal has voluntarily provided their personal data and has not objected to its use for a specific purpose, as well as cases involving natural disasters, medical emergencies, employment-related matters, and the provision of government services and benefits

This means that the DPDPA creates a consent-centric regime for personal data processing. Notably, it does not recognise other alternative legal bases to consent for processing personal data, such as contractual necessity and legitimate interests, that are  provided under other leading data protection laws internationally, such as the General Data Protection Regulation (GDPR) in the EU and Brazil’s Lei Geral de Proteção de Dados (LGPD). Previous work by FPF has identified challenges – for both organizations and individuals – in relying on consent as the primary basis for processing, especially in ensuring that it is provided meaningfully.  In the context of AI development, FPF’s report on generative AI governance frameworks in the APAC region highlights the challenges of relying on consent for web crawling and scraping (however, this may not be an issue under the DPDPA for publicly available data – see point 2 below). Specifically, without an established legal relationship with the individuals whose data is scraped, it is practically impossible to identify and contact them to obtain their consent.

Certain sector-specific AI applications and generative AI systems that require curated personal data to develop AI models will need to be trained on personal data that is not publicly available. In such a context, data fiduciaries  (i.e., “data controllers” or entities that determine the purposes and means of processing personal data) will likely need to rely on consent as the primary ground for processing personal data. As per the DPDPA, data fiduciaries — in this case, AI developers or deployers — must ensure that consent is accompanied by a notice clearly outlining the personal data being sought, the purpose of processing, and the rights available to the data principal. Furthermore, for personal data collected before the enactment of the DPDPA, data fiduciaries are required to provide notice informing the “data principal” (i.e., data subject, or the person whose personal data are collected or otherwise processed).

2. Exemptions for publicly available data could facilitate training AI models on scraped data, but require caution 

A significant provision under the DPDPA is the exclusion of publicly available data entirely from the scope of regulation. According to Section 3(c)(ii) of the DPDPA, the DPDPA does not apply to data that is made publicly available by the “data principal” or any other person legally obligated to make the data publicly available. 

This blanket exemption goes further than similar provisions in other data protection laws, which, for instance, only exempt organizations from the obligation to obtain individuals’ consent for processing of their personal data, if the data is publicly available. This is the case in Singapore, where Section 13 of the Personal Data Protection Act (PDPA), read with the Act’s First Schedule, exempts organizations from the requirement to obtain consent to process personal data, if the data is publicly available. However, unlike the DPDPA, data protection obligations under PDPA continue to apply even when processing publicly available data. 

Similarly, Article 13 of China’s Personal Information Protection Law (PIPL), which, broadly, specifies the grounds for processing personal data, allows the processing of personal data without consent if the data has been disclosed by the individual concerned or has been lawfully disclosed. Such processing must be within reasonable scope and must balance the rights and interests of the individual and the larger public interest.

In Canada, the relevant exemption under the Personal Information Protection and Electronic Documents Act (PIPEDA) only applies to the processing of publicly available information in the circumstances mentioned in the Regulations Specifying Publicly Available Information, SOR/2001-7 (13 December, 2000). The Canadian data protection regulator provides guidance on the interpretation of what could be considered as publicly available.

Of note, the EU’s GDPR does not include any exemptions or even tailored rules applying to publicly available personal data. This is because the whole regulation applies equally to all personal data, including the provisions related to lawful grounds for processing. For instance, with regard to giving notice to data subjects, the GDPR even has a dedicated article that requires notice to be given when personal data was not collected directly from data subjects (Article 14). However, this obligation has an exception where “the provision of such information proves impossible or would involve a disproportionate effort, in particular for processing for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes”. There is currently an ongoing debate among European regulators on whether processing publicly available personal data particularly under the guise of scraping can be done lawfully without the consent of individuals under the GDPR, with no clear answer yet.2 

Globally, the scraping of webpages has come under increased regulatory scrutiny. In August 2023, members of the Global Privacy Assembly’s International Enforcement Cooperation Working Group issued a joint statement urging social media companies and other websites to guard against unlawful scraping of personal information from web pages. In May 2024, the European Union Data Protection Board’s ChatGPT Taskforce, in its report, noted that the automated collection and extraction of certain information from webpages might contain personal data, including sensitive categories of personal data, which could “carry peculiar risks for the fundamental rights and freedoms” of individuals.  

Processing of publicly available personal data would not be subject to obligations under the DPDPA to the extent that any personal data contained in the datasets was made publicly available by the data principal or by someone legally required to do so – this may include, for example, personal data from social media platforms and company directories. However, organizations will still need to incorporate appropriate safeguards to ensure that only permissible personal data is scraped and the scraped data does not violate any other applicable laws. At the same time, questions may arise with regard to the applicability of the DPDPA to publicly available personal data that was collected for an initial processing operation, such as training an AI model, but which is not anymore publicly available after being collected.

3. Exemptions for research purposes with clear technical and ethical standards could promote AI research and development

Section 17(2)(b) of the DPDPA also exempts processing of personal data for “research, archiving or statistical purposes” from obligations under the DPDPA. However, this exemption only applies if such processing complies with standards prescribed by the Central Government and is not done to take “any decision specific to a [d]ata [p]rincipal”. To date, the Central Government has not released any standards relating to this provision

By contrast, data protection laws in most jurisdictions do not specifically provide an exemption for processing personal data for research purposes. Instead, they recognize research as a secondary use that does not require a distinct lawful basis for processing than the one originally relied on, or permit non-consensual processing for research, subject to certain conditions.

For instance, in the EU, under the GDPR, secondary use of personal data for archiving, statistical, or scientific research purposes is permissible, provided that ‘appropriate safeguards’ are in place to protect the rights of the data subject. These safeguards include technical and organizational measures aimed at ensuring data minimization. Furthermore, the GDPR allows the processing of sensitive categories of personal data when necessary for scientific or historical research purposes.

In Japan, the Act on the Protection of Personal Information (APPI) exempts consent requirements, in cases of secondary collection and use of personal data, if the data is obtained from an academic research institution and processed jointly with that institution. However, such processing must not be solely for commercial purposes and must not infringe upon the individual’s rights and interests.

In Singapore, the PDPA provides a limited additional basis for the use, collection, and disclosure of personal data for research purposes, if the organization can satisfy the following conditions: (a) the research purpose requires personally identifiable information; (b) there is a clear public benefit to the research; (c) the research results will not be used to make decisions affecting individuals; and (d) the published results do not identify individuals.

It is unclear at this stage if the research exemption under the DPDPA will extend to only academic institutions or also extend to private entities that engage in research. While such an exemption, with clearly outlined standards, could help create quality data sets for model development, it is crucial to have clearly defined technical and ethical standards that can prevent privacy harms.  

4. Limited nature of DPDPA’s territorial scope may allow offshore providers of AI systems to engage in unregulated processing of personal data of data principals in India

Like many other global data protection frameworks, the DPDPA has extraterritorial applicability. Section 3(b) of the DPDPA indicates that the DPDPA applies to entities that process personal data outside India, if such processing is connected to any activity which is related to the offering of “goods or services” to data principals in India. 

This provision is narrower in scope than similar provisions under other global data protection laws. For example, the GDPR, unlike the DPDPA, also applies extraterritorially to processing which involves ”the monitoring of behavio(u)r” of data subjects within the European Union. In fact, data protection authorities in Europe have fined foreign entities for unlawfully processing the personal data of EU residents, even when those entities have no presence in the region. Of note, under the EU’s AI Act, AI systems used in high-risk use cases3 “should be considered to pose significant risks of harm to the health, safety or fundamental rights if the AI system implies profiling” as defined by the GDPR (Recital 53), linking thus engaging in “profiling” as a component of an AI system to heightened risks to the rights of individuals. Interestingly, the Personal Data Protection Bill, 2019, which was introduced in Indian Parliament and withdrawn in 2022, and the Joint Parliamentary Committee’s version of the data protection bill also extended extraterritorial applicability to any processing that involved the “profiling of data principals within the territory of India”. 

This narrower scope permits offshore providers of AI systems, which do not provide goods and services to data principals in India, to profile and monitor the behavior of data principals in India without being subject to any obligations following from the DPDPA. Additionally, such companies may engage in unregulated scraping of publicly available data to train their AI systems, beyond the exception explored above. As highlighted in point 2, publicly available personal data that has not been made available by the data principal or by any other person under a legal obligation still falls under the DPDPA’s scope of regulation. This could include personal data shared by others on blog pages, social media websites, or in public directories, among others. Compliance with the DPDPA obligations in these scenarios does not extend to offshore organizations, as long as they do not engage in activities related to offering goods or services in India. 

For the same types of data, all other data fiduciaries must ensure that the data is processed based on permissible grounds and is protected by appropriate security safeguards. Additionally, for personal data collected through consent, data fiduciaries must ensure that data principals are afforded the rights to access, correct, or erase their personal data held by the fiduciary.

5. Classification of significant data fiduciaries with objective criteria would allow a balanced and risk-based approach to data protection obligations relevant to AI systems

The DPDPA adopts a risk-based approach to imposing obligations by introducing a category of data fiduciaries known as ‘Significant Data Fiduciaries’ (SDFs). The DPDPA empowers the Central Government to designate any data fiduciary or class of data fiduciaries as a SDF based on the following factors:  

  1. The volume and sensitivity of personal data processed; 
  2. The risk posed to the rights of data principals); 
  3. The potential impact on the sovereignty and integrity of India; 
  4. Risk to electoral democracy; 
  5. Security of the state; and
  6. Public order.

In addition to complying with the obligations for data fiduciaries, SDFs are required to: 

The DPIA obligation is particularly relevant to identifying and mitigating risks to privacy and other rights that may be impacted by processing of personal data in the context of training or deploying an AI system. 

The Central Government also has the powers to impose additional obligations on SDFs. On the other hand, the Central Government is also empowered to remove notice, data retention limitation, accuracy and obligations for certain data fiduciaries or a class of data fiduciaries, “including startups”. 

It is important to note that the DPDPA does not specify objective criteria, such as the categories of personal data that may be considered sensitive, or the volume of data or users required for the classification of SDFs or the easing of certain obligations for data fiduciaries. In the absence of these specific quantitative thresholds, the classification of AI driven companies could be influenced by the Central Government’s perception of the potential threats posed by specific AI applications. 

Conclusion 

With the AI market in India growing at 25-35% annually and projected to reach a market size of around $17 billion by 2027, the Indian government has recognized this opportunity by allocating over $1.2 billion for the IndiaAI Mission, aimed at developing domestic capabilities to boost the growth of AI in the country. As AI continues to evolve and integrate into various sectors, the DPDPA provides a crucial framework that will influence how organizations develop and deploy AI technologies in India. The law’s exemptions for publicly available data, its over-reliance on consent, and a graded approach to obligations for data fiduciaries present both opportunities and challenges. 

The provisions of the DPDPA will only take effect once the government issues a notification under Section 1(2) of the DPDPA. The forthcoming DPDP Rules are expected to clarify and operationalize key aspects of the Act. These include the form and manner of providing notices, breach notification procedures, how data principals can exercise their rights under the DPDPA, and the provisions on procedure and operations of the Data Protection Board. The effectiveness of the law in balancing privacy protections, preventing harms, on one hand, and harnessing the benefits that AI could bring for people and society, on the other hand, will become clearer once these rules are in place.

Edited by: Gabriela Zanfir-Fortuna, Josh Lee Kok Thong, and Dominic Paulger

  1.  You can refer to FPF’s previous blogs (here and here) for a brief history and overview of the DPDPA. ↩︎
  2. See, for instance, Report of the work undertaken by the ChatGPT Taskforce of the EDPB, May 2024, paras. 15 to 19, and the Dutch Data Protection Authority’s Guidelines on the scraping of web data. ↩︎
  3. As identified in Annex III of the regulation. ↩︎