U.S. Legislative Trends in AI-Generated Content: 2024 and Beyond
Standing in front of the U.S. flag and dressed as Uncle Sam, Taylor Swift proudly proclaims that you should vote for Joe Biden for President. She then wants you to vote for Donald Trump in a nearly identical image circulated by former President Trump himself. Both the images, and the purported sentiments, are fabricated, the output of a generative AI tool used for creating and manipulating images. In fact, shortly after Donald Trump circulated his version of the image, and in response to the fear of spreading misinformation, the real Taylor Swift posted a real endorsement to her Instagram account, for Vice President Kamala Harris.
Generative AI is a powerful tool, both in elections and more generally in people’s personal, professional, and social lives. In response, policymakers across the U.S. are exploring ways to mitigate risks associated with AI-generated content, also known as “synthetic” content. As generative AI makes it easier to create and distribute synthetic content that is indistinguishable from authentic or human-generated content, many are concerned about its potential growing use in political disinformation, scams, and abuse. Legislative proposals to address these risks often focus on disclosing the use of AI, increasing transparency around generative AI systems and content, and placing limitations on certain synthetic content. While these approaches may address some challenges with synthetic content, they also face a number of limitations and tradeoffs that policymakers should address going forward.
1. Legislative proposals to regulate synthetic content have primarily focused on authentication, transparency, and restrictions.
Generally speaking, policymakers have sought to address the potential risks of synthetic content by promoting techniques for authenticating content, establishing requirements for disclosing the use of AI, and/or setting limitations on the creation and distribution of deepfakes. Authentication techniques, which involve verifying the source, history, and/or modifications to a piece of content, are intended to help people determine whether they’re interacting with an AI agent or AI-generated content, and to provide greater insight into how content was created. Authentication often includes requiring the option to embed, attach, or track certain information in relation to content to provide others with more information about where the content came from, such as:
Watermarking: embedding information into content for the purpose of verifying the authenticity of the output, determining the identity or characteristics of the content, or establishing provenance (see below). Also referred to as “digital watermarking” in this context, to distinguish from traditional physical watermarks.
Provenance tracking: recording and tracking the origins and history of content or data (also known as “provenance”) in order to determine its authenticity or quality.
Metadata recording: tracking information about data or content itself, rather than its substance (also known as “metadata”) for the purpose of authenticating the origins and history of content.
A number of bills require or encourage the use of techniques like watermarking, provenance tracking, and metadata recording. Most notably, California AB 3211 regarding “Digital Content Provenance Standards,” which was proposed in 2024 but did not pass, sought to require generative AI providers to embed provenance information in synthetic content and provide a tool to users for detecting synthetic content, as well as for recording device manufacturers to offer users the ability to place authenticity and provenance information in content. At the federal level, a bipartisan bill, the Content Origin Protection and Integrity from Edited and Deepfaked Media (COPIED) Act, has been introduced that would direct the National Institute of Standards and Technology (NIST) to develop standards for watermarking, provenance, and synthetic content detection, and to require generative AI providers allow content owners to attach provenance information into content. If passed, the COPIED Act would build on NIST’s existing efforts to provide guidelines on synthetic content transparency techniques, as required by the White House Executive Order (EO) on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.
Relatedly, policymakers are also exploring ways to improve transparency regarding synthetic content through labeling, disclosures, and detection. Some legislation, such as the recently-enacted Colorado AI Act and the pending federal AI Labeling Act of 2023, requires individuals or entities to label AI-generated content (labeling) or disclose the use of AI in certain circumstances (disclosure). Other legislation focuses on synthetic content detection tools, which analyze content to determine whether it’s synthetic, and to provide further insight into the content. Detection tools can include those that evaluate the likelihood a given piece of content is AI-generated, as well as tools that can read watermarks, metadata, or provenance data to inform people about their background. For example, the recently-enacted California AI Transparency Act requires, among other things, generative AI system providers to make an AI detection tool available to its users. Separately, the Federal Communications Commission (FCC) is exploring creating rules around the use of technologies that analyze the content of private phone conversations to alert users that the voice on the other end of the line may be AI-generated.
Another common approach to addressing synthetic content risks has been to place legal restrictions on the production or distribution of certain AI-generated content, particularly “deepfakes” that use AI to appropriate a person’s likeness or voice. In contrast to more technical and organizational approaches, legal restrictions typically involve prohibiting certain uses of deepfakes, providing mechanisms for those affected to seek relief, and potentially placing liability on platforms that distribute or fail to remove prohibited content. Over the past few years, many states have passed laws focused on deepfakes in political and election-related communications, non-consensual intimate imagery (NCII), and child sexual abuse material (CSAM), with some applying to deepfakes more generally. This year at the federal level, a number of similar bills have been introduced, such as the Candidate Voice Fraud Prohibition Act, DEEPFAKES Accountability Act, and Protect Victims of Digital Exploitation and Manipulation Act. The Federal Trade Commission (FTC) has also taken this approach, recently finalizing a rule banning fake reviews and testimonials (including synthetic ones), and exploring rulemaking on AI-driven impersonation of individuals. The FCC has also considered engaging in rulemaking on disclosures for synthetic content in political ads on TV and radio.
2. Legislative approaches to synthetic content need to be carefully considered to assess feasibility and impact.
While legally-mandated safeguards may help address some of the risks of synthetic content, they also currently involve a number of limitations, and may conflict with other legal and policy requirements or best practices. First, many of the technical approaches to improving transparency are relatively new, and often not yet capable of achieving the goals with which they may be tasked. For example, synthetic content detection tools—which have already been used controversially in schools—are, generally speaking, not currently able to reliably flag when content is meaningfully altered by generative AI. This is particularly true when a given tool is used across different media and content types (e.g., images, audio, text), and across languages and cultures, where they can vary significantly in accuracy. And because they often make mistakes, detection tools may be unable to slow the distribution of misinformation while simultaneously exacerbating skepticism around their own reliability.
Even more established techniques may still have technical limitations. Watermarks, for instance, can still be removed, altered, or forged relatively easily, creating a false history for a piece of content. Techniques that are easy to manipulate could end up creating mistrust in the information ecosystem, as synthetic content may appear as non-synthetic, and non-synthetic content may be flagged as synthetic. Additionally, because watermarking only works when the watermark and detection tool are interoperable—and many are not—rolling this technique out at scale without coordination may prove unhelpful and exacerbate confusion. Finally, given that there is no agreement or standard regarding when content has been altered enough to be considered “synthetic,” techniques for distinguishing between synthetic and non-synthetic content are likely to face challenges in drawing a clear line.
Certain techniques that are intended to provide authentication through tracking, like metadata recording and provenance tracking, may also conflict with privacy and data protection principles. Provenance and metadata tracking, for example, may reveal individuals’ personal data, and digital watermarks can be individualized, which could then be used to monitor people’s personal habits or online behavior. These techniques require collecting more data about a piece of content, and keeping records of it for longer periods of time, which may be in tension with mandates to minimize data collection and limit retention. As previously mentioned, the FCC is investigating third-party AI call detection, alerting, and blocking technologies, which require real-time collection and analysis of private phone conversations, often without the other party’s knowledge. Notably, FCC Commissioner Simington has said the notion of the Commission putting its “imprimatur” on “ubiquitous third-party monitoring” tools is “beyond the pale.”
Beyond issues with technical feasibility and privacy, some approaches to addressing synthetic content risks are likely to face legal challenges under the First Amendment. According to some interpretations of the First Amendment, laws prohibiting the creation of deepfakes in certain circumstances—such as in the case of election-related content and digital replicas of deceased people—are a violation of constitutionally-protected free expression. For example, in early October a federal judge enjoined a recently-enacted California law that would prohibit knowingly and maliciously distributing communications with “materially deceptive” content that could harm a political candidate, and which portrays them doing something they did not—such as a deepfake—without a disclosure that the media is manipulated. According to the judge, the law may violate the First Amendment because its disclosure requirement is “overly burdensome and not narrowly tailored,” and given that the law’s over-broad conception of “harm” may stifle free expression.
Finally, some have raised challenges on the intersection between regulation of synthetic content and other regulatory areas, including platform liability and intellectual property. Critics argue that laws holding republishers and online platforms liable for prohibited content run afoul not only of the First Amendment but also Section 230 of the Communications Decency Act, which largely shields interactive computer service providers from liability for third-party content. In the latter argument, exposing platforms to liability for failing to remove or block violative synthetic content that users have not reported to it contradicts Section 230, and would also be an unreasonable logistical expectation to place on platforms. There is also concern that holding platforms responsible for removing “materially deceptive” content—such as in the context of elections and political communications—would put them in a position of determining what information is “accurate,” for which they are not equipped. In recognition of these technical and organizational limitations, some have pushed for legislation to include “reasonable knowledge” and/or “technical feasibility” standards.
3. More work lies ahead for policymakers intent on regulating synthetic content.
2024 has been called an election “super year,” and by the end of the year up to 3.7 billion people in 72 countries will have voted. This convergence has likely motivated lawmakers to focus on the issues surrounding deepfakes in political and election-related communications. By contrast, there will be significantly fewer elections in the coming years. At the same time, emerging research is challenging the notion that deepfakes have a noticeable impact on either the outcome or integrity of elections. Additionally, the U.S. Federal Election Commission (FEC) recently declined to make rules regarding the use of AI in election ads, stating it doesn’t have the authority to do so, and has clashed with the FCC in its own attempt to regulate AI in election ads.
While political and election deepfakes may get less policymaker attention in the U.S. in 2025, deepfakes are only becoming harder to distinguish from authentic content. At the federal level, U.S. regulators and lawmakers have signaled strong interest in continuing to push for the development and implementation of content authentication techniques to allow people to distinguish between AI and humans, or between AI-generated content and human-generated content. NIST, for example, is currently responding to the White House EO on AI and finalizing guidance for synthetic content authentication, to be published by late December 2024. In May 2024 the Bipartisan Senate AI Working Group, led by Sen. Chuck Schumer, published its Roadmap for AI policy, recommending that congressional committees consider the need for legislation regarding deepfakes, NCII, fraud, and abuse. The FTC is also currently considering an expansion of existing rules prohibiting impersonation of businesses and government officials to cover individuals as well, including AI-enabled impersonation. Given generative AI’s increasing sophistication, and integration into more aspects of people’s daily lives, interest in content authentication will likely continue to grow in 2025.
In the same way that age verification and age estimation tools got a boost in response to children’s privacy and safety regulations requiring differential treatment of minors online, there may be a similar effect on authentication tools. The FCC is already interested in exploring real-time call detection, alerting, and blocking technologies to distinguish human callers from AI callers. Other similar solutions, such as “personhood credentials,” are also building on existing techniques like credentialing programs and zero-knowledge proofs to provide assurance that a particular individual online is in fact a human, or that a given online account is the official one and not an imposter.
As generative AI becomes more powerful, and synthetic content more convincing, malicious impersonation, disinformation, and NCII and CSAM may pose even greater risks to safety and privacy. In response, policymakers are likely to ramp up efforts to manage these risks, through a combination of technical, organizational, and legal approaches. In particular, lawmakers may focus on especially harmful uses of deepfakes, such as synthetic NCII and CSAM, as well as encouraging or mandating the use of transparency tools like watermarking, content labeling and disclosure, and provenance tracking.
Processing of Personal Data for AI Training in Brazil: Takeaways from ANPD’s Preliminary Decisions in the Meta Case
Data Protection Authorities (DPAs) across the globe are currently wrestling with fundamental questions raised by the emergence of generative AI and its compatibility with data protection laws. A key issue is under which legal basis companies might be able to process personal data for training AI models. Another one is how the rights of individuals with regard to their personal data can be safeguarded, as well as how to mitigate potential risks arising from the complex and novel processing of personal data entailed particularly by generative AI.
Brazil’s Autoridade Nacional de Proteçao de Dados Pessoais (ANPD) reviewed these issues recently in a set of Preliminary Decisions following an inspection into the lawfulness of processing of personal data by Meta for the training of AI models. Particularly, the DPA initially ordered Meta under an emergency procedure to suspend this processing citing potential harm and irreparable damage to users. The emergency order was later maintained following a first challenge, but it was subsequently reversed after the DPA was satisfied with the level of cooperation by the company and the measures it proposed. However, the main inspection process continues.
Although preliminary, the Brazil ANPD’s decisions contain insights into the assessment criteria that DPAs are starting to deploy when looking at the impact generative AI has on the rights and freedoms of individuals1 in the context of the compatibility of this new technology with data protection law. In particular, the salient issues that surface are related to:
Relying on “legitimate interests” as a lawful ground for processing publicly available personal data for AI training, paired with providing individuals an accessible right to opt out from such processing;
Meaningful transparency;
The scope of what constitutes “sensitive data” and the lawfulness of processing it in this context;
Protections for children’s personal data.
In this blog, we summarize the procedural steps that have occurred, from initial suspension (Round 1), the upholding of that decision (Round 2), and the current proposed action plan (Round 3), including the ANPD’s reasoning at each stage, and offer our initial reflections and key takeaways, including what it means for ANPD’s enforcement priorities and the future of “legitimate interests.”
Round 1: ANPD Suspends Meta’s Processing for AI Training Purposes
The ANPD issued a preventive Measure on July 2, 2024, requiring the immediate suspension of the processing of personal data by Meta for the purpose of training its generative AI model. The decision came after Meta announced a change in its privacy policy indicating that it could use “publicly available” information collected from users to train and enhance its AI system starting June 26, 2024. The ANPD initiated an ex-officio inspection (Proceeding No. 00261.004529/2024-36) and preliminarily ordered a suspension of that processing activity.
In this initial order, the ANPD determined that preventive measures were necessary to avoid irreparable damage or serious risk to individuals, and in turn, ordered a temporary suspension of the processing activity by Meta. The decision adopted the legal reasoning of Vote 11/2024/DIR-MW/CD, presented by Director Miriam Wimmer, which was supported by the ANPD’s General Coordination of Inspection (CGF) technical report proposing the preventive measure2. In its deliberative vote, the Board determined Meta had potentially violated several provisions of the country’s general data protection law (LGPD) due to:
the ineffective use of “legitimate interest” as a legal basis for processing personal data for AI training purposes;
a lack of transparency and disclosure to users about its processing operations involving user data;
limiting the exercise of data subjects’ rights; and
processing of personal data of children and adolescents without proper safeguards.
Hurdles for relying on “legitimate interests”: processing sensitive data and the legitimate expectations of users
In its July 2 Order, the ANPD determined “legitimate interests” were not an adequate basis for processing personal data for Meta’s AI training activity, because the processing may have included users’ sensitive data. Of note, the LGPD requires that all processing of personal data is based on a lawful ground (Article 7), similar to the EU’s General Data Protection Regulation but with some variations. Meta’s privacy policy originally stated it relied on the “company, users, and third parties’ legitimate interest” to process any personal data from publicly available sources, including images, audio, texts, and videos. The ANPD found that such information might reveal sensitive information about an individual’s political, religious, and sexual preferences, among other aspects of their personality, and thus qualify as “sensitive data” under Article 5 of the LGPD. Article 5, section II, defines “sensitive personal data” as “personal data on racial or ethnic origin, religious conviction, political opinion, union affiliation or religious, philosophical or political organization, health or sexual life data, genetic or biometric data, when linked to a natural person.”
Under Article 11 LGPD, the processing of “sensitive data” can only be carried out with the data subject’s consent, or if the processing is “indispensable” for a set of specific scenarios, such as:
Compliance with a legal or regulatory obligation;
Processing data necessary for the execution of public policies provided by law or regulation;
Conducting studies by a research body, ensuring that data is anonymized where possible;
Processing that is part of the regular exercise of rights, including by contract or in judicial, administrative, or arbitral proceedings under the Arbitration Law;
Protecting the life or physical safety of the data subject or third parties;
Health protection, exclusive to a procedure performed by health professionals, health services, or the health authority;
Guaranteeing fraud prevention and security for the data subject, in processes of identification and authentication of registration in electronic systems, or safeguarding [the rights to access data given in Article 9] and except in the case where fundamental rights and freedoms of the data subject prevail, which require the protection of personal data.
Even if Meta’s processing activities did not include providing its model with sensitive data, the ANPD determined the company’s reliance on “legitimate interests” as lawful ground would not be sufficient unless it met the legitimate expectations of the data subjects.
Under Article 10, Section II of the LGPD, a controller must be able to demonstrate the processing of personal data for the intended purpose “respects the legitimate expectations and fundamental rights and freedoms” of the data subjects. In this case, the ANPD argued data subjects could not reasonably expect their personal information would be used to train Meta’s AI model – given that the data was primarily shared for networking with family and friends and included information posted long before the policy change. One point that was not addressed in the decision was whether the source of the publicly available personal data used for AI training, on-platform or off-platform, would make a difference in such an assessment.
To adequately meet required expectations, the ANPD determined a controller must give clear and precise information to data subjects concerning how it intended to use their data and provide effective mechanisms for the exercise of consumer rights. As explained below, the ANPD determined Meta’s new policy was insufficiently transparent and potentially obstructed data subjects’ rights – two potential violations of the LGPD.
Transparency must extend also over how changes in the Privacy Policy are communicated
The ANPD noted that even if Meta’s legitimate interest was adequate, the June change to its privacy policy would nonetheless violate the principle of transparency. Article 10(2) of the LGPD requires data controllers to “adopt measures to guarantee the transparency of data processing based on its legitimate interest.” The ANPD found that Meta failed to provide clear, specific, and broad communication about the privacy policy change. Citing its Guidance on Legitimate Interest, the agency noted that, under this legal hypothesis, data controllers must provide information about the processing clearly and extensively and identify the duration and purpose of the processing, as well as data rights and channels available for their exercise.
Importantly, the agency highlighted the differences in the company’s communication with Brazilian users compared to those in the European Union (EU): EU users were notified about the privacy policy change via email and app notifications, while Brazilian users were not informed and only able to see the privacy policy’s update via Meta’s Privacy Policy Center. In addition, the CGF’s Technical Report, as cited in Vote 11/2024/DIR-MW/CD, highlighted how the failure to provide transparency heightened information asymmetries between the platform and its users, especially for those who are not users but whose personal data might have been employed for AI training.
Exercising Data Subjects’ Rights must be straightforward and involve few steps
The ANPD found that Meta’s privacy policy’s opt-out mechanism was difficult to implement and required users to take several steps before successfully opting out of the processing. The CGF’s Technical Report highlighted that, unlike EU users, Brazilians were required to go through eight steps to access the opt-out form, which was hosted in a complex interface. The ANPD took into account its Cookie Guidelines to demonstrate that companies must provide mechanisms and intuitive tools to assist users in exercising their rights and assert control over their data, as well as a previous recommendation made to Meta in 2021, where the ANPD specifically recommended the company adjust its privacy policy for full compliance with the LGPD. The agency specifically cited the lack of clear communication and difficult mechanisms for exercising the right to opt out as particularly alarming, given that Meta’s processing operations also affect minors.
Processing Data of Children and Adolescents must be done in their “best interest”
The LGPD provides special protection to the data of children and adolescents. Under Article 14, any processing of children and adolescents must be carried out in the “best interests” of the minor and Article 14 Section 6 requires information on the processing to be “provided in a simple, clear and accessible manner, taking into account the physical-motor, perceptual, sensory, and intellectual and mental characteristics of the user.” The ANPD found Meta potentially failed to comply with this obligation and to demonstrate its legitimate interest was adequately balanced against the “best interest” of Brazilian children and adolescents.
While the LGPD does not prohibit reliance on “legitimate interest” to process the personal data of children, this activity must still satisfy the requirement that the processing is in the best interest of the child. The ANPD cited its Guidelines on Processing Personal Data Based on Legitimate Interest to indicate controllers must perform and document a “balancing test” to demonstrate (i) what it considered the “best interest” of the children; (ii) the criteria used to weigh the children’s “best interest” against the controller’s legitimate interest; and (iii) that the processing does not disproportionately impact the rights of children or pose excessive risk. In this case, the ANPD pointed out that Meta’s new policy was silent on how the processing for AI training was beneficial for children and adolescents and noted it did not include any measures to mitigate potential risks.
Round 2: Meta Requests Reconsideration, ANPD Upholds the Suspension
After notification of the July 2 Order, Meta filed for reconsideration to (ii) fully lift the suspension or, in the alternative, (ii) get a deadline extension to certify the suspension of the processing of personal data for AI training in Brazil. In response to the request to lift the suspension, the ANPD upheld its original decision on the basis that Meta did not provide sufficient documentation to demonstrate it had adopted measures to mitigate the risks of harm and irreparable damage to data subjects.
The July 10 Decision was supported by the reasoning of Vote 19/2024/DIR-JR/CD issued by Director Joacil Rael. Although Meta’s intention to implement specific mitigating measures was considered, the Board determined the company failed to specify a date for putting the proposed actions into practice or show evidence that they were in effect.
In that sense, any reconsideration of a full reversal of the suspension would not be considered until the company presented satisfactory documentation indicating a specific ‘work plan’ and timeframe for its implementation. Considering Meta’s alternative request, the ANPD granted a deadline extension for the company to certify it had suspended the relevant processing operations. This extension was based on the argument that it was “technically unfeasible” to confirm full suspension of the processing within the original deadline (five working days from the notification of the July 2 Order) – although the specific reasons for this argument were not included in the Decision. The agency granted Meta five additional business days to present its compliance plan and postponed the analysis on the merits of fully lifting the suspension until that later date.
Round 3: The Proposed Action Plan allows Meta to Resume Processing for AI Training while Waiting for the Conclusion of the Full Inspection Process
After Meta provided the requested documentation, the ANPD reconsidered the company’s request to lift the suspension entirely. In its August 30 Decision, the agency determined the company’s compliance plan adequately improved transparency and allowed for the exercise of data subjects rights. The Board lifted the general suspension and allowed Meta to continue processing personal data for AI training, except for data from individuals under the age of 18.
Addressing its prior concerns about transparency and potential obstruction of data rights, the ANPD considered Meta’s revised plan sufficient to eliminate the previously identified risk of harm. Meta agreed to undertake several changes to its Privacy Policy, app, and website banners to better communicate the purposes of the processing and provide easier ways to opt out of AI training. Full details of Meta’s compliance plan are not given in the decision; however, some of the changes noted by the ANPD include:
sending email and app notifications to users at least 30 days before the beginning of the new processing, and
providing a link with easy access to the form to opt out of the processing for both users and non-users of Meta’s platforms. It is not entirely clear how accessible and straightforward the opt-out process must be under LGPD requirements, though the ANPD’s decision suggests the change was adequate because the process was altered to involve “fewer clicks.”
In lifting the suspension, the ANPD accepted Meta’s commitments to adopt safeguards to mitigate risks, including the implementation of pseudonymization techniques during the pre-training phase of its AI model, as well as the adoption of security measures to prevent re-identification attacks. These measures, in addition to the changes proposed to enhance communication and opt-out mechanisms, were sufficient for ANPD to lift the suspension, except for processing of personal data concerning minors, addressing the ANPD’s earlier concerns that the company’s reliance on “legitimate interests” to process personal data to train its generative AI tool did not sufficiently balance the risk to data subjects.
Some Reflections
Importantly, the ANPD stresses that the legality of using “legitimate interests” as a lawful basis for AI training purposes under the LGPD requires further examination. Given the complex and multifaceted issue, the authority leaves the question open for examination in administrative proceedings as part of the ongoing inspection, and subject to further evidence that Meta’s safeguards and techniques effectively address risks associated with processing personal information, including sensitive data, for AI training. Notably, the LGPD is also one of the few data protection laws to explicitly adopt a special level of protection for children’s and adolescents’ data by including the “best interest of the child” standard, which requires a detailed examination of interests between these individuals and the controller’s interest.
The goal of ANPD’s processing suspension was to prevent Meta from processing personal data to train its generative AI, as the authority considered that the company had given insufficient consideration for potential violations of data subjects’ rights and freedoms. The nature of the suspension, as a preliminary measure, was to prevent ongoing harm from violations identified during the inspection. In its initial decision, the ANPD’s key concern was the proper implementation of the required balancing test to rely on legitimate interests in order to process personal data for AI training as well as ensuring sufficient internal controls to mitigate associated risks. The authority noted that the complexity of the legal question, in combination with the technicality of the issue and information asymmetries between the company, users, and non-users, justified a preliminary suspension of the processing.
It is also important to highlight that not all issues identified in the original order are addressed in the reconsideration. For instance, it is not clear whether Meta had already processed the personal information of users with public accounts to train its AI model before the suspension, including sensitive data and data from children and adolescents, and what would happen to that data. Of note, the decision reversing the original order does not include specific details about the steps the company committed to take to effectively comply with the ongoing prohibition on processing children’s and adolescents’ data.
The ANPD was clear nonetheless that Meta is committed to cooperating with the authority to implement its compliance plan, and that such cooperation includes providing evidence of the security measures and internal controls to be adopted.
The ANPD’s initial suspension relied on finding potential violations of the LGPD, most significantly, for the potential lack of a valid lawful ground for processing both newly acquired and previously provided personal data to train the generative AI model. This determination, as well as the criteria taken into account to reverse it, involves a major question that can significantly impact the future of data processing in the context of AI training – a decision that may have a global impact as more authorities worldwide are inevitably faced with similar scenarios given the proliferation of generative AI technologies. The final determinations of this case will provide critical insight into the immediate future of data protection enforcement in Brazil and elsewhere.
According to article 4, sec. II, of the ANPD’s Internal Regulations, Directors can issue a vote when assigned the role of Rapporteur of a matter before the Board of Directors. Under article 17, sec. V, of the Internal Regulations, the CGF may propose the adoption of preventive measures and setting a daily fine for non-compliance to the Board of Directors. ↩︎
Do LLMs Contain Personal Information? California AB 1008 Highlights Evolving, Complex Techno-Legal Debate
By Jordan Francis, Beth Do, and Stacey Gray, with thanks to Dr. Rob van Eijk and Dr. Gabriela Zanfir-Fortuna for their contributions.
California Governor Gavin Newsom signedAssembly Bill (AB) 1008 into law on September 28, amending the definition of “personal information” under the California Consumer Privacy Act (CCPA) to provide that personal information canexist in “abstract digital formats,” including in “artificial intelligence systems that are capable of outputting personal information.”
The bill focuses on the issue of applying existing CCPA privacy rights and obligations to generative AI and large language models (LLMs). However, the bill introduces ambiguities that raise a significant emerging question within the privacy and AI regulatory landscape: whether, and to what extent, personal information exists within generative AI models. The legal interpretation of this question (whether “yes,” “no,” or “sometimes”) will impact how concrete privacy protections, such as deletion and access requests, apply to the complex data processes in generative AI systems.
Prior to its signing, the Future of Privacy Forum (FPF) submitted a letter to Governor Newsom’s office, highlighting the ambiguities in the bill, summarizing some preliminary analysis from European data protection regulators concerning whether LLMs “contain” personal information, and recommending that California regulators collaborate with technologists and their U.S. and international counterparts to share expertise and work toward a common understanding of this evolving issue.
The complex policy, legal, and technical challenges posed by the enactment of AB 1008 regarding whether generative AI models contain personal information;
Evolving perspectives from global regulators and experts on this issue; and
The implications of various approaches for privacy compliance and AI development.
AB 1008 Highlights a Complex Question: Do Generative AI Models “Contain” Personal Information?
AB 1008 amends the definition of “personal information” under the CCPA to clarify that personal information can exist in various formats, including physical formats (e.g., “paper documents, printed images, vinyl records, or video tapes”), digital formats (e.g., “text, image, audio, or video files”), and abstract digital formats (e.g., “compressed or encrypted files, metadata, or artificial intelligence systems that are capable of outputting personal information”). Specifically, the inclusion of “abstract digital formats” and “artificial intelligence systems” raises the complex question of whether generative AI models themselves can “contain” personal information.
Generative AI models can process personal information at many stages of their life cycle. Personal information may be present in the collection of data used for training datasets, often sourced from publicly available information that is exempt from U.S. privacy laws, as well as in the training processes. Personal information can also be present in the input and output of a generative AI model when it is being used or trained. For example, asking an LLM such as ChatGPT, Claude, Gemini, or Llama, a question such as “Who is Tom Cruise?”, or “When is Tom Cruise’s birthday?” should generate a response that contains personal information, and this can be done for many lesser-known public figures.
Does this mean personal information exists “within” the model itself? Unlike typical databases that store and retrieve information, LLMs are deep neural networks trained on vast amounts of text data to predict the next word in a sequence. LLMs rely on the statistical relationships between “tokens” or “chunks” of text representing commonly occurring sequences of characters. In such a model, the tokens comprising the words “Tom” and “Cruise” are more closely related to each other than the tokens comprising “Tom” and “elevator” (or another random word). LLMs use a transformer architecture, which enables processing input text in parallel and captures long-range dependencies, allowing modern LLMs to engage in longer, more human-like conversations and greater levels of “understanding.”
While the question may seem academic, the answer has material compliance implications for organizations, including for responding to deletion and access requests (discussed further below). An earlier draft version of AB 1008 would have provided that personal information can exist in “the model weights of artificial neural networks,” and legislative history supports that AB 1008’s original intent was to address concerns that, “[o]nce trained, these [GenAI] systems could accurately reproduce their training data, including Californians’ personal information.”
Despite the stated goal of clarifying the law’s definition of personal information, ambiguities remain. The statute’s reference to AI “systems,” rather than “models,” could impact the meaning of the law, and “systems” is left undefined. While a “model” generally refers to a specific trained algorithm (e.g., an LLM), an AI “system” could also encompass the model architecture, including user interfaces and application programming interfaces (APIs) for interacting with the model, monitoring model performance and usage, or periodic fine-tuning and retraining the model. Additionally, legislative analysis suggests the drafters were primarily concerned with personal information in a system’s output. The Assembly Floor analysis from August 31 suggests that organizations could comply with deletion requests by preventing their systems from outputting personal information through methods like:
1. Filtering and suppressing the system’s inputs and outputs.
2. Excluding the consumer’s personal information from the system’s training data.
3. Fine-tuning the system’s model in order to prevent the system from outputting personal information.
4. Directly manipulating model parameters in order to prevent the system from outputting personal information.
This emphasis on the output of generative AI models, rather than the models themselves, suggests that the bill does not necessarily define models as containing personal information per se. Addressing these policy ambiguities will benefit from guidance and alignment between companies, researchers, policy experts, and regulators. Ultimately, it will be up to the Attorney General and the California Privacy Protection Agency (CPPA) to make this determination through advisories, rulemaking, and/or enforcement actions. Though the CPPA issued a July 2024 letter on AB 1008, the letter does not provide detailed conclusion or analysis on the application of the CCPA to the model itself, leaving room for additional clarification and guidance.
Why Does it Matter? Practical Implications for Development and Compliance
The extent to which AI models, either independently or as part of AI systems, contain personal information could have significant implications for organizations’ obligations under privacy laws. If personal information exists primarily in the training, input, and output of a generative AI system, but not “within” the model, organizations can implement protective measures to comply with privacy laws like the CCPA through mechanisms like suppression and de-identification. For example, a deletion request could be applied to training datasets (to the extent that the information remains identifiable), or applied to prevent the information from being generated in the model’s output through suppression filters. Leading LLM providers are able to offer this option, such as in ChatGPT’s feature for individuals to request to “Remove your personal data from ChatGPT model outputs.”
However, if there were a legal interpretation that personal information can exist within a model, then suppression of a model’s output would not necessarily be sufficient to comply with a deletion request. While a suppression mechanism may prevent information from being generated in the output stage, such an approach requires that a company retain personal information as a screening mechanism in order to effectuate the suppression on an ongoing basis. An alternative option could be for information to be “un-learned” or “forgotten,” but this remains a challenging feat given the complexity of an AI model that does not rely on traditional storage and retrieval, and the fact that models may continue to be refined over time. While researchers are beginning to address this concept, it remains at an early stage. Furthermore, there is growing research interest in architectures that separate knowledge representation from the core language model.
Other compliance operations could also look different: for example, if a model contains personal information, then the purchase or licensing of a model would have the same obligations that typically go along with purchasing or licensing large databases of personal information.
Emerging Perspectives from European Regulators and Other Experts
While regulators in the United States have mostly not yet begun to address this legal issue directly, some early views are beginning to emerge from European regulators and other experts.
Preliminary Perspectives that Large Language Models Do Not Contain Personal Information: In July 2024, the Hamburg Data Protection Authority (Hamburg DPA) in Germany released an informal discussion paper arguing that LLMs do not store personal information under the GDPR because these models do not contain any data that relates to an identified or identifiable person. According to the Hamburg DPA, the tokenization and embedding processes involved in developing LLMs transform text into “abstract mathematical representations,” losing “concrete characteristics and references to specific individuals,” and instead reflect “general patterns and correlations derived from the training data.”
Fig. 1: Example indexed values of tokens created using OpenAI’s tokenizer tool (setting: GPT-3 (Legacy)).
The Hamburg DPA added that LLM outputs “do not store the texts used for training in their original form, but process them in such a way that the training data set can never be fully reconstructed from the model.”
Similarly, a 2023 guidance document from the Danish DPA (Datatilsynet) on AI use by public authorities also assumed that an AI model “does not in itself constitute personal data, but is only the result of the processing of personal data.” The document notes that some models can be attacked in ways that re-identify individuals whose information existed in the training data. While this would be considered a data breach, this risk “does not, in the opinion of the Danish Data Protection Agency, mean that the model should be considered personal data in itself.”
The Irish Data Protection Commission’s (DPC) 2024 guidance on generative AI acknowledged that some models may unintentionally regurgitate passages of personal training data, a key point raised by the sponsor of the California law. However, comprehensive reviews are underway, with Ireland’s DPC seeking an opinion from the European Data Protection Board (EDPB) on “issues arising from the use of personal data in AI models.”
Recently, the EDPB’s ChatGPT Taskforce Report noted that the processing of personal information occurs at different stages of an LLM’s life cycle (including collection, pre-processing, training, prompts, output, and further training). While it did not address personal information within the model itself, the report emphasized that “technical impossibility cannot be invoked to justify non-compliance with GDPR requirements.”
Opposing Perspectives that LLMs Do Contain Personal Information: In contrast, in a detailed analysis on this issue, technology lawyer David Rosenthal has argued that, according to the “relative approach” espoused by the Court of Justice of the European Union (CJEU Case C‑582/14), the question of whether an LLM contains personal information should be assessed solely from the perspectives of the LLM user and the parties who have access to the output. Whether a data controller can identify a data subject based on personal information derived from an LLM is not material; the information is considered personal information as long as the data subject can be identified or is “reasonably likely” to be identified by the party with access and that the party with access has an interest in identifying the data subject. Consequently, a party that discloses information that is personal information for another data subject is classified as disclosing personal information to a third party and must comply with GDPR.
Conversely, if an LLM user formulates a prompt that cannot be reasonably expected to generate output relating to a specific data subject—or if those with access to the output do not reasonably have the means to identify those data subjects and lacks the interest in doing so—then there is no personal information and thus data protection requirements do not apply. Other commentators that disagree with the Hamburg DPA’s discussion paper have focused on the reproducibility of training data, likening the data stored in an LLM to encrypted data.
What’s Next
Addressing these policy ambiguities will benefit from guidance and alignment between companies, researchers, policy experts, and regulators. Ultimately, it will be up to the California Attorney General and CPPA to make a determination under AB 1008 through advisories, rulemaking, or enforcement actions. Greater collaboration between regulators and technical experts may also help build a shared understanding of personal information in LLMs, non-LLM AI models, and AI systems and promote consistency in data protection regulation.
Even if California’s policy and legal conclusions ultimately differ from conclusions in other jurisdictions, building a shared (technical) understanding will help assess whether legislation like AB 1008 effectively address these issues and comports with existing privacy and data protection legal requirements, such as data minimization and consumer rights.
Updated February 25, 2025: FPF no longer coordinates the Multistate AI Policymaker Working Group
Updated February 25, 2025: FPF no longer coordinates the MAP-WG. Please read here for more information.
Future of Privacy Forum Convened Over 200 State Lawmakers in AI Policy Working Group Focused on 2025 Legislative Sessions
The Multistate AI Policymaker Working Group (MAP-WG) was convened by FPF to help state lawmakers from more than 45 states to collaborate on emerging technologies and related policy issues.
OCTOBER 21, 2024 — In the lead-up to the 2025 legislative session, FPF is excited to convene the expanded Multistate AI Policymaker Working Group (MAP-WG)—a bipartisan coalition of over 200 state lawmakers from more than 45 states, and highlight the group’s resources in a new dedicated landing page. This lawmaker-led initiative, facilitated by FPF, enables legislators to collaborate on developing a shared understanding of emerging technologies, particularly artificial intelligence, and to coordinate on related policy issues. In anticipation of significant state-level AI legislation in 2025, the group is expanding its efforts and launching a dedicated landing page (fpf.org/multistateAI) to centrally share its purpose, key resources, and insights with a broader audience.
In the absence of a comprehensive federal law regulating data privacy or AI, state lawmakers are quickly moving forward with local legislation in response to the rapid advances in AI. This has led to questions about how to best craft appropriate, consistent protections for individuals. The MAP-WG seeks to navigate these challenges by fostering collaboration and promoting better understanding of AI technologies.
FPF serves as a neutral convenor and is proud to be a trusted source of nonpartisan, practical expertise and support. While the group primarily focuses on artificial intelligence, its scope extends to related areas such as data privacy, enforcement, regulation, AI workforce development, and combating non-consensual intimate images.
“To foster better communication, idea-sharing, and support among state legislators, FPF is excited to convene and assist this group,” said Tatiana Rice, FPF’s Deputy Director for U.S. Legislation. “As a forum that brings together industry, academics, consumer advocates, and civil society to discuss emerging technologies and privacy protections, FPF is uniquely positioned to support the group’s mission of promoting the safe and equitable use of AI. We are thrilled to now provide a public resource dedicated to this vital collaboration.”
Participation in the MAP-WG is open to any U.S. state-level senator, representative, or public official, and current staff members. FPF serves as a neutral facilitator of the group. Meetings include open sessions with the participation of outside experts, and closed sessions reserved for lawmakers and staff.
The MAP-WG’s bipartisan steering committee, chaired by Connecticut Senator James Maroney (D), collaborates to decide the topics and agenda.
Other members of the lawmaker steering committee include:
Alaska Senator Shelley Hughes (R)
California Privacy Protection Agency Deputy Director of Policy and Legislation Maureen Mahoney
Colorado Senator Robert Rodriguez (D)
Florida Representative Fiona McFarland (R)
Maryland Senator Katie Fry Hester (D)
Minnesota Representative Steve Elkins (D)
Minnesota Representative Kristin Bahner (D)
New York Senator Kristen Gonzalez (D)
Texas Representative Giovanni Capriglione (R)
Virginia Delegate Michelle Maldonado (D)
Learn more about the work of the Multistate AI Policymaker Working Group here.
In recent weeks, some critics opposed to state AI legislation have made inaccurate claims about FPF on social media and other forums. I am writing to set the record straight.
FPF is an independent think tank that works with companies and policymakers to help find pragmatic solutions to support the benefits of emerging technologies, including AI. We facilitate expert, bipartisan conversations, and we are optimistic about the benefits of data use when safeguarded by pragmatic protections. FPF does not work to import European-style regulation into US states. We agree with Vice President JD Vance’s recent remarks that this model will not work for the United States. American leadership on AI is necessary, with US companies succeeding and winning against China.
FPF is not partisan. Our credibility as an honest convenor and thought leader is crucial to serving as a bridge between lawmakers, companies, and other experts. We are proud of the fact that over 90% of our funding comes from the private sector. The government grants referenced in a recent op-ed do not fund the Multistate AI Policymaker Working Group. These contracts support training in support of cross-border data flows and efforts to advance privacy enhancing technologies, which are often essential to accessing data sets that are otherwise unavailable. FPF has neither sought nor received grant support for our work facilitating the multistate group or for other work regarding state legislation.
The multistate group is bipartisan. Lawmakers drive the agenda and discussion, and FPF served as a neutral facilitator. We did not play any role in the Texas AI bill sponsored by Rep. Giovanni Capriglione (R-TX) nor have we taken a position on it. In fact, Rep. Capriglione worked with the Texas Public Policy Foundation on crafting the bill — a well-known conservative group that was previously led by President Donald J. Trump’s Agriculture Secretary Brooke Rollins.
At a recent Texas Public Policy Foundation event, Rep. Capriglione described the extensive local engagement process and local hearings he held to take input as he drafted a bill to respond to his Texas stakeholders’ concerns about AI. While lawmakers may choose to pursue different paths to legislating, the multistate group created a platform to find common ground about AI, mitigating a potential patchwork of conflicting and discordant AI laws that would be confusing for consumers and difficult to comply with. This is particularly important in the absence of a federal standard.
FPF greatly values the trust of stakeholders across the political spectrum. We do not want misperceptions about the Multistate AI Policymaker Working Group to create any misperceptions about the larger mission of FPF and our non-partisan role. As a result, we will be withdrawing from our work supporting the Working Group. We believe that Republicans and Democrats can find common ground and solutions to address issues raised by emerging technologies. FPF will continue to be non-partisan and fully transparent in our work.
Synthetic Content: Exploring the Risks, Technical Approaches, and Regulatory Responses
Today, the Future of Privacy Forum (FPF) released a new report, Synthetic Content: Exploring the Risks, Technical Approaches, and Regulatory Responses, which analyzes the various approaches being pursued to address the risks associated with “synthetic” content – material produced by generative artificial intelligence (AI) tools. As more people use generative AI to create synthetic content, civil society, media, and lawmakers are paying greater attention to some of the risks—such as disinformation, fraud, and abuse. Legislation to address these risks has focused primarily on disclosing the use of generative AI, increasing transparency around generative AI systems and content, and placing limitations on certain synthetic content. However, while these approaches may address some challenges with synthetic content, each one is individually limited in its reach and implicates a number of tradeoffs that policymakers should address going forward.
Synthetic content can raise a number of risks, including risks related to political disinformation and misinformation, fraud, and non-consensual intimate imagery (NCII) and child sexual abuse material (CSAM).
Policymakers and others are exploring various technical, organizational, and legal approaches to addressing synthetic content’s risks, such as requiring authentication techniques and placing limitations on certain uses of synthetic content.
Current approaches to regulating synthetic content may face a number of limitations and tradeoffs, including with privacy and security, and policymakers should evaluate the potential implications of these approaches.
This report is based on an extensive survey of existing technical and policy literature, recently-proposed and/or enacted legislation, and emerging regulatory guidance and rulemaking. The appendix provides further details about the current major legislative and regulatory frameworks being proposed in the U.S. regarding synthetic content.
This report is part of a larger, ongoing FPF effort to monitor and analyze emerging trends in synthetic content, including its potential risks, technical developments, and relevant legislation and regulation. For previous FPF work on this issue, check out the following:
Comment to the Federal Communications Commission (FCC) on disclosure and transparency of AI-generated content in political advertisements.
Comment to the National Institute of Standards & Technology (NIST) in response to NIST AI 100-4, “Reducing Risks Posed by Synthetic Content: An Overview of Technical Approaches to Digital Content Transparency.”
Comment to the Federal Trade Commission (FTC) on AI-driven impersonation.
Comment to the Federal Election Commission (FEC) on “fraudulent misrepresentation” in AI-generated political campaign ads.
One-pager analyzing California’s new AI Transparency Act (SB 942), which requires certain disclosures for AI-generated content (FPF members only).
Briefing analyzing general current legislative approaches to synthetic content (FPF members only).
If you would like to speak with us about this work, or about synthetic content more generally, please reach out to Jameson Spivack ([email protected]).
Out, Not Outed: Privacy for Sexual Health, Orientations, and Gender Identities
Co-authored by: Judy Wang (FPF Intern), Jeter Sison (FPF Intern), Jordan Wrigley (FPF Data and Policy Analyst, Health & Wellness)
On National Coming Out Day, it’s important to recognize that Coming Out is a right of passage for many LGBTQ+ individuals and a decision that they should be empowered to make for themselves.
Protections for health information are essential to ensuring the autonomy of an individual in choosing how to come out, to whom, and when. A person’s health information may facially reveal their sexual orientation and gender identity (“SOGI”). Alternatively, a person’s health information may not specifically include SOGI information, but SOGI information may be able to be inferred or extrapolated from other health information, especially in a personal profile that includes many different data points.
Opportunities for Health Data and Services for LGBTQ Individuals
While uses of health data may carry heightened risk for LGBTQ+ individuals, it is also particularly critical for those same individuals to have access to safe, secure, and practicable physical and mental health services. A poll by LGBT Tech shows LGBTQ+ individuals use online and digital health resources extensively to navigate information and access to healthcare. The National Coalition for LGBTQ Health has found that “LGBTQ people are more likely to report poor physical and mental health than the general population, including increased incidence of HIV and other sexually transmitted infections (STIs), long term conditions such as arthritis and chronic fatigue, and elevated risk of depression, anxiety, and other mental illness.” In addition, LGBTQ+ youth have been found to face heightened risks for mental health issues, and are 300% more likely to suffer from symptoms of depression. Unfortunately, at the same time, LGBTQ+ adults are twice as likely to report having experienced a negative health care interaction.
To address these health disparities, researchers have stressed the importance of collecting additional health information, including SOGI information. In addition, where LGBTQ+ individuals do not have local access to equitable health care resources, connected devices and services may provide new capacity for individuals to obtain valuable information about their health questions, engage with healthcare professionals, and even receive important physical and mental treatment.
Specific categories of tech services and applications have been found to play an increased role in addressing the health and wellbeing needs of LGBTQ+ individuals. Some of these include:
Mental health services: Through social media and mobile applications, the internet has been able to help connect LGBTQ+ people to accessible and appropriate LGBTQ+ mental health care. Platforms like National Queer & Trans Therapists of Color Network and LGBTQ Therapy Space connect LGBTQ+ people with specialized mental health care, resources, and community support. Mental health apps oriented towards the LGBTQ community provide a virtual format for accessing specialized, culturally competent healthcare.
Gender-affirming online care: With half of all U.S. states having passed bans on gender-affirming care, now more than ever, safe and private ways to access gender-affirming care online are needed to protect the rights of transgender Americans. 39% of U.S. transgender youth live in states that have passed bans on gender-affirming care, preventing them from accessing medically necessary care and worsening the stigma and discrimination against all transgender youth and people.
Dating & connections services: Dating and connections apps often play a crucial role in connecting LGBTQ+ users, especially in situations where it may be unsafe to do so in person. Compared to 28% of heterosexual adults who use dating apps, 50% of lesbian, gay, and bisexual adults have reported that they use dating apps. These apps have also provided an essential way for LGBTQ+ people to share and connect with others who face similar health challenges. For instance, HIV-positive LGBTQ+ people may use dating services to connect with other people living with HIV, a community of people who often face stigma while dating and in general.
Health Data Risks for LGBTQ+ People
While online health and wellbeing services can offer a lifeline to many LGBTQ+ individuals, they can also create new risks. For instance, improper uses of health information related to sexuality may undermine the autonomy of an LGBTQ+ individual. According to an article in the Oregon Law Review, data usage related to sexuality can result in pop-ups on an individual’s phone, creating worry and concern about potentially outing an individual in public or quasi-public situations. LGBTQ+ individuals are not always able to be safely out in public settings, with potential harms including impacts on employment, loss of housing opportunities, or jeopardization of personal relationships.
These risks have increased as certain U.S. states have pursued and enacted laws to undermine the rights and freedoms of LGBTQ+ people and communities. Efforts to restrict gender-affirming care have gained traction in several states, and lawmakers in Oklahoma, Texas, and South Carolina have proposed legislation to prohibit such care for transgender individuals up to the age of 26. Additionally, multiple states have implemented policies barring the use of Medicaid or other state-sponsored insurance for gender-affirming treatments, regardless of the patient’s age.
New Legal Protections for Health Data
When collected inside a health care environment, such as a telehealth service, health data, including SOGI information, is subject to protections under the Health Information Portability and Accountability Act (HIPAA) and related laws or regulations. However, these protections do not extend to similar information collected outside of that context. Filling that space, eighteen U.S. states (as of the writing of this post) have passed comprehensive privacy laws that apply more broadly, and each of them include “sexual orientation” as a category of sensitive data or sensitive personal information subject to special protections. However, of these 18 laws, only one (Maryland) affirmatively includes “gender-affirming treatment” in its scope of sensitive information, although three of the states (Oregon, Delaware, and New Jersey) do explicitly include “status as transgender or non-binary.”
In addition to comprehensive privacy laws, two states have implemented health data privacy laws – Washington (My Health, My Data Act (MHMD)) and Nevada (SB 370). Both laws have definitions for “consumer health data,” and include a non-exhaustive list of qualifying categories of information, including “gender-affirming care” information. Additionally, both laws define “consumer health data” as “health information derived or inferred from non-health data,” which could include data usages that unintentionally reveal information about a user’s sexual orientation or gender identity.
At the federal level, the Biden Administration has initiated the Federal Evidence Agenda on LGBTQI+ Equity, directing the collection of SOGI data in federal surveys and forms. The Federal Evidence Agenda on LGBTQ+ Equity follows the recommendation of implementing appropriate security and privacy safeguards with “Guideline 1: Ensure relevant data are collected and privacy protections are properly applied.” However, without comprehensive privacy legislation at the federal level, there is a lack of guidance regarding collecting and sharing data while protecting user privacy.
Additionally, in 2024, the U.S. Department of Health and Human Services (HHS) issued a rule designed to prevent discrimination based on gender identity or sexual orientation by healthcare providers and insurers that receive federal funding. Unfortunately, recently, a federal judge in Mississippi issued a preliminary injunction against the rule, citing the Supreme Court’s decision to overturn Chevron deference to administrative agencies. A recent Supreme Court decision also blocked the enforcement of a new rule from the Biden Administration that would protect transgender students from discrimination in education. The new rule included sexual orientation and gender identity within its discrimination protections for the first time.
Current best practices and recommendations
It is imperative for the collectors of health data about LGBTQ+ individuals, and in particular sexual orientation and gender identity (SOGI) data, to work toward a safer and more equitable use of SOGI data with meaningful privacy safeguards in mind.
SOGI data is inherently complex. Given its revelatory nature, SOGI data should be treated with heightened sensitivity. In 2022, FPF and LGBT Tech published a report on the Role of Data Protection in Safeguarding Sexual Orientation and Gender Identity Information. Organizations should apply regulatory safeguards robustly and approach SOGI information with appropriate care and respect for its contextual sensitivity. This can be done, for instance, by requiring consent for the use and collection of SOGI data and considering limitations before data is sent to third parties, particularly if it will make that data publicly accessible in a way that could increase the risk of outing someone. Organizations should implement appropriate security and privacy safeguards to protect any SOGI data in proportion to the sensitivity of the underlying data. They should also note that the sensitivity should be considered individually and in the community context.
Additional reports provide more information and can increase knowledge and understanding of the challenges and risks associated with the collection and use of health information for LGBTQ+ people. This includes a 2024 report by LGBT Tech that recommends that platforms adopt multifaceted strategies that prioritize user protection, inclusivity, and community empowerment. A research brief from the Center for Democracy and Technology in 2022 is also relevant, that explains how LGBTQ+ students are increasingly targeted by policies and practices that threaten their privacy in schools, with 29 percent of LGBTQ+ students reporting that they or someone they know has been outed by school-sponsored monitoring technology.
Conclusion
Coming Out Day is about having control over information. The decision to come out should be up to the discretion of the individual coming out. On Coming Out Day, and every day, robust privacy and data protection rules, policies, and practices are crucial to empower LGBTQ+ people to decide when and where to share their SOGI data.
FPF Analysis of New Requirements for Generative AI Use by Healthcare Entities in Patient Communications
Co-Authored by Judy Wang, FPF Communications Intern
On September 28, Governor Gavin Newsom signed California AB 3030, among a host of AI bills. CA AB 3030 amended the California Health & Safety Code and requires specified healthcare entities to disclose the use of generative artificial intelligence (AI) in provider-patient communications through visual or verbal disclaimers presented before, during, and/or at the end of a communication.
A new chart from FPF provides a “cheatsheet” of the covered entities and activities, as well as requirements based on the relevant type of communication. The chart also includes information about the obligations of health facilities to which the law applies and enforcement provisions incorporated from the broader legislative package.
Emphasis on transparency and indirect obligations for developers: CA AB 3030 emphasizes transparency for those who deploy generative AI; however, it may also indirectly impose on those upstream obligations for developers when creating features for healthcare generative AI tools.
A long list of covered entity definitions: The law adopts definitions from the California Health & Safety Code for many covered entities, including health facilities, clinics, and group practice offices. These entities include most clinical, treatment, and care contexts.
A narrow, use-based scope: The scope of the law includes covered entities that use “generative artificial intelligence to generate written or verbal patient communications about patient clinical information.”
Codifies human involvement, which will help avoid the “doom loops:” In addition to the transparency requirements discussed above, the law provides a second requirement focused on human-in-the-loop approaches.
Interested in getting a full analysis of AB 3030? Become an FPF member to access the complete analysis and breakdown of California AB 3030 in the member portal.
FPF Submits Comments to Inform New York Children’s Privacy Rulemaking Processes
At the end of the 2024 legislative session, New York State passed a pair of bills aimed at creating heightened protections for children and teens online. One, the New York Child Data Protection Act (NYCDPA), applies to a broad range of online services that are “primarily directed to children.” The NYCDPA creates novel substantive data minimization requirements, restricts the sale of data of children (defined as under 18), and requires businesses to respect “age flags” – new device signals intended to convey whether a user is a minor. The second law, the Stop Addictive Feeds Exploitation (SAFE) for Kids Act, is more narrowly focused on social media platforms and restricts minors’ access to the presentation of content made by “addictive feeds.”
On September 30, the Future of Privacy Forum (FPF) filed comments with the New York Office of the Attorney General (OAG) to inform forthcoming rulemaking for the implementation of these two frameworks. While raising the protections for youth privacy and online safety has been a priority area for lawmakers over the last several years, New York’s two new laws both take unique approaches. FPF’s comments seek to ensure that New York’s protections for youth online can protect and empower minors while supporting interoperability with existing state and federal privacy frameworks.
New York Child Data Protection Act
The NYCDPA creates new restrictions on the collection and procession of the personal data of teen users who are outside the scope of the federal Children’s Online Privacy Protection Act (COPPA). Under the law, a covered business must obtain “informed consent” to use teen data or the processing must be strictly necessary to meet one of nine enumerated “permissible purposes” such as conducting internal business operations or preventing cybersecurity threats. The requirement that, in the absence of informed consent, processing must be strictly necessary for minors 13+ could be stricter than COPPA standards, especially with respect to many digital advertising practices. The law also restricts the sale of minors’ personal data, including allowing a third party to sell the data.
Obtaining “informed consent” under the NYCDPA requires satisfying a number of conditions, some of which diverge from comparable privacy regimes. Consent must be made separately from any other transaction or part of a transaction; be made in the absence of any ‘dark patterns;’ clearly and conspicuously state that the processing for which consent is requested is not strictly necessary and that the minor may decline without preventing continued use of the service; and clearly present an option to refuse to provide consent as the most prominent option.
The NYCDPA is also unique in providing for the use of device signals to transmit legally binding information about a user’s age and their informed consent choices. Such technologies are not commonplace in the market and raise a number of both technical and policy questions and challenges.
With these unique provisions in mind, FPF’s comments recommend that the OAG:
1. Consider existing sources of law, including the COPPA Rule’s internal operations exception, state privacy laws, and the GDPR to provide guidance on the scope of “permissible processing” activities;
2. Where appropriate, align core privacy concepts with the developing state comprehensive privacy landscape including the definition of “personal information” and opportunities for data sharing for research;
3. Consult with the New York State Education Department to ensure alignment with New York’s existing student privacy laws and implementing regulations to avoid disruption to both schools and students access to and use of educational products and services;
4. Mitigate privacy, technical, and practical implementation concerns with “age flags” by further consulting with stakeholders and establishing baseline criteria for qualifying signals. FPF offers technical and policy considerations the OAG should consider in furthering this emerging technology; and
5. Explicitly distinguish informed consent device signals from “age flags”, given that providing consent at scale raises a separate set of challenges and may undermine the integrity of the NYCDPA’s opt-in consent framework.
The SAFE for Kids Act restricts social media platforms from offering “addictive feeds” unless the service has conducted “commercially reasonable” age verification to determine that a user is over 17 years of age, or the service has obtained verifiable parental consent (VPC). The legislative intent makes clear that ordering content in a chronological list would not be considered an “addictive feed.” A social media platform will also need to obtain VPC to provide notifications concerning an addictive feed to minors between the hours of midnight and 6 am.
“Addictive feeds” are broadly defined as a service in which user-generated content is recommended, selected, or prioritized in whole or in part on user data. There are six carve-outs to the definition of “addictive feed” such as displaying or prioritizing content that was specifically and unambiguously requested by the user or displaying content in response to a specific search inquiry from a user.
Notably, the SAFE for Kids Act focuses on parent consent for teens to receive “addictive feeds.” In contrast, teens are empowered by the Child Data Protection Act to provide informed consent for a broad range of activities. The divergence in policy approaches between these two laws regarding who can provide consent for a teen using a service may lead to challenges in understanding individual rights and protections.
Given the critical role of age verification and parental consent within the SAFE for Kids Act, FPF’s comments to the OAG focus on highlighting considerations, risks, and benefits of various methods for conducting age assurance and parental consent. In particular we note that:
1. There are three primary categories of age assurance in the United States: age declaration, age estimation, and age verification. Each method has its own challenges and risks that should be carefully balanced across the state interest in protecting minors online, the state of current technologies, and end-user realities when developing age verification standards.
2. When exploring appropriate methods for providing verifiable parental consent, the OAG should consider the known problems, concerns, and friction points that already exist with the existing verifiable parental consent framework under COPPA.
3. Strong data minimization, use limitations, and data retention standards could enhance data protection and user trust in age assurance and VPC requirements.
Regulatory Strategies of Data Protection Authorities in the Asia-Pacific Region: 2024, and Beyond
The Asia-Pacific (APAC) region has emerged as a dynamic and rapidly evolving landscape for data protection regulation. As digital economies flourish and cross-border data flows intensify, data protection authorities (DPAs) across the region are grappling with complex challenges posed by technological advancements, changing business practices, and evolving societal expectations regarding privacy.
This Report provides a comprehensive analysis of strategy documents and key regulatory actions of the DPAs in 10 jurisdictions, published or developed in 2023 and 2024, setting out regulatory priorities for the following years:
1. Australia
2. China
3. Hong Kong, Special Administrative Region of China (SAR)
The first provides an overview of key trends in the APAC region and identifies priority areas and future initiatives that APAC DPAs indicate that they will focus on in years to come.
The second provides a brief profile of each DPA and summarizes their regulatory actions for the period of 2023-2024, as well as any key strategy documents available.
Our analysis provides insights into how these DPAs have been working towards implementing their strategic priorities throughout 2023 and 2024. To the extent possible, the analysis in this Report is based on official strategy documents – that is, master plans, statements of regulatory priorities, annual reports, and the like – published by these DPAs between 2023-2024, supplemented by an examination of significant regulators actions taken by the DPAs during this period.
While we offer a thorough examination of recent and ongoing initiatives, it is important to note that the data protection landscape is dynamic and rapidly evolving. Therefore, this report not only serves as a retrospective overview but also aims to highlight prospective directions that DPAs may pursue in 2025 and beyond. By highlighting the trajectory of these regulatory bodies, we hope that this Report will aid readers in anticipating potential developments in data protection regulation and enforcement across the region. However, readers should bear in mind that unforeseen technological advancements, geopolitical shifts, or other factors may influence future regulatory approaches in ways that cannot be fully predicted at the time of publication.
The Report recognizes that each jurisdiction faces unique challenges, operates within distinct legal and cultural contexts, and may prioritize different aspects of data protection based on their specific circumstances. The Report is therefore not intended to make value judgments on DPAs, rank them, or evaluate their effectiveness in key areas. Rather, our aim is to identify commonalities and divergences in the DPAs’ priorities and approaches, in order to shed light on key trends in the APAC region. We hope that these insights will prove useful to policymakers, businesses, and data protection privacy professionals as they navigate the APAC region’s complex data protection landscape.
To ensure a comprehensive and accurate understanding of this Report’s scope and methodology, readers should note the following key considerations:
Not all the above DPAs consistently publish official strategy documents. Where a given DPA has not published a strategy document for the period of 2023-2024, the Report’s analysis infers the relevant DPA’s priorities from its regulatory actions.
Not all the above DPAs provide official documents and information in English. Where official English language translations of relevant documents and information are unavailable, we have worked from machine translations.
Our analysis focuses primarily on the DPAs’ strategies and priorities regarding the private sector. While public sector data protection is an important area, it often raises distinct considerations which are beyond the aims of this Report.
Analysis of key strategic documents and recent regulatory actions across the 10 APAC DPAs reveals several common priorities for 2024 and beyond.
Cybersecurity and data breach response emerged as the most widespread priority, with 90% of the DPAs prioritizing efforts to combat cyber threats and enhance organizational readiness for data breaches. This reflects the growing frequency and sophistication of cyber attacks across the region and globally.
Cross-border data transfers were a key priority for 80% of the DPAs, highlighting the increasing importance of facilitating secure international data flows in an interconnected digital economy.
AI governance and regulation was a key focus for 70% of the DPAs, as authorities grappled with the rapid advancement and adoption of AI technologies, particularly generative AI, in recent years.
Regulation of the use of biometric data, including facial recognition technology (FRT), was prioritized by 60% of DPAs, indicating growing concerns about the privacy implications of these technologies.
Finally, 50% of DPAs emphasized the protection of children’s personal data, recognizing the unique needs of young people in digital environments.
Updated FPF Infographic Explores Data in Connected Vehicles
Today, The Future of Privacy Forum is launching the Data and the Connected Vehicle Infographic 2.0, including new updates to account for the types of data associated with connected vehicles, features in and outside of the vehicle, and data handlers who receive and process data. Lawmakers, manufacturers, privacy professionals, and consumers are actively engaged in work to examine and respond to privacy and transparency practices related to personal data collected in and around vehicles. The updated infographic provides a visual representation of where the data flows within the connected vehicle ecosystem.
In 2017, FPF launched the first vehicle infographic, “Data and the Connected Car.” FPF’s continued work on connected vehicles has built upon this initial product, providing additional resources, up to and including the Vehicle Safety Systems Privacy Risks and Recommendations report from March 2024. The report specifically highlights the potential for privacy risks to exist when new technology is incorporated, through requirement or choice, In addition to comments to the Department of Transportation regarding privacy implications for future technology and privacy implications for the future use of AI in transportation.
The updated infographic highlights three specific areas within the connected vehicle ecosystem:
1. Types of Data in the Vehicle include vehicle and safety data, occupant data, location data, account data, and biometric and body-related data. Artificial intelligence is likely to be present in various features and functions throughout the vehicle. Understanding the types and categories of data associated with connected vehicles is essential for regulating data and increasing privacy literacy among individual drivers and passengers. Some data, like operational information or data on engine health, is integral to the vehicle functions, while other types of data can be user-generated and intended for personalization or driver assistance, including GPS navigation and usage of smartphone integration.
2. Features Inside and Outside the Vehicle include technologies such as infotainment systems, event data recorders, and tire sensors. Additional novel technologies may be more commonly incorporated into vehicles in the future. Some of the vehicle technologies may be added after-market by individuals or are specific to a certain vehicle make and model, such as keyless entry, augmented reality displays, or external charging. Certain vehicle features may be governed by specific requirements and rules according to state and federal regulations. In addition, manufacturers are increasingly incorporating certain technologies specifically in response to emerging regulatory requirements. An increase in technology and data collection can increase the privacy risk associated with the vehicle.
3. Data Receivers or Data Handlers are entities who collect and control the flow of data from inside and outside the vehicle for various purposes, including performance and safety. Once the data is collected, its transfer and use can depend on a number of factors, including agreements with the manufacturer, third parties and service providers, emergency services, and external infrastructure such as traffic lights and automatic license plate readers. Manufacturers may receive vehicle and safety data, location data, account data, occupant data, and biometric or body-related data (depending on the technology incorporated into the vehicle). Third parties and service providers may also receive information about the vehicle and potentially about the user. Some third parties in the connected vehicle ecosystem include insurance companies, dealerships and service centers, and entities that provide in-vehicle services through the infotainment system. Notice to individuals should provide information about when data is required for the vehicle to function or for important safety or regulatory requirements.
Individuals should feel physically and digitally safe in their vehicles. In 2023, FPF conducted a survey wherein consumers indicated that transparency is important to trust and adoption of in-vehicle technologies intended to increase safety. This updated infographic can help provide people with transparency by providing a visual demonstration to foster an understanding of how technology is utilized in a vehicle and where personal data may be implicated. Additionally, this infographic can serve as a resource for policymakers who need to understand the ecosystem in order to regulate effectively. As vehicle privacy continues to be top of mind for all individuals, the updated FPF infographic serves to help improve understanding and provide the transparency that is needed for a trusted mobility ecosystem.