Future of Privacy Forum Launches the FPF Center for Artificial Intelligence

The FPF Center for Artificial Intelligence will serve as a catalyst for AI policy and compliance leadership globally, advancing responsible data and AI practices for public and private stakeholders

Today, the Future of Privacy Forum (FPF) launched the FPF Center for Artificial Intelligence, established to better serve policymakers, companies, non-profit organizations, civil society, and academics as they navigate the challenges of AI policy and governance. The Center will expand FPF’s long-standing AI work, introduce large-scale novel research projects, and serve as a source for trusted, nuanced, nonpartisan, and practical expertise. 

FPF’s Center work will be international as AI continues to deploy globally and rapidly. Cities, states, countries, and international bodies are already grappling with implementing laws and policies to manage the risks.“Data, privacy, and AI are intrinsically interconnected issues that we have been working on at FPF for more than 15 years, and we remain dedicated to collaborating across the public and private sectors to promote their ethical, responsible, and human-centered use,” said Jules Polonetsky, FPF’s Chief Executive Officer. “But we have reached a tipping point in the development of the technology that will affect future generations for decades to come. At FPF, the word Forum is a core part of our identity. We are a trusted convener positioned to build bridges between stakeholders globally, and we will continue to do so under the new Center for AI, which will sit within FPF.”

The Center will help the organization’s 220+ members navigate AI through the development of best practices, research, legislative tracking, thought leadership, and public-facing resources. It will be a trusted evidence-based source of information for policymakers, and it will collaborate with academia and civil society to amplify relevant research and resources. 

“Although AI is not new, we have reached an unprecedented moment in the development of the technology that marks a true inflection point. The complexity, speed and scale of data processing that we are seeing in AI systems can be used to improve people’s lives and spur a potential leapfrogging of societal development, but with that increased capability comes associated risks to individuals and to institutions,” said Anne J. Flanagan, Vice President for Artificial Intelligence at FPF. “The FPF Center for AI will act as a collaborative force for shared knowledge between stakeholders to support the responsible development of AI, including its fair, safe, and equitable use.”

The Center will officially launch at FPF’s inaugural summit DC Privacy Forum: AI Forward. The in-person and public-facing summit will feature high-profile representatives from the public and private sectors in the world of privacy, data and AI. 

FPF’s new Center for Artificial Intelligence will be supported by a Leadership Council of leading experts from around the globe. The Council will consist of members from industry, academia, civil society, and current and former policymakers. 

See the full list of founding FPF Center for AI Leadership Council members here.

I am excited about the launch of the Future of Privacy Forum’s new Center for Artificial Intelligence and honored to be part of its leadership council. This announcement builds on many years of partnership and collaboration between Workday and FPF to develop privacy best practices and advance responsible AI, which has already generated meaningful outcomes, including last year’s launch of best practices to foster trust in this technology in the workplace.  I look forward to working alongside fellow members of the Council to support the Center’s mission to build trust in AI and am hopeful that together we can map a path forward to fully harness the power of this technology to unlock human potential.

Barbara Cosgrove, Vice President, Chief Privacy Officer, Workday

I’m honored to be a founding member of the Leadership Council of the Future of Privacy Forum’s new Center for Artificial Intelligence. AI’s impact transcends borders, and I’m excited to collaborate with a diverse group of experts around the world to inform companies, civil society, policymakers, and academics as they navigate the challenges and opportunities of AI governance, policy, and existing data protection regulations.

Dr. Gianclaudio Malgieri, Associate Professor of Law & Technology at eLaw, University of Leiden

“As we enter this era of AI, we must require the right balance between allowing innovation to flourish and keeping enterprises accountable for the technologies they create and put on the market. IBM believes it will be crucial that organizations such as the Future of Privacy Forum help advance responsible data and AI policies, and we are proud to join others in industry and academia as part of the Leadership Council.”

Learn more about the FPF Center for AI here.

About Future of Privacy Forum (FPF)

The Future of Privacy Forum (FPF) is a global non-profit organization that brings together academics, civil society, government officials, and industry to evaluate the societal, policy, and legal implications of data use, identify the risks, and develop appropriate protections. 

FPF believes technology and data can benefit society and improve lives if the right laws, policies, and rules are in place. FPF has offices in Washington D.C., Brussels, Singapore, and Tel Aviv. Learn more at fpf.org.

FPF Develops Checklist & Guide to Help Schools Vet AI Tools for Legal Compliance

FPF’s Youth and Education team has developed a checklist and accompanying policy brief to help schools vet generative AI tools for compliance with student privacy laws. Vetting Generative AI Tools for Use in Schools is a crucial resource as the use of generative AI tools continues to increase in educational settings. It’s critical for school leaders to understand how existing federal and state student privacy laws, such as the Family Educational Rights and Privacy Act (FERPA) apply to the complexities of machine learning systems to protect student privacy. With these resources, FPF aims to provide much-needed clarity and guidance to educational institutions grappling with these issues.

Click here to access the checklist and policy brief.

“AI technology holds immense promise in enhancing educational experiences for students, but it must be implemented responsibly and ethically,” said David Sallay, the Director for Youth & Education Privacy at the Future of Privacy Forum. “With our new checklist, we aim to empower educators and administrators with the knowledge and tools necessary to make informed decisions when selecting generative AI tools for classroom use while safeguarding student privacy.”

The checklist, designed specifically for K -12 schools, outlines key considerations when incorporating generative AI into a school or district’s edtech vetting checklist. 

These include: 

By prioritizing these steps, educational institutions can promote transparency and protect student privacy while maximizing the benefits of technology-driven learning experiences for students. 

The in-depth policy brief outlines the relevant laws and policies a school should consider, the unique compliance considerations of generative AI tools (including data collection, transparency and explainability, product improvement, and high-risk decision-making), and their most likely use cases (student, teacher, and institution-focused).

The brief also encourages schools and districts to update their existing edtech vetting policies to address the unique considerations of AI technologies (or to create a comprehensive policy if one does not already exist) instead of creating a separate vetting process for AI. It also highlights the role that state legislatures can play in ensuring the efficiency of school edtech vetting and oversight and calls on vendors to be proactively transparent with schools about their use of AI.

li live promo

Check out the LinkedIn Live with CEO Jules Polonetsky and Youth & Education Director David Sallay about the Checklist and Policy Brief.

To read more of the Future of Privacy Forum’s youth and student privacy resources, visit www.StudentPrivacyCompass.org

FPF Releases “The Playbook: Data Sharing for Research” Report and Infographic

Today, the Future of Privacy Forum (FPF) published “The Playbook: Data Sharing for Research,” a report on best practices for instituting research data-sharing programs between corporations and research institutions. FPF also developed a summary of recommendations from the full report.

Facilitating data sharing for research purposes between corporate data holders and academia can unlock new scientific insights and drive progress in public health, education, social science, and a myriad of other fields for the betterment of the broader society. Academic researchers use this data to consider consumer, commercial, and scientific questions at a scale they cannot reach using conventional research data-gathering techniques alone. This data also helped researchers answer questions on topics ranging from bias in targeted advertising and the influence of misinformation on election outcomes to early diagnosis of diseases through data collected by fitness and health apps.

The playbook addresses vital steps for data management, sharing, and program execution between companies and researchers. Creating a data-sharing ecosystem that positively advances scientific research requires a better understanding of the established risks, opportunities to address challenges, and the diverse stakeholders involved in data-sharing decisions. This report aims to encourage safe, responsible data-sharing between industries and researchers.

“Corporate data sharing connects companies with research institutions, by extension increasing the quantity and quality of research for social good,” said Shea Swauger, Senior Researcher for Data Sharing and Ethics. “This Playbook showcases the importance, and advantages, of having appropriate protocols in place to create safe and simple data sharing processes.”

In addition to the Playbook, FPF created a companion infographic summarizing the benefits, challenges, and opportunities of data sharing for research outlined in the larger report.

research data sharing infographic

As a longtime advocate for facilitating the privacy-protective sharing of data by industry to the research community, FPF is proud to have created this set of best practices for researchers, institutions, policymakers, and data-holding companies. In addition to the Playbook, the Future of Privacy Forum has also opened nominations for its annual Award for Research Data Stewardship.

“Our goal with these initiatives is to celebrate the successful research partnerships transforming how corporations and researchers interact with each other,” Swauger said. “Hopefully, we can continue to engage more audiences and encourage others to model their own programs with solid privacy safeguards.”

Shea Swauger, Senior Researcher for Data Sharing and Ethics, Future of privacy Forum

Established by FPF in 2020 with support from The Alfred P. Sloan Foundation, the Award for Research Data Stewardship recognizes excellence in the privacy-protective stewardship of corporate data shared with academic researchers. The call for nominations is open and closes on Tuesday, January 17, 2023. To submit a nomination, visit the FPF site.

FPF has also launched a newly formed Ethics and Data in Research Working Group; this group receives late-breaking analyses of emerging US legislation affecting research and data, meets to discuss the ethical and technological challenges of conducting research, and collaborates to create best practices to protect privacy, decrease risk, and increase data sharing for research, partnerships, and infrastructure. Learn more and join here

FPF Testifies Before House Subcommittee on Energy and Commerce, Supporting Congress’s Efforts on the “American Data Privacy and Protection Act” 

This week, FPF’s Senior Policy Counsel Bertram Lee testified before the U.S. House Energy and Commerce Subcommittee on Consumer Protection and Commerce hearing, “Protecting America’s Consumers: Bipartisan Legislation to Strengthen Data Privacy and Security” regarding the bipartisan, bicameral privacy discussion draft bill, “American Data Privacy and Protection Act” (ADPPA). FPF has a history of supporting the passage of a comprehensive federal consumer privacy law, which would provide businesses and consumers alike with the benefit of clear national standards and protections.

Lee’s testimony opened by applauding the Committee on its efforts towards comprehensive federal privacy legislation and emphasized the “time is now” for its passage. As it is written, the ADPPA would address gaps in the sectoral approach to consumer privacy, establish strong national civil rights protections, and establish new rights and safeguards for the protection of sensitive personal information. 

“The ADPPA is more comprehensive in scope, inclusive of civil rights protections, and provides individuals with more varied enforcement mechanisms in comparison to some states’ current privacy regimes,” Lee said in his testimony. “It also includes corporate accountability mechanisms, such as the requiring privacy designations, data security offices, and executive certifications showing compliance, which is missing from current states’ laws. Notably, the ADPPA also requires ‘short-form’ privacy notices to aid consumers of how their data will be used by companies and their rights — a provision that is not found in any state law.” 

Lee’s testimony also provided four recommendations to strengthen the bill, which include: 

Many of the recommendations would ensure that the legislation gives individuals meaningful privacy rights and places clear obligations on businesses and other organizations that collect, use and share personal data. The legislation would expand civil rights protections for individuals and communities harmed by algorithmic discrimination as well as require algorithmic assessments and evaluations to better understand how these technologies can impact communities. 

The submitted testimony and a video of the hearing can be found on the House Committee on Energy & Commerce site.

Reading the Signs: the Political Agreement on the New Transatlantic Data Privacy Framework

The President of the United States, Joe Biden, and the President of the European Commission, Ursula von der Leyen, announced last Friday, in Brussels, a political agreement on a new Transatlantic framework to replace the Privacy Shield. 

This is a significant escalation of the topic within Transatlantic affairs, compared to the 2016 announcement of a new deal to replace the Safe Harbor framework. Back then, it was Commission Vice-President Andrus Ansip and Commissioner Vera Jourova who announced at the beginning of February 2016 that a deal had been reached. 

The draft adequacy decision was only published a month after the announcement, and the adequacy decision was adopted 6 months later, in July 2016. Therefore, it should not be at all surprising if another 6 months (or more!) pass before the adequacy decision for the new Framework will produce legal effects and actually be able to support transfers from the EU to the US. Especially since the US side still has to pass at least one Executive Order to provide for the agreed-upon new safeguards.

This means that transfers of personal data from the EU to the US may still be blocked in the following months – possibly without a lawful alternative to continue them – as a consequence of Data Protection Authorities (DPAs) enforcing Chapter V of the General Data Protection Regulation in the light of the Schrems II judgment of the Court of Justice of the EU, either as part of the 101 noyb complaints submitted in August 2020 and slowly starting to be solved, or as part of other individual complaints/court cases. 

After the agreement “in principle” was announced at the highest possible political level, EU Justice Commissioner Didier Reynders doubled down on the point that this agreement is reached “on the principles” for a new framework, rather than on the details of it. Later on he also gave credit to Commerce Secretary Gina Raimondo and US Attorney General Merrick Garland for their hands-on involvement in working towards this agreement. 

In fact, “in principle” became the leitmotif of the announcement, as the first EU Data Protection Authority to react to the announcement was the European Data Protection Supervisor, who wrote that he “Welcomes, in principle”, the announcement of a new EU-US transfers deal – “The details of the new agreement remain to be seen. However, EDPS stresses that a new framework for transatlantic data flows must be sustainable in light of requirements identified by the Court of Justice of the EU”.

Of note, there is no catchy name for the new transfers agreement, which was referred to as the “Trans-Atlantic Data Privacy Framework”. Nonetheless, FPF’s CEO Jules Polonetsky submits the “TA DA!” Agreement, and he has my vote. For his full statement on the political agreement being reached, see our release here.

Some details of the “principles” agreed on were published hours after the announcement, both by the White House and by the European Commission. Below are a couple of things that caught my attention from the two brief Factsheets.

The US has committed to “implement new safeguards” to ensure that SIGINT activities are “necessary and proportionate” (an EU law legal measure – see Article 52 of the EU Charter on how the exercise of fundamental rights can be limited) in the pursuit of defined national security objectives. Therefore, the new agreement is expected to address the lack of safeguards for government access to personal data as specifically outlined by the CJEU in the Schrems II judgment.

The US also committed to creating a “new mechanism for the EU individuals to seek redress if they believe they are unlawfully targeted by signals intelligence activities”. This new mechanism was characterized by the White House as having “independent and binding authority”. Per the White House, this redress mechanism includes “a new multi-layer redress mechanism that includes an independent Data Protection Review Court that would consist of individuals chosen from outside the US Government who would have full authority to adjudicate claims and direct remedial measures as needed”. The EU Commission mentioned in its own Factsheet that this would be a “two-tier redress system”. 

Importantly, the White House mentioned in the Factsheet that oversight of intelligence activities will also be boosted – “intelligence agencies will adopt procedures to ensure effective oversight of new privacy and civil liberties standards”. Oversight and redress are different issues and are both equally important – for details, see this piece by Christopher Docksey. However, they tend to be thought of as being one and the same. Being addressed separately in this announcement is significant.

One of the remarkable things about the White House announcement is that it includes several EU law-specific concepts: “necessary and proportionate”, “privacy, data protection” mentioned separately, “legal basis” for data flows. In another nod to the European approach to data protection, the entire issue of ensuring safeguards for data flows is framed as more than a trade or commerce issue – with references to a “shared commitment to privacy, data protection, the rule of law, and our collective security as well as our mutual recognition of the importance of trans-Atlantic data flows to our respective citizens, economies, and societies”.

Last, but not least, Europeans have always framed their concerns related to surveillance and data protection as being fundamental rights concerns. The US also gives a nod to this approach, by referring a couple of times to “privacy and civil liberties” safeguards (adding thus the “civil liberties” dimension) that will be “strengthened”. All of these are positive signs for a “rapprochement” of the two legal systems and are certainly an improvement to the “commerce” focused approach of the past on the US side. 

Lastly, it should also be noted that the new framework will continue to be a self-certification scheme managed by the US Department of Commerce.  

What does all of this mean in practice? As the White House details, this means that the Biden Administration will have to adopt (at least) an Executive Order (EO) that includes all these commitments and on the basis of which the European Commission will draft an adequacy decision.

Thus, there are great expectations in sight following the White House and European Commission Factsheets, and the entire privacy and data protection community is waiting to see further details.

In the meantime, I’ll leave you with an observation made by my colleague, Amie Stepanovich, VP for US Policy at FPF, who highlighted that Section 702 of the FISA Act is set to expire on December 31, 2023. This presents Congress with an opportunity to act, building on such an extensive amount of work done by the US Government in the context of the Transatlantic Data Transfers debate.

Privacy Best Practices for Rideshare Drivers Using Dashcams

FPF & Uber Publish Guide Highlighting Privacy Best Practices for Drivers who Record Video and Audio on Rideshare Journeys

FPF and Uber have created a guide for US-based rideshare drivers who install “dashcams” – video cameras mounted on a vehicle’s dashboard or windshield. Many drivers install dashcams to improve safety, security, and accountability; the cameras can capture crashes or other safety-related incidents outside and inside cars. Dashcam footage can be helpful to drivers, passengers, insurance companies, and others when adjudicating legal claims. At the same time, dashcams can pose substantial privacy risks if appropriate safeguards are not in place to limit the collection, use, and disclosure of personal data. 

Dashcams typically record video outside a vehicle. Many dashcams also record in-vehicle audio and some record in-vehicle video. Regardless of the particular device used, ride-hail drivers who use dashcams must comply with applicable audio and video recording laws.

The guide explains relevant laws and provides practical tips to help drivers be transparent, limit data use and sharing, retain video and audio-only for practical purposes, and use strict security controls. The guide highlights ways that drivers can employ physical signs, in-app notices, and other means to ensure passengers are informed about dashcam use and can make meaningful choices about whether to travel in a dashcam-equipped vehicle. Drivers seeking advice concerning specific legal obligations or incidents should consult legal counsel.

Privacy best practices for dashcams include: 

  1. Give individuals notice that they are being recorded
    • Place recording notices inside and on the vehicle.
    • Mount the dashcam in a visible location.
    • Consider, in some situations, giving an oral notification that recording is taking place.
    • Determine whether the ride sharing service provides recording notifications in the app, and utilize those in-app notices.
  2. Only record audio and video for defined, reasonable purposes
    • Only keep recordings for as long as needed for the original purpose.
    • Inform passengers as to why video and/or audio is being recorded.
  3. Limit sharing and use of recorded footage
    • Only share video and audio with third parties for relevant reasons that align with the original reason for recording.
    • Thoroughly review the rideshare service’s privacy policy and community guidelines if using an app-based rideshare service, and be aware that many rideshare companies maintain policies against widely disseminating recordings.
  4. Safeguard and encrypt recordings and delete unused footage
    • Identify dashcam vendors that provide the highest privacy and security safeguards.
    • Carefully read the terms and conditions when buying dashcams to understand the data flows.

Uber will be making these best practices available to drivers in their app and website. 

Many ride-hail drivers use dashcams in their cars, and the guidance and best practices published today provide practical guidance to help drivers implement privacy protections. But driver guidance is only one aspect of ensuring individuals’ privacy and security when traveling. Dashcam manufacturers must implement privacy-protective practices by default and provide easy-to-use privacy options. At the same time, ride-hail platforms must provide drivers with the appropriate tools to notify riders, and carmakers must safeguard drivers’ and passengers’ data collected by OEM devices.

In addition, dashcams are only one example of increasingly sophisticated sensors appearing in passenger vehicles as part of driver monitoring systems and related technologies. Further work is needed to apply comprehensive privacy safeguards to emerging technologies across the connected vehicle sector, from carmakers and rideshare services to mobility services providers and platforms. Comprehensive federal privacy legislation would be a good start. And in the absence of Congressional action, FPF is doing further work to identify key privacy risks and mitigation strategies for the broader class of driver monitoring systems that raise questions about technologies beyond the scope of this dashcam guide.

12th Annual Privacy Papers for Policymakers Awardees Explore the Nature of Privacy Rights & Harms

The winners of the 12th annual Future of Privacy (FPF) Privacy Papers for Policymakers Award ask big questions about what should be the foundational elements of data privacy and protection and who will make key decisions about the application of privacy rights. Their scholarship will inform policy discussions around the world about privacy harms, corporate responsibilities, oversight of algorithms, and biometric data, among other topics.

“Policymakers and regulators in many countries are working to advance data protection laws, often seeking in particular to combat discrimination and unfairness,” said FPF CEO Jules Polonetsky. “FPF is proud to highlight independent researchers tackling big questions about how individuals and society relate to technology and data.”

This year’s papers also explore smartphone platforms as privacy regulators, the concept of data loyalty, and global privacy regulation. The award recognizes leading privacy scholarship that is relevant to policymakers in the U.S. Congress, at U.S. federal agencies, and among international data protection authorities. The winning papers will be presented at a virtual event on February 10, 2022. 

The winners of the 2022 Privacy Papers for Policymakers Award are:

From the record number of nominated papers submitted this year, these six papers were selected by a diverse team of academics, advocates, and industry privacy professionals from FPF’s Advisory Board. The winning papers were selected based on the research and solutions that are relevant for policymakers and regulators in the U.S. and abroad.

In addition to the winning papers, FPF has selected two papers for Honorable Mention: Verification Dilemmas and the Promise of Zero-Knowledge Proofs by Kenneth Bamberger, University of California, Berkeley – School of Law; Ran Canetti, Boston University, Department of Computer Science, Boston University, Faculty of Computing and Data Science, Boston University, Center for Reliable Information Systems and Cybersecurity; Shafi Goldwasser, University of California, Berkeley – Simons Institute for the Theory of Computing; Rebecca Wexler, University of California, Berkeley – School of Law; and Evan Zimmerman, University of California, Berkeley – School of Law; and A Taxonomy of Police Technology’s Racial Inequity Problems by Laura Moy, Georgetown University Law Center.

FPF also selected a paper for the Student Paper Award, A Fait Accompli? An Empirical Study into the Absence of Consent to Third Party Tracking in Android Apps by Konrad Kollnig and Reuben Binns, University of Oxford; Pierre Dewitte, KU Leuven; Max van Kleek, Ge Wang, Daniel Omeiza, Helena Webb, and Nigel Shadbolt, University of Oxford. The Student Paper Award Honorable Mention was awarded to Yeji Kim, University of California, Berkeley – School of Law, for her paper, Virtual Reality Data and Its Privacy Regulatory Challenges: A Call to Move Beyond Text-Based Informed Consent.

The winning authors will join FPF staff to present their work at a virtual event with policymakers from around the world, academics, and industry privacy professionals. The event will be held on February 10, 2022, from 1:00 – 3:00 PM EST. The event is free and open to the general public. To register for the event, visit https://bit.ly/3qmJdL2.

Organizations must lead with privacy and ethics when researching and implementing neurotechnology: FPF and IBM Live event and report release

The Future of Privacy Forum (FPF) and the IBM Policy Lab released recommendations for promoting privacy and mitigating risks associated with neurotechnology, specifically with brain-computer interface (BCI). The new report provides developers and policymakers with actionable ways this technology can be implemented while protecting the privacy and rights of its users.

“We have a prime opportunity now to implement strong privacy and human rights protections as brain-computer interfaces become more widely used,” said Jeremy Greenberg, Policy Counsel at the Future of Privacy Forum. “Among other uses, these technologies have tremendous potential to treat people with diseases and conditions like epilepsy or paralysis and make it easier for people with disabilities to communicate, but these benefits can only be fully realized if meaningful privacy and ethical safeguards are in place.”

Brain-computer interfaces are computer-based systems that are capable of directly recording, processing, analyzing, or modulating human brain activity. The sensitivity of data that BCIs collect and the capabilities of the technology raise concerns over consent, as well as the transparency, security, and accuracy of the data. The report offers a number of policy and technical solutions to mitigate the risks of BCIs and highlights their positive uses.

“Emerging innovations like neurotechnology hold great promise to transform healthcare, education, transportation, and more, but they need the right guardrails in place to protect individuals’ privacy,” said IBM Chief Privacy Officer Christina Montgomery. “Working together with the Future of Privacy Forum, the IBM Policy Lab is pleased to release a new framework to help policymakers and businesses navigate the future of neurotechnology while safeguarding human rights.”

FPF and IBM have outlined several key policy recommendations to mitigate the privacy risks associated with BCIs, including:

FPF and IBM have also included several technical recommendations for BCI devices, including:

FPF-curated educational resources, policy & regulatory documents, academic papers, thought pieces, and technical analyses regarding brain-computer interfaces are available here.

Read FPF’s four-part series on Brain-Computer Interfaces (BCIs), providing an overview of the technology, use cases, privacy risks, and proposed recommendations for promoting privacy and mitigating risks associated with BCIs.

FPF Launches Asia-Pacific Region Office, Global Data Protection Expert Clarisse Girot Leads Team

The Future of Privacy Forum (FPF) has appointed Clarisse Girot, PhD, LLM, an expert on Asian and European privacy legislation, to lead its new FPF Asia-Pacific office based in Singapore as Director. This new office expands FPF’s international reach in Asia and complements FPF’s offices in the U.S., Europe, and Israel, as well as partnerships around the globe.
 
Dr. Clarisse Girot is a privacy professional with over twenty years of experience in the privacy and data protection fields. Since 2017, Clarisse has been leading the Asian Business Law Institute’s (ABLI) Data Privacy Project, focusing on the regulations on cross-border data transfers in 14 Asian jurisdictions. Prior to her time at ABLI, Clarisse served as the Counsellor to the President of the French Data Protection Authority (CNIL) and Chair of the Article 29 Working Party. She previously served as head of CNIL’s Department of European and International Affairs, where she sat on the Article 29 Working Party, the group of EU Data Protection Authorities, and was involved in major international cases in data protection and privacy.
 
“Clarisse is joining FPF at an important time for data protection in the Asia-Pacific region. The two most populous countries in the world, India, and China, are introducing general privacy laws, and established data protection jurisdictions, like Singapore, Japan, South Korea, and New Zealand, have recently updated their laws,” said FPF CEO Jules Polonetsky. “Her extensive knowledge of privacy law will provide vital insights for those interested in compliance with regional privacy frameworks and their evolution over time.”
 
FPF Asia-Pacific will focus on several priorities by the end of the year including hosting an event at this year’s Singapore Data Protection Week. The office will provide expertise in digital data flows and discuss emerging data protection issues in a way that is useful for regulators, policymakers, and legal professionals. Rajah & Tann Singapore LLP is supporting the work of the FPF Asia-Pacific office.
 
“The FPF global team will greatly benefit from the addition of Clarisse. She will advise FPF staff, advisory board members, and the public on the most significant privacy developments in the Asia-Pacific region, including data protection bills and cross-border data flows,” said Gabriela Zanfir-Fortuna, Director for Global Privacy at FPF. “Her past experience in both Asia and Europe gives her a unique ability to confront the most complex issues dealing with cross-border data protection.”
 
As over 140 countries have now enacted a privacy or data protection law, FPF continues to expand its international presence to help data protection experts grapple with the challenges of ensuring responsible uses of data. Following the appointment of Malavika Raghavan as Senior Fellow for India in 2020, the launch of the FPF Asia-Pacific office further expands FPF’s international reach.
 
Dr. Gabriela Zanfir-Fortuna leads FPF’s international efforts and works on global privacy developments and European data protection law and policy. The FPF Europe office is led by Dr. Rob van Eijk, who prior to joining FPF worked at the Dutch Data Protection Authority as Senior Supervision Officer and Technologist for nearly ten years. FPF has created thriving partnerships with leading privacy research organizations in the European Union, such as Dublin City University and the Brussels Privacy Hub of the Vrije Universiteit Brussel (VUB). FPF continues to serve as a leading voice in Europe on issues of international data flows, the ethics of AI, and emerging privacy issues. FPF Europe recently published a report comparing the regulatory strategy for 2021-2022 of 15 Data Protection Authorities to provide insights into the future of enforcement and regulatory action in the EU.
 
Outside of Europe, FPF has launched a variety of projects to advance tech policy leadership and scholarship in regions around the world, including Israel and Latin America. The work of the Israel Tech Policy Institute (ITPI), led by Managing Director Limor Shmerling Magazanik, includes publishing a report on AI Ethics in Government Services and organizing an OECD workshop with the Israeli Ministry of Health on access to health data for research.
 
In Latin America, FPF has partnered with the leading research association Data Privacy Brasil, provided in-depth analysis on Brazil’s LGPD privacy legislation and various data privacy cases decided in the Brazilian Supreme Court. FPF recently organized a panel during the CPDP LatAm Conference which explored the state of Latin American data protection laws alongside experts from Uber, the University of Brasilia, and the Interamerican Institute of Human Rights.
 

Read Dr. Girot’s Q&A on the FPF blog. Stay updated: Sign up for FPF Asia-Pacific email alerts.
 

FPF and Leading Health & Equity Organizations Issue Principles for Privacy & Equity in Digital Contact Tracing Technologies

With support from the Robert Wood Johnson Foundation, FPF engaged leaders within the privacy and equity communities to develop actionable guiding principles and a framework to help bolster the responsible implementation of digital contact tracing technologies (DCTT). Today, seven privacy, civil rights, and health equity organizations signed on to these guiding principles for organizations implementing DCTT.

“We learned early in our Privacy and Pandemics initiative that unresolved ethical, legal, social, and equity issues may challenge the responsible implementation of digital contact tracing technologies,” said Jules Polonetsky, CEO of the Future of Privacy Forum. “So we engaged leaders within the civil rights, health equity, and privacy communities to create a set of actionable principles to help guide organizations implementing digital contact tracing that respects individual rights.”

Contact tracing has long been used to monitor the spread of various infectious diseases. In light of COVID-19, governments and companies began deploying digital exposure notification using Bluetooth and geolocation data on mobile devices to boost contact tracing efforts and quickly identify individuals who may have been exposed to the virus. However, as DCTT begins to play an important role in public health, it is important to take necessary steps to ensure equity in access to DCTT and understand the societal risks and tradeoffs that might accompany its implementation today and in the future. Governance efforts that seek to better understand these risks will be better able to bolster public trust in DCTT technologies. 

“LGBT Tech is proud to have participated in the development of the Principles and Framework alongside FPF and other organizations. We are heartened to see that the focus of these principles is on historically underserved and under-resourced communities everywhere, like the LGBTQ+ community. We believe the Principles and Framework will help ensure that the needs and vulnerabilities of these populations are at the forefront during today’s pandemic and future pandemics.”

Carlos Gutierrez, Deputy Director, and General Counsel, LGBT Tech

“If we establish practices that protect individual privacy and equity, digital contact tracing technologies could play a pivotal role in tracking infectious diseases,” said Dr. Rachele Hendricks-Sturrup, Research Director at the Duke-Margolis Center for Health Policy. “These principles allow organizations implementing digital contact tracing to take ethical and responsible approaches to how their technology collects, tracks, and shares personal information.”

FPF, together with Dialogue on Diversity, the National Alliance Against Disparities in Patient Health (NADPH), BrightHive, and LGBT Tech, developed the principles, which advise organizations implementing DCTT to commit to the following actions:

  1. Be Transparent About How Data Is Used and Shared. 
  1. Apply Strong De-Identification Techniques and Solutions. 
  1. Empower Users Through Tiered Opt-in/Opt-out Features and Data Minimization. 
  1. Acknowledge and Address Privacy, Security, and Nondiscrimination Protection Gaps. 
  1. Create Equitable Access to DCTT. 
  1. Acknowledge and Address Implicit Bias Within and Across Public and Private Settings.
  1. Democratize Data for Public Good While Employing Appropriate Privacy Safeguards. 
  1. Adopt Privacy-By-Design Standards That Make DCTT Broadly Accessible. 

Additional supporters of these principles include the Center for Democracy and Technology and Human Rights First.

To learn more and sign on to the DCTT Principles visit fpf.org/DCTT.

Support for this program was provided by the Robert Wood Johnson Foundation. The views expressed here do not necessarily reflect the views of the Foundation.

Navigating Preemption through the Lens of Existing State Privacy Laws

This post is the second of two posts on federal preemption and enforcement in United States federal privacy legislation. See Preemption in US Privacy Laws (June 14, 2021).

In drafting a federal baseline privacy law in the United States, lawmakers must decide to what extent the law will override state and local privacy laws. In a previous post, we discussed a survey of 12 existing federal privacy laws passed between 1968-2003, and the extent to which they are preemptive of similar state laws. 

Another way to approach the same question, however, is to examine the hundreds of existing state privacy laws currently on the books in the United States. Conversations around federal preemption inevitably focus on comprehensive laws like the California Consumer Privacy Act, or the Virginia Consumer Data Protection Act — but there are hundreds of other state privacy laws on the books that regulate commercial and government uses of data. 

In reviewing existing state laws, we find that they can be categorized usefully into: laws that complement heavily regulated sectors (such as health and finance); laws of general applicability; common law; laws governing state government activities (such as schools and law enforcement); comprehensive laws; longstanding or narrowly applicable privacy laws; and emerging sectoral laws (such as biometrics or drones regulations). As a resource, we recommend: Robert Ellis Smith, Compilation of State and Federal Privacy Laws (last supplemented in 2018). 

  1. Heavily Regulated Sectoral Silos. Most federal proposals for a comprehensive privacy law would not supersede other existing federal laws that contain privacy requirements for businesses, such as the Health Insurance Portability and Accountability Act (HIPAA) or the Gramm-Leach-Bliley Act (GLBA). As a result, a new privacy law should probably not preempt state sectoral laws that: (1) supplement their federal counterparts and (2) were intentionally not preempted by those federal regimes. In many cases, robust compliance regimes have been built around federal and state parallel requirements, creating entrenched privacy expectations, privacy tools, and compliance practices for organizations (“lock in”).
  1. Laws of General Applicability. All 50 states have laws barring unfair and deceptive commercial and trade practices (UDAP), as well as generally applicable laws against fraud, unconscionable contracts, and other consumer protections. In cases where violations involve the mis-use of personal information, such claims could be inadvertently preempted by a national privacy law.
  1. State Common Law. Privacy claims have been evolving in US common law over the last hundred years, and claims vary from state to state. A federal privacy law might preempt (or not preempt) claims brought under theories of negligence, breach of contract, product liability, invasions of privacy, or other “privacy torts.”
  2. State Laws Governing State Government Activities. In general, states retain the right to regulate their own government entities, and a commercial baseline privacy law is unlikely to affect such state privacy laws. These include, for example, state “mini Privacy Acts” applying to state government agencies’ collection of records, state privacy laws applicable to public schools and school districts, and state regulations involving law enforcement — such as government facial recognition bans.
  1. Comprehensive or Non-Sectoral State Laws. Lawmakers considering the extent of federal preemption should take extra care to consider the effect on different aspects of omnibus or comprehensive consumer privacy laws, such as the California Consumer Privacy Act (CCPA), the Colorado Privacy Act, and the Virginia Consumer Data Protection Act. In addition, however, there are a number of other state privacy laws that can be considered “non-sectoral” because they apply broadly to businesses that collect or use personal information. These include, for example, CalOPPA (requiring commercial privacy policies), the California “Shine the Light” law (requiring disclosures from companies that share personal information for direct marketing), data breach notification laws, and data disposal laws.
  1. Longstanding, Narrowly Applicable State Privacy Laws. Many states have relatively long-standing privacy statutes on the books that govern narrow use cases, such as: state laws governing library records, social media password laws, mugshot laws, anti-paparazzi laws, state laws governing audio surveillance between private parties, and laws governing digital assets of decedents. In many cases, such laws could be expressly preserved or incorporated into a federal law. 
  1. Emerging Sectoral and Future-Looking Privacy Laws. New state laws have emerged in recent years in response to novel concerns, including for: biometric data; drones; connected and autonomous vehicles; the Internet of Things; data broker registration; and disclosure of intimate images. This trend is likely to continue, particularly in the absence of a federal law.

Congressional intent is the “ultimate touchstone” of preemption. Lawmakers should consider long-term effects on current and future state laws, including how they will be impacted by a preemption provision, as well as how they might be expressly preserved through a Savings Clause. In order to help build consensus, lawmakers should work with stakeholders and experts in the numerous categories of laws discussed above, to consider how they might be impacted by federal preemption.

ICYMI: Read the first blog in this series PREEMPTION IN US PRIVACY LAWS.

Manipulative Design: Defining Areas of Focus for Consumer Privacy

In consumer privacy, the phrase “dark patterns” is everywhere. Emerging from a wide range of technical and academic literature, it now appears in at least two US privacy laws: the California Privacy Rights Act and the Colorado Privacy Act (which, if signed by the Governor, will come into effect in 2025).

Under both laws, companies will be prohibited from using “dark patterns,” or “user interface[s] designed or manipulated with the substantial effect of subverting or impairing user autonomy, decision‐making, or choice,” to obtain user consent in certain situations–for example, for the collection of sensitive data.

When organizations give individuals choices, some forms of manipulation have long been barred by consumer protection laws, with the Federal Trade Commission and state Attorneys General prohibiting companies from deceiving or coercing consumers into taking actions they did not intend or striking bargains they did not want. But consumer protection law does not typically prohibit organizations from persuading consumers to make a particular choice. And it is often unclear where the lines fall between cajoling, persuading, pressuring, nagging, annoying, or bullying consumers. The California and Colorado laws seek to do more than merely bar deceptive practices; they prohibit design that “subverts or impairs user autonomy.”

What does it mean to subvert user autonomy, if a design does not already run afoul of traditional consumer protections law? Just as in the physical world, the design of digital platforms and services always influences behavior — what to pay attention to, what to read and in what order, how much time to spend, what to buy, and so on. To paraphrase Harry Brignull (credited with coining the term), not everything “annoying” can be a dark pattern. Some examples of dark patterns are both clear and harmful, such as a design that tricks users into making recurring payments, or a service that offers a “free trial” and then makes it difficult or impossible to cancel. In other cases, the presence of “nudging” may be clear, but harms may be less clear, such as in beta-testing what color shades are most effective at encouraging sales. Still others fall in a legal grey area: for example, is it ever appropriate for a company to repeatedly “nag” users to make a choice that benefits the company, with little or no accompanying benefit to the user?

In Fall 2021, Future of Privacy Forum will host a series of workshops with technical, academic, and legal experts to help define clear areas of focus for consumer privacy, and guidance for policymakers and legislators. These workshops will feature experts on manipulative design in at least three contexts of consumer privacy: (1) Youth & Education; (2) Online Advertising and US Law; and (3) GDPR and European Law. 

As lawmakers address this issue, we identify at least four distinct areas of concern:

This week at the first edition of the annual Dublin Privacy Symposium, FPF will join other experts to discuss principles for transparency and trust. The design of user interfaces for digital products and services pervades modern life and directly impacts the choices people make with respect to sharing their personal information. 

India’s new Intermediary & Digital Media Rules: Expanding the Boundaries of Executive Power in Digital Regulation

tree 200795 1920

Author: Malavika Raghavan

India’s new rules on intermediary liability and regulation of publishers of digital content have generated significant debate since their release in February 2021. The Information Technology (Intermediary Guidelines and Digital Media Ethics Code) Rules, 2021 (the Rules) have:

The majority of these provisions were unanticipated, resulting in a raft of petitions filed in High Courts across the country challenging the validity of the various aspects of the Rules, including with regard to their constitutionality. On 25 May 2021, the three month compliance period on some new requirements for significant social media intermediaries (so designated by the Rules) expired, without many intermediaries being in compliance opening them up to liability under the Information Technology Act as well as wider civil and criminal laws. This has reignited debates about the impact of the Rules on business continuity and liability, citizens’ access to online services, privacy and security. 

Following on FPF’s previous blog highlighting some aspects of these Rules, this article presents an overview of the Rules before deep-diving into critical issues regarding their interpretation and application in India. It concludes by taking stock of some of the emerging effects of these new regulations, which have major implications for millions of Indian users, as well as digital services providers serving the Indian market. 

1. Brief overview of the Rules: Two new regimes for ‘intermediaries’ and ‘publishers’ 

The new Rules create two regimes for two different categories of entities: ‘intermediaries’ and ‘publishers’.  Intermediaries have been the subject of prior regulations – the Information Technology (Intermediaries guidelines) Rules, 2011 (the 2011 Rules), now superseded by these Rules. However, the category of “publishers” and related regime created by these Rules did not previously exist. 

The Rules begin with commencement provisions and definitions in Part I. Part II of the Rules apply to intermediaries (as defined in the Information Technology Act 2000 (IT Act)) who transmit electronic records on behalf of others, and includes online intermediary platforms (like Youtube, Whatsapp, Facebook). The rules in this part primarily flesh out the protections offered in Section 79 of India’s Information Technology Act 2000 (IT Act), which give passive intermediaries the benefit of a ‘safe harbour’ from liability for objectionable information shared by third parties using their services — somewhat akin to protections under section 230 of the US Communications Decency Act.  To claim this protection from liability, intermediaries need to undertake certain ‘due diligence’ measures, including informing users of the types of content that could not be shared, and content take-down procedures (for which safeguards evolved overtime through important case law). The new Rules supersede the 2011 Rules and also significantly expand on them, introducing new provisions and additional due diligence requirements that are detailed further in this blog. 

Part III of the Rules apply to a new previously non-existent category of entities designated to be ‘publishers‘. This is further classified into subcategories of ‘publishers of news and current affairs content’ and ‘publishers of online curated content’. Part III then sets up extensive requirements for publishers to adhere to specific codes of ethics, onerous content take-down requirements and three-tier grievance process with appeals lying to an Executive Inter-Departmental Committee of Central Government bureaucrats. 

Finally, the Rules contain two provisions that apply to all entities (i.e. intermediaries and publishers) relating to content-blocking orders. They lay out a new process by which Central Government officials can issue directions to delete, modify or block content to intermediaries and publishers, either following a grievance process (Rule 15) or including procedures of “emergency” blocking orders which may be passed ex-parte. These Rules stem from powers to issue directions to intermediaries to block public access of any information through any computer resource (Section 69A of the IT Act). Interestingly, these provisions have been introduced separately from the existing rules for blocking purposes called the Information Technology (Procedure and Safeguards for Blocking for Access of Information by Public) Rules, 2009

2. Key issues for intermediaries under the Rules

2.1 A new class of ‘social media intermediaries

The term ‘intermediary’ is a broadly defined term in the IT Act covering a range of entities involved in the transmission of electronic records. The Rules introduce two new sub-categories, being:

Given that a popular messaging app like Whatsapp has over 400 million users in India, the threshold appears to be fairly conservative. The Government may order any intermediary to comply with the same obligations as SSMIs (under Rule 6) if their services are adjudged to pose a risk of harm to national security, the sovereignty and integrity of India, India’s foreign relations or to public order.  

SSMIs have to follow substantially more onerous “additional due diligence” requirements to claim the intermediary safe harbour (including mandatory traceability of message originators, and proactive automated screening as discussed below). These new requirements raise privacy concerns and data security concerns, as they extend beyond the traditional ideas of platform  “due diligence”, they potentially expose content of private communications and in doing so create new privacy risks for users in India.    

2.2 Additional requirements for SSMIS: resident employees, mandated message traceability, automated content screening 

Extensive new requirements are set out in the new Rule 4 for SSMIs. 

Provisions to mandate modifications to the technical design of encrypted platforms to enable traceability seem to go beyond merely requiring intermediary due diligence. Instead they appear to draw on separate Government powers relating to interception and decryption of information (under Section 69 of the IT Act). In addition, separate stand-alone rules laying out procedures and safeguards for such interception and decryption orders already exist in the Information Technology (Procedure and Safeguards for Interception, Monitoring and Decryption of Information) Rules, 2009. Rule 4(2) even acknowledges these provisions–raising the question of whether these Rules (relating to intermediaries and their safe harbours) can be used to expand the scope of section 69 or rules thereunder. 

Proceedings initiated by Whatsapp LLC in the Delhi High Court, and Free and Open Source Software (FOSS) developer Praveen Arimbrathodiyil in the Kerala High Court have both challenged the legality and validity of Rule 4(2) on grounds including that they are ultra vires and go beyond the scope of their parent statutory provisions (s. 79 and 69A) and the intent of the IT Act itself. Substantively, the provision is also challenged on the basis that it would violate users’ fundamental rights including the right to privacy, and the right to free speech and expression due to the chilling effect that the stripping back of encryption will have.

Though the objective of the provision is laudable (i.e. to limit the circulation of violent or previously removed content), the move towards proactive automated monitoring has raised serious concerns regarding censorship on social media platforms. Rule 4(4) appears to acknowledge the deep tensions that this requirement raises with privacy and free speech concerns, as seen by the provisions that require these screening measures to be proportionate to the free speech and privacy of users, to be subject to human oversight, and reviews of automated tools to assess fairness, accuracy, propensity for bias or discrimination, and impact on privacy and security. However, given the vagueness of this wording compared to the trade-off of losing intermediary immunity, scholars and commentators are noting the obvious potential for ‘over-compliance’ and excessive screening out of content. Many (including the petitioner in the Praveen Arimbrathodiyil matter) have also noted that automated filters are not sophisticated enough to differentiate between violent unlawful images and legitimate journalistic material. The concern is that such measures could create a large-scale screening out of ‘valid’ speech and expression, with serious consequences for constitutional rights to free speech and expression which also protect ‘the rights of individuals to listen, read and receive the said speech‘ (Tata Press Ltd v. Mahanagar Telephone Nigam Ltd, (1995) 5 SCC 139). 

Such requirements appear to be aimed at creating more user-friendly networks of intermediaries. However, the imposition of a single set of requirements is especially onerous for smaller or volunteer-run intermediary platforms which may not have income streams or staff to provide for such a mechanism. Indeed, the petition in the Praveen Arimbrathodiyil matter has challenged certain of these requirements as being a threat to the future of the volunteer-led Free and Open Source Software (FOSS) movement in India, by placing similar requirements on small FOSS initiatives as on large proprietary Big Tech intermediaries.  

Other obligations that stipulate turn-around times for intermediaries include (i) a requirement to remove or disable access to content within 36 hours of receipt of a Government or court order relating the unlawful information on the intermediary’s computer resources (under Rule 3(1)(d)) and (ii) to provide information within 72 hours of receiving an order from a authorised Government agency undertaking investigative activity (under Rule 3(1)(j). 

Similar to the concerns with automated screening, there are concerns that the new grievance process could lead to private entities becoming the arbiters of appropriate content/ free speech — a position that was specifically reversed in a seminal 2015 Supreme Court decision that clarified that a Government or Court order was needed for content-takedowns.  

3. Key issues for the new ‘publishers’ subject to the Rules, including OTT players

3.1 New Codes of Ethics and three-tier redress and oversight system for digital news media and OTT players 

Digital news media and OTT players have been designated as ‘publishers of news and current affairs content’ and ‘publishers of online curated content’ respectively in Part III of the Rules. Each category has been then subjected to separate Codes of Ethics. In the case of digital news media, the Codes applicable to the newspapers and cable television have been applied. For OTT players, the Appendix sets out principles regarding content that can be created and display classifications. To enforce these codes and to address grievances from the public on their content, publishers are now mandated to set up a grievance system which will be the first tier of a three-tier “appellate” system culminating in an oversight mechanism by the Central Government with extensive powers of sanction.  

At least five legal challenges have been filed in various High Courts challenging the competence and authority of the Ministry of Electronics & Information Technology (MeitY) to pass the Rules and their validity namely (i) in the Kerala High Court, LiveLaw Media Private Limited vs Union of India WP(C) 6272/2021; in the Delhi High Court, three petitions tagged together being (ii) Foundation for Independent Journalism vs Union of India WP(C) 3125/2021, (iii) Quint Digital Media Limited vs Union of India WP(C)11097/2021, and (iv) Sanjay Kumar Singh vs Union of India and others WP(C) 3483/2021, and (v) in the Karnataka High Court, Truth Pro Foundation of India vs Union of India and others, W.P. 6491/2021. This is in addition to a fresh petition filed on 10 June 2021, in TM Krishna vs Union of India that is challenging the entirety of the Rules (both Part II and III) on the basis that they violate rights of free speech (in Article 19 of the Constitution), privacy (including in Article 21 of the Constitution) and that it fails the test of arbitrariness (under Article 14) as it is manifestly arbitrary and falls foul of principles of delegation of powers. 

Some of the key issues emerging from these Rules in Part III and the challenges to them are highlighted below. 

3.2 Lack of legal authority and competence to create these Rules

There has been substantial debate on the lack of clarity regarding the legal authority of the Ministry of Electronics & Information Technology (MeitY) under the IT Act. These concerns arise at various levels. 

First, there is a concern that Level I & II result in a privatisation of adjudications relating to free speech and expression of creative content producers – which would otherwise be litigated in Courts and Tribunals as matters of free speech. As noted by many (including the LiveLaw petition at page 33), this could have the effect of overturning judicial precedent in Shreya Singhal v. Union of India ((2013) 12 S.C.C. 73) that specifically read down s 79 of the IT Act  to avoid a situation where private entities were the arbiters determining the legitimacy of takedown orders.  Second, despite referring to “self-regulation” this system is subject to executive oversight (unlike the existing models for offline newspapers and broadcasting).

The Inter-Departmental Committee is entirely composed of Central Government bureaucrats, and it may review complaints through the three-tier system or referred directly by the Ministry following which it can deploy a range of sanctions from warnings, to mandating apologies, to deleting, modifying or blocking content. This also raises the question of whether this Committee meets the legal requirements for any administrative body undertaking a ‘quasi-judicial’ function, especially one that may adjudicate on matters of rights relating to free speech and privacy. Finally, while the objective of creating some standards and codes for such content creators may be laudable it is unclear whether such an extensive oversight mechanism with powers of sanction on online publishers can be validly created under the rubric of intermediary liability provisions.  

4. New powers to delete, modify or block information for public access 

As described at the start of this blog, the Rules add new powers for the deletion, modification and blocking of content from intermediaries and publishers. While section 69A of the IT Act (and Rules thereunder) do include blocking powers for Government, they only exist vis a vis intermediaries. Rule 15 also expands this power to ‘publishers’. It also provides a new avenue for such orders to intermediaries, outside of the existing rules for blocking information under the Information Technology (Procedure and Safeguards for Blocking for Access of Information by Public) Rules, 2009

More grave concerns arise from Rule 16 which allows for the passing of emergency orders for blocking information, including without giving an opportunity of hearing for publishers or intermediaries. There is a provision for such an order to be reviewed by the Inter-Departmental Committee within 2 days of its issue. 

Both Rule 15 and 16 apply to all entities contemplated in the Rules. Accordingly, they greatly expand executive power and oversight over digital media services in India, including social media, digital news media and OTT on-demand services. 

5. Conclusions and future implications

The new Rules in India have opened up deep questions for online intermediaries and providers of digital media services serving the Indian market. 

For intermediaries, this creates a difficult and even existential choice: the requirements, (especially relating to traceability and automated screening) appear to set an improbably high bar given the reality of their technical systems. However, failure to comply will result in not only the loss of a safe harbour from liability — but as seen in new Rule 7, also opens them up to punishment under the IT Act and criminal law in India. 

For digital news and OTT players, the consequences of non-compliance and the level of enforcement remain to be understood, especially given open questions regarding the validity of legal basis to create these rules. Given the numerous petitions filed against these Rules, there is also substantial uncertainty now regarding the future although the Rules themselves have the full force of law at present. 

Overall, it does appear that attempts to create a ‘digital media’ watchdog would be better dealt with in a standalone legislation, potentially sponsored by the Ministry of Information and Broadcasting (MIB) which has the traditional remit over such areas. Indeed, the administration of Part III of the Rules has been delegated by MeitY to MIB pointing to the genuine split in competence between these Ministries.  

Finally, the potential overlaps with India’s proposed Personal Data Protection Bill (if passed) also create tensions in the future. It remains to be seen if the provisions on traceability will survive the test of constitutional validity set out in India’s privacy judgement (Justice K.S. Puttaswamy v. Union of India, (2017) 10 SCC 1). Irrespective of this determination, the Rules appear to have some dissonance with the data retention and data minimisation requirements seen in the last draft of the Personal Data Protection Bill, not to mention other obligations relating to Privacy by Design and data security safeguards. Interestingly, despite the Bill’s release in December 2019, a definition for ‘social media intermediary’ that it included in an explanatory clause to its section 26(4) closely track the definition in Rule 2(w), but also departs from it by carving out certain intermediaries from the definition. This is already resulting in moves such as Google’s plea on 2 June 2021 in the Delhi High Court asking for protection from being declared a social media intermediary. 

These new Rules have exhumed the inherent tensions that exist within the realm of digital regulation between goals of the freedom of speech and expression, and the right to privacy and competing governance objectives of law enforcement (such as limiting the circulation of violent, harmful or criminal content online) and national security. The ultimate legal effect of these Rules will be determined as much by the outcome of the various petitions challenging their validity, as by the enforcement challenges raised by casting such a wide net that covers millions of users and thousands of entities, who are all engaged in creating India’s growing digital public sphere.

Photo credit: Gerd Altmann from Pixabay

Read more Global Privacy thought leadership:

South Korea: The First Case where the Personal Information Protection Act was Applied to an AI System

China: New Draft Car Privacy and Security Regulation is Open for Public Consultation

A New Era for Japanese Data Protection: 2020 Amendments to the APPI

New FPF Report Highlights Privacy Tech Sector Evolving from Compliance Tools to Platforms for Risk Management and Data Utilization

As we enter the third phase of development of the privacy tech market, purchasers are demanding more integrated solutions, product offerings are more comprehensive, and startup valuations are higher than ever, according to a new report from the Future of Privacy Forum and Privacy Tech Alliance. These factors are leading to companies providing a wider range of services, acting as risk management platforms, and focusing on support of business outcomes.

“The privacy tech sector is at an inflection point, as its offerings have expanded beyond assisting with regulatory compliance,” said FPF CEO Jules Polonetsky. “Increasingly, companies want privacy tech to help businesses maximize the utility of data while managing ethics and data protection compliance.”

According to the report, “Privacy Tech’s Third Generation: A Review of the Emerging Privacy Tech Sector,” regulations are often the biggest driver for buyers’ initial privacy tech purchases. Organizations also are deploying tools to mitigate potential harms from the use of data. However, buyers serving global markets increasingly need privacy tech that offers data availability and control and supports its utility, in addition to regulatory compliance. 

The report finds the COVID-19 pandemic has accelerated global marketplace adoption of privacy tech as dependence on digital technologies grows. Privacy is becoming a competitive differentiator in some sectors, and TechCrunch reports that 200+ privacy startups have together raised more than $3.5 billion over hundreds of individual rounds of funding. 

“The customers buying privacy-enhancing tech used to be primarily Chief Privacy Officers,” said report lead author Tim Sparapani. “Now it’s also Chief Marketing Officers, Chief Data Scientists, and Strategy Officers who value the insights they can glean from de-identified customer data.”

The report highlights five trends in the privacy enhancing tech market:

The report also draws seven implications for competition in the market:

The report makes a series of recommendations, including that the industry define as a priority a common vernacular for privacy tech; set standards for technologies in the “privacy stack” such as differential privacy, homomorphic encryption, and federated learning; and explore the needs of companies for privacy tech based upon their size, sector, and structure. It calls on vendors to recognize the need to provide adequate support to customers to increase uptake and speed time from contract signing to successful integration.

The Future of Privacy Forum launched the Privacy Tech Alliance (PTA) as a global initiative with a mission to define, enhance and promote the market for privacy technologies. The PTA brings together innovators in privacy tech with customers and key stakeholders.

Members of the PTA Advisory Board, which includes Anonos, BigID, D-ID, Duality, Ethyca, Immuta, OneTrust, Privacy Analytics, Privitar, SAP, Truata, TrustArc, Wirewheel, and ZL Tech, have formed a working group to address impediments to growth identified in the report. The PTA working group will define a common vernacular and typology for privacy tech as a priority project with chief privacy officers and other industry leaders who are members of FPF. Other work will seek to develop common definitions and standards for privacy-enhancing technologies such as differential privacy, homomorphic encryption, and federated learning and identify emerging trends for venture capitalists and other equity investors in this space. Privacy Tech companies can apply to join the PTA by emailing [email protected].


Perspectives on the Privacy Tech Market

Quotes from Members of the Privacy Tech Alliance Advisory Board on the Release of the “Privacy Tech’s Third Generation” Report

anonos feature image 1

“The ‘Privacy Tech Stack’ outlined by the FPF is a great way for organizations to view their obligations and opportunities to assess and reconcile business and privacy objectives. The Schrems II decision by the Court of Justice of the European Union highlights that skipping the second ‘Process’ layer can result in desired ‘Outcomes’ in the third layer (e.g., cloud processing of, or remote access to, cleartext data) being unlawful – despite their global popularity – without adequate risk management controls for decentralized processing.” — Gary LaFever, CEO & General Counsel, Anonos

bigid 1

“As a founding member of this global initiative, we are excited by the conclusions drawn from this foundational report – we’ve seen parallels in our customer base, from needing an enterprise-wide solution to the rich opportunity for collaboration and integration. The privacy tech sector continues to mature as does the imperative for organizations of all sizes to achieve compliance in light of the increasingly complicated data protection landscape.’’—Heather Federman, VP Privacy and Policy at BigID

logo

“There is no doubt of the massive importance of the privacy sector, an area which is experiencing huge growth. We couldn’t be more proud to be part of the Privacy Tech Alliance Advisory Board and absolutely support the work they are doing to create alignment in the industry and help it face the current set of challenges. In fact we are now working on a similar initiative in the synthetic media space to ensure that ethical considerations are at the forefront of that industry too.” — Gil Perry, Co-Founder & CEO, D-ID

dualitytechnologies

“We congratulate the Future of Privacy Forum and the Privacy Tech Alliance on the publication of this highly comprehensive study, which analyzes key trends within the rapidly expanding privacy tech sector. Enterprises today are increasingly reliant on privacy tech, not only as a means of ensuring regulatory compliance but also in order to drive business value by facilitating secure collaborations on their valuable and often sensitive data. We are proud to be part of the PTA Advisory Board, and look forward to contributing further to its efforts to educate the market on the importance of privacy-tech, the various tools available and their best utilization, ultimately removing barriers to successful deployments of privacy-tech by enterprises in all industry sectors” — Rina Shainski, Chairwoman, Co-founder, Duality

onetrustlogo

“Since the birth of the privacy tech sector, we’ve been helping companies find and understand the data they have, compare it against applicable global laws and regulations, and remediate any gaps in compliance. But as the industry continues to evolve, privacy tech also is helping show business value beyond just compliance. Companies are becoming more transparent, differentiating on ethics and ESG, and building businesses that differentiate on trust. The privacy tech industry is growing quickly because we’re able to show value for compliance as well as actionable business insights and valuable business outcomes.” — Kabir Barday, CEO, OneTrust

pa logo iqvia

“Leading organizations realize that to be truly competitive in a rapidly evolving marketplace, they need to have a solid defensive footing. Turnkey privacy technologies enable them to move onto the offense by safely leveraging their data assets rapidly at scale.” — Luk Arbuckle, Chief Methodologist, Privacy Analytics

1024px sap logo.svg

“We appreciate FPF’s analysis of the privacy tech marketplace and we’re looking forward to further research, analysis, and educational efforts by the Privacy Tech Alliance. Customers and consumers alike will benefit from a shared understanding and common definitions for the elements of the privacy stack.” — Corinna Schulze, Director, EU Government Relations, Global Corporate Affairs, SAP

unknown

“The report shines a light on the evolving sophistication of the privacy tech market and the critical need for businesses to harness emerging technologies that can tackle the multitude of operational challenges presented by the big data economy. Businesses are no longer simply turning to privacy tech vendors to overcome complexities with compliance and regulation; they are now mapping out ROI-focused data strategies that view privacy as a key commercial differentiator. In terms of market maturity, the report highlights a need to overcome ambiguities surrounding new privacy tech terminology, as well as discrepancies in the mapping of technical capabilities to actual business needs. Moving forward, the advantage will sit with those who can offer the right blend of technical and legal expertise to provide the privacy stack assurances and safeguards that buyers are seeking – from a risk, deployment and speed-to-value perspective. It’s worth noting that the growing importance of data privacy to businesses sits in direct correlation with the growing importance of data privacy to consumers. Trūata’s Global Consumer State of Mind Report 2021 found that 62% of global consumers would feel more reassured and would be more likely to spend with companies if they were officially certified to a data privacy standard. Therefore, in order to manage big data in a privacy-conscious world, the opportunity lies with responsive businesses that move with agility and understand the return on privacy investment. The shift from manual, restrictive data processes towards hyper automation and privacy-enhancing computation is where the competitive advantage can be gained and long-term consumer loyalty—and trust— can be retained.” — Aoife Sexton, Chief Privacy Officer and Chief of Product Innovation, Trūata

unknown 1

“As early pioneers in this space, we’ve had a unique lens on the evolving challenges organizations have faced in trying to integrate technology solutions to address dynamic, changing privacy issues in their organizations, and we believe the Privacy Technology Stack introduced in this report will drive better organizational decision-making related to how technology can be used to sustainably address the relationships among the data, processes, and outcomes.” — Chris Babel, CEO, TrustArc

wirewheel logo

“It’s important for companies that use data to do so ethically and in compliance with the law, but those are not the only reasons why the privacy tech sector is booming. In fact, companies with exceptional privacy operations gain a competitive advantage, strengthen customer relationships, and accelerate sales.” — Justin Antonipillai, Founder & CEO, Wirewheel

The right to be forgotten is not compatible with the Brazilian Constitution. Or is it?

Brazilian Supreme Federal Court

Author: Dr. Luca Belli

Dr. Luca Belli is Professor at FGV Law School, Rio de Janeiro, where he leads the CyberBRICS Project and the Latin American edition of the Computers, Privacy and Data Protection (CPDP) conference. The opinions expressed in his articles are strictly personal. The author can be contacted at [email protected].

The Brazilian Supreme Federal Court, or “STF” in its Brazilian acronym, recently took a landmark decision concerning the right to be forgotten (RTBF), finding that it is incompatible with the Brazilian Constitution. This attracted international attention to Brazil for a topic quite distant than the sadly frequent environmental, health, and political crises.

Readers should be warned that while reading this piece they might experience disappointment, perhaps even frustration, then renewed interest and curiosity and finally – and hopefully – an increased open-mindedness, understanding a new facet of the RTBF debate, and how this is playing out at constitutional level in Brazil.

This might happen because although the STF relies on the “RTBF” label, the content behind such label is quite different from what one might expect after following the same debate in Europe. From a comparative law perspective, this landmark judgment tellingly shows how similar constitutional rights play out in different legal cultures and may lead to heterogeneous outcomes based on the constitutional frameworks of reference.   

How it started: insolvency seasoned with personal data

As it is well-known, the first global debate on what it means to be “forgotten” in the digital environment arose in Europe, thanks to Mario Costeja Gonzalez, a Spaniard who, paradoxically, will never be forgotten by anyone due to his key role in the construction of the RTBF.

Costeja famously requested to deindex from Google Search information about himself that he considered to be no longer relevant. Indeed, when anyone “googled” his name, the search engine provided as the top results some link to articles reporting Costeja’s past insolvency as a debtor. Costeja argued that, despite having been convicted for insolvency, he had already paid his debt with Justice and society many years before and it was therefore unfair that his name would continue to be associated ad aeternum with a mistake he made in the past.

The follow up is well known in data protection circles. The case reached the Court of Justice of the European Union (CJEU), which, in its landmark Google Spain Judgment (C-131/12), established that search engines shall be considered as data controllers and, therefore, they have an obligation to de-index information that is inappropriate, excessive, not relevant, or no longer relevant, when a data subject to whom such data refer requests it. Such an obligation was a consequence of Article 12.b of Directive 95/46 on the protection of personal data, a pre-GDPR provision that set the basis for the European conception of the RTBF, providing for the “rectification, erasure or blocking of data the processing of which does not comply with the provisions of [the] Directive, in particular because of the incomplete or inaccurate nature of the data.”

The indirect consequence of this historic decision, and the debate it generated, is that we have all come to consider the RTBF in the terms set by the CJEU. However, what is essential to emphasize is that the CJEU approach is only one possible conception and, importantly, it was possible because of the specific characteristics of the EU legal and institutional framework. We have come to think that RTBF means the establishment of a mechanism like the one resulting from the Google Spain case, but this is the result of a particular conception of the RTBF and of how this particular conception should – or could – be implemented.

The fact that the RTBF has been predominantly analyzed and discussed through the European lenses does not mean that this is the only possible perspective, nor that this approach is necessary the best. In fact, the Brazilian conception of the RTBF is remarkably different from a conceptual, constitutional, and institutional standpoint. The main concern of the Brazilian RTBF is not how a data controller might process personal data (this is the part where frustration and disappointment might likely arise in the reader) but the STF itself leaves the door open to such possibility (this is the point where renewed interest and curiosity may arise).

The Brazilian conception of the right to be forgotten

Although the RTBF has acquired a fundamental relevance in digital policy circles, it is important to emphasize that, until recently, Brazilian jurisprudence had mainly focused on the juridical need for “forgetting” only in the analogue sphere. Indeed, before the CJEU Google Spain decision, the Brazilian Supreme Court of Justice or “STJ” – the other Brazilian Supreme Court that deals with the interpretation of the Law, differently from the previously mentioned STF, which deals with the interpretation of constitutional matters – had already considered the RTBF as a right not to be remembered, affirmed by the individual vis-à-vis traditional media outlets.

This interpretation first emerged in the “Candelaria massacre” case, a gloomy page of Brazilian history, featuring a multiple homicide perpetrated in 1993 in front of the Candelaria Church, a beautiful colonial Baroque building in Rio de Janeiro’s downtown. The gravity and the particularly picturesque stage of the massacre led Globo TV, a leading Brazilian broadcaster, to feature the massacre in a TV show called Linha Direta. Importantly, the show included in the narration some details about a man suspected of being one of the perpetrators of the massacre but later discharged.

Understandably, the man filed a complaint arguing that the inclusion of his personal information in the TV show was causing him severe emotional distress, while also reviving suspects against him, for a crime he had already been discharged of many years before. In September 2013, further to Special Appeal No. 1,334,097, the STJ agreed with the plaintiff establishing the man’s “right not to be remembered against his will, specifically with regard to discrediting facts.” This is how the RTBF was born in Brazil.

Importantly for our present discussion, this interpretation is not born out of digital technology and does not impinge upon the delisting of specific type of information as results of search engine queries. In Brazilian jurisprudence the RTBF has been conceived as a general right to effectively limit the publication of certain information. The man included in the Globo reportage had been discharged many years before, hence he had a right to be “let alone,” as Warren and Brandeis would argue, and not to be remembered for something he had not even committed. The STJ, therefore, constructed its vision of the RTBF, based on article 5.X of the Brazilian Constitution, enshrining the fundamental right to intimacy and preservation of image, two fundamental features of privacy. 

Hence, although they utilize the same label, the STJ and CJEU conceptualize two remarkably different rights, when they refer to the RTBF. While both conceptions aim at limiting access to specific types of personal information, the Brazilian conception differs from the EU one on at least three different levels.

First, their constitutional foundations. While both conceptions are intimately intertwined with individuals’ informational self-determination, the STJ built the RTBF based on the protection of privacy, honour and image, whereas the CJEU built it upon the fundamental right to data protection, which in the EU framework is a standalone fundamental right. Conspicuously, in the Brazilian constitutional framework an explicit right to data protection did not exist at the time of the Candelaria case and only since 2020 it has been in the process of being recognized

Secondly, and consequently, the original goal of the Brazilian conception of the RTBF was not to regulate how a controller should process personal data but rather to protect the private sphere of the individual. In this perspective, the goal of STJ was not – and could not have been – to regulate the deindexation of specific incorrect or outdated information, but rather to regulate the deletion of “discrediting facts” so that the private life, honour and image of any individual might be illegitimately violated.

Finally, yet extremely importantly, the fact that, at the time of the decision, an institutional framework dedicated to data protection was simply absent in Brazil did not allow the STJ to have the same leeway of the CJEU. The EU Justices enjoyed the privilege of delegating to search engine the implementation of the RTBF because, such implementation would have received guidance and would have been subject to the review of a well-consolidated system of European Data Protection Authorities. At the EU level, DPAs are expected to guarantee a harmonious and consistent interpretation and application of data protection law. At the Brazilian level, a DPA has just been established in late 2020 and announced its first regulatory agenda only in late January 2021.

This latter point is far from trivial and, in the opinion of this author, an essential preoccupation that might have driven the subsequent RTBF conceptualization of the STJ.

The stress-test

The soundness of the Brazilian definition of the RTBF, however, was going to be tested again by the STJ, in the context of another grim and unfortunate page of Brazilian story, the Aida Curi case. This case originated with the sexual assault and subsequent homicide of the young Aida Curi, in Copacabana, Rio de Janeiro, on the evening of 14 July 1958. At the time the case crystallized considerable media attention, not only because of its mysterious circumstances and the young age of the victim, but also because the sexual assault perpetrators tried to dissimulate it by throwing the body of the victim from the rooftop of a very high building on the Avenida Atlantica, the fancy avenue right in front of the Copacabana beach.

Needless to say, Globo TV considered the case as a perfect story for yet another Linha Direta episode. Aida Curi’s relatives, far from enjoying the TV show, sued the broadcaster for moral damages and demanded the full enjoyment of their RTBF – in the Brazilian conception, of course. According to the plaintiffs, it was indeed not conceivable that, almost 50 years after the murder, Globo TV could publicly broadcast personal information about the victim – and her family – including the victim’s name and address, in addition to unauthorized images, thus bringing back a long-closed and extremely traumatic set of events.

The brothers of Aida Curi claimed reparation against Rede Globo, but the STJ, decided that the time passed was enough to mitigate the effects of anguish and pain on the dignity of Aida Curi’s relatives, while arguing that it was impossible to report the events without mentioning the victim. This decision was appealed by Ms Curi’s family members, who demanded by means of Extraordinary Appeal No. 1,010,606, that STF recognized “their right to forget the tragedy.” It is interesting to note that the way the demand is constructed in this Appeal exemplifies tellingly the Brazilian conception of “forgetting” as erasure and prohibition from divulgation.

At this point, the STF identified in the Appeal the interest of debating the issue “with general repercussion” which is a peculiar judicial process that the Court can utilize when recognizes that a given case has particular relevance and transcendence for the Brazilian legal and judicial system. Indeed, the decision of a case with general repercussion does not only bind the parties but rather establishes a jurisprudence that must be replicated by all lower-level courts.

In February 2021, the STF finally deliberated on the Aida Curi case, establishing that “the idea of ​​a right to be forgotten is incompatible with the Constitution, thus understood as the power to prevent, due to the passage of time, the disclosure of facts or data that are true and lawfully obtained and published in analogue or digital media” and that “any excesses or abuses in the exercise of freedom of expression and information must be analyzed on a case-by-case basis, based on constitutional parameters – especially those relating to the protection of honor, image, privacy and personality in general – and the explicit and specific legal provisions existing in the criminal and civil spheres.”

In other words, what the STF has deemed as incompatible with the Federal Constitution is a specific interpretation of the Brazilian version of the RTBF. What is not compatible with the Constitution is to argue that the RTBF allows to prohibit publishing true facts, lawfully obtained. At the same time, however, the STF clearly states that it remains possible for any Court of law to evaluate, on a case-by-case basis and according to constitutional parameters and existing legal provisions, if a specific episode can allow the use of the RTBF to prohibit the divulgation of information that undermine the dignity, honour, privacy, or other fundamental interests of the individual.

Hence, while explicitly prohibiting the use of the RTBF as a general right to censorship, the STF leaves room for the use of the RTBF for delisting specific personal data in an EU-like fashion, while specifying that this must be done finding guidance in the Constitution and the Law.

What next?

Given the core differences between the Brazilian and EU conception of the RTBF, as highlighted above, it is understandable in the opinion of this author that the STF adopted a less proactive and more conservative approach. This must be especially considered in light of the very recent establishment of a data protection institutional system in Brazil.

It is understandable that the STF might have preferred to de facto delegate the interpretation of when and how the RTBF could be rightfully invoked before Courts, according to constitutional and legal parameters. First, in the Brazilian interpretation of the RTBF, this right fundamentally insist on the protection of privacy – i.e. the private sphere of an individual – and, while admitting the existence of data protection concerns, these are not the main ground on which the Brazilian RTBF conception relays.

It is understandable that in a country and a region where the social need to remember and shed light on what happened in a recent history, marked by dictatorships, well-hidden atrocities, and opacity, outweighs the legitimate individual interest to prohibit the circulation of truthful and legally obtained information. In the digital sphere, however, the RTBF quintessentially translates into an extension of informational self-determination, which the Brazilian General Data Protection Law, better known as “LGPD” (Law No. 13.709 / 2018), enshrines in its article 2 as one of the “foundations” of data protection in the country and that whose fundamental character was recently recognized by the STF itself.

In this perspective, it is useful to remind the dissenting opinion of Justice Luiz Edson Fachin, in the Aida Curi case, stressing that “although it does not expressly name it, the Constitution of the Republic, in its text, contains the pillars of the right to be forgotten, as it celebrates the dignity of the human person (article 1, III), the right to privacy (article 5, X) and the right to informational self-determination – which was recognized, for example, in the disposal of the precautionary measures of the Direct Unconstitutionality Actions No. 6,387, 6,388, 6,389, 6,390 and 6,393, under the rapporteurship of Justice Rosa Weber (article 5, XII).”

It is the opinion of this author that the Brazilian debate on the RTBF in the digital sphere would be clearer if it its dimension as a right to deindexation of search engines results were to be clearly regulated. It is understandable that the STF did not dare regulating this, given its interpretation of the RTBF and the very embryonic data protection institutional framework in Brazil. However, given the increasing datafication we are currently witnessing, it would be naïve not to expect that further RTBF claims concerning the digital environment and, specifically, the way search engines process personal data will keep emerging.

The fact that the STF has left the door open to apply the RTBF in the case-by-case analysis of individual claims may reassure the reader regarding the primacy of constitutional and legal arguments in such case-by-case analysis. It may also lead the reader to – very legitimately – wonder whether such a choice is the facto the most efficient to deal with the potentially enormous number of claims and in the most coherent way, given the margin of appreciation and interpretation that each different Court may have.  

An informed debate able to clearly highlight what are the existing options and what might be the most efficient and just ways to implement them, considering the Brazilian context, would be beneficial. This will likely be one of the goals of the upcoming Latin American edition of the Computers, Privacy and Data Protection conference (CPDP LatAm) that will take place in July, entirely online, and will aim at exploring the most pressing issues for Latin American countries regarding privacy and data protection.

Photo Credit: “Brasilia – The Supreme Court” by Christoph Diewald is licensed under CC BY-NC-ND 2.0

If you have any questions about engaging with The Future of Privacy Forum on Global Privacy and Digital Policymaking contact Dr. Gabriela Zanfir-Fortuna, Senior Counsel, at [email protected].

FPF announces appointment of Malavika Raghavan as Senior Fellow for India

The Future of Privacy Forum announces the appointment of Malavika Raghavan as Senior Fellow for India, expanding our Global Privacy team to one of the key jurisdictions for the future of privacy and data protection law. 

Malavika is a thought leader and a lawyer working on interdisciplinary research, focusing on the impacts of digitisation on the lives of lower-income individuals. Her work since 2016 has focused on the regulation and use of personal data in service delivery by the Indian State and private sector actors. She has founded and led the Future of Finance Initiative for Dvara Research (an Indian think tank) in partnership with the Gates Foundation from 2016 until 2020, anchoring its research agenda and policy advocacy on emerging issues at the intersection of technology, finance and inclusion. Research that she led at Dvara Research was cited by the India’s Data Protection Committee in its White Paper as well as its final report with proposals for India’s draft Personal Data Protection Bill, with specific reliance placed on such research on aspects of regulatory design and enforcement. See Malavika’s full bio here.

“We are delighted to welcome Malavika to our Global Privacy team. For the following year, she will be our adviser to understand the most significant developments in privacy and data protection in India, from following the debate and legislative process of the Data Protection Bill and the processing of non-personal data initiatives, to understanding the consequences of the publication of the new IT Guidelines. India is one of the most interesting jurisdictions to follow in the world, for many reasons: the innovative thinking on data protection regulation, the potentially groundbreaking regulation of non-personal data and the outstanding number of individuals whose privacy and data protection rights will be envisaged by these developments, which will test the power structures of digital regulation and safeguarding fundamental rights in this new era”, said Dr. Gabriela Zanfir-Fortuna, Global Privacy lead at FPF. 

We have asked Malavika to share her thoughts for FPF’s blog on what are the most significant developments in privacy and digital regulation in India and about India’s role in the global privacy and digital regulation debate.

FPF: What are some of the most significant developments in the past couple of years in India in terms of data protection, privacy, digital regulation?

Malavika Raghavan: “Undoubtedly, the turning point for the privacy debate India was the 2017 judgement of the Indian Supreme Court in Justice KS Puttaswamy v Union of India. The judgment affirmed the right to privacy as a constitutional guarantee, protected by Part III (Fundamental Rights) of the Indian Constitution. It was also regenerative, bringing our constitutional jurisprudence into the 21st century by re-interpreting timeless principles for the digital age, and casting privacy as a prerequisite for accessing other rights—including the right to life and liberty, to freedom of expression and to equality—given the ubiquitous digitisation of human experience we are witnessing today. 

Overnight, Puttaswamy also re-balanced conversations in favour of privacy safeguards to make these equal priorities for builders of digital systems, rather than framing these issues as obstacles to innovation and efficiency. In addition, it challenged the narrative that privacy is an elite construct that only wealthy or privileged people deserve— since many litigants in the original case that had created the Puttaswamy reference were from marginalised groups. Since then, a string of interesting developments have arisen as new cases are reassessing the impact of digital technology on individuals in India, for e.g. the boundaries case of private sector data sharing (such as between Whatsapp and Facebook), or the State’s use of personal data (as in the case concerning Aadhaar, our national identification system) among others. 

Puttaswamy also provided fillip for a big legislative development, which is the creation of an omnibus data protection law in India. A bill to create this framework was proposed by a Committee of Experts under the chairmanship of Justice Srikrishna (an ex-Supreme Court judge), which has been making its way through ministerial and Parliamentary processes. There’s a large possibility that this law will be passed by the Indian parliament in 2021! Definitely a big development to watch.

FPF: How do you see India’s role in the global privacy and digital regulation debate?

Malavika Raghavan: “India’s strategy on privacy and digital regulation will undoubtedly have global impact, given that India is home to 1/7th of the world’s population! The mobile internet revolution has created a huge impact on our society with millions getting access to digital services in the last couple of decades. This has created nuanced mental models and social norms around digital technologies that are slowly being documented through research and analysis. 

The challenge for policy makers is to create regulations that match these expectations and the realities of Indian users to achieve reasonable, fair regulations. As we have already seen from sectoral regulations (such as those from our Central Bank around cross border payments data flows) such regulations also have huge consequences for global firms interacting with Indian users and their personal data.  

In this context, I think India can have the late-mover advantage in some ways when it comes to digital regulation. If we play our cards right, we can take the best lessons from the experience of other countries in the last few decades and eschew the missteps. More pragmatically, it seems inevitable that India’s approach to privacy and digital regulation will also be strongly influenced by the Government’s economic, geopolitical and national security agenda (both internationally and domestically). 

One thing is for certain: there is no path-dependence. Our legislators and courts are thinking in unique and unexpected ways that are indeed likely to result in a fourth way (as described by the Srikrishna Data Protection Committee’s final report), compared to the approach in the US, EU and China.”

If you have any questions about engaging with The Future of Privacy Forum on Global Privacy and Digital Policymaking contact Dr. Gabriela Zanfir-Fortuna, Senior Counsel, at [email protected].

India: Massive overhaul of digital regulation, with strict rules for take-down of illegal content and Automated scanning of online content

Taj Mahal 1209004 1920

On February 25, the Indian Government notified and published Information Technology (Guidelines for Intermediaries and Digital media Ethics Code) Rules 2021. These rules mirror the Digital Services Act (DSA) proposal of the EU to some extent, since they propose a tiered approach based on the scale of the platform, they touch on intermediary liability, content moderation, take-down of illegal content from online platforms, as well as internal accountability and oversight mechanisms, but they go beyond such rules by adding a Code of Ethics for digital media, similar to the Code of Ethics classic journalistic outlets must follow, and by proposing an “online content” labelling scheme for content that is safe for children.

The Code of Ethics applies to online news publishers, as well as intermediaries that “enable the transmission of news and current affairs”. This part of the Guidelines (the Code of Ethics) has already been challenged in the Delhi High Court by news publishers this week. 

The Guidelines have raised several types of concerns in India, from their impact on freedom of expression, impact on the right to privacy through the automated scanning of content and the imposed traceability of even end-to-end encrypted messages so that the originator can be identified, to the choice of the Government to use executive action for such profound changes. The Government, through the two Ministries involved in the process, is scheduled to testify in the Standing Committee of Information Technology of the Parliament on March 15.

New obligations for intermediaries

“Intermediaries” include “websites, apps and portals of social media networks, media sharing websites, blogs, online discussion forums, and other such functionally similar intermediaries” (as defined in rule 2(1)(m)).

Here are some of the most important rules laid out in Part II of the Guidelines, dedicated to Due Diligence by Intermediaries:

“Significant social media intermediaries” have enhanced obligations

“Significant social media intermediaries” are social media services with a number of users above a threshold which will be defined and notified by the Central Government. This concept is similar to the the DSA’s “Very Large Online Platform”, however the DSA includes clear criteria in the proposed act itself on how to identify a VLOP.

As for Significant Social Media Intermediaries” in India, they will have additional obligations (similar to how the DSA proposal in the EU scales obligations): 

These “Guidelines” seem to have the legal effect of a statute, and they are being adopted through executive action to replace Guidelines adopted in 2011 by the Government, under powers conferred to it in the Information Technology Act 2000. The new Guidelines would enter into force immediately after publication in the Official Gazette (no information as to when publication is scheduled). The Code of Ethics would enter into force three months after the publication in the Official Gazette. As mentioned above, there are already some challenges in Court against part of these rules.

Get smart on these issues and their impact

Check out these resources: 

Another jurisdiction to keep your eyes on: Australia

Also note that, while the European Union is starting its heavy and slow legislative machine, by appointing Rapporteurs in the European Parliament and having first discussions on the DSA proposal in the relevant working group of the Council, another country is set to soon adopt digital content rules: Australia. The Government is currently considering an Online Safety Bill, which was open to public consultation until mid February and which would also include a “modernised online content scheme”, creating new classes of harmful online content, as well as take-down requirements for image-based abuse, cyber abuse and harmful content online, requiring removal within 24 hours of receiving a notice from the eSafety Commissioner.

If you have any questions about engaging with The Future of Privacy Forum on Global Privacy and Digital Policymaking contact Dr. Gabriela Zanfir-Fortuna, Senior Counsel, at [email protected].

Russia: New Law Requires Express Consent for Making Personal Data Available to the Public and for Any Subsequent Dissemination

Authors: Gabriela Zanfir-Fortuna and Regina Iminova

Moscow 2742642 1920 1
Source: Pixabay.Com, by Opsa

Amendments to the Russian general data protection law (Federal Law No. 152-FZ on Personal Data) adopted at the end of 2020 enter into force today (Monday, March 1st), with some of them having the effective date postponed until July 1st. The changes are part of a legislative package that is also seeing the Criminal Code being amended to criminalize disclosure of personal data about “protected persons” (several categories of government officials). The amendments to the data protection law envision the introduction of consent based restrictions for any organization or individual that publishes personal data initially, as well as for those that collect and further disseminate personal data that has been distributed on the basis of consent in the public sphere, such as on social media, blogs or any other sources. 

The amendments:

The potential impact of the amendments is broad. The new law prima facie affects social media services, online publishers, streaming services, bloggers, or any other entity who might be considered as making personal data available to “an indefinite number of persons.” They now have to collect and prove they have separate consent for making personal data publicly available, as well as for further publishing or disseminating PDD which has been lawfully published by other parties originally.

Importantly, the new provisions in the Personal Data Law dedicated to PDD do not include any specific exception for processing PDD for journalistic purposes. The only exception recognized is processing PDD “in the state and public interests defined by the legislation of the Russian Federation”. The Explanatory Note accompanying the amendments confirms that consent is the exclusive lawful ground that can justify dissemination and further processing of PDD and that the only exception to this rule is the one mentioned above, for state or public interests as defined by law. It is thus expected that the amendments might create a chilling effect on freedom of expression, especially when also taking into account the corresponding changes to the Criminal Code.

The new rules seem to be part of a broader effort in Russia to regulate information shared online and available to the public. In this context, it is noteworthy that other amendments to Law 149-FZ on Information, IT and Protection of Information solely impacting social media services were also passed into law in December 2020, and already entered into force on February 1st, 2021. Social networks are now required to monitor content and “restrict access immediately” of users that post information about state secrets, justification of terrorism or calls to terrorism, pornography, promoting violence and cruelty, or obscene language, manufacturing of drugs, information on methods to commit suicide, as well as calls for mass riots. 

Below we provide a closer look at the amendments to the Personal Data Law that entered into force on March 1st, 2021. 

A new category of personal data is defined

The new law defines a category of “personal data allowed by the data subject to be disseminated” (PDD), the definition being added as paragraph 1.1 to Article 3 of the Law. This new category of personal data is defined as “personal data to which an unlimited number of persons have access to, and which is provided by the data subject by giving specific consent for the dissemination of such data, in accordance with the conditions in the Personal Data Law” (unofficial translation). 

The old law had a dedicated provision that referred to how this type of personal data could be lawfully processed, but it was vague and offered almost no details. In particular, Article 6(10) of the Personal Data Law (the provision corresponding to Article 6 GDPR on lawful grounds for processing) provided that processing of personal data is lawful when the data subject gives access to their personal data to an unlimited number of persons. The amendments abrogate this paragraph, before introducing an entirely new article containing a detailed list of conditions for processing PDD only on the basis of consent (the new Article 10.1).

Perhaps in order to avoid misunderstanding on how the new rules for processing PDD fit with the general conditions on lawful grounds for processing personal data, a new paragraph 2 is introduced in Article 10 of the law, which details conditions for processing special categories of personal data, to clarify that processing of PDD “shall be carried out in compliance with the prohibitions and conditions provided for in Article 10.1 of this Federal Law”.

Specific, express, unambiguous and separate consent is required

Under the new law, “data operators” that process PDD must obtain specific and express consent from data subjects to process personal data, which includes any use, dissemination of the data. Notably, under the Russian law, “data operators” designate both controllers and processors in the sense of the General Data Protection Regulation (GDPR), or businesses and service providers in the sense of the California Consumer Privacy Act (CCPA).

Specifically, under Article 10.1(1), the data operator must ensure that it obtains a separate consent dedicated to dissemination, other than the general consent for processing personal data or other type of consent. Importantly, “under no circumstances” may individuals’ silence or inaction be taken to indicate their consent to the processing of their personal data for dissemination, under Article 10.1(8).

In addition, the data subject must be provided with the possibility to select the categories of personal data which they permit for dissemination. Moreover, the data subject also must be provided with the possibility to establish “prohibitions on the transfer (except for granting access) of [PDD] by the operator to an unlimited number of persons, as well as prohibitions on processing or conditions of processing (except for access) of these personal data by an unlimited number of persons”, per Article 10.1(9). It seems that these prohibitions refer to specific categories of personal data provided by the data subject to the operator (out of a set of personal data, some categories may be authorized for dissemination, while others may be prohibited from dissemination).

If the data subject discloses personal data to an unlimited number of persons without providing to the operator the specific consent required by the new law, not only the original operator, but all subsequent persons or operators that processed or further disseminated the PDD have the burden of proof to “provide evidence of the legality of subsequent dissemination or other processing”, under Article 10.1(2), which seems to imply that they must prove consent was obtained for dissemination (probatio diabolica in this case). According to the Explanatory Note to the amendments, it seems that the intention was indeed to turn the burden of proof of legality of processing PDD from data subjects to the data operators, since the Note makes a specific reference to the fact that before the amendments the burden of proof rested with data subjects.

If the separate consent for dissemination of personal data is not obtained by the operator, but other conditions for lawfulness of processing are met, the personal data can be processed by the operator, but without the right to distribute or disseminate them – Article 10.1.(4). 

A Consent Management Platform for PDD, managed by the Roskomnadzor

The express consent to process PDD can be given directly to the operator or through a special “information system” (which seems to be a consent management platform) of the Roskomnadzor, according to Article 10.1(6). The provisions related to setting up this consent platform for PDD will enter into force on July 1st, 2021. The Roskomnadzor is expected to provide technical details about the functioning of this consent management platform and guidelines on how it is supposed to be used in the following months. 

Absolute right to opt-out of dissemination of PDD

Notably, the dissemination of PDD can be halted at any time, on request of the individual, regardless of whether the dissemination is lawful or not, according to Article 12.1(12). This type of request is akin to a withdrawal of consent. The provision includes some requirements for the content of such a request. For instance, it requires writing contact information and listing the personal data that should be terminated. Consent to the processing of the provided personal data is terminated once the operator receives the opt-out request – Article 10.1(13).

A request to opt-out of having personal data disseminated to the public when this is done unlawfully (without the data subject’s specific, affirmative consent) can also be made through a Court, as an alternative to submitting it directly to the data operator. In this case, the operator must terminate the transmission of or access to personal data within three business days from when such demand was received or within the timeframe set in the decision of the court which has come into effect – Article 10.1(14).

A new criminal offense: The prohibition on disclosure of personal data about protected persons

Sharing personal data or information about intelligence officers and their personal property is now a criminal offense under the new rules, which amended the Criminal Code. The law obliges any operators of personal data, including government departments and mobile operators, to ensure the confidentiality of personal information concerning protected persons, their relatives, and their property. Under the new law, “protected persons” include employees of the Investigative Committee, FSB, Federal Protective Service, National Guard, Ministry of Internal Affairs, and Ministry of Defense judges, prosecutors, investigators, law enforcement officers and their relatives. Moreover, the list of protected persons can be further detailed by the head of the relevant state body in which the specified persons work.

Previously, the law allowed for the temporary prohibition of the dissemination of personal data of protected persons only in the event of imminent danger in connection with official duties and activities. The new amendments make it possible to take protective measures in the absence of a threat of encroachment on their life, health and property.

What to watch next: New amendments to the general Personal Data Law are on their way in 2021

There are several developments to follow in this fast changing environment. First, at the end of January, the Russian President gave the government until August 1 to create a set of rules for foreign tech companies operating in Russia, including a requirement to open branch offices in the country.

Second, a bill (No. 992331-7) proposing new amendments to the overall framework of the Personal Data Law (No. 152-FZ) was introduced in July 2020 and was the subject of a Resolution that passed in the State Duma on February 16, allowing for a period for amendments to be submitted, until March 16. The bill is on the agenda for a potential vote in May. The changes would entail expanding the possibility to obtain valid consent through other unique identifiers which are currently not accepted by the law, such as unique online IDs, changes to purpose limitation, a possible certification scheme for effective methods to erase personal data and new competences for the Roskomnadzor to establish requirements for deidentification of personal data and specific methods for effective deidentification.

If you have any questions on Global Privacy and Data Protection developments, contact Gabriela Zanfir-Fortuna at [email protected]

FPF Experts Take The Stage at the 2025 IAPP Global Privacy Summit

By FPF Communications Intern Celeste Valentino

Earlier this month, FPF participated at the IAPP’s annual Global Privacy Summit (GPS)  at the Convention Center in Washington, D.C. The Summit convened top privacy professionals for a week of expert workshops, engaging panel discussions, and exciting networking opportunities on issues ranging from understanding U.S. state and global privacy governance to the future of technological innovation, policy, and professions.

event recap blog template (3)

FPF started out the festivities by hosting its annual Spring Social with a night full of great company, engaging discussions, and new connections. A special thank you to our sponsors FTI Consulting, Perkins Coie, Qohash, Transcend, and TrustArc!

The IAPP conference started with FPF Senior Director for U.S. Legislation Keir Lamont, who led an informative workshop, “US State Privacy Crash Course – What Is New and What Is Next” with Lothar Determann (Partner, Baker McKenzie) and David Stauss (Partner, Husch Blackwell). The workshop provided an overview of recent U.S. state privacy legislation developments and a lens into how these laws fit into the existing landscape.

img 0823 (2)

The next day, FPF Senior Fellow Doug Miller hosted an insightful discussion with Jocelyn Aqua (Principal, PwC), providing guidance and tools for privacy professionals to avoid workplace burnout. Both began the discussion by arguing that because privacy professionals face different organizational and positional pressures from other business professionals, they experience varying types of burnout that require alternative remedies. The experts then detailed each kind of burnout and provided solutions for how individuals, teams, and leaders can provide support to avoid them. “Giving your team transparency about a decision gives them control, and feeling better about a decision,” Doug explained, highlighting leaders’ vital role in mitigating workplace burnout. You can find additional resources from Doug’s full presentation here.

1745437096396

Next, FPF Vice President for Global Privacy Gabriela Zanfir-Fortuna, moderated a compelling conversation amongst European legislators, including Brando Benifei (Member of European Parliament, co-Rapporteur of the AI Act), John Edwards (Information Commissioner, U.K. Information Commissioner’s Office), and Louisa Specht-Riemenschneider (Federal Commissioner for Data Protection and Freedom of Information, Germany), on Cross-regulatory Cooperation Between Digital Regulators. 

Their panel began by painting a detailed portrait of how the proliferation of digital regulations has created a necessity for cross-regulatory collaboration between differing authorities. Using the EU Artificial Intelligence (AI) Act as an example, the panelists argued that the success of cross-regulation hinges on cooperation and knowledge sharing between data protection agencies of different countries. “It’s important to see how the authority of the data protection authority remains relevant and at the center of regulation around AI. One interesting point in the AI Act is that in the Netherlands, there were around 20 authorities appointed as having competence to enforce and regulate to a certain extent under the AI Act; this speaks to how complex the landscape is,” examined Gabriela Zanfir-Fortuna, Vice President for Global Privacy.

The panel also dissected concrete ways regulators can work together to enable cross-regulation, including a mandatory collaboration mechanism, supervisory authorities, and a more unified approach from governments and regulators alike. 

img 0834 (1)

FPF CEO Jules Polonetsky served as a moderator of a timely dialogue among high-ranking leaders, including Kate Charlet (Director, Privacy, Safety, and Security; Government Affairs and Public Policy, Google), Kate Goodloe (Managing Director, Policy, BSA, The Software Alliance), and Amanda Kane Rapp (Head of Legal, U.S. Government, Palantir Technologies), covering tech in an evolving political era. 

The panel highlighted recent and expected shifts in technology, cybersecurity, privacy, AI governance, and online safety within a new U.S. executive administration. Jules commenced the panel posing, “We’ve seen increasing clashes between privacy and competition, privacy and kids’ issues, etc. Has anything changed in the current environment?” The panelists agreed that, regardless of government dynamics, privacy issues remain relevant for technology companies to address to protect and foster trust in the digital ecosystem with consumers. The panel also provided a master perspective on how tech leaders approach digital governance now and in the future through promoting interoperability, model transparency, and government experimentation and implementation of IT tools and procurement.

img 0282 2

On the second day  of the conference, FPF Managing Director for Asia-Pacific (APAC) Josh Lee Kok Thong, spoke on a panel with Darren Grayson Chng, (Regional Data Protection Director, Asia Pacific, Middle East, and Africa, Electrolux), Haksoo Ko (Chairperson, Personal Information Protection Commission, Republic of Korea), and Angela Xu (Senior Privacy Counsel, APAC Head, Google) exploring the nuanced landscape of AI regulation in Asia-Pacific. 

Through the panel, the discussants highlighted the differing AI regulatory approaches across the Asia-Pacific region, noting that most APAC jurisdictions have preferred not to enact hard AI laws. Instead, these regions focus on regulating elements of AI systems such as the use of personal data (Singapore), addressing risk in AI systems (Australia), promoting industry development (South Korea), fostering international cooperation, and responsible AI practices (Japan), government oversight of deployment of AI systems (India) and regulating misinformation and personal information protection (China). “The APAC region is like a huge experimental lens for AI regulation, with different jurisdictions trying out different approaches, so do pay attention to this region because it will be very influential going forward. There will be increasing diversity and regulation,” Josh noted, providing valuable insider insight about where audience members should focus their attention. 

event recap blog template (4)

Throughout the week, FPF’s booth in the Exhibition Hall was a popular stop for IAPP GPS attendees. Policymakers, industry leaders, and privacy scholars stopped by our booth to learn more about FPF memberships, connect with FPF staff, and learn more about FPF’s ongoing issues, ranging from the future of regulating AI agents to helping schools defend against deepfakes in the classroom. Visitors to the booth stopped by to speak with FPF staff and left with a collection of infographics, membership resources, and an “I Love Privacy” sticker.

img 1735 (1) edited (1)

FPF hosted two roundtable discussions early in the week, with Vice President for Global Privacy, Gabriela Zanfir-Fortuna, leading  conversations on “Navigating Transatlantic Affairs and the EU-US Digital Regulatory Landscape” and “India’s new Data Protection law and what to expect from its implementation phase.” FPF’s U.S. Legislation team also hosted an event at our D.C. office for members to connect with the team and each other to discuss the U.S. legislative landscape. 

img 1691 (1) edited (1)

FPF also hosted two Privacy Executives Network breakfasts and a lunch during the Summit week featuring peer-to-peer discussions top-of-mind issues in data protection and privacy and AI Governance. We discussed the current EU privacy landscape with Commissioner for Data Protection and Chairperson of the Irish Data Protection Commission, Des Hogan, and we spoke with Colorado Attorney General Office’s First Assistant Attorney General, Technology & Privacy Protection Unit, Stevie DeGroff. These roundtable discussions allowed our members to discuss critical topics with one another in a private and dynamic meeting. 

In partnership with the Mozilla Foundation, we also hosted a PETs Workshop featuring short, expert panels exploring new and emerging Privacy Enhancing Technology (PETs) applications. Technology and policy experts presented several leading PETs use cases, analyzed how PETs work with other privacy protections, and discussed how PETs may intersect with data protection rules. This workshop was the first time that several of the use cases were shared in detail with independent experts.

We hope you enjoyed this year’s IAPP Global Privacy Summit as much as we did! If you missed us at our booth, visit FPF.org for all our reports, publications, and infographics. Follow us on X, LinkedIn, Instagram, and YouTube, and subscribe to our newsletter for the latest.

Lessons Learned from FPF “Deploying AI Systems” Workshop

On May 7, 2025, the Future of Privacy Forum (FPF) hosted a “Deploying AI Systems” workshop at the Privacy + Security Academy’s Spring Academy, which took place at The George Washington University in Washington, DC. Workshop participants included students and privacy lawyers from firms, companies, data protection authorities, and regulatory agencies around the world.

img 6436
Pictured left to right: Daniel Berrick, Anne Bradley, Bret Cohen, Brenda Leong, and Amber Ezzell

The two-part workshop explored the emerging U.S. and global legal requirements for AI deployers, and attendees engaged in exercises involving case studies and demos on managing third-party vendors, agentic AI, and red teaming. The workshop was facilitated by FPF’s Amber Ezzell, Policy Counsel for Artificial Intelligence, who was joined by Anne Bradley (Luminos.AI), Brenda Leong (ZwillGen), Bret Cohen (Hogan Lovells), and Daniel Berrick (FPF).

From the workshop, a few key takeaways emerged:

As organizations, policymakers, and regulators grapple with the rapidly evolving landscape of AI development and deployment, FPF will continue to explore a range of issues at the intersection of AI governance.

If you have any questions, comments, or wish to discuss any of the topics related to the Deploying AI Systems workshop, please do not hesitate to reach out to FPF’s Center for Artificial Intelligence at [email protected].

Amendments to the Montana Consumer Data Privacy Act Bring Big Changes to Big Sky Country

On May 8, Montana Governor Gianforte signed SB 297, amending the Montana Consumer Data Privacy Act (MCDPA). This amendment was sponsored by Senator Zolnikov, who also championed the underlying law’s enactment in 2023. Much has changed in the state privacy law landscape since the MCDPA was first enacted, and SB 297 incorporates elements of further reaching state laws into the MCDPA while declining to break new ground. For example, SB 297 adopts heightened protections for minors like those in Connecticut and Colorado as well as privacy notice requirements and a narrowed right of access like in Minnesota’s law. The bill does not include an effective date for these new provisions, so by default the amendments should take effect on October 1, 2025. 

This blog post highlights the important changes made by SB 297 and some key takeaways about what this means for the comprehensive consumer privacy landscape. Changes to the law include (1) a duty of care with respect to minors, (2) new requirements for processing minors’ personal data, (3) a disclaimer that the law does not require age verification, (4) lowered applicability thresholds and narrowed exemptions, (5) a narrowed right of access that prohibits controllers from disclosing certain sensitive information, (6) expanded privacy notice requirements, and (7) modifications to the law’s enforcement provisions. With these changes, Montana yet again reminds us that privacy remains a bipartisan issue as SB 297, like its underlying law, was passed with overwhelmingly bipartisan votes.

1.  New Connecticut- and Colorado-style duty of care with respect to minors. 

The biggest changes to the MCDPA concern protections for children and teenagers. Like legislation enacted by Connecticut in 2023 and Colorado in 2024, SB 297 amends the MCDPA to add privacy protections for consumers under the age of 18 (“minors”). These new provisions apply more broadly than the rest of the law, covering entities that conduct business in Montana without any small business exceptions (i.e., there are no numerical applicability thresholds, although the law’s entity-level and data-level exemptions still apply). 

Under these new provisions, any controller that offers an online service, product, or feature to a consumer whom the controller actually knows or wilfully disregards is a minor must use “reasonable care” to avoid a “heightened risk of harm to minors” caused by the online service, product, or feature (“online service”). Heightened risk of harm to minors is defined as processing a minor’s personal data in a manner that presents a “reasonably foreseeable risk” of: (a) Unfair or deceptive treatment of, or unlawful disparate impact on, a minor; (b) financial, physical, or reputational injury; (c) unauthorized disclosure of personal data as a result of a security breach (as described in Mont. Code Ann. § 30-14-1704); or (d) intrusion upon the solitude or seclusion or private affairs or concerns of a minor, whether physical or otherwise, that would be offensive to a reasonable person. This definition largely aligns with some of the existing triggers for conducting a data protection assessment under the MCDPA. 

At a time when many youth privacy and online safety bills, such as the California Age-Appropriate Design Code (AADC), are mired in litigation over their constitutionality, it is notable that three states—Connecticut, Colorado, and Montana—have now opted for the framework in SB 297. Given that neither Connecticut’s nor Colorado’s laws have been subject to any constitutional challenges as of yet, this approach could be a more constitutionally resilient way than the AADC model to impose a duty of care with respect to minors. Specifically, the duties of care in Connecticut’s, Colorado’s, and now Montana’s laws are rooted in traditional privacy harms and torts (e.g., intrusion upon seclusion) whereas other frameworks that have been challenged have more amorphous concepts of harm that are more likely to implicate protected speech (e.g., the enjoined California AADC requires addressing whether an online service’s design could harm children by exposing them to “harmful, or potentially harmful, content”). 

2.  Controllers are entitled to a rebuttable presumption of having exercised reasonable care if they comply with statutory requirements.

Under Montana’s new duty of care to minors, a controller is entitled to a rebuttable presumption that it used reasonable care if it complies with certain statutory requirements related to design and personal data processing. With respect to design, controllers are prohibited from using consent mechanisms that are designed to impair user autonomy, they are required to establish easy-to-use safeguards to limit unsolicited communications from unknown adults, and they must provide a signal indicating when they are collecting precise geolocation data. For processing, controllers must obtain a minor’s consent before: (a) Processing a minor’s data for targeted advertising, sale, and profiling in furtherance of decisions that produce legal or similarly significant effects; (b) “us[ing] a system design feature to significantly increase, sustain, or extend a minor’s use of the online service, product, or feature”; or (c) collecting precise geolocation data, unless doing so is “reasonably necessary” to provide the online service, or retaining that data for longer than “necessary” to provide the online service.

Controllers subject to these provisions must also conduct data protection assessments for an online service “if there is a heightened risk of harm to minors.” These data protection assessments must comply with all existing requirements under the MCDPA and must provide additional information such as the online service’s purpose, the categories of personal data processed, and the processing purposes. Data protection assessments should be reviewed “as necessary” to account for material changes, and documentation should be retained for either 3 years after the processing operations cease, or the date on which the controller ceases offering the online service, whichever is longer. If a controller conducts an assessment and determines that a heightened risk of harm to minors exists, it must “establish and implement a plan to mitigate or eliminate the heightened risk.” 

Although the substantive requirements of the protections for minors are substantively similar between Connecticut’s, Colorado’s, and Montana’s laws, these states are not fully aligned with respect to the rebuttable presumption of reasonable care. Montana follows Colorado’s approach, whereby a controller is entitled to the rebuttable presumption if it complies with the processing and design restrictions described above. Connecticut’s law, in contrast, provides that a controller is entitled to the rebuttable presumption of having used reasonable care if the controller complies with the data protection assessment requirements. 

3.  The bill clarifies that Montana’s privacy law does not require age verification. 

In addition to adding a duty of care and design and processing restrictions with respect to minors, SB 297 makes a small change to existing adolescent privacy protections. The existing requirement that a controller obtain a consumer’s consent before engaging in targeted advertising or selling personal data for consumers aged 13–15 now applies when a controller willfully disregards the consumer’s age, not just if the controller has actual knowledge of their age. This knowledge standard aligns with that in similar opt-in requirements for adolescents in California, Connecticut, Delaware, New Hampshire, New Jersey, and Oregon. It also aligns with the broader duty of care protections in SB 297, which apply when a controller “actually knows or willfully disregards” that a consumer is a minor. This change may be negligible, however, as the amendment already requires any controller that offers an online service, product, or feature to a consumer whom the controller actually knows or wilfully disregards is a minor (under 18) to obtain consent before processing a minor’s data for targeted advertising, sale, and profiling in furtherance of decisions that produce legal or similarly significant effects.

These new protections and the introduction of a “willfully disregards” knowledge standard for minors implicate a broad, contentious policy debate over age verification, the process by which an entity affirmatively determines the age of individual users, often through the collection of personal data. Across the country, courts are litigating the constitutionality of such requirements under other laws. Presumably to head-off any such constitutional challenges, SB 297 explicitly provides that nothing in the law shall require a controller to engage in age-verification or age-gating. However, it also provides that if a controller chooses to conduct commercially reasonable age estimation to determine which consumers are minors, then the controller is not liable for erroneous age estimation. 

Such a clarification is arguably necessary if “willfully disregards” is implied to require some level of affirmative action on a controller’s part to estimate users’ ages under certain circumstances. For example, the Florida Digital Bill of Rights regulations provide that a controller willfully disregards a consumer’s age if it “should reasonably have been aroused to question whether a consumer was a child and thereafter failed to perform reasonable age verification,” and it incentivizes age verification by providing that a controller will not be found to have willfully disregarded a consumer’s age if it used “a reasonable age verification method with respect to all of its consumers” and determined that the consumer was not a child. Montana takes a different approach, explicitly disclaiming any requirement to engage in age verification, but still incentivizing age estimation. 

4.  Changed applicability requirements expand the law’s reach. 

Owing to its relatively low population, the MCDPA had the lowest numerical applicability thresholds of any of the state comprehensive privacy laws when the law was enacted in 2023. At that time, prior comprehensive privacy laws in Virginia, Colorado, Utah, Connecticut, Iowa, and Indiana all applied to controllers that either (1) control or process the personal data of at least 100,000 consumers (“the general threshold”), or (2) control or process the personal data of at least 25,000 consumers if the controller derived a certain percentage of its gross revenue from the sale of personal data. Montana broke that mold by lowering the general threshold to 50,000 affected consumers. Several states—Delware, New Hampshire, Maryland, and Rhode Island—have since surpassed Montana’s low-water mark. Accordingly, SB 297 lowers the law’s applicability thresholds. The law will now apply to controllers that either (1) control or process the personal data of at least 25,000 consumers, or (2) control or process the personal data of at least 15,000 consumers (down from 25,000) if the controller derives at least 25% of gross revenue from the sale of personal data. 

Following a broader legislative trend in recent years, this bill also narrows or eliminates several entity-level exemptions. Most notably, the entity-level exemption for financial institutions and affiliates governed by the Gramm-Leach-Bliley Act has been narrowed to a data-level exemption, aligning with the approach taken by Oregon and Minnesota. To counterbalance this change, SB 297 adds new entity-level exemptions for certain chartered banks, credit unions, insurers, and third-party administrators of self-insurance engaged in financial activities. SB 297 also narrows the non-profit exemption to apply only to non-profits that are “established to detect and prevent fraudulent acts in connection with insurance.” Thus, Montana’s law now joins those of Colorado, Oregon, Delaware, New Jersey, Maryland, and Minnesota in broadly applying to non-profits. 

5.  The newly narrowed right to access now prohibits controllers from disclosing certain types of highly-sensitive information, such as social security numbers.

The consumer right to access one’s personal data carries a tension between the ability to access the specific data that an entity has collected concerning oneself and the risk that one’s data, especially one’s sensitive data, could be either erroneously or surreptitiously disclosed to a third party or even a bad actor. Responsive to that risk, SB 297 follows Minnesota’s approach by narrowing the right to access to prohibit disclosure of certain types of sensitive data. As amended, a controller now may not, in response to a consumer exercising their right to access their personal data, disclose the following information: social security number; government issued identification number (including driver’s license number); financial account number; health insurance account number or medical identification number; account password, security questions, or answer; or biometric data. If a controller has collected this information, rather than disclosing it, the controller must inform the consumer “with sufficient particularity” that it has collected the information. 

SB 297 also slightly expands one of the law’s opt-out rights. Consumers can now opt out of profiling in furtherance of “automated decisions” that produce legal or similarly significant effects, rather than only “solely automated decisions.”

6.  The MCDPA now includes more prescriptive privacy notice requirements.

SB 297 significantly expands the requirements for privacy notices and related disclosures, largely aligning with the more prescriptive provisions in Minnesota’s law. Changes made by SB 297 include—

The law provides that controllers do not need to provide a separate, Montana-specific privacy notice or section of a privacy notice so long as the controller’s general privacy notice includes all information required by the MCDPA. 

7.  The Attorney General now has increased investigatory power.

Finally, SB 297 reworks the law’s enforcement provisions. The amendments build out the Attorney General’s (AG) investigatory powers by allowing the AG to exercise powers provided by the Montana Consumer Protection Act and Unfair Trade Practices laws, to issue civil investigative demands, and request that controllers disclose any data protection assessments that are relevant to an investigation. Furthermore, the AG is no longer required to offer an opportunity to cure before bringing an enforcement action, in effect closing the cure period six months prior to its previous scheduled expiration date. The statute of limitations is five years after a cause of action accrues. 

* * *

Looking to get up to speed on the existing state comprehensive consumer privacy laws? Check out FPF’s 2024 report, Anatomy of State Comprehensive Privacy Law: Surveying the State Privacy Law Landscape and Recent Legislative Trends

Tags: U.S. Legislation, Youth & Education Privacy

Consent for Processing Personal Data in the Age of AI: Key Updates Across Asia-Pacific

This Issue Brief summarizes key developments in data protection laws across the Asia-Pacific region since 2022, when the Future of Privacy Forum (FPF) and the Asian Business Law Institute (ABLI) published a series of reports examining 14 jurisdictions in the region. We found that while many offer alternative legal bases for data processing, consent remains the most widely used, often due to its familiarity, despite known limitations.

This Issue Brief provides an updated view of evolving consent requirements and alternative legal bases for data processing across key APAC jurisdictions: India, Vietnam, Indonesia, the Philippines, South Korea, and Malaysia.

In August 2023, India passed the Digital Personal Data Protection Act (DPDPA). Once in force, the DPDPA will provide a comprehensive framework for processing personal data. It affirms consent as the primary basis for processing but introduces structured obligations around notice, purpose limitation, and consent withdrawal, while enabling future flexibility for alternative legal bases.

Vietnam‘s Decree on Personal Data Protection took effect in July 2023. It sets clearer standards for consent while formally recognizing alternative legal bases, including for contractual necessity and legal obligations. This marks a key step in broadening lawful processing options for businesses.

Indonesia’s Personal Data Protection Law (PDPL), enacted in October 2022, introduces a unified national privacy law with an extended transition period. It affirms consent but also allows processing based on legitimate interest, public duties, and contract performance, bringing Indonesia closer to global privacy frameworks.

In November 2023, the PhilippinesNational Privacy Commission issued a Circular on Consent, clarifying valid consent standards and promoting transparency. The guidance aims to reduce consent fatigue by encouraging layered, contextual consent interfaces and outlines when consent may not be strictly necessary.

South Korea amended PIPA (in force since September 2023) and related guidelines promote easy-to-understand consent practices and recognize additional legal grounds, especially in the context of AI. A 2025 bill is under consideration to expand the use of non-consent bases for AI-related processing.

The Personal Data Protection (Amendment) Act 2024, published in October 2024, introduces stronger enforcement tools and administrative penalties in Malaysia. While the amendments do not change the legal bases for processing, they enhance the compliance environment and signal stricter oversight.

The Issue Brief also explores how the rise of AI is impacting shifts in lawmaking and policymaking across the region, when it comes to lawful grounds for processing personal data. 

As the APAC region shifts from fragmented, sector-specific rules to unified legal frameworks, understanding the evolving role of consent and the growing adoption of alternative legal bases is essential. From improving user-friendly consent mechanisms to strengthening enforcement and expanding lawful processing grounds, these changes highlight a more flexible and accountable approach to data protection across the region.

The Curse of Dimensionality: De-identification Challenges in the Sharing of Highly Dimensional Datasets

The 2006 release by AOL of search queries linked to individual users and the re-identification of some of those users is one of the best known privacy disasters in internet history. Less well known is that AOL had released the data to meet intense demand from academic researchers who saw this valuable data set as essential to understanding a wide range of human behavior. 

As the executive appointed AOL’s first Chief Privacy Officer as part of a strategy to help prevent further privacy lapses, the benefits as well as the risks of sharing data became a priority in my work. At FPF, our teams have worked on every aspect of enabling privacy safe data sharing for research and social utility, including de-identification1, the ethics of data sharing, privacy-enhancing technologies2 and more3.  Despite the skepticism of critics who maintain that reliable identification is a myth4, I maintain that it is hard, but for many data sets it is feasible, with the application of significant technical, legal and organizational controls. However, for highly dimensional data sets, or complex data sets that are made public or shared with multiple parties, the ability to provide strong guarantees at scale or without extensive impact on utility is far less feasible. 

1. Introduction

The Value and Risk of Search Query Data

Search query logs constitute an unparalleled repository of collective human interest, intent, behavior, and knowledge-seeking activities. As one of the most common activities on the web, searching generates data streams that paint intimate portraits of individual lives, revealing interests, needs, concerns, and plans over time5. This data holds immense potential value for a wide range of applications, including improving search relevance and functionality, understanding societal trends, advancing scientific research (e.g., in public health surveillance or social sciences), developing new products and services, and fueling the digital advertising ecosystem. 

However, the very richness that makes search data valuable also makes it exceptionally sensitive and fraught with privacy risks. Search queries frequently contain explicit personal information such as names, addresses, phone numbers, or passwords, often entered inadvertently by users. Beyond direct identifiers, queries are laden with quasi-identifiers (QIs) – pieces of information that, while not identifying in isolation, can be combined with other data points or external information to single out individuals. These can include searches related to specific locations, niche hobbies, medical conditions, product interests, or unique combinations of terms searched over time. Furthermore, the integration of search engines with advertising networks, user accounts, and other online services creates opportunities for linking search behavior with other extensive user profiles, amplifying the potential for privacy intrusions. The longitudinal nature of search logs, capturing behavior over extended periods, adds another layer of sensitivity, as sequences of queries can reveal evolving life circumstances, intentions, and vulnerabilities. The database reconstruction theorem, referred to as the fundamental law of information reconstruction, posits that publishing too much data derived from a confidential data source, at a high a degree of accuracy, will certainly after a finite number of queries result in the de-identification of the confidential data6. Extensive and extended releases of search data are a model example of this problem.

The De-identification Imperative and Its Inherent Challenges

Faced with the dual imperatives of leveraging valuable data and protecting user privacy, organizations rely heavily on data de-identification. De-identification encompasses a range of techniques aimed at removing or obscuring identifying information from datasets, thereby reducing the risk that the data can be linked back to specific individuals. The goal is to enable data analysis, research, and sharing while mitigating privacy harms and complying with legal and ethical obligations.

Despite its widespread use and appeal, de-identification is far from a perfected solution. Decades of research and numerous real-world incidents have demonstrated that supposedly “de-identified” or “anonymized” data have been re-identified, sometimes with surprising ease. This re-identification potential stems from several factors: the residual information left in the data after processing, the increasing availability of external datasets (auxiliary information) that can be linked to the de-identified data, and the continuous development of sophisticated analytical techniques. In some of these cases, a more rigorous de-identification process could have provided more effective protections, albeit with impact on the availability of the data needed.  In other cases, the impact of the de-identification might “only” be a threat to public figures7. In my experience, expert technical and legal teams can collaborate to support reasonable de-identification efforts for data that is well structured or closely held, but for complex, high-dimensional datasets or data shared broadly, the risks multiply.

Furthermore, the terminology itself is fraught with ambiguity. “De-identification” is often used as a catch-all term, but it can range from simple masking of direct identifiers (which offers weak protection) to more rigorous attempts at achieving true anonymity, where the risk of re-identification is negligible. This ambiguity can foster a false sense of security, as techniques that merely remove names or obvious identifiers have too often been labeled as “de-identified” while still leaving individuals vulnerable. Achieving a state where individuals genuinely cannot be reasonably identified is significantly harder, especially given the inherent trade-off between privacy protection and data utility: more aggressive de-identification techniques reduce re-identification risk but also diminish the data’s value for analysis. The concept of true, irreversible anonymization, where re-identification is effectively impossible, represents a high standard that is particularly challenging to meet for rich behavioral datasets, especially when data is shared with additional parties or made public. For more limited data sets that can be kept private and secure, or shared with extensive controls and legal and technical oversight, effective de-identification that maintains utility while reasonably managing risk can be feasible. This gap between the promise of de-identification and the persistent reality of re-identification risk for rich data sets that are shared lies at the heart of the privacy challenges discussed in this article.

Report Objectives and Structure

This article provides an analysis of the challenges associated with de-identifying massive datasets of search queries. It aims to review the technical, practical, legal, and ethical complexities involved. The analysis will cover:

  1. General De-identification Concepts and Techniques: Defining the spectrum of data protection methods and outlining common technical approaches.
  2. Unique Characteristics of Search Data: Examining the properties of search logs (dimensionality, sparsity, embedded identifiers, longitudinal nature) that make de-identification particularly difficult.
  3. The Re-identification Threat: Reviewing the mechanisms of re-identification attacks and landmark case studies (AOL, Netflix, etc.) where de-identification failed.
  4. Limitations of Techniques: Assessing the vulnerabilities and shortcomings of various de-identification methods when applied to search data.
  5. Harms and Ethics: Identifying the potential negative consequences of re-identification and exploring the ethical considerations surrounding user expectations, transparency, and consent.

The report concludes by synthesizing these findings to summarize the core privacy challenges, risks, and ongoing debates surrounding the de-identification of massive search query datasets.

2. Understanding Data De-identification

To analyze the challenges of de-identifying search queries, it is essential first to establish a clear understanding of the terminology and techniques involved in de-identification. The landscape includes various related but distinct concepts, each carrying different technical implications and legal weight.

Defining the Spectrum: De-identification, Anonymization, Pseudonymization8

The terms used to describe processes that reduce the linkability of data to individuals are often employed inconsistently, leading to confusion. 

Key De-identification Techniques and Mechanisms

A variety of techniques can be employed, often in combination, to achieve different levels of de-identification or anonymization. Each has distinct mechanisms, strengths, and weaknesses:

The following table provides a comparative overview of these techniques:

Table 1: Comparison of Common De-identification Techniques

Technique NameMechanism DescriptionPrimary GoalKey StrengthsKey Weaknesses/LimitationsApplicability to Search Logs
Suppression/ RedactionRemove specific values or recordsRemove specific identifiers/sensitive dataSimple; Effective for targeted removalHigh utility loss if applied broadly; Doesn’t address linkage via remaining dataLow (Insufficient alone; high utility loss for QIs)
MaskingObscure parts of data values (e.g., XXXX)Obscure direct identifiersSimple; Preserves formatLimited privacy protection; Can reduce utility; Hard for free textLow (Insufficient for QIs in queries)
GeneralizationReplace specific values with broader categoriesReduce identifiability via QIsBasis for k-anonymitySignificant utility loss, especially in high dimensions (“curse of dimensionality”)Low (Requires extreme generalization, destroying query meaning)
AggregationCombine data into summary statisticsHide individual recordsSimple; Useful for high-level trendsLoses individual detail; Vulnerable to differencing attacks ; Low utility for user-level analysisLow (Loses essential query sequence/context)
Noise AdditionAdd random values to data/resultsObscure true values; Enable DPBasis for DP; Provable guarantees possibleReduces accuracy/utility; Requires careful calibrationLow (Core of DP, but utility trade-off is key challenge, application to non-numeric fields like query text uncertain)
SwappingExchange values between recordsPreserve aggregates while perturbing recordsMaintains marginal distributionsIntroduces record-level inaccuracies; Complex implementation; Limited privacy guaranteeLow (Disrupts relationships within user history)
Hashing (Salted)Apply one-way function with unique salt per recordCreate non-reversible identifiersCan prevent simple lookups if salted properlyVulnerable if salt/key compromised; Doesn’t prevent linkage if hash is used as QILow (Hash of query text loses semantics; Hash of user ID is just pseudonymization)
PseudonymizationReplace identifiers with artificial codesAllow tracking/linking without direct IDsEnables longitudinal analysis; ReversibleStill personal data; High risk of pseudonym reversal/linkage, QIs remaining in data set create major risksLow (Allows user tracking, but privacy relies on pseudonym security/unlinkability)
k-AnonymityEnsure record indistinguishable among k based on QIsPrevent linkage via QIsIntuitive conceptFails in high dimensions; High utility loss; Vulnerable to homogeneity/background attacks; Not compositionalMedium (Impractical due to data characteristics)
l-Diversity / t-Closenessk-Anonymity variants adding sensitive attribute constraintsPrevent attribute disclosure within k-groupsStronger attribute protection than k-anonymityInherits k-anonymity issues; Adds complexity; Further utility reductionLow (Impractical due to k-anonymity’s base failure)
Differential Privacy (DP)Mathematical framework limiting inference about individuals via noiseProvable privacy guarantee against inference/linkageStrongest theoretical guarantees; Composable; Robust to auxiliary infoUtility/accuracy trade-off; Implementation complexity; Can be hard for complex queriesLow (Theoretically strongest, but practical utility for granular search data is a major hurdle)
Synthetic DataGenerate artificial data mimicking original statisticsProvide utility without real recordsCan avoid direct disclosure of real dataHard to ensure utility & privacy simultaneously; Risk of memorization/inference if model overfits; Bias amplificationMedium (Promising, but technically demanding for complex behavioral data like search, future potential, but research still early)

3. The Unique Nature and Privacy Sensitivity of Search Query Data

Search query data possesses several intrinsic characteristics that make it particularly challenging to de-identify effectively while preserving its analytical value. These properties distinguish it from simpler, structured datasets often considered in introductory anonymization examples.

High Dimensionality, Sparsity, and the “Curse of Dimensionality”

Search logs are inherently high-dimensional datasets. Each interaction potentially captures a multitude of attributes associated with a user or session: the query terms themselves, the timestamp of the query, the user’s IP address (providing approximate location), browser type and version, operating system, language settings, cookies or other identifiers linking sessions, the rank of clicked results, the URL or domain of clicked results, and potentially other contextual signals. When viewed longitudinally, the sequence of these interactions adds further dimensions representing temporal patterns and evolving interests.

Simultaneously, individual user data within this high-dimensional space is typically very sparse. Any single user searches for only a tiny fraction of all possible topics or keywords, clicks on a minuscule subset of the web’s pages, and exhibits specific patterns of activity at particular time17.

This combination of high dimensionality and sparsity poses a fundamental challenge known as the “curse of dimensionality18” in the context of data privacy. In high-dimensional spaces, data points tend to become isolated; the concept of a “neighbor” or “similar record” becomes less meaningful because points are likely to differ across many dimensions19. Consequently, even without explicit identifiers, the unique combination of attributes and behaviors across many dimensions can act as a distinct “fingerprint” for an individual user. This uniqueness makes re-identification through linkage or inference significantly easier.

The curse of dimensionality challenges traditional anonymization techniques like k-anonymity20. Since k-anonymity relies on finding groups of at least k individuals who are identical across all quasi-identifying attributes, the sparsity and uniqueness inherent in high-dimensional search data make finding such groups highly improbable without resorting to extreme measures. To force records into equivalence classes, one would need to apply such broad generalization (e.g., reducing detailed query topics to very high-level categories) or suppress so much data that the resulting dataset loses significant analytical value. 

Implicit Personal Identifiers and Quasi-Identifiers in Queries

Beyond the metadata associated with a search (IP, timestamp, etc.), the content of the search queries themselves is a major source of privacy risk.  Firstly, users frequently, though often unintentionally, include direct personal information within their search queries. This could be their own name, address, phone number, email address, social security number, account numbers, or similar details about others. The infamous AOL search log incident provided stark evidence of this, where queries directly contained names and location information that facilitated re-identification.  Secondly, and perhaps more pervasively, search queries are rich with quasi-identifiers (QIs). These are terms, phrases, or concepts that, while not uniquely identifying on their own, become identifying when combined with each other or with external auxiliary information. Examples abound in the search context:

The challenge lies in the unstructured, free-text nature of search queries. Unlike structured databases where QIs like date of birth, gender, and ZIP code often reside in well-defined columns, the QIs in search queries are embedded within the semantic meaning and contextual background of the text string itself. Identifying and removing or generalizing all such potential QIs automatically is an extremely difficult task, particularly if done at large scale and by automated means. Standard natural language processing techniques might identify common entities like names or locations, but would struggle with the vast range of potentially identifying combinations and context-dependent sensitivities. Passwords or coded unique urls of private documents may be entered by users and impossible to recognize for automated redaction. This inherent difficulty in scrubbing QIs from unstructured query text makes search data significantly harder to de-identify reliably compared to structured data.

Temporal Dynamics and Longitudinal Linkability

Search logs are not static snapshots; they are longitudinal records capturing user behavior as it unfolds over time. A user’s search history represents a sequence of actions, reflecting evolving interests, ongoing tasks, changes in location, and shifts in life circumstances. This temporal dimension adds significant identifying power beyond that of individual, isolated queries.

Even if session-specific identifiers like cookies are removed or periodically changed, the continuity of a user’s behavior can allow for linking queries across different sessions or time periods. Consistent patterns (e.g., regularly searching for specific technical terms related to one’s profession), evolving interests (e.g., searches related to pregnancy progressing over months), or recurring needs (e.g., checking commute times) can serve as anchors to connect seemingly disparate query records back to the same individual. The sequence itself becomes a quasi-identifier.  This poses a significant challenge for de-identification. Techniques applied cross-sectionally—treating each query or session independently—may fail to protect against longitudinal linkage attacks that exploit these behavioral trails. Effective de-identification of longitudinal data requires considering the entire user history, or at least sufficiently long windows of activity, to assess and mitigate the risk of temporal linkage. This inherently increases the complexity of the de-identification process and potentially necessitates even greater data perturbation or suppression to break these temporal links, further impacting utility. Anonymization techniques that completely sever links between records over time would prevent valuable longitudinal analysis altogether.

The Uniqueness and Re-identifiability Potential of Search Histories

The combined effect of high dimensionality, sparsity, embedded quasi-identifiers, and temporal dynamics results in search histories that are often highly unique to individual users. Research has repeatedly shown that even limited sets of behavioral data points can uniquely identify individuals within large populations. Latanya Sweeney’s seminal work demonstrated that 87% of the US population could be uniquely identified using just three quasi-identifiers: 5-digit ZIP code, gender, and full date of birth21. Search histories contain far more dimensions and potentially identifying attributes than this minimal set.

Studies on analogous high-dimensional behavioral datasets confirm this potential for uniqueness and re-identification. The successful de-anonymization of Netflix users based on a small number of movie ratings linked to public IMDb profiles is a prime example. Similarly, research has shown high re-identification rates for mobile phone location data and credit card transactions, purely based on the patterns of activity. Su and colleagues showed that de-identified web browsing histories can be linked to social media profiles using only publicly available data22. Given that search histories encapsulate a similarly rich and diverse set of user actions and interests over time, it is highly probable that many users possess unique or near-unique search “fingerprints” even after standard de-identification techniques (like removing IP addresses and user IDs) are applied. This inherent uniqueness makes search logs exceptionally vulnerable to re-identification, particularly through linkage attacks that correlate the de-identified search patterns with other available data sources. The simple assumption that removing direct identifiers is sufficient to protect privacy is demonstrably false for this type of rich, behavioral data. The very detail that makes search logs valuable for understanding behavior also makes them inherently difficult to anonymize effectively.

4. The Re-identification Threat: Theory and Practice

The potential for re-identification is not merely theoretical; it is a practical threat demonstrated through various attack methodologies and real-world incidents. Understanding these mechanisms is crucial for appreciating the limitations of de-identification for search query data.

Mechanisms of Re-identification: Linkage, Inference, and Reconstruction Attacks

Re-identification attacks exploit residual information in de-identified data or leverage external knowledge to uncover identities or sensitive attributes. Key mechanisms include:

The threat landscape for re-identification is diverse and evolving. While linkage attacks relying on external data remain a primary concern, inference and reconstruction attacks, potentially powered by advanced AI/ML techniques, pose growing risks even to datasets processed with sophisticated methods. This necessitates robust privacy protections that anticipate a wide range of potential attack vectors.

Landmark Case Study: The AOL Search Log Release (2006)

In August 2006, AOL publicly released a dataset containing approximately 20 million search queries made by over 650,000 users during a three-month period. The data was intended for research purposes and was presented as “anonymized.” The primary anonymization step involved replacing the actual user identifiers with arbitrary numerical IDs. However, the dataset retained the raw query text, query timestamps, and information about clicked results (rank and domain URL). Later statements suggest IP address and cookie information were also altered, though potentially insufficiently.

The attempt at anonymization failed dramatically and rapidly. Within days, reporters Michael Barbaro and Tom Zeller Jr. of The New York Times were able to re-identify one specific user, designated “AOL user No. 4417749,” as Thelma Arnold, a 62-year-old widow living in Lilburn, Georgia23. They achieved this by analyzing the sequence of queries associated with her user number. The queries contained a potent mix of quasi-identifiers, including searches for “landscapers in Lilburn, Ga,” searches for individuals with the surname “Arnold,” and searches for “homes sold in shadow lake subdivision gwinnett county georgia,” alongside other personally revealing (though not directly identifying) queries like “numb fingers,” “60 single men,” and “dog that urinates on everything.” The combination of these queries created a unique pattern easily traceable to Ms. Arnold through publicly available information.

The AOL incident became a watershed moment in data privacy. It starkly demonstrated several critical points relevant to search data de-identification:

  1. Removing explicit user IDs is fundamentally insufficient when the underlying data itself contains rich identifying information.
  2. Search queries, even seemingly innocuous ones, are laden with Personally Identifiable Information (PII) and powerful quasi-identifiers embedded in the text.
  3. The temporal sequence of queries provides crucial context and significantly increases identifiability.
  4. Linkage attacks using query content combined with publicly available information are feasible and effective.
  5. Simple anonymization techniques fail to account for the identifying power of combined attributes and behavioral patterns.

The incident led to significant public backlash, the resignation of AOL’s CTO, and a class-action lawsuit. It remains a canonical example of the pitfalls of naive de-identification and the unique sensitivity of search query data.

Landmark Case Study: The Netflix Prize De-anonymization (2007-2008)

In 2006, Netflix launched a public competition, the “Netflix Prize,” offering $1 million to researchers who could significantly improve the accuracy of its movie recommendation system. To facilitate this, Netflix released a large dataset containing approximately 100 million movie ratings (1-5 stars, plus date) from nearly 500,000 anonymous subscribers, collected between 1998 and 2005. User identifiers were replaced with random numbers, and any other explicit PII was removed.

In 2007, researchers Arvind Narayanan and Vitaly Shmatikov published a groundbreaking paper demonstrating how this supposedly anonymized dataset could be effectively de-anonymized24. Their attack relied on linking the Netflix data with a publicly available auxiliary dataset: movie ratings posted by users on the Internet Movie Database (IMDb).

They developed statistical algorithms that could match users across the two datasets based on shared movie ratings and the approximate dates of those ratings. Their key insight was that while many users might rate popular movies similarly, the combination of ratings for less common movies, along with the timing, created unique signatures. They showed that an adversary knowing only a small subset (as few as 2, but more reliably 6-8) of a target individual’s movie ratings and approximate dates could, with high probability, uniquely identify that individual’s complete record within the massive Netflix dataset. Their algorithm was robust to noise, meaning the adversary’s knowledge didn’t need to be perfectly accurate (e.g., dates could be off by weeks, ratings could be slightly different).

Narayanan and Shmatikov successfully identified the Netflix records corresponding to several non-anonymous IMDb users, thereby revealing their potentially private Netflix viewing histories, including ratings for sensitive or politically charged films that were not part of their public IMDb profiles.

The Netflix Prize de-anonymization study had significant implications:

  1. It demonstrated the vulnerability of high-dimensional, sparse datasets (characteristic of much behavioral data, including search logs) to linkage attacks.
  2. It proved that even seemingly non-sensitive data (movie ratings) can become identifying when combined with auxiliary information.
  3. It highlighted the inadequacy of simply removing direct identifiers and replacing them with pseudonyms when dealing with rich datasets.
  4. It underscored the power of publicly available auxiliary data in undermining anonymization efforts.

The research led to a class-action lawsuit against Netflix alleging privacy violations and the subsequent cancellation of a planned second Netflix Prize competition due to privacy concerns raised by the Federal Trade Commission (FTC). It remains a pivotal case study illustrating the fragility of anonymization for behavioral data.

Other Demonstrations of Re-identification Across Data Types

The AOL and Netflix incidents are not isolated cases. Numerous studies and breaches have demonstrated the feasibility of re-identifying individuals from various types of supposedly de-identified data, reinforcing the systemic nature of the challenge, especially for rich, individual-level records.

The following table summarizes some of these key incidents:

Table 2: Summary of Notable Re-identification Incidents

Incident Name/YearData Type“Anonymization” Method UsedRe-identification MethodAuxiliary Data UsedKey Finding/Significance
MA Governor Weld (1990s)Hospital Discharge DataRemoval of direct identifiers (name, address, SSN)Linkage AttackPublic Voter Registration List (ZIP, DoB, Gender)Early demonstration that QIs in supposedly de-identified data allow linkage to identified data.
AOL Search Logs (2006)Search QueriesUser ID replaced with number; Query text, timestamps retainedLinkage/Inference from Query ContentPublic knowledge, location directoriesSearch queries themselves contain rich PII/QIs enabling re-identification. Simple ID removal is insufficient.
Netflix Prize (2007-8)Movie Ratings (user, movie, rating, date)User ID replaced with numberLinkage AttackPublic IMDb User RatingsHigh-dimensional, sparse behavioral data is vulnerable. Small amounts of auxiliary data can enable re-id.
NYC Taxis (2014)Taxi Trip Records (incl. hashed medallion/license)Weak (MD5) hashing of identifiersPseudonym Reversal (Hash cracking)Knowledge of hashing algorithmPoorly chosen pseudonymization (weak hashing) is easily reversible.
Australian Health Records (MBS/PBS) (2016)Medical Billing DataClaimed de-identification (details unclear)Linkage AttackPublicly available information (e.g., birth year, surgery dates)Government-released health data, claimed anonymous, was re-identifiable.
Browsing History / Social Media Web Browsing HistoryAssumed de-identified (focus on linking)Linkage AttackSocial Media Feeds (e.g., Twitter)Unique patterns of link clicking in browsing history mirror unique social feeds, enabling linkage.
Genomic Beacons (Various studies)Aggregate Genomic Data (allele presence/absence)Query interface limits information releaseMembership Inference Attack (repeated queries, linkage)Individual’s genome sequence, Genealogical databasesEven aggregate or restricted-query genomic data can leak membership information.
Credit Card Data (de Montjoye et al. 2015)Transaction Records (merchant, time, amount)Assumed de-identifiedUniqueness Analysis / Linkage(Implicit) External knowledge correlating purchases/locationsSparse transaction data is highly unique; few points needed for re-identification.
Location Data (Various studies)Mobile Phone Location TracesVarious (often simple ID removal or aggregation)Uniqueness Analysis / Linkage AttackMaps, Points of Interest, Public RecordsHuman mobility patterns are highly unique; location data is easily re-identifiable..

These examples collectively illustrate that re-identification is not a niche problem confined to specific data types but a systemic risk inherent in sharing or releasing granular data about individuals, especially when that data captures complex behaviors over time or across multiple dimensions. Search query logs share many characteristics with these vulnerable datasets (high dimensionality, sparsity, behavioral patterns, embedded QIs, longitudinal nature), strongly suggesting they face similar, if not greater, re-identification risks.

The Critical Role of Auxiliary Information

A recurring theme across nearly all successful re-identification demonstrations is the crucial role played by auxiliary information. This refers to any external data source or background knowledge an attacker possesses or can obtain about individuals, which can then be used to bridge the gap between a de-identified record and a real-world identity.

The sources of auxiliary information are vast and continuously expanding in the era of Big Data:

The critical implication is that the privacy risk associated with a de-identified dataset cannot be assessed in isolation. Its vulnerability depends heavily on the external data ecosystem and what information might be available for linkage. De-identification performed today might be broken tomorrow as new auxiliary data sets become available or linkage techniques improve. This makes robust anonymization a moving target. Any assessment of re-identification risk must therefore be contextual, considering the specific data being released, the intended recipients or release environment, and the types of auxiliary information reasonably available to potential adversaries. Relying solely on removing identifiers without considering this broader context creates a fragile and likely inadequate privacy protection strategy.

5. Limitations of De-identification Techniques on Search Data

Given the unique characteristics of search query data and the demonstrated power of re-identification attacks, it is essential to critically evaluate the limitations of specific de-identification techniques when applied to this context.

The Fragility of k-Anonymity in High-Dimensional, Sparse Data

As established in Section 3.1, k-anonymity aims to protect privacy by ensuring that any individual record in a dataset is indistinguishable from at least k-1 other records based on their quasi-identifier (QI) values. This is typically achieved through generalization (making QI values less specific) and suppression (removing records or values).

However, k-anonymity proves fundamentally ill-suited for high-dimensional and sparse datasets like search logs. The core problem lies in the “curse of dimensionality”:

  1. Uniqueness: In datasets with many attributes (dimensions), individual records tend to be unique or nearly unique across the combination of those attributes. Finding k search users who have matching patterns across numerous QIs (specific query terms, timestamps, locations, click behavior, etc.) is highly improbable.
  2. Utility Destruction: To force records into equivalence classes of size k, massive amounts of generalization or suppression are required. Generalizing query terms might mean reducing specific searches like “side effects of lisinopril” to a broad category like “health query,” destroying the semantic richness crucial for analysis. Suppressing unique or hard-to-group records could eliminate vast portions of the dataset. This results in an unacceptable level of information loss, potentially rendering the data useless for its intended purpose.
  3. Vulnerability to Attacks: Even if k-anonymity is technically achieved, it remains vulnerable. The homogeneity attack occurs if all k records in a group share the same sensitive attribute (e.g., all searched for the same sensitive topic), revealing that attribute for anyone linked to the group. Background knowledge attacks can allow adversaries to further narrow down possibilities within a group.

Refinements like l-diversity and t-closeness attempt to address attribute disclosure vulnerabilities by requiring diversity or specific distributional properties for sensitive attributes within each group. However, they inherit the fundamental problems of k-anonymity regarding high dimensionality and utility loss, while adding implementation complexity. Furthermore, k-anonymity lacks robust compositionality; combining multiple k-anonymous releases does not guarantee privacy. Therefore, k-anonymity and its derivatives face challenges when used for de-identifying massive, complex search logs. They force difficult choices between retaining minimal utility or providing inadequate privacy protection against linkage and inference attacks.

Differential Privacy: The Utility-Privacy Trade-off and Implementation Hurdles

Differential Privacy (DP) offers a fundamentally different approach, providing mathematically rigorous, provable privacy guarantees29. Instead of modifying data records directly to achieve indistinguishability, DP focuses on the output of computations (queries, analyses, models) performed on the data. It ensures that the result of any computation is statistically similar whether or not any single individual’s data is included in the input dataset. This is typically achieved by adding carefully calibrated random noise to the computation’s output.

DP’s strengths are significant: its guarantees hold regardless of an attacker’s auxiliary knowledge, and privacy loss (quantified by \epsilon and \delta) composes predictably across multiple analyses. However, applying DP effectively to massive search logs presents substantial challenges:

  1. Applicability to Complex Queries and Data Types: DP is well-understood for basic aggregate queries (counts, sums, averages, histograms) on numerical or categorical data. Applying it effectively to the complex structures and query types relevant to search logs—such as analyzing free-text query semantics, mining sequential patterns in user sessions, building complex machine learning models (e.g., for ranking or recommendations), or analyzing graph structures (e.g., click graphs)—is more challenging and an active area of research. Standard DP mechanisms might require excessive noise or simplification for such tasks. Techniques like DP-SGD (Differentially Private Stochastic Gradient Descent) exist for training models, but again involve utility trade-offs30.
  1. The Utility-Privacy Trade-off31: This is the most fundamental challenge. The strength of the privacy guarantee (lower \epsilon) is inversely proportional to the amount of noise added. More noise provides better privacy but reduces the accuracy and utility of the results. For the complex, granular analyses often desired from search logs (e.g., understanding rare query patterns, analyzing specific user journeys, training accurate prediction models), the amount of noise required to achieve a meaningful level of privacy (a small \epsilon) might overwhelm the signal, rendering the results unusable. While DP performs better on larger datasets where individual contributions are smaller, the sensitivity of queries on sparse, high-dimensional data can still necessitate significant noise. Finding an acceptable balance between privacy and utility for diverse use cases remains a major hurdle.
  1. Implementation Complexity and Correctness: Implementing DP correctly requires significant expertise in both the theory and the practical nuances of noise calibration, sensitivity analysis (bounding how much one individual can affect the output), and privacy budget management. Errors in implementation, such as underestimating sensitivity or mismanaging the privacy budget across multiple queries (due to composition rules), can silently undermine the promised privacy guarantees. Defining the “privacy unit” (e.g., user, query, session) appropriately is critical; misclassification can lead to unintended disclosures. Auditing DP implementations for correctness is also non-trivial.
  1. Local vs. Central Models: DP can be implemented in two main models. In the central model, a trusted curator collects raw data and then applies DP before releasing results. This generally allows for higher accuracy (less noise for a given \epsilon) but requires users to trust the curator with their raw data. In the local model (LDP), noise is added on the user’s device before data is sent to the collector. This offers stronger privacy guarantees as the collector never sees raw data, but typically requires significantly more noise to achieve the same level of privacy, often leading to much lower utility. The choice of model impacts both trust assumptions and achievable utility.

In essence, while DP provides the gold standard in theoretical privacy guarantees, its practical application to the scale and complexity of  search logs involves significant compromises in data utility and faces non-trivial implementation hurdles. It is not a simple “plug-and-play” solution for making granular search data both private and fully useful.

Inadequacies of Aggregation, Masking, and Generalization for Search Logs

Simpler, traditional de-identification techniques prove largely insufficient for protecting privacy in search logs while preserving meaningful utility:

These foundational techniques, while potentially useful as components within a more sophisticated strategy (e.g., aggregation combined with differential privacy), are individually incapable of addressing the complex privacy challenges posed by massive search query datasets without sacrificing the data’s core value.  As we discuss further, even combined they fall short.

Challenges with Synthetic Data Generation for Complex Behavioral Data

Generating synthetic data—artificial data designed to mirror the statistical properties of real data without containing actual individual records—has emerged as a promising privacy-enhancing technology. It offers the potential to share data insights without sharing real user information. However, creating high-quality, privacy-preserving synthetic search logs faces significant hurdles32:

  1. Utility Preservation: Search logs capture complex patterns: semantic relationships between query terms, sequential dependencies in user sessions, temporal trends, correlations between queries and clicks, and vast individual variability. Training a generative model (e.g., a statistical model or a deep learning model like an LLM) to accurately capture all these nuances without access to the original data is extremely challenging. If the synthetic data fails to replicate these properties faithfully, it will have limited utility for downstream tasks like training accurate machine learning models or conducting reliable behavioral research. Generating realistic sequences of queries that maintain semantic coherence and plausible user intent is particularly difficult.
  2. Privacy Risks (Memorization and Inference): Generative models, especially large and complex ones like LLMs, run the risk of “memorizing” or “overfitting” to their training data. If this happens, the model might generate synthetic examples that are identical or very close to actual records from the sensitive training dataset, thereby leaking private information. This risk is often higher for unique or rare records (outliers) in the original data. Even if exact records aren’t replicated, the synthetic data might still be vulnerable to membership inference attacks, where an attacker tries to determine if a specific person’s data was used to train the generative model. Ensuring the generation process itself is privacy-preserving, for example by using DP during model training is crucial but adds complexity and can impact the fidelity (utility) of the generated data. Evaluating the actual privacy level achieved by synthetic data is also a complex task.
  3. Bias Amplification: Generative models learn patterns from the data they are trained on. If the original search log data contains societal biases (e.g., stereotypical associations, skewed representation of demographic groups), the synthetic data generated is likely to replicate, and potentially even amplify, these biases. This can lead to unfair or discriminatory outcomes if the synthetic data is used for training downstream applications.

Therefore, while synthetic data holds promise, generating truly useful and private synthetic search logs is a frontier research problem. The very complexity that makes search data valuable also makes it incredibly difficult to synthesize accurately without inadvertently leaking information or perpetuating biases. It requires sophisticated modeling techniques combined with robust privacy-preserving methods like DP integrated directly into the generation workflow.

6. Harms, Ethics, and Societal Implications

The challenges of de-identifying search query data are not merely technical or legal; they extend into architectural and organizational domains that fundamentally shape privacy outcomes. How data is released—through what mechanisms, under what controls, and with what oversight—represents an architectural problem bound by organizational principles and norms. The key architectural building block lies in the design of APIs (Application Programming Interfaces), which can act as critical shields between raw data and external access. Re-identification attempts can be partially mitigated at the API level through strict query limits, access controls, auditing mechanisms, and purpose restrictions—complementing the privacy-enhancing technologies discussed throughout this paper. These architectural choices embed ethical values and reflect organizational commitments to privacy beyond mere technical implementation. They carry significant weight and potential for real-world harm if privacy is compromised. These controls can perhaps be observed and managed at an individual organizational level, with extensive oversight and a data protection legal regime including enforcement in place, but are challenging to envision for ongoing large scale access to data by multiple unrelated independent parties.  Once data is released, it is beyond the control of the API.  Cutting off future API access when multiple releases create a re-identification risk may not be feasible.  Knowing whether multiple API users collaborate or combine data is also a limitation.

Potential Harms from Re-identified Search Data: From Embarrassment to Discrimination

If supposedly de-identified search query data is successfully re-linked to individuals, the consequences can range from personal discomfort to severe, tangible harms. Search histories can reveal extremely sensitive aspects of a person’s life, including:

The exposure of such information through re-identification can lead to a spectrum of harms:

These potential harms underscore the high stakes involved in handling search query data. The impact extends beyond individual privacy violations to potential societal harms, such as reinforcing existing inequalities through discriminatory profiling or undermining trust in digital services. Critically, legal systems often struggle to recognize and provide remedies for many of these harms, particularly those that are non-financial, cumulative, or relate to future risks.

7. Conclusion: Synthesizing the Challenges and Risks

The de-identification of massive search query datasets presents a complex and formidable challenge, sitting at the intersection of immense data value and profound privacy risk. While the potential benefits of analyzing search behavior for societal good, service improvement, and innovation are undeniable, the inherent nature of this data makes achieving meaningful privacy protection through de-identification exceptionally difficult.

The Core Privacy Paradox of Search Data De-identification

The fundamental paradox lies in the richness of the data itself. Search logs capture a high-dimensional, sparse, and longitudinal record of human intent and behavior. This richness, containing myriad explicit and implicit identifiers and quasi-identifiers embedded within unstructured query text and temporal patterns, creates unique individual fingerprints. Consequently, techniques designed to obscure identity often face a stark trade-off: either they fail to adequately protect against re-identification attacks (especially linkage attacks leveraging the vast ecosystem of auxiliary data ), or they must apply such aggressive generalization, suppression, or noise addition that the data’s analytical utility is severely compromised.

Traditional methods like k-anonymity are fundamentally crippled by the “curse of dimensionality” inherent in this data type. More advanced techniques like differential privacy offer stronger theoretical guarantees but introduce significant practical challenges related to the privacy-utility balance, implementation complexity, and applicability to the diverse analyses required for search data. Synthetic data generation, while promising, faces similar difficulties in capturing complex behavioral nuances without leaking information or amplifying bias.

Summary of Key Risks and Vulnerabilities

The analysis presented in this report highlights several critical risks associated with attempts to de-identify  search query data:

  1. High Re-identification Risk: Due to the data’s uniqueness and the power of linkage attacks using auxiliary information, the risk of re-identifying individuals from processed search logs remains substantial. Landmark failures like the AOL and Netflix incidents serve as potent warnings.
  2. Inadequacy of Simple Techniques: Basic methods like removing direct identifiers, masking, simple aggregation, or naive generalization are insufficient to protect against sophisticated attacks on this type of data.
  3. Limitations of Advanced Techniques: Even state-of-the-art methods like differential privacy and synthetic data generation face significant hurdles in balancing provable privacy with practical utility for complex, granular search data analysis.
  4. Evolving Threat Landscape: The continuous growth of available data and the increasing sophistication of analytical techniques, including AI/ML-driven attacks, mean that re-identification risks are dynamic and likely increasing over time.
  5. Potential for Serious Harm: Re-identification can lead to tangible harms, including discrimination, financial loss, reputational damage, psychological distress, and chilling effects on free expression and inquiry.

The Ongoing Debate

The challenges outlined fuel an ongoing debate about the viability and appropriate role of de-identification in the context of large-scale behavioral data. While organizations invest in Privacy Enhancing Technologies (PETs) and implement policies aimed at protecting user privacy, the demonstrable risks and technical limitations suggest that achieving true, robust anonymity for granular search query data, while maintaining high utility, remains an elusive goal.

During the preparation of this work the author used ChatGPT to reword and rephrase text and for a first draft of the two charts in the document. After using this tool/service, the author reviewed and edited the content as needed and takes full responsibility for the content of the publication.

  1. https://fpf.org/issue/deid/ ↩︎
  2. https://fpf.org/tag/privacy-enhancing-technologies/ ↩︎
  3.  https://fpf.org/issue/research-and-ethics/ ↩︎
  4. Ohm: https://heinonline.org/HOL/LandingPage?handle=hein.journals/uclalr57&div=48&id=&page= ↩︎
  5. Cooper: https://citeseerx.ist.psu.edu/document? ↩︎
  6. Dinur, Nissim: https://weizmann.elsevierpure.com/en/publications/revealing-information-while-preserving-privacy ↩︎
  7. Barth-Jones: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2076397 ↩︎
  8. Polonetsky, Tene and Finch: https://digitalcommons.law.scu.edu/cgi/viewcontent.cgi?article=2827&context=lawreview ↩︎
  9. We note the European Court of Justice Breyer decision and subsequent EU court decisions that may open up a legal argument that it may be possible to consider a party that does not reasonably have potential access to the additional data to be in possession of non-personal data. https://curia.europa.eu/juris/document/document.jsf?docid=184668&doclang=EN ↩︎
  10. Sweeney: https://www.hks.harvard.edu/publications/k-anonymity-model-protecting-privacy
    ↩︎
  11. Aggarwal, Charu C. (2005). “On k-Anonymity and the Curse of Dimensionality”. VLDB ’05 – Proceedings of the 31st International Conference on Very large Data Bases. Trondheim, Norway. CiteSeerX 10.1.1.60.3155 ↩︎
  12. Marcus Olson:https://marcusolsson.dev/k-anonymity-and-l-diversity/ ↩︎
  13. Ninghui Li, Tiancheng Li, and Suresh Venkatasubramanian, “t-Closeness: Privacy Beyond k-Anonymity and ℓ-Diversity,” Proceedings of the 23rd IEEE International Conference on Data Engineering (2007 ↩︎
  14. Dwork, C. (2006). Differential Privacy. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds) Automata, Languages and Programming. ICALP 2006. Lecture Notes in Computer Science, vol 4052. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11787006_1 ↩︎
  15. Simson Garfinkel NIST SP 800 ↩︎
  16. https://research.google/blog/protecting-users-with-differentially-private-synthetic-training-data/ ↩︎
  17. https://sparktoro.com/blog/who-sends-traffic-on-the-web-and-how-much-new-research-from-datos-sparktoro/ ↩︎
  18. Mitigating the Curse of Dimensionality in Data Anonymization – CRISES / URV, https://crises-deim.urv.cat/web/docs/publications/lncs/1084.pdf 59 ↩︎
  19. Bellman: https://link.springer.com/referenceworkentry/10.1007/978-0-387-39940-9_133 ↩︎
  20. On k-anonymity and the curse of dimensionality, https://www.vldb.org/archives/website/2005/program/slides/fri/s901-aggarwal.pdf ↩︎
  21. Latanya Sweeney, “Uniqueness of Simple Demographics in the U.S. Population,” Carnegie Mellon University, Data Privacy Working Paper 3, 2000 ↩︎
  22. Su, Goel, Shukla, Narayana https://www.cs.princeton.edu/~arvindn/publications/browsing-history-deanonymization.pdf ↩︎
  23. Michael Barbaro and Tom Zeller Jr., “A Face Is Exposed for AOL Searcher No. 4417749,” The New York Times, August 9, 2006 ↩︎
  24. Shmatikov How To Break Anonymity of the Netflix Prize Dataset. arxiv cs/0610105 ↩︎
  25. Systematic Review of Re-Identification Attacks on Health Data – PMC, https://pmc.ncbi.nlm.nih.gov/articles/PMC3229505/ 115 ↩︎
  26. https://medium.com/vijay-pandurangan/of-taxis-and-rainbows-f6bc289679a1 ↩︎
  27. https://dspace.mit.edu/handle/1721.1/96321 ↩︎
  28. https://www.cs.princeton.edu/~arvindn/publications/browsing-history-deanonymization.pdf ↩︎
  29. Cynthia Dwork, “Differential Privacy,” in Automata, Languages and Programming, 33rd International Colloquium, ICALP 2006, Proceedings, Part II, ed. Michele Bugliesi et al., Lecture Notes in Computer Science 4052 (Berlin: Springer, 2006) ↩︎
  30. https://research.google/blog/generating-synthetic-data-with-differentially-private-llm-inference/ ↩︎
  31. Guidelines for Evaluating Differential Privacy Guarantees – NIST Technical Series Publications, https://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-226.pdf ↩︎
  32. Privacy Tech-Know blog: When what is old is new again – The reality of synthetic data, https://www.priv.gc.ca/en/blog/20221012/ 95 ↩︎

FPF Launches Major Initiative to Study Economic and Policy Implications of AgeTech

FPF and University of Arizona Eller College of Management Awarded Grant by Alfred P. Sloan Foundation to Address Privacy Implications, and Data Uses of Technologies Aimed at Aging At Home

The Future of Privacy Forum (FPF) — a global non-profit focused on data protection, AI and emerging technologies–has been awarded a grant from the Alfred P. Sloan Foundation to lead a two-year research project entitled Aging at Home: Caregiving, Privacy, and Technology, in partnership with the University of Arizona Eller College of Management. The project, which launched on April 1, will explore the complex intersection of privacy, economics, and the use of emerging technologies designed to support aging populations (“AgeTech”). AgeTech includes a wide range of applications and technologies, from fall detection devices and health monitoring apps to artificial intelligence (AI)-powered assistants.

The number of seniors eighty-five and older is expected to nearly double by 2035 and nearly triple by 2060. This rapidly aging population presents complex challenges and opportunities, particularly in the increased demand for resources necessary for senior care and the use of AgeTech to promote improved autonomy and independence.

FPF will lead rigorous, independent research into these issues, with a particular focus on the privacy expectations of seniors and caregivers, cost barriers to adoption, and the policy gaps surrounding AgeTech. The research will include experimental surveys, roundtables with industry and policy leaders, and a systematic review of economic and privacy challenges facing AgeTech solutions.

The project will be led by co-principals Jules Polonetsky, CEO of FPF, and Dr. Laura Brandimarte, Associate Professor of Management Information Systems at the University of Arizona Eller College of Management. Polonetsky is an internationally recognized privacy expert and co-editor of the Cambridge Handbook on Consumer Privacy. Brandimarte’s work focused on the ethics of technology, with an emphasis on privacy and security, uses quantitative methods including survey and experimental design, and econometric data analysis.  

Jordan Wrigley, a data and policy analyst who leads FPF health data research, will play a lead role for FPF along with members of FPF’s U.S., Global, and AI Policy teams.  Jordan is a recognized and awarded health meta-analytic methodologist and researcher, whose work has informed medical care guidelines and AI data practices.

“The privacy aspects of AgeTech, such as consent and authorization, data sensitivity, and cost, need to be studied and considered holistically to create sustainable policies and build trust with seniors and caregivers as the future of aging becomes the present,” said Wrigley. “This research will seek to do just that.”

“At FPF, we believe that technology and data can benefit society and improve lives when the right laws, policies, and safeguards are in place,” added Polonetsky. “The goal of AgeTech – to assist seniors in living independently while reducing healthcare costs and caregiving burdens – impacts us all. As this field grows, it’s essential that we have the right rules in place to protect privacy and preserve dignity.”

“Technology has the potential to increase the autonomy and overall wellbeing of an ageing population, but for that to happen there has to be trust on the part of users – both that the technology will effectively be of assistance and that it will not constitute another source of data privacy and security intrusions,” added Brandimarte. “We currently know very little about the level of trust the elderly place in AgingTech and the specific needs of this at-risk population when they interact with it, including data accessibility by family members or caregivers.”

Dr. Daniel Goroff, Vice President and Program Director for Sloan, agrees, “As AgeTech evolves, it brings enormous promise—along with pressing questions about equity, access, and privacy. This initiative will provide insights about how innovations can ethically and responsibly enhance the autonomy and dignity of older adults. We’re excited to see FPF and the University of Arizona leading the way on this timely research.”

Key project outputs will include:

Sign-up for our mailing list to stay informed about future progress, and reach out to Jordan Wrigley ([email protected]) if you are interested in learning more about the project. 

Aging at Home: Caregiving, Privacy, and Technology is supported by the Alfred P. Sloan Foundation under Grant No. G-2025-25191.

About The Alfred P. Sloan Foundation

The ALFRED P. SLOAN FOUNDATION is a not-for-profit, mission-driven grantmaking institution dedicated to improving the welfare of all through the advancement of scientific knowledge. Established in 1934 by Alfred Pritchard Sloan Jr., then-President and Chief Executive Officer of the General Motors Corporation, the Foundation makes grants in four broad areas: direct support of research in science, technology, engineering, mathematics, and economics; initiatives to increase the quality, equity, diversity, and inclusiveness of scientific institutions and the science workforce; projects to develop or leverage technology to empower research; and efforts to enhance and deepen public engagement with science and scientists.
sloan.org | @SloanFoundation

About Future of Privacy Forum (FPF)

FPF is a global non-profit organization that brings together academics, civil society, government officials, and industry to evaluate the societal, policy, and legal implications of data use, identify the risks, and develop appropriate protections. FPF believes technology and data can benefit society and improve lives if the right laws, policies, and rules are in place. FPF has offices in Washington D.C., Brussels, Singapore, and Tel Aviv. Follow FPF on X and LinkedIn.

About the University of Arizona Eller College of Management

The Eller College of Management at The University of Arizona offers highly ranked undergraduate (BSBA and BSPA), MBA, MPA, masters, and doctoral, Ph.D. degrees in accounting, economics, entrepreneurship, finance, marketing, management and organizations, management information systems (MIS), and public administration and policy in Tucson, Arizona and Phoenix, Arizona.

FPF and OneTrust publish the Updated Guide on Conformity Assessments under the EU AI Act

The Future of Privacy Forum (FPF) and OneTrust have published an updated version of their Conformity Assessments under the EU AI Act: A Step-by-Step Guide, along with an accompanying Infographic. This updated Guide reflects the text of the EU Artificial Intelligence Act (EU AIA), adopted in 2024.  

Conformity Assessments (CAs) play a significant role in the EU AIA’s accountability and compliance framework for high-risk AI systems. The updated Guide and Infographic provide a step-by-step roadmap for organizations seeking to understand whether they must conduct a CA. Both resources are designed to support organizations as they navigate their obligations under the AIA and build internal processes that reflect the Act’s overarching accountability. However, they do not constitute legal advice for any specific compliance situation. 

Key highlights from the Updated Guide and Infographic:

You can also view the previous version of the Conformity Assessment Guide here.

South Korea’s New AI Framework Act: A Balancing Act Between Innovation and Regulation

On 21 January 2025, South Korea became the first jurisdiction in the Asia-Pacific (APAC) region to adopt comprehensive artificial intelligence (AI) legislation. Taking effect on 22 January 2026, the Framework Act on Artificial Intelligence Development and Establishment of a Foundation for Trustworthiness (AI Framework Act or simply, Act) introduces specific obligations for “high-impact” AI systems in critical sectors, including healthcare, energy, and public services, and mandatory labeling requirements for certain applications of generative AI. The Act also includes substantial public support for private sector AI development and innovation through its support for AI data centers, as well as projects that create and provide access to training data, and encouragement of technological standardization to support SMEs and start-ups in fostering AI innovation. 

In the broader context of public policies in South Korea that are designed to allow the advancement of AI, the Act is notable for its layered, transparency-focused approach to regulation, moderate enforcement approach compared to the EU AI Act, and significant public support intended to foster AI innovation and development. We cover these in Parts 2 to 4 below. 

Key features of the law include:

In Part 5, we provide a comparison below to the European Union (EU)’s AI Act (EU AI Act). We note that while the AI Framework Act shares some common elements with the EU AI Act, including tiered classification and transparency mandates, South Korea’s regulatory approach differs in its simplified risk categorization, including absence of prohibited AI practices, comparatively lower financial penalties, and the establishment of initiatives and government bodies aimed at promoting the development and use of AI technologies. The intent of this comparison is to assist practitioners in understanding and analyzing key commonalities and differences between both laws.

Finally, Part 6 of this article places the Act within South Korea’s broader AI innovation strategy and discusses the challenges of regulatory alignment between the Ministry of Science and IT (MSIT) and South Korea’s data protection authority, the Personal Information Protection Commission (PIPC) in South Korea’s evolving AI governance landscape.

1. Background 

On 26 December 2024, South Korea’s National Assembly passed the Framework Act on Artificial Intelligence Development and Establishment of a Foundation for Trustworthiness (AI Framework Act or Act). 

The AI Framework Act was officially promulgated on 21 January 2025 and will take effect on 22 January 2026, following a one-year transition period to prepare for compliance. During this period, MSIT will assist with the issuance of Presidential Decrees and other sub-regulations and guidelines to clarify implementation details.

South Korea was the first country in the Asia-Pacific region to introduce a comprehensive AI law in 2021: the Bill on Fostering Artificial Intelligence and Creating a Foundation of Trust. However, the legislative process faced significant hurdles, including political uncertainty surrounding the April 2024 general elections, raising concerns that the bill could be scrapped entirely.

However, by November 2024, South Korea’s AI policy landscape had grown increasingly complex, with 20 separate AI governance bills since the National Assembly began its new term in June 2024, each independently proposed by different members. In November 2024, the Information and Communication Broadcasting Bill Review Subcommittee conducted a comprehensive review of these AI-related bills and consolidated them into a single framework, leading to the passage of the AI Framework Act.

At its core, the AI Framework Act adopts a risk-based approach to AI regulation. In particular, it introduces specific obligations for high-impact AI systems and generative AI applications. The AI Framework Act also has extraterritorial reach: it applies to AI activities that impact South Korea’s domestic market or users.

This blog post examines the key provisions of the Act, including its scope, regulatory requirements, and implications for organizations developing or deploying AI systems.

2. The Act establishes a layered approach to AI regulation

2.1 Definitions lay the foundation for how different AI systems will be regulated under the Act

Article 2 of the Act provides three AI-related definitions. 

At the core of the Act’s layered approach is its definition of “high-impact AI” (which is subject to more stringent requirements). “High-impact AI” refers to AI systems “that may have a significant impact on or pose a risk to human life, physical safety, and basic rights,” and is utilized in critical sectors identified under the AI Framework Act, including energy, healthcare, nuclear operations, biometric data analysis, public decision-making, education, or other areas that have a significant impact on the safety of human life and body and the protection of basic rights as prescribed by Presidential Decree.

The Act also introduces specific provisions for “generative AI.” The Act defines generative AI as AI systems that create text, sounds, images, videos, or other outputs by imitating the structure and characteristics of the input data. 

The Act also defines an “AI Business Operator” as corporations, organizations, government agencies, or individuals conducting business related to the AI industry. The Act subdivides AI Business Operators into two sub-categories (which effectively reflect a developer-deployer distinction): 

Currently, as will be covered in more detail below, the obligations under the Act apply to both categories of AI Business Operators, regardless of their specific roles in the AI lifecycle. For example, transparency-related obligations apply to all AI Business Operators, regardless of whether they are involved in the development and/or deployment phases of AI systems. It remains to be seen if forthcoming Presidential Decrees to implement the Act will introduce more differentiated obligations for each type of entity.

While the Act expressly excludes AI used solely for national defense and security from its scope, the Act applies to both government agencies and public bodies when they are involved in the development, provision, or use of AI technology in a business-related context. More broadly, the Act also assigns the government a significant role in shaping AI policy, providing support, and overseeing the development and use of AI.

2.2. The AI Framework Act has broad extraterritorial reach 

Under Article 4(1), the Act applies not only to acts conducted within South Korea but also to those conducted abroad that impact South Korea’s domestic market, or users in South Korea. This means that foreign companies providing AI systems or services to users in South Korea will be subject to the Act’s requirements, even if they lack a physical presence in the country. 

However, Article 4(2) of the Act introduces a notable exemption for AI systems developed and deployed exclusively for national defense or security purposes. These systems, which will be designated by Presidential Decree, fall outside the Act’s regulatory framework.

For global organizations, the Act’s jurisdictional scope raises key compliance considerations. Companies will likely need to assess whether their AI activities fall under South Korea’s regulatory reach, particularly if they:

This last criterion appears to be a novel policy proposition and differentiates the AI Framework Act from the EU AI Act, potentially making it broader in reach. This is because it does not seem necessary for an AI system to be placed on the South Korean market for the condition to be triggered, but simply for the AI-related activity of a covered entity to “indirectly impact” the South Korean market. 

2.3. The Act establishes a multi-layered approach to AI safety and trustworthiness requirements

(i) The Act emphasizes oversight of high-impact AI but does not prohibit particular AI uses 

For most AI Business Operators, compliance obligations under the AI Framework Act are minimal. There are, however, noteworthy obligations – relating to transparency, safety, risk management and accountability – that apply to AI Business Operators deploying high-impact AI systems. 

Under Article 33, AI Business Operators providing AI products and services must “review in advance” (this presumably means before the relevant product or service is released into a live environment or goes to market) whether their AI systems is considered “high-impact AI.” Businesses may request confirmation from the MSIT on whether their AI system is to be considered “high-impact AI.”

Under Article 34, organizations that offer high-impact AI, or products or services using high-impact AI, must meet much stricter requirements, including:

1. Establishing and operating a risk management plan.

2. Establishing and operating a plan to provide explanation for AI-generated results within technical limits, including key decision criteria and an overview of training data.

3. Establishing and operating “user protection measures.”

4. Ensuring human oversight and supervision of high-impact AI.

5. Preserving and storing documents that demonstrate measures taken to ensure AI safety and reliability.

6. Following any additional requirements imposed by the National AI Committee (established under the Act) to enhance AI safety and 7. reliability.

Under Article 35, AI Business Operators are also encouraged to conduct impact assessments for high-impact AI systems to evaluate their potential effects on fundamental rights. While the language of the Act (i.e., “shall endeavor to conduct an impact assessment”) suggests that these assessments are not mandatory, the Act introduces an incentive: where a government agency intends to use a product or service using high-impact AI, the agency is to prioritize AI products or services that have undergone impact assessments in public procurement decisions. Legislatively stipulating the use of public procurement processes to incentivize businesses to conduct impact assessments appears to be a relatively novel move and arguably reflects the innovation-risk duality seen across the Act.

(ii) The Act prioritizes user awareness and transparency for generative AI products and services 

The AI Framework Act introduces specific transparency obligations for generative AI providers. Under Article 31(1), AI Business Operators offering high-impact or generative AI-powered products or services must notify users in advance that the product or service utilizes AI. Further, under Article 31(2), AI Business Operators providing generative AI as a product or service must also indicate that output generated was generated by generative AI. 

Beyond general disclosure, Article 31(3) of the Act mandates that where an AI Business Operator uses an AI system to provide virtual sounds, images, video or other content that are “difficult to distinguish from reality,” the AI Business Operator must “notify or display the fact that the result was generated by an (AI) system in a manner that allows users to clearly recognize it.” 

However, the provision also provides flexibility for artistic and creative expressions. It permits notifications or labelling to be displayed in ways intended to not hinder creative expression or appreciation. This approach appears aimed at balancing the creative utility of generative AI with transparency requirements. Technical details, such as how notification or labelling should be implemented, will be prescribed by Presidential Decree.

(iii) The Act establishes other requirements that apply when certain thresholds are met

The following requirements focus on safety measures and operational oversight, including specific provisions for foreign AI providers.

Under Article 32, AI Business Operators that operate AI systems whose computational learning capacity exceeds prescribed thresholds are required to identify, assess, and mitigate risks throughout the AI lifecycle, and establish a risk management system to monitor and respond to AI-related safety incidents. AI Business Operators must document and submit their findings to the MSIT. 

For accountability, Article 36 provides that AI Business Operators without a domestic address or place of business and cross certain user number or revenue thresholds (to be prescribed) must appoint a “domestic representative” with an address or place of business in South Korea. The details of the domestic representative must be provided to the MSIT. 

These domestic representatives take on significant responsibilities, including:

3. The Act grants the MSIT significant investigative and enforcement powers

3.1 The legislation empowers the MSIT with broad authority to investigate potential violations of the Act 

Under Article 40 of the Act, the MSIT is empowered to investigate businesses that it suspects of breaching any of the following requirements under the Act:

When potential breaches are identified, the MSIT may carry out necessary investigations, including the authority to conduct on-site investigations and to compel AI Business Operators to submit relevant data. During these inspections, authorized officials can examine business records, operational documents, and other critical materials, following established administrative investigation protocols.

If violations are confirmed, the MSIT can issue corrective orders, requiring businesses to immediately halt non-compliant practices and implement necessary remediation measures. 

3.2 The Act takes a relatively moderate approach to penalties compared to other global AI regulations 

Under Articles 43 of the Act, administrative fines of up to KRW 30 million (approximately USD 20,707) may be imposed for:

This enforcement structure caps fines at lower amounts than other global AI regulations. 

4. The Act promotes the development of AI technologies through strategic support for data infrastructure and learning resources

The MSIT is responsible for developing comprehensive policies to support the entire lifecycle of AI training data, ensuring that businesses have access to high-quality datasets essential for AI development. To achieve this, the Act mandates government-led initiatives to:

A key initiative under the Act can be found in Article 25, which provides for the promotion of policies to establish and operate AI Data Centers. Under Article 25(2), the South Korean government may provide administrative and financial support to facilitate the construction and operation of data centers. These centers will provide infrastructure for AI model training and development, ensuring that businesses of all sizes – including small and medium-sized enterprises (SMEs) – have access to these resources.

The Act also promotes the advancement and safe use of AI by encouraging technological standardization (Articles 13 and 14), supporting SMEs and start-ups, and fostering AI-driven innovation. It also facilitates international collaboration and market expansion while establishing a framework for AI testing and verification (Articles 13 and 14). Together, these measures aim to strengthen South Korea’s broader AI ecosystem and ensure its responsible development and deployment.

5. Comparing the approaches of South Korea’s AI Framework Act and the EU’s AI Act reveals both convergences and divergences

As South Korea is only the second jurisdiction globally to enact comprehensive national AI regulation, comparing its AI Framework Act with the EU AI Act helps illuminate both its distinctive features and its place in the emerging landscape of global AI governance. As many companies will need to navigate both frameworks, understanding of their similarities and differences is essential for global compliance strategies.

Table 1. Comparison of Key Aspects of the South Korea AI Framework Act and EU AI Act

6. Looking ahead

South Korea’s AI Framework Act is the first omnibus AI regulation in the APAC region., The South Korean model is notable for establishing an alternative approach to AI regulation: one that seeks to balance the promotion of AI innovation, development, and use, along with safeguards for high-impact aspects.

6.1 Though the Act establishes a framework for direct regulation of AI, several critical areas require further definition through Presidential Decree

The areas that are expected to be clarified through Presidential Decree include:

The interpretation and implementation of these provisions will significantly shape compliance expectations, influencing how AI businesses—both domestic and international—navigate the regulatory landscape.

6.2 The Act must also be considered in the context of South Korea’s broader efforts to position the country as a leader in AI innovation 

The first – and arguably most significant – of these efforts is a significant bill recently introduced by members of the National Assembly, which seeks to amend the Personal Information Protection Act (PIPA) by creating a new legal basis for the processing of personal information specifically for the development and use of AI. The bill introduces a new Article 28-12, which would permit the use of personal information beyond its original purpose of collection, specifically for the development and improvement of AI systems. This amendment would allow such processing provided that:

Second, South Korea’s government is also reportedly exploring other legal reforms to its data protection law to facilitate the development of AI. According to PIPC Chairman Haksoo Ko’s recent interview with a global regulatory news outlet, these reforms could potentially include reforming the “legitimate interests” basis for processing personal information under the PIPA.

South Korea’s Minister for Science and ICT Yoo Sang-im has also reportedly urged the National Assembly to swiftly pass a law on the management and use of government-funded research data to advance scientific and technological development in the AI era.

Third, while creating these pathways for innovation, the PIPC has simultaneously been developing mechanisms to provide oversight over AI systems. For instance, the PIPC’s comprehensive policy roadmap for 2025 (Policy Roadmap) announced in January 2025 outlines an ambitious regulatory framework for AI governance and data protection. In particular, the Policy Roadmap envisions the implementation of specialized regulatory and oversight provisions for the use of unmodified personal data in AI development. 

The Policy Roadmap is supplemented by the PIPC’s Work Direction for Investigations in 2025 (Work Direction). Published in January 2025, the Work Direction includes measures intended to provide additional oversight over AI services, including conducting preliminary onsite inspections of AI-powered services, such as AI agents, and reviewing the use of personal information in AI-based legal and human resources services.

A possible instance of this additional emphasis on providing oversight arose in February 2025, when the PIPC announced a temporary suspension of new downloads of the Chinese generative AI application Deepseek over concerns about potential breaches of the PIPA.

Fourth, South Korea is seeking to strengthen the accountability of foreign organizations. The PIPC has expressed its support for a bill amending the PIPA’s domestic representative system for foreign organizations, which was subsequently amended and became effective from April 1, 2025. This amendment bill addresses a significant gap in the current system, which has allowed foreign companies to designate unrelated third parties as their domestic agents in South Korea, often resulting in what one lawmaker described as “formal” compliance without meaningful accountability.

The new requirements would mandate that foreign companies with established business units in South Korea designate those local entities as their representatives, while imposing explicit obligations on foreign headquarters to properly manage and supervise these domestic agents. The bill also establishes sanctions for violations of these requirements, including fines of up to KRW 20 million (approximately USD 14,000). 

Fifth, South Korea is seeking to position itself as a global leader in privacy and AI governance through international cooperation and thought leadership. As South Korea prepares to host the annual Global Privacy Assembly in September 2025 – an event involving participants from 95 countries – the PIPC is positioning itself as a bridge between different regional approaches to data protection and AI governance.

6.3 However, these efforts highlight a persistent challenge to ensure clear alignment between key regulatory authorities in South Korea’s AI governance landscape 

Whilst the MSIT was working to finalize the AI Framework Act, the PIPC, like its counterparts in many other jurisdictions globally, has been assuming a de facto regulatory role for AI applications involving personal data.

However, while the AI Framework Act assigns primary responsibility for AI governance to the MSIT, it does not appear to address or acknowledge the PIPC’s role in the regulatory landscape. This creates a potential situation where two parallel AI regulators – one de jure and the other de facto – will likely continue to operate: the MSIT overseeing general AI system safety and trustworthiness under the AI Framework Act, and the PIPC maintaining its oversight of personal data processing in AI systems under the PIPA.

As a result, organizations developing or deploying AI systems in South Korea may need to navigate compliance requirements from both authorities, particularly when their AI systems process personal data. How this dual regulatory structure evolves and whether a more unified governance approach emerges will be a critical factor in determining the success of South Korea’s ambitious AI strategy in the coming years.

Despite these practical challenges, South Korea’s approach to AI regulation offers a potential governance model for other APAC jurisdictions. Regardless, the success of the Act will ultimately depend on how effectively it balances its dual objectives — fostering AI innovation while ensuring responsible deployment. As AI governance evolves globally, the South Korean experience will provide valuable insights for policymakers, regulators, and industry stakeholders worldwide.

Note: Please note that the summary of the AI Framework Act above is based on an English machine translation, which may contain inaccuracies. Additionally, the information should not be considered legal advice. For specific legal guidance, kindly consult a qualified lawyer practicing in South Korea.

The authors would like to thank Josh Lee Kok Thong, Dominic Paulger, and Vincenzo Tiani for their contributions to this post.

Little Rock, Minor Rights: Arkansas Leads with COPPA 2.0-Inspired Law

With thanks to Daniel Hales and Keir Lamont for their contributions.

Shortly before the close of its 2025 session, the Arkansas legislature passed HB 1717, the Arkansas Children and Teens’ Online Privacy Protection Act, with unanimous votes. As the name suggests, Arkansas modeled this legislation after Senator Markey’s federal “COPPA 2.0” proposal, which passed the U.S. Senate as part of a broad child online safety package last year. Presuming enactment by Governor Sarah Huckabee Sanders, HB 1717 will take effect on July 1, 2026. The Arkansas law, or “Arkansas COPPA 2.0” establishes privacy protections for teens aged 13 to 16, introduces substantive data minimization requirements including prohibitions on targeted advertising, and provides new rights to access, delete, and correct personal information for teens. The legislature also considered an Arkansas version of the federal Kids Online Safety Act but this proposal ultimately failed, with the bill’s sponsor noting some uncertainties about its constitutionality.

What to know about Arkansas HB 1717: 

The substantive data minimization trend continues

While the federal COPPA framework is largely focused on consent, former Commissioner Slaughter noted in 2022 that people “may be surprised to know that COPPA provides for perhaps the strongest, though under-enforced, data minimization rule in US privacy law.” Arkansas builds on these requirements and follows the recent shift towards substantive data minimization with a complex web of layered requirements that operators must satisfy to use both child and teen data:

 In practice, the interaction between these distinct requirements may raise difficult questions of statutory interpretation.

Differences from federal COPPA 2.0

As originally introduced, Arkansas’s bill was nearly identical to last year’s federal COPPA 2.0 bill. Arkansas’ framework went through various, largely business-friendly amendments (and one bill number switch) during its legislative journey. Though HB 1717 maintains the same general framework of COPPA 2.0, it includes several important divergences:

Could COPPA preempt the Arkansas law?

One question likely to emerge from Arkansas COPPA 2.0 is whether certain provisions, or the entire law, may be subject to federal preemption under the existing COPPA statute. COPPA includes an express preemption clause that prohibits state laws from imposing requirements that are inconsistent with COPPA. This is relevant in two ways as the Arkansas law will both (1) extend protections to teens and (2) introduce new substantive limitations on the use of children’s and teens’ data, such as limits on targeted advertising and strict data minimization requirements, that go beyond COPPA’s scope. 

The question of COPPA preemption was recently explored in Jones v. Google, with the FTC filing an amicus brief arguing that state laws that “supplement” or “require the same thing” as COPPA are not inconsistent. The FTC references the Congressional record from when COPPA was contemplated, arguing that “Congress viewed ‘the States as partners’. . . rather than as potential intruders on an exclusively federal arena,” and that “the state law protections at issue ‘complement–rather than obstruct–Congress’ ‘full purposes and objectives in enacting the statute.’” Something to additionally keep in mind is that the FTC has been in the process of finalizing an update to the COPPA Rule and which could introduce additional inconsistencies, or at least compliance confusion, between the new final Rule and Arkansas COPPA 2.0 when it comes to key terms like the definition of personal information or whether targeted advertising is allowed with consent. 

A trend to watch?

The passage of Arkansas COPPA 2.0 may signal an emerging trend towards a potentially more constitutionally resilient approach to protecting children and teens online. Unlike age-appropriate design codes or social media age verification mandates, which have faced significant First Amendment challenges, Arkansas COPPA 2.0 takes a more targeted approach focused on privacy and data governance, rather than access, online safety, or content. Questions of preemption and drafting quirks aside, this approach may be on firmer ground by focusing on data protection practices and building on a longstanding federal privacy framework. As states explore new ways to safeguard youth online without triggering constitutional pitfalls, privacy-focused legislation modeled on COPPA standards could become a popular path forward. 

Chatbots in Check: Utah’s Latest AI Legislation

With the close of Utah’s short legislative session, the Beehive State is once again an early mover in U.S. tech policy. In March, Governor Cox signed several bills related to the governance of generative Artificial Intelligence systems into law. Among them, SB 332 and SB 226 amend Utah’s 2024 Artificial Intelligence Policy Act (AIPA) while HB 452 establishes new regulations for mental health chatbots.

The Future of Privacy Forum has released a chart detailing key elements of these new laws.

Amendments to the Artificial Intelligence Policy Act

SB 332 and SB 226 update Utah’s Artificial Intelligence Policy Act (SB 149), which took effect May 1, 2024. The AIPA requires entities using consumer-facing generative AI services to interact with individuals within regulated professions (those requiring a state-granted license such as accountants, psychologists, and nurses) to disclose that individuals are interacting with generative AI, not a human. The Act was initially set to automatically repeal on May 7, 2025. 

SB 332 extends the AIPA’s expiration date by two years, ensuring its provisions remain in effect until July 2027, while SB 226 narrows the law’s scope by limiting generative AI disclosure requirements only to instances when directly asked by a consumer or supplier, or during a “high-risk” interaction. The bill defines “high-risk” interactions to include instances where a generative AI system collects sensitive personal information and involves significant decisionmaking, such as in financial, legal, medical, and mental health contexts. SB 226 includes a safe harbor for AI suppliers if they provide clear disclosures at the start or throughout an interaction, ensuring users are aware they are engaging with AI. 

Mental Health Chatbots

Though HB 452 does not directly amend the AIPA, it is closely linked to the broader AI governance framework established by the law. As part of AIPA, Utah established a regulatory sandbox program and created the Office of Artificial Intelligence Policy to oversee AI governance and innovation in the state. One of the AI Office’s early priorities has been assessing the role of AI-driven mental health chatbots in licensed medical practice.

To address concerns surrounding these chatbots, the AI Office convened stakeholders to explore potential regulatory approaches. These discussions, along with the state’s first regulatory mitigation agreement under the AIPA’s sandbox program involving a student-focused mental health chatbot, helped shape the passage of HB 452. The bill establishes new rules governing the use of AI-driven mental health chatbots in Utah, including:

Utah’s latest round of legislation reflects a continued focus on targeted and risk-based regulation for emerging AI systems. Building on the foundation set by the 2024 Artificial Intelligence Policy Act, the new laws reflect an emerging national trend towards affirmatively supporting AI development and innovation while focusing regulatory interventions on particularly high-risk sectors such as healthcare. Utah’s approach to balancing innovation, regulation, and consumer protection in AI space may produce lessons and influence legislators in other states.