Unveiling China’s Generative AI Regulation

Authors: Yirong Sun and Jingxian Zeng

The following is a guest post to the FPF blog by Yirong Sun, research fellow at the New York University School of Law Guarini Institute for Global Legal Studies at NYU School of Law: Global Law & Tech and Jingxian Zeng, research fellow at the University of Hong Kong Philip K. H. Wong Centre for Chinese Law. The guest blog reflects the opinion of the authors only. Guest blog posts do not necessarily reflect the views of FPF.

The Draft Measures for the Management of Generative AI Services (the “Draft Measures”) were released on April 11, 2023, with their comment period closed on May 10. Public statements by industry participants and legal experts provided insight into the likely content of their comments. It is now the turn of China’s cyber super-regulator – the Cyberspace Administration of China (“CAC”) – to consider these comments and likely produce a revised text.

This blog analyzes the provisions and implications of the Draft Measures. It covers the Draft Measures’ scope of application, how they apply to the development and deployment lifecycle of generative AI systems, and how they deal with the ability of generative AI systems to “hallucinate” (that is, produce inaccurate or baseless output). It also highlights potential developments and contextual points about the Draft Measures that industry and observers should pay attention to.

The Draft Measures aim to protect the “collective” interests of “the public” within the territory of the People’s Republic of China (PRC) in relation to the Management of Generative AI Services. The primary risk foreseen by the CAC involves the potential use of the novel technology to manipulate public opinion and fuel social mobilization by spreading sensitive or false information. The Draft Measures also seek to tackle issues arising from high-profile societal events, such as data leaks, frauds, privacy breaches, intellectual property infringements, as well as overseas incidents widely reported in Chinese media, including defamation and extreme cases of suicide following interactions with AI chatbots. Notably, the Draft Measures set high standards for data authenticity and impose safeguards for personal information and user input. They also mandate the disclosure of information that may impact users’ trust and the provision of guidance for using the service rationally. 

Meanwhile, concerns have arisen that the Draft Measures may slow down the development of generative AI-based products and services by Chinese tech giants. Companies providing services based on generative AI, including those provided through application programming interfaces (“APIs”), are all subject to stringent requirements in the Draft Measures. The Draft Measures thus concern not only those who have the means to train their own models, but also smaller businesses who want to leverage on open-source pre-trained models to deliver services. In this regard, the Draft Measures are likely to present compliance challenges within the open-source context.

While this blog focuses on the Draft Measures, it is important to note that industrial policies from both central and local governments in China also exert substantial influence over the sector. Critically, the task to promote AI advancement amid escalating concerns is overseen by authorities other than the CAC, such as the Ministry of Science and Technology (“MST”) and the Ministry of Industry and Information Technology (“MIIT”). Recently, the China Academy of Information and Communications Technology (“CAICT”), a research institute affiliated with the MIIT, introduced China’s first-ever industry standards1 for assessing generative AI products. These agencies, along with their competition and coordination, can and will co-play a significant role with the CAC in the realm of generative AI regulation. 

1. Notable aspects of the Draft Measures’ scope of application: Definition of “public” and extraterritorial application

Ambiguity in the definition of “public”

The Draft Measures regulate all generative AI-based services offered to “the public within the PRC territory.”2 This scope of application diverges from existing Chinese laws and regulations where intended service recipients are not usually considered. For instance, regulations targeting deep synthesis and recommendation algorithms both apply to the provision of service using these technologies regardless of service recipients being individuals, businesses or “the public.” Looking at its context, Article 6 of the Draft Measures suggests that generative AI-based services have the potential to shape public opinion or stimulate social mobilization, essentially highlighting their impact on “the public.” This new development thus likely signifies the CAC’s goal to prioritize the protection of wider societal interests over individual ones such as privacy or intellectual property which could be protected under previous regulations.  

However, the Draft Measures leave “the public (公众)” undefined. This gives rise to ambiguity as to the scope of application for the Draft Measures. For example, would a service licensed exclusively to a Chinese private entity for in-house use fall in the scope? How about a service accessible only to certain public institutes but not to the unaffiliated, or one customized for individual clients who each receive a unique product derived from a common foundation model, or simply an open-source model that is ready to download and install?

Extraterritorial application

The new approach also suggests a more extensive extraterritorial reach. Regardless of where the service is provided, as long as the public within the PRC territory has access to it, the Draft Measures apply. To avoid being subject to Chinese law, OpenAI, for example, has reportedly begun blocking users based in mainland China. This development could further restrict Chinese users’ access to overseas generative AI services, especially since even before the Draft Measures were released, most Chinese users’ access to such services was  already geo-blocked – either by the service providers themselves (e.g., by requiring a foreign telephone number for registration), or by the Chinese government through enforcement measures. At the same time, the scale of China’s user market and its involvement in AI development render it a “vital” jurisdiction in terms of AI regulation. OpenAI CEO has recently called for collaboration with China to counter AI risks, a trend we might see more in the future. 

2. The Draft Measures adopt a compliance approach based on the lifecycle of generative AI systems 

The Draft Measures are targeted at “providers” of generative AI-based services

The Draft Measures take the approach of regulating generative AI-based service providers. As per Article 5, “providers (提供者)” are those “using generative AI to offer services such as chat, text, image, audio generation; including providing programmable interface and other means which support others to themselves generate text, images, audio, etc.” The obligations are as follows:

Incentivizing providers to allocate risk upstream to developers

By imposing lifecycle compliance obligations on the end-providers, the Draft Measures create incentives for end-providers to allocate risks to upstream developers through mechanisms like contracts. Whether the parties can distribute their rights and obligations fairly and efficiently depends on various factors, such as the resources available to them and the presence of asymmetric information among them. To better direct this “private ordering” with significant social implications, the EU has planned to create non-binding standard contractual clauses based on each party’s level of control in the AI value chain. The CAC’s stance in this new and fast-moving area remains to be seen.

The Draft Measures pose potential challenges for deploying open-source generative AI systems

Open-source models raise a related but distinct issue. Open-source communities are currently developing highly capable large language models (“LLMs”), and businesses have compelling commercial incentives to adopt them, as training a model from scratch is relatively hard. However, many open-source models are released without a full disclosure of their training datasets, due to reasons such as the extensive effort required for data cleaning and privacy issues, especially when user data is involved. Adding to this complexity is the fact that open-source LLMs are not typically trained in isolation. Rather, they form a modification chain where the models build on top of each other with modifications made by different contributors. Consequently, for those using open-source models, several obligations in the Draft Measures become difficult or even impossible to fulfill, including pre-launch assessment, post-launch retraining, and information disclosure.

3. The Draft Measures target the “hallucination” of generative AI systems

The Draft Measures describe generative AI as “technologies generating text, image, audio, video, code, or other such content based on algorithms, models, or rules.” In contrast to the EU’s new compromise text on rules for generative AI, which adopts a technical definition of “foundation models,” the Draft Measures focus on the technology’s function, regardless of their underlying mechanisms. Moreover, according to Article 6 of the Draft Measures, generative AI-based services automatically fall under the scope of Regulations for the Security Assessment of Internet Information Services Having Public Opinion Properties or Social Mobilization Capacity, which mandate security assessment.  A group of seven Chinese scholars have proposed removing this provision and applying security assessment only to those that actually possess these properties.

The Draft Measures contain provisions targeted at ensuring accuracy throughout the developmental lifecycle of generative AI systems. These echo the CAC’s primary concern that the technology could be misused to generate and disseminate misinformation. Article 7(4) of the Draft Measures stipulates that providers must guarantee the “veracity, accuracy, objectivity, and diversity” of the training data. Article 4(4) of the Draft Measures requires that all content generated be “true and accurate,” and that providers of generative AI-based products and services must adopt measures in place to “prevent the generation of false information.” Such providers are responsible for filtering out any non-compliant material and preventing its regeneration within three months (Article 15). However, industry representatives and legal practitioners in China have raised concerns about the baseline and technical feasibility of ensuring data authenticity, given the use of open internet information and synthetic data in the development of generative AI.

4. Looking Ahead

The CAC is expected to refine the Draft Measures after gathering public feedback. The final version and subsequent promulgation may be influenced by a broader set of contextual factors. We believe the following aspects also warrant consideration:

1Chinese major players in the AI industry are forming interest groups to channel their influence on policy makers. For example, China’s industry standards for generative AI were drafted by over 40 entities including tech companies such as Baidu, SenseTime, Xiaomi, NetEase. SenseTime also launched an open platform for AI safety governance to shape practices around AI regulatory issues such as cybersecurity, traceability, IP protection.

2A widely circulated translation of Article 2 states: “These Measures apply to the research, development, and use of products with generative AI functions, and to the provision of services to the public within the territory of the People’s Republic of China.” However, we believe this is misleading. A more accurate read of the original Chinese text and its context suggest that “the provision of services to the public” is a cumulative requirement rather than a separate one.

3The Draft Measures seem to exhibit technical sophistication in their terminology. In Articles 7 and 17, the data compliance obligation is split into two phases – pre-training and optimization. However, the choice of terminology is peculiar, as the prevailing terms in machine learning are pre-training and fine-tuning. Optimization is typically employed to describe a stage within the training process, often used in conjunction with forward and backward propagation.

(Health) Data is What (Health) Data Does in Nevada

Note: This title is inspired by Professor Daniel J. Solove’s recent essay, ‘Data Is What Data Does: Regulating Based on Harm and Risk Instead of Sensitive Data.’

On June 16, 2023, Nevada Senate Bill 370 (SB 370) was signed into law by Governor Lombardo, making Nevada the second state, after Washington, to pass broad-based consumer health data privacy legislation this session. The act will take effect on March 31, 2024.

The Washington ‘My Health, My Data’ Act (MHMD), which was enacted on April 27, 2023, established a first-of-its-kind, comprehensive framework within U.S. law for the protection of consumer health data and health-related inferences. To help stakeholders assess how SB 370 fits into the expanding U.S. state health privacy landscape, the Future of Privacy Forum has released a chart comparing SB 370 to MHMD.

Download the Chart

SB 370 and MHMD adopt similar, but not identical, frameworks for protecting personal health data. Both laws restrict the disclosure of personal health data to third parties and limit the use of geofencing to collect information from or target content to people entering health care facilities. SB 370, however, establishes a use-based definition of “consumer health data,” applies to a narrower scope of covered entities, contains greater flexibility for businesses in responding to access and deletion requests, and provides sole enforcement through the state Attorney General.

Key differences include the following:

1. SB 370 applies to a narrower, use-based range of “consumer health data.” Rather than governing all consumer personal data that could potentially identify health status, SB 370 applies to information that a regulated entity “uses to identify the past, present or future health status of the consumer.” Furthermore, SB 370 excludes certain personal information concerning consumer shopping habits and interests.

2. As compared to MHMD, SB 370 covers fewer organizations, excluding Health Insurance Portability and Accountability Act (HIPAA) and Gramm-Leach-Bliley Act (GLBA)-covered entities, among others, from coverage. By contrast, MHMD excludes data that is subject to HIPAA and GLBA, but not HIPAA and GLBA-regulated entities in their entirety. Both laws apply to entities that “conduct business” in the state in which they were enacted or provide products or services targeted to state consumers and, solely or with others, “determine the purpose and means of processing, sharing, or selling consumer health data.”

3. SB 370 grants individuals a more limited “right to access” than MHMD. The law allows consumers to request access to a list of the third parties with whom a regulated entity has shared their consumer health data, but, unlike MHMD, does not grant individuals the right to access a copy of their health data held by the regulated entity.

4. Under SB 370, regulated entities have greater flexibility in responding to deletion requests. Entities are granted up to two years to comply with deletion requests for consumer health data contained within archival or backup systems, as opposed to the six months provided for under MHMD.

5. While MHMD contains a provision for enforcement through a private right of action, SB 370 is enforceable solely by the state Attorney General.

FPF Releases Report on Verifiable Parental Consent 

Today, FPF released a new report on the effectiveness of a key federal children’s privacy requirement known as verifiable parental consent (VPC). The Children’s Online Privacy and Protection Act (COPPA) requires operators of child-directed services to provide parents with detailed, direct notice and obtain parents’ affirmative express consent – verifiable parental consent – before collecting personal information from kids. While companies are not required to use one of the Federal Trade Commission’s seven approved methods for obtaining VPC, most elect to do so.

FPF’s report, The State of Play: Is Verifiable Parental Consent Fit for Purpose?, and an accompanying infographic detail the mechanics of how VPC works; implementation challenges from both the parent and industry perspectives; and potential solutions, including alternative VPC methods and new regulatory approaches.

Download the report and infographic

“Some of the same technology used to establish VPC is also the foundation for the age estimation technology required by new laws in California, Utah, and the United Kingdom,” said Jim Siegl, senior technologist with FPF’s Youth & Education Privacy team. “Utah’s law ups the stakes further by expanding age verification requirements to older and broader audiences. Understanding the challenges and opportunities posed by VPC has never been more important, as the FTC’s recent order against Edmodo makes abundantly clear. We hope this paper will inform the ongoing conversation about the privacy risks of estimating the ages of internet users and the trade-offs between the accuracy and invasiveness of VPC and age estimation technologies.”

FPF’s new report builds on a previous discussion draft and feedback from stakeholders. Based on public comments about COPPA and additional insights from parents, advocates, industry representatives, and academics, the report details unique challenges with the current VPC mechanisms and approaches, as well as potential solutions.

The identified concerns with VPC include efficacy (many VPC methods are easily circumvented by children), accessibility (not everyone has a government-issued ID or a credit card), privacy/security (concerns over sharing sensitive personal information like a credit card number or photo ID), and convenience (inconveniences in the process cause users to drop off, frustrating parents and online providers). The report also considers other potential avenues to obtaining VPC and age assurance, such as the use of mobile phone SMS or text messaging, the device’s operating system, the point of purchase or setup of a device by a parent, artificial intelligence, and profiling, as well as the associated privacy, security and accuracy tradeoffs.

“The online experience for kids has evolved tremendously in the last few years, and it is clear we need regulations and legislation that can keep up with the ever-changing digital environment and legal landscape,” said Alexa Mooney, policy counsel with FPF’s Youth & Education Privacy team. “While there is no single solution that will get us to that point, the recommendations and ideas outlined in this report provide a great place to start, and we hope will help advance this important conversation.”

To learn more, read FPF’s new report, The State of Play: Is Verifiable Parental Consent Fit for Purpose? in full, as well as its analysis of the new laws in California, Utah, and the United Kingdom that contain broader age assurance requirements.

FIRST JAPAN PRIVACY SYMPOSIUM CONVENING G7 REGULATORS FOCUSES ON GLOBAL TRENDS AND ENFORCEMENT PRIORITIES

The Future of Privacy Forum (FPF), a global non-profit focused on data protection and privacy, and S&K Brussels LPC will jointly present the first edition of the Japan Privacy Symposium on June 22, 2023. The event will convene in Tokyo, bringing together leaders in the Japanese privacy community with data protection and privacy regulators from across the globe.

The event coincides with the G7 Data Protection Authorities and Privacy Commissioners’ Summit, and the Symposium will convene leaders in the Japanese privacy community with data protection and privacy regulators from across the globe to discuss key issues on AI governance and data protection law, the future of adtech, global cooperation and enforcement trends.  The line-up of speakers includes: Ms. Rebecca Kelly Slaughter (Commissioner, U.S. Federal Trade Commission), Dr. Wojciech Wiewiórowski (European Data Protection Supervisor), Mr. Philippe Dufresne (Federal Privacy Commissioner, Canada), Ms. Ginevra Cerrina Feroni (Vice President of the Garante, Italy), Mr. John Edwards (Information Commissioner, UK), with a keynote address from Mr. Shuhei Ohshima (Commissioner, Japan’s Personal Information Protection Commission).

“We’re excited to co-host this valuable event that will bring together data protection and privacy regulators from around the world alongside the Japanese privacy community,” Gabriela Zanfir-Fortuna, FPF’s Vice President for Global Privacy, said. “Data protection and privacy regulators from the G7 economies are meeting in Tokyo to strategize about coordinated approaches to tackle the challenges raised by the advancement of new technologies fueled by data and their impact on society, people, and economy. This Symposium offers a forum for the regulators and the Japanese data protection and privacy community members to exchange ideas, share an overview of the state of play in global regulation and strategize for the future.”

Takeshige Sugimoto, Managing Director and Partner at S&K Brussels LPC, FPF’s Senior Fellow for Global Privacy, and Co-Founder and Board Member of Japan DPO Association, added: “S&K Brussels is delighted to co-host the inaugural Japanese privacy symposium to bring together esteemed privacy and data protection leaders from G7 countries.  Opportunities for collaboration in the global data protection and privacy community are vital, and we hope that the Japan Privacy Symposium will set the stage for important participation and dialogue for years to come.”

FPF is focused on the expansion of its international reach in Asia, with its August 2021 Asia Pacific office opening in Singapore and the announcement of a new FPF APAC Managing Director, Josh Lee Kok Thong, last July.

For more information about the event, the agenda, and speakers, visit the FPF site.

###

About Future of Privacy Forum (FPF)

The Future of Privacy Forum (FPF) is a global non-profit organization that brings together academics, civil society, government officials, and industry to evaluate the societal, policy, and legal implications of data use, identify the risks and develop appropriate protections.

FPF believes technology and data can benefit society and improve lives if the right laws, policies, and rules are in place. FPF has offices in Washington D.C., Brussels, Singapore, and Tel Aviv. Follow FPF on Twitter and LinkedIn.

About S&K Brussels LPC

S&K Brussels LPC is a Japanese law firm composed of lawyers and foreign lawyers whose main practice area is data protection and privacy laws in five jurisdictions i.e., EU, UK, US, China and Japan, which opened in Brussels, Belgium in 2019. We focus on Future Proof efforts to read the shape of future regulations, including AI regulations and other data-related regulations as regulations closely related to data protection legislation.

— データ保護とプライバシーに焦点を当てた世界的な非営利団体であるフューチャー・オブ・プライバシー・フォーラム(Future of Privacy Forum(FPF))弁護士法人S&K Brussels法律事務所は、2023年6月22日に第1回日本プライバシーシンポジウムを共同で開催します。当イベントは、日本のプライバシーコミュニティのリーダーたちと、世界中のデータ保護およびプライバシー規制監督当局が一堂に会し、東京で開催されます。

G7データ保護監督当局・プライバシーコミッショナーのサミットと同時期の開催となる当シンポジウムでは、日本のプライバシーコミュニティのリーダーたちと世界中のデータ保護・プライバシー規制監督当局が一堂に会し、AIガバナンスとデータ保護法、広告技術の未来、グローバル協力、執行動向に関する重要課題について議論する予定です。当シンポジウムでは、レベッカ・ケリー・スローター(Rebecca Kelly Slaughter)氏(米国連邦取引委員会委員)、ヴォイチェフ・ヴィエヴィオロフスキー(Wojciech Wiewiórowski)氏(欧州データ保護監督官)、フィリップ・デュフレーヌ(Philippe Dufresne)氏(カナダ連邦プライバシーコミッショナー)、ジネーヴラ・チェリーナ・フェローニ(Ginevra Cerrina Feroni)氏(イタリアデータ保護監督当局(Garante)副委員長)、ジョン・エドワーズ(John Edwards)氏(英国情報コミッショナーオフィス情報コミッショナー)にパネリストとして御登壇頂くとともに、大島周平(Shuhei Ohshima)氏(日本個人情報保護委員会委員)に基調講演に御登壇頂きます。

FPFのグローバルプライバシー担当副社長であるガブリエラ・ザンフィル=フォルトゥナ(Gabriela Zanfir-Fortuna)氏は、「世界中のデータ保護およびプライバシー規制監督当局と日本のプライバシーコミュニティが集まるこの貴重なイベントを共催できることを大変嬉しく思います」、「G7経済圏のデータ保護およびプライバシー規制監督当局が東京に集まり、データによって促進される新しい技術の進歩や、それらが社会、人々、経済に与える影響によって生じる課題に取り組むための協調的なアプローチについて戦略を練っています。当シンポジウムは、規制監督当局と日本のデータ保護およびプライバシーコミュニティのメンバーが意見を交換し、世界の規制の現状を共有し、将来に向けて戦略を立てるための場を提供するものです。」と述べています。

弁護士法人S&K Brussels法律事務所代表パートナー、FPFのグローバルプライバシー担当シニアフェロー、一般社団法人日本DPO協会の共同設立者兼理事である杉本武重(Takeshige Sugimoto)氏は「S&K Brusselsは、G7諸国の尊敬すべきプライバシーとデータ保護のリーダーたちが集まる、第1回日本プライバシーシンポジウムを共同開催することができて嬉しく思います。 世界のデータ保護とプライバシーコミュニティにおける協力の機会は不可欠であり、日本プライバシーシンポジウムが、今後何年にもわたって重要な参加と対話の舞台となることを期待しています。」と述べています。

FPFは、2021年8月にアジア太平洋オフィス(FPF APAC)をシンガポールに開設し、昨年7月には新しいFPF APACマネージングディレクターのジョシュ・リー・コク・トン(Josh Lee Kok Thong)氏の就任を発表するなど、アジアでの国際展開に力を入れています。

当イベントの詳細、アジェンダおよび登壇者については、FPFのウェブサイトを御覧下さい。

###

Future of Privacy Forum(FPF)について

Future of Privacy Forum(FPF)は、学術関係者、市民社会、政府関係者、産業界が集まり、データ活用の社会的、政策的、法的な意味を評価し、リスクを特定し、適切な保護を開発するための世界的な非営利団体です。FPFは、適切な法律、政策および規則が整備されれば、技術とデータは社会に利益をもたらし生活を向上させることができると考えています。FPFはワシントンD.C.、ブリュッセル、シンガポールおよびテルアビブにオフィスを構えています。TwitterLinkedInでFPFをフォローしてください。

S&K Brussels法律事務所について

S&K Brussels法律事務所は2019年にベルギーのブリュッセルで開業したEU、英国、米国、中国及び日本の5つの法域のデータ保護法制を主な取扱分野とする弁護士・外国弁護士によって構成される日本の法律事務所です。データ保護法制に密接に関連する規制としてのAI規制をはじめとするデータ関連規制を含め、将来の規制の形を読むFuture Proofの取組みに力を入れています。

Future of Privacy Forum Recognizes Two Privacy and Technology Leaders with Career Awards

The Future of Privacy Forum (FPF) — presented Maneesha Mithal, a long-time leader in privacy and consumer protection at the Federal Trade Commission, the Distinguished Public Service Award, and Jane Horvath, Apple’s former Chief Privacy Officer and a privacy and technology trailblazer of more than two decades, the Career Achievement Award. The awards were presented at the FPF Advisory Board Meeting on June 13, an annual gathering of senior privacy leaders at companies, academia, civil society, and government. 

The Distinguished Public Service Award acknowledges an individual whose public service career efforts are notable for advancing protection as a government regulator. FPF awards the Career Achievement Award to private sector leaders who have made major contributions to advancing the values of data protection. 

Jane Horvath and Maneesha Mithal have been trailblazers. Their achievements have directly benefited millions of people in the U.S. and globally by elevating standards for data protection.

Christopher Wolf, FPF Founder and Board President
jhorvath web

Horvath recently stepped down as Chief Privacy Officer at Apple, where she led the company’s regulatory, policy, and product strategy on all privacy and cybersecurity-related legal matters. Horvath is currently a partner at Gibson, Dunn & Crutcher and is the co-chair for the firm’s Privacy, Cybersecurity, and Data Innovation Practice Group. She is also a member of their Administrative Law and Regulatory, Artificial Intelligence, Crisis Management, Litigation, Media and Entertainment, and Technology Practice Groups. Horvath has also previously served as Google’s Global Privacy Counsel and the Department of Justice (DOJ)’s first Chief Privacy Counsel and Civil Liberties Officer.

maneesha mithal

Mithal is a privacy and cybersecurity partner at Wilson Sonsini Goodrich & Rosati. She is an internationally recognized expert on privacy and data security, having led the Federal Trade Commission (FTC)’s Division of Privacy and Identity Protection prior to joining her current firm. Mithal also previously served as Chief of Staff and Senior Counsel in the Bureau of Consumer Protection (BCP). In her time as a public servant, Mithal oversaw teams that were responsible for the enforcement of privacy and security laws and the development of policy positions.

Jane Horvath and Maneesha Mithal have inspired me and so many others working in data protection. Each has had a tremendous impact in shaping for the better how our data is managed by companies, and each has shown that courageous individuals can drive real data protection progress.

Jules Polonetsky, FPF CEO

Previous award winners have included former FTC official Jessica Rich, Georgia Tech Professor Peter Swire, Irish Data Protection Commissioner Helen Dixon, former Procter & Gamble executive Sandra Hughes, and former Dell and Kodak privacy leader Dale Skivington.

FPF at CPDP 2023: Covering Hot Topics, from Data Protection by Design and by Default, to International Data Transfers and Machine Learning

At this year’s annual Computers, Privacy and Data Protection (CPDP) conference in Brussels, several Future of Privacy Forum (FPF) staff took part in different panels, organized by FPF, as well as academic, industry, and civil society groups. This blogpost provides a brief overview of these exciting events, and CPDP will publish recordings of them shortly. 

May 24: EU Commission and ASEAN launch Joint Guide to Model Clauses for Data Transfers

On the conference’s first day, FPF Vice President for Global Privacy Gabriela Zanfir-Fortuna joined a panel organized by Haifa Center for Law and Technology (Faculty of Law, University of Haifa) on the GDPR’s effectiveness, alongside Tal Zarsky, Dean and Professor of Law at the University of Haifa’s Faculty of Law, Raphael Gellert, assistant professor in ICT and private law at Radboud University, Sam Jungyun Choi, associate in the technology regulatory group of Covington & Burling LLP, and Amit Ashkenazi, research student and adjunct lecturer on cyber law and policy at the University of Haifa. The panel contributed to current reflections on the effectiveness of the GDPR, five years after its enactment, by focusing on challenges arising from the regulatory design it sets forth. Gabriela noted that data protection law is much broader than the GDPR because it has a fundamental element behind it, meaning that the right to the protection of personal data is protected as a fundamental right at a constitutional level in the EU. She also stressed that ongoing adoption of the GDPR has catalyzed more societal interest in law, technology, and data protection rights and concepts. 

cpdp 2023 1

Photo: CPDP Panel on Exploring the Many Faces of the GDPR – in Search of Effective Data Protection Regulation, 5/24/2023

Later that day, Gabriela moderated a panel organized by the EU Commission, which served as a platform to launch a “Joint Guide to ASEAN Model Contractual Clauses and EU Standard Contractual Clauses.” The Guide identifies commonalities between the two sets of model clauses and aims to “assist companies present in both jurisdictions with their compliance efforts under both sets of clauses.” The panel was joined by Denise Wong, Deputy Commissioner-designate, Personal Data Protection Commission Singapore, and Assistant Chief Executive-Designate of the Infocomm Media Development Authority, Alisa Vekeman, European Commission, International Affairs and Data Flow team, and Philipp Raether, Group Chief Privacy Officer, Allianz. The panelists noted that model clauses are the most used mechanisms for international data transfers and that efforts like the Joint Guide are a promising solution for a global regime underpinning flows of personal data across different jurisdictions, while providing safeguards for individuals and their data. Officials in the panel noted that the Guide is just the first step in this EU-ASEAN collaboration on model clauses, noting that a set of best practices from companies who use both is to be expected in the near future.

To wrap up the first day of the conference, FPF’s Policy Councel for Global Privacy, Katerina Demetzou, joined a panel on the constitutionalization of data rights in the Global South, along with Mariana Marques Rielli, Institutional Development Coordinator at Data Privacy Brazil, Laura Schertel Mendes, law Professor at the University of Brasilia (UnB) and at the Brazilian Institute for Development, Education and Research (IDP) and Senior Visiting Researcher at the Goethe-Universität Frankfurt am Main with the Capes/Alexander von Humboldt Fellowship, and Risper Onyango, advocate of the High Court of Kenya, currently serving as a Digital Policy Lead under the Digital Economy Department at the Lawyers Hub. In her intervention, Katerina explored how the GDPR has been applied by Data Protection Authorities in Europe to emotion recognition AI systems and to Generative AI. Her examples emphasized that discussions about AI governance and AI regulation should examine existing data protection law applications to these systems and develop in response to gaps in these legal systems.

img 7732

Photo: Panel on From Theory to Practice: Digital Constitutionalism and Data Justice in Movement in the Global South, 5/24/2023

May 25: High Level Discussion Spurred by FPF’s Data Protection by Design and by Default Case-Law Report

On May 17, FPF launched a comprehensive Report on the enforcement of the EU’s GDPR Data Protection by Design and by Default (DPbD&bD) obligations, which are outlined in GDPR Article 25. The Report is informed by extensive research covering more than 92 decisions from Data Protection Authorities (DPAs) and national Courts, and it offers specific Guidance and other policy documents issued by regulators.

On May 25, FPF organized a panel moderated by the Report’s co-author Christina Michelakaki, FPF Policy Fellow for Global Privacy, on the enforcement of Article 25 GDPR and the uptake of Privacy Enhancing Technologies (PETs). Marit Hansen, State Data Protection Commissioner of Land Schleswig-Holstein, Jaap-Henk Hoepman, Professor, Radboud University Nijmegen/University of Groningen, Cameron Russell, Primary Privacy Advisor on Global Payments Matters at eBay, and Stefano Leucci, Legal and Technology expert at the European Data Protection Supervisor joined the panel. The speakers offered their perspectives on the Article 25 GDPR enforcement, delving into topics such as the interrelation between dark patterns and by default settings, the role of Article 25 GDPR in preventing harms from AI systems, and the maturity of PETs. 

Photos: CPDP workshop on State-of-Play of Privacy Preserving Machine Learning (PPML), and CPDP Panel on the Enforcement of Data Protection by Design & Default: Consequences for the Uptake of Privacy-Enhancing Technologies, 5/25/2023

cpdp 2023 4 6

Photo: CPDP Panel on the Enforcement of Data Protection by Design & Default: Consequences for the Uptake of Privacy-Enhancing Technologies, 5/25/2023

FPF’s Managing Director for Europe, Rob van Eijk, organized and facilitated a workshop exploring how to clear the path towards alternative solutions for processing of (personal) data with Machine Learning. Four data scientists, Lindsay Carignan (Holistic AI), Nigel Kingsman (Holistic AI), Victor Ruehle (Microsoft Research), and Reza Shokri (National University of Singapore) joined the workshop. The group introduced an easy to understand privacy auditing framework that quantitatively measures privacy risks in ML systems, while also exploring the relationship between bias and regulations in legislation such as the EU AI Act. You can watch the recording of the workshop here.

The same day, Rob also joined a panel on PETs, consumer protection, and the online ads ecosystem with Marek Steffen Jansen, Privacy Policy Lead – EMEA/Global at Google, Anthony Chavez, VP of Product Management of Google, Marie-Paule Benassi, lawyer, economist, data scientist, and Head of Enforcement of Consumer Law and Redress at the European Commission, Stefan Hanloser, VP Data Protection Law at ProSiebenSat.1 Media SE, and Christian Reimsbach-Kounatze, Information Economist and Policy Analyst at the OECD Directorate for Science, Technology and Innovation. You can watch the recording of the panel here.

May 26: Reflections on automation, compliance and data protection law

Finally, Gabriela participated in a day-long “philosopher’s seminar” on compliance and automation in data protection law organized by CPDP, ALTEP-DP, COHUBICOL under the leadership of Prof. Mireille Hildebrandt, which will flow into a series of published research papers later on in 2023.

While celebrating the five years anniversary of the GDPR becoming applicable, at a pivotal moment of growth and change for emerging technologies, CPDP 2023 in Brussels gave the FPF team an extraordinary opportunity to engage with and facilitate collaborative dialogues with leading academics, technologists, policy experts, and regulators.

Connecticut Shows You Can Have It All

On June 3rd, Connecticut Senate Bill 3 (SB 3), an “Act Concerning Online Privacy, Data and Safety Protections,” cleared the state legislature following unanimous votes in the House and Senate. If enacted by Governor Lamont, SB 3 will amend the Connecticut Data Privacy Act (CTDPA) to create new rights and protections for consumer health data and minors under the age of 18, and also make small-but-impactful amendments to existing provisions of the CTDPA. The bill also contains some standalone sections, such as a section requiring the operators of online dating services within the state to implement new safety features, including a mechanism to report “harmful or unwanted” behavior.

The children’s and health provisions of SB 3 appear to be informed by the California Age-Appropriate Design Code (AADC) and the recently enacted Washington State My Health, My Data Act, respectively, but contain numerous important distinctions.  FPF has prepared a comparison chart to help stakeholders assess how SB 3’s youth privacy provisions compare to the California AADC. The provisions related to consumer health data will take effect on October 1, 2023, while the new requirements governing minors’ data and accounts will take effect a year later, on October 1, 2024.

New protections for youth online (Sections 7-13)

Sections 8-13 of SB 3 create new protections for youth online by expanding youth-specific protections to include teens up to 18, placing limits on certain data processing activities, and requiring services to assess risk to minors through data protection assessments. SB 3 appears to draw inspiration from the California Age-Appropriate Design Code Act’s (AADC) obligations and prohibitions but includes many divergences, which are assessed in further detail in a comparison chart. If enacted, these provisions will go into effect on October 1, 2024, with a right to cure until December 31, 2025. Additionally, Section 7 of the bill specifically regulates social media platforms and is largely focused on facilitating requests from a minor or minor’s parent to “unpublish” a minor’s social media account within 15 business days.

1. Scope

The obligations in Sections 8-13 will apply to controllers offering any online service, product, or feature to consumers whom the controller has actual knowledge, or wilfully disregards, are minors. “Minors” is defined as any consumers under 18, in line with recently-passed legislation in California and Florida. SB 3 borrows California AADC’s “online service, product, or feature” scope but retains the CTDPA’s “actual knowledge, or wilfully disregards” knowledge standard rather than the California AADC’s “likely to be accessed” standard. As written, it appears that the data protection and design obligations under the proposal would apply on an individualized basis to minors who the bill aims to protect, rather than governing the entire service. Additionally, there are also no affirmative age estimation requirements within the proposal, meaning that the scope of SB 3 is narrower than the California AADC because it only applies to controllers who have actual knowledge or willfully disregard that minors are using their service. These diversions may be in response to First Amendment objections raised in the Netchoice v. Bonta litigation seeking to strike down the California AADC.

2. Key obligations

SB 3 requires controllers to use reasonable care to avoid “any heightened risk of harm to minors” caused by their service. “Heightened risk of harm to minors” is defined to mean “processing minors’ personal data in a manner that presents any reasonably foreseeable risk of (A) any unfair or deceptive treatment of, or any unlawful disparate impact on minors, (B) any financial, physical or reputational injury to minors, or (C) any physical or other intrusion upon the solitude or seclusion, or the private affairs or concerns, of minors if such intrusion would be offensive to a reasonable person.” This requirement is reminiscent of the California AADC’s “material detriment” language, though “material detriment” and “harm” are undefined within the California AADC, and thus SB 3 may provide more clarity to controllers in scope.

Building off the data protection assessment requirements set forth in the CTDPA, SB 3 requires controllers to address (1) the purpose of the service, (2) the categories of minors’ personal data processed by the service, (3) the purpose of the data processing, and (4) any heightened risk of harm to minors that is a reasonably foreseeable result of offering the service. The bill specifically notes that a single data protection assessment may address a comparable set of processing operations that include similar activities. If controllers comply with the data protection assessment requirements of the bill, there is a rebuttable presumption in any enforcement action brought by the State AG that a controller used the reasonable care required to avoid heightened risk of harm to minors.

SB 3 includes several data processing limits that are subject to the consent of a minor or minor’s parent. While 2023 has seen the passage of legislation in other states requiring teens to receive parent consent, and thus treating all minors the same for purposes of exercising rights online, SB 3 allows for minors 13 and older to consent for themselves. Absent consent, controllers are prohibited from processing data not reasonably necessary to provide a service, retaining data for longer than necessary, and using any system design feature to “significantly increase, sustain or extend” a minor’s use of the service. Although data minimization is a key privacy principle found in most privacy proposals, it is atypical for this to be subject to consent. Targeted advertising and sale of a minor’s personal data are also subject to the consent of a minor or minor’s parent, expanding the CTDPA’s existing protections for teens that create opt-in requirements for the sale or processing for targeted advertising of data from teens 13-15. 

In addition to the above limits subject to the consent of a minor, SB 3 creates new prohibitions for  controllers offering services to minors. Like the California AADC, there are also limits on collecting precise geolocation information with a requirement to provide a signal when that information is being collected. While neither SB 3 nor the California AADC give guidance or further definition on “signal,” California AADC specifies an “obvious signal.” The bill also includes two design-related prohibitions: controllers are prohibited from providing any consent mechanisms designed to impact user autonomy or choice and are also prohibited from offering direct messaging without providing “readily accessible and easy-to-use safeguards” to limit the ability to receive messages from adults who the minor is not connected with. 

New protections for consumer health data (Sections 1-6)

The CTDPA designates data revealing “health condition and diagnosis” information as a sensitive category of personal data subject to heightened protections, including an affirmative consent requirement for processing. SB 3 aims to expand the CTDPA’s protections for consumer health information by (1) creating a new sensitive data category under the CTDPA of “consumer health data,” (2) creating protections governing the collection and processing of “consumer health data,” applicable to a broad range of entities, and (3) establishing restrictions on the geofencing of healthcare facilities.

1. Definitions

If enacted, SB 3 will add eleven new health-related definitions to the CTDPA, including the terms “abortion,” “consumer health data,” “geofence,” “gender-affirming health data,” and “reproductive or sexual health data.” SB 3 is focused on establishing protections for “consumer health data,” defined as “any personal data that a controller uses to identify a consumer’s physical or mental health condition or diagnosis, and includes, but is not limited to, gender-affirming health data and reproductive or sexual health data” (emphasis added). This is a narrower definition of “consumer health data” than established under the Washington ‘My Health, My Data’ Act (MHMD), which applies to personal information that “identifies” a consumer’s health status, even if not used for a health-related purpose. 

SB 3’s focus on “data used to identify physical or mental health condition or diagnosis” differs slightly from the CTDPA’s original protections for “data revealing mental or health condition or diagnosis” in that it centers on regulated entity use of data, rather than the nature of a data point. Data is subject to these new health data protections when an entity uses it to identify something about a consumer’s health, seemingly including through inference, whether or not that data “reveals” something about a consumer’s health on its face. In addition, SB 3’s definition of “consumer health data” explicitly includes “gender-affirming” and “reproductive and sexual” health information. It remains to be seen what the impact of distinction will be when the CTDPA takes effect.

2. Expanded Protections for the Collection and Processing of “Consumer Health Data”

SB 3 would create several protections exclusive to consumer health data that apply to “persons,” a category that includes non-profits and small businesses, which are otherwise excluded from coverage under the CTDPA. First, SB 3 requires that any employee or contractor with access to consumer health data shall be subject to either a contractual or statutory duty of confidentiality. In addition, the Act will forbid entities that collect and process consumer health data from selling that health data without prior consumer consent.

3. Restrictions on Geofencing

SB 3 follows MHMD in responding to concerns about the geofencing-facilitated digital harassment of individuals visiting abortion and gender-affirming care facilities post-Dobbs v. Jackson Women’s Health Organization by forbidding “persons” from geofencing mental, reproductive, or sexual health facilities for certain purposes. These purposes include the geofencing of health facilities conducted in order to (1) identify, (2) track, (3) collect data from, or (4) send health-related notifications to consumers. The act defines “geofence” broadly, as “any technology that uses global positioning coordinates, cell tower connectivity, cellular data, radio frequency identification, wireless fidelity technology data or any other form of location detection, or any combination of such coordinates, connectivity, data, identification or other form of location detection, to establish a virtual boundary.”

Other modifications to CTDPA

In addition to the substantive changes creating new consumer rights for consumer health data and youth data, SB 3 makes minor but meaningful changes to CTDPA. FPF observes 4 notable changes:

(1) “Data concerning an individual’s status as a victim of crime” is added to the “sensitive personal data” definition, perhaps inspired by pending legislation in Oregon.

(2) Consistent with other state privacy laws, Tribal nation government organizations and air carriers are carved out of scope of the CTDPA.

(3) The knowledge standard for processing youth data was modified from actual knowledge and wilfully disregards to actual knowledge or wilfully disregards. This amendment fixes a likely drafting error and aligns the CTDPA’s knowledge standard with the CCPA and Montana, strengthening privacy protections for children. 

(4) Finally, SB 3 clarifies the Connecticut Attorney General may consider the “sensitivity of the data” involved in a violation of the CTDPA, along with other factors, when determining whether to grant a controller or consumer health data controller a right to cure.

Conclusion

Connecticut’s unanimous passage of SB 3 reflects the urgency of the new priorities around health and kids’ privacy that have permeated the 2023 legislative session. When these provisions take effect in October, the modified CTDPA will provide a template for other states that may wish to integrate protections for consumer health data within their comprehensive privacy laws, rather than passing standalone laws like MHMD. Similarly, Connecticut provides a template for states seeking to increase protections for youth online by first setting baseline standards for all consumers and then building off of that framework to create heightened protections for those under 18.

FPF Submits Comments in Response to the Consumer Financial Protection Bureau’s Request for Information on Data Brokers

On June 5, the Future of Privacy Forum filed comments with the Consumer Financial Protection Bureau (CFPB) in response to their Request for Information (RFI) Regarding Data Brokers and Other Business Practices Involving the Collection and Sale of Consumer Information.

In 2021, FPF explored the landscape of the current data broker industry in testimony presented to the Senate Finance Subcommittee on Fiscal Responsibility and Economic Growth. Since then, emerging data practices have continued to create potential risks for individuals and to raise novel questions about the scope of the Fair Credit Reporting Act (FCRA). Meanwhile, the exclusion of FCRA-covered activities from the state-level comprehensive privacy laws passed in recent years reinforces the critical need for federal leadership to establish jurisdictional clarity and to address privacy risks. 

FPF’s comments encourage the CFPB to analyze the broad range of business activities that can be considered “data brokerage,” and use the Bureau’s regulatory instruments to address specific risks posed by emerging technologies and business practices, including:

FPF’s full comments to the CFPB are available here.

AI Verify: Singapore’s AI Governance Testing Initiative Explained

In recent months, global interest in AI governance and regulation has expanded dramatically. Many identify a need for new governance and regulatory structures in response to the impressive capabilities of generative AI systems, such as OpenAI’s ChatGPT and DALL-E, Google’s Bard, Stable Diffusion, and more. While much of this attention focuses on the upcoming EU AI Act, there are other significant initiatives around the world proposing different AI governance models or frameworks.

This blog post covers “AI Verify,” Singapore’s AI governance testing framework and toolkit, announced in May 2022. Our analysis has three key parts. First, we summarize Singapore’s overall approach to AI governance and the key initiatives that the Singapore Government released regarding AI governance, prior to the launch of AI Verify. Second, we explain the key components of AI Verify.. Finally, as we approach the anniversary of AI Verify’s roll-out, we explore what the future may hold for AI Verify and Singapore’s approach to AI governance and regulation. Briefly, the key takeaways are:

1. Singapore’s overall approach to AI governance

In Singapore’s high-level strategy for AI, the National AI Strategy (NAIS), the country announced it aims to be “at the forefront of development and deployment of scalable, impactful AI solutions,” hoping to cement itself as “a global hub for developing, test-bedding, deploying, and scaling AI solutions.” Among the five “ecosystem enablers” identified in the strategy to increase AI adoption is the development of a “progressive and trusted environment” for AI  – one that strikes a balance between innovation and minimization of societal risks. 

To create this “progressive and trusted environment,” Singapore has adopted so far a light-touch and voluntary approach to AI regulation. This approach recognizes two practical realities about Singapore’s AI ambitions. First, the Singapore Government sees AI as a key strategic enabler in developing its economy and improving the quality of life of its citizens. This explains why Singapore is not taking a heavy-handed approach in regulating AI lest it stifles innovation and investment. Second, given its size, Singapore is aware it is also likely to be a price-taker rather than a price-setter as AI governance discourse, frameworks and regulations develop globally. Thus, rather than introducing new AI principles afresh, the current approach is to “take the world where it is, rather than where it hopes the world to be.”

Before the release of AI Verify in 2022, Singapore’s approach to AI regulation – as overseen by the Personal Data Protection Commission of Singapore (PDPC) – had three pillars: 

  1. The Model AI Governance Framework (Model Framework). 
  2. The Advisory Council on the Ethical Use of AI and Data (Advisory Council).
  3. The Research Programme on the Governance of AI and Data Use (Research Program). 

As we aim to highlight the substantive aspects of Singapore’s AI regulatory approach, the following paragraphs will focus on the Model Framework. 

The Model Framework

The Model Framework, first launched at the World Economic Forum Annual Meeting (WEF) in 2019, is a voluntary and non-binding framework that guides organizations in the responsible deployment of AI solutions at scale, noting that this framework does not concern the development phase of these technologies. As a guide, the Model Framework sets out practical recommendations for AI deployments for private sector entities, as the public sector’s use of AI is governed by internal guidelines and AI and data governance toolkits. The Model Framework is billed as a “living document,” as it is meant to evolve through future editions alongside technological and societal developments. The Model Framework is also technology-, industry-, scale- and business-model agnostic. 

Substantively, the Model Framework is guided by two fundamental principles to promote trust and understanding in AI. First, organizations using AI in decision-making should ensure that the decision-making process is explainable, transparent and fair. Second, AI systems should be human-centric: the protection of human well-being and safety should be primary considerations in designing, developing and using AI.

The Framework translates these guiding principles to implementable practices in four key areas of an organization’s decision-making and technology-development processes:

(a) Internal governance structures and measures;

(b) Determining the level of human involvement in AI-augmented decision-making;

(c) Operations management; and

(d) Stakeholder interaction and communication.

The table prepared below shows a summary of some suggested considerations, practices, and measures falling under each of these key areas.

Internal governance structures and measuresHuman involvement in AI-augmented decision-makingOperations managementStakeholder interaction and communication
Clear roles and responsibilities
Use existing or set up new corporate governance and oversight processes

Ensure staff are appropriately trained and equipped

Internal controls Monitoring and reporting system to ensure awareness at appropriate level of management

Manage personnel risk
Periodic reviews
Appropriate level of human intervention
Use probability-severity of harm matrix to determine level of human involvement

Incorporate corporate and societal values in decision-making
Good data accountability 
Data lineage, quality, accuracy, completeness, veracity, relevance, integrity, etc.

Minimizing bias in data / model
Heterogeneous datasets

Separate training, testing and validation datasets

Repeatability assessments, counterfactual testing, etc.

Regular review and tuning
General disclosure
Being transparent when AI is used in products and services

Use simple language, with communication appropriate to the audience, purpose and context.

Increased transparency Information on how AI decisions may affect individuals

Feedback channels
Avenues for feedback and review of decisions

Other initiatives accompanying the Model Framework

When Singapore released the second edition of the Model Framework at the WEF in 2020, it was released alongside two other documents: the Implementation and Self-Assessment Guide for Organisations (ISAGO) and the Compendium of Use Cases (Compendium – Volume 1 and Volume 2). The ISAGO is a checklist helping organizations assess the alignment of their AI governance processes with the Model Framework. The Compendium provides real-life examples of the adoption of the Model Framework’s recommendations across various sectors, use cases, and jurisdictions. 

Collectively, the Model Framework and its suite of accompanying documents anchored and outlined substantive thinking on AI regulation in Singapore. These initiatives led to Singapore winning a United Nations World Summit on the Information Society Prize in 2019, recognizing its efforts as a frontrunner in AI governance. 

2. AI Verify in a Nutshell

January 2020 marked a turning point for global discourse on AI regulation. On January 17, 2020, a leaked white paper from the European Commission brought international attention to the increasing possibility of government regulation of AI technology. In February 2020, the European Commission formally issued a White Paper on Artificial Intelligence, which, among other things, set out plans to create a regulatory framework for AI. In the following months, the European Commission began to make available drafts of a forthcoming AI Act. For the first time, a major government was making a serious attempt to introduce substantive rules to horizontally regulate the development and use of AI systems. Due to the expected extraterritorial nature of the AI Act, companies developing AI systems outside of Europe could potentially be covered by the new law. 

These developments influenced thinking about the future of Singapore’s AI regulatory and governance landscape. While the PDPC maintained its voluntary and light-touch approach to AI regulation, it acknowledged a future in which AI faces heightened oversight. The PDPC seemed to also be mindful of growing consumer awareness and demand for trustworthiness from AI systems and developers, a need for international standards on AI to benchmark and assess AI systems against regulatory requirements, and an increasing need for interoperability of AI regulatory frameworks. With these in mind, Singapore began developing the framework that eventually coalesced into AI Verify.

FPF Training: The EU’s Proposed AI Act

The EU’s Artificial Intelligence (AI) Act is in the final stages of adoption in Brussels, and will be the first piece of legislation worldwide regulating AI. Join us for an FPF Training virtual session to learn about the act’s extraterritorial reach, the legal implications for providers and deployers of AI, and more.

Register today!

What is AI Verify?

Launched by the Infocomm Media Development Authority (IMDA) – a statutory board under the Singapore Ministry of Communications and Information, and the PDPC, AI Verify is an AI governance testing framework and toolkit. By using AI Verify, organizations are able to use a combination of technical tests and process-based checks to conduct a voluntary self-assessment of their AI systems. The system, in turn, helps companies attempt to objectively and verifiably demonstrate to stakeholders that their AI systems have been implemented in a responsible and trustworthy manner. 

Given that AI testing methodologies, standards, metrics and tools continue to develop, AI Verify is also currently at a “Minimum Viable Product” (MVP) stage. This has two implications. First, there are several technical limitations to the MVP version, and limitations to the types and size of AI models or datasets that it can test or analyze. Second, it is expected that AI Verify will evolve as AI testing capabilities mature. 

The four aims for developing an MVP version of AI Verify are:

(a) First, IMDA hopes that organizations are able to use AI Verify to determine performance benchmarks for their AI systems, and demonstrate these claimed benchmarks to stakeholders such as consumers and employees, thereby helping organizations enhance trust.

(b) Second, given that it was developed with various AI regulatory and governance frameworks, as well as common trustworthy AI principles in mind, AI Verify seeks to help organizations find commonalities across various global AI governance frameworks and regulations. IMDA is also continuing to engage regulators and standards organizations to map AI Verify’s testing framework onto established frameworks. These efforts are aimed at allowing businesses to operate and offer AI-enabled products and services in multiple markets, while allowing Singapore to act as a hub in AI governance and regulatory testing.

(c) Third, as organizations trial AI Verify and use its testing framework, IMDA will be able to collate industry practices, benchmarks and metrics. These can facilitate input into the development of international standards on AI governance, considering Singapore is participating in global AI governance platforms such as the Global Partnership on AI and ISO/IEC JTC1/SC 42, to contribute valuable perspectives towards the development of international standards on AI governance.

(d) Fourth, IMDA hopes AI Verify will allow Singapore to create a local AI testing community, consisting of AI developers and system owners (who are seeking to test AI systems), technology providers (who are developing AI governance implementation and testing solutions), advisory service providers (specializing in testing and certification support), and researchers (who are developing testing technologies, benchmarks and practices). 

It is also important to clarify several potential misconceptions about AI Verify. First, AI Verify is not an attempt to define ethical standards. It also does not attempt to classify AI systems with a clear bright line. Instead, AI Verify provides verifiability, as it allows AI system developers and owners to demonstrate their claims about the performance of their AI systems. Second, an organization’s use of AI Verify does not guarantee that tested AI systems are free from risks or biases, nor that they  are completely “safe” or “ethical.” Third, AI Verify is intended to preclude organizations from unintentionally divulging sensitive information from their AI systems (such as their underlying code or training data); one key safeguard – AI Verify will be used by AI system developers and owners themselves to conduct self-testing. This allows the organization’s data and models to remain within the organization’s operating environment. 

How does AI Verify work?

AI Verify consists of two parts. The first is a Testing Framework, which references eleven internationally accepted AI ethics and governance principles, grouped into five pillars. The second is a Toolkit that organizations use to execute technical tests and to record process checks from the Testing Framework.

AI Verify’s Testing Framework

The five pillars and eleven principles in AI Verify’s Testing Framework, as well as their expected assessment, are:

PillarPrinciplesAssessment method(s)
Transparency on Use of AI and AI systems:
This pillar is about disclosing to individuals about AI use in a technological system, so that they can be aware and make informed choices on whether to use the AI-enabled system.
Transparency:
Providing appropriate information to individuals impacted by AI systems.
Assessed through process checks of documentary evidence (e.g., company policy and communication collaterals) providing appropriate information to individuals who may be impacted by the AI system.

The information includes (subject to the need to avoid compromising IP, safety, and system integrity): the use of AI in the system, its intended use, limitations, and risk assessments.
Understanding how an AI model reaches a decision:
This pillar is about allowing individuals to understand the factors contributing to an AI model’s output, while also ensuring output consistency and accuracy in similar conditions. 
Explainability:
Understanding and interpreting the decisions and output of an AI system.
Assessed through a combination of technical tests and process checks.

Technical tests are conducted to identify factors contributing to an AI model’s output.

Process checks include verifying documentary evidence of considerations given to the choice of models, such as rationale, risk assessments, and trade-offs of the AI model.
Repeatability / reproducibility: Ensuring consistency in AI output by being able to replicate an AI system, either internally or through a third party.Assessed through process checks of documentary evidence, including evidence of AI model provenance, data provenance, and use of versioning tools.
Ensuring safety and resilience of the AI system:
This pillar is aimed at helping individuals understand that the AI system will not cause harm, is reliable, and will perform according to its intended purpose even despite encountering unexpected input.
Safety:
Ensuring safety by conducting impact / risk assessments, and ensuring that known risks have been identified / mitigated.
Assessed through process checks of documentary evidence of materiality assessment and risk assessment, including how known risks of the AI system have been identified and mitigated.
Security:
Ensuring the cyber-security of AI systems.
Presently NA
Robustness:
Ensuring that the AI system can still function despite unexpected input.
Assessed through a combination of technical tests and process checks.

Technical tests attempt to assess if a model performs as expected even when provided with unexpected inputs.

Process checks include verifying documentary evidence, review of factors that may affect the performance of AI model, including adversarial attacks.
Ensuring fairness:
This pillar is about evaluating whether the data used to train the AI model is sufficiently representative, and testing to ensure that the AI system will not unintentionally discriminate. 
Fairness:
Avoiding unintended bias, ensuring that the AI system makes the same decision even if a certain attribute is changed, and ensuring that the data used to train the model is representative.
Assessing the mitigation of unintended discrimination through a combination of technical tests and process checks.

Technical tests check that an AI model does not produce biased results based on protected or sensitive attributes specified by the system owner, by checking the model output against the ground truth.

Process checks include verifying documentary evidence that there is a strategy for the selection of fairness metrics aligned with the desired outcomes of the AI system’s intended application; and the definition of sensitive attributes are consistent with legislation and corporate values.
Data governance:
Ensuring the source and quality of data by adopting good data governance practices when training AI models.
Presently NA
Ensuring proper (human) management and oversight of the AI system:
This pillar is about assessing human accountability and control in the development and/or deployment of AI systems, and whether the AI system is aimed at beneficial purposes for general society. 
Accountability:
Ensuring proper management oversight during AI system development.
Assessed through process checks of documentary evidence, including evidence of clear internal governance mechanisms for proper management and oversight of the AI system’s development and deployment.
Human agency and oversight:
Ensuring that the AI system is designed in a way that will not diminish the ability of humans to make decisions.
Assessed through process checks of documentary evidence that the AI system is designed in a way that will not reduce human’s ability to make decisions or to take control of the system. This includes defining the role of humans in the oversight and control of the AI system such as human-in-the-loop, human-over-the-loop, or human-out-of-the-loop.
Inclusive growth, societal and environmental well-being:
Ensuring beneficial outcomes for people and the planet. 
Presently NA

The actual Testing Framework has several key components:

(a) Definitions: The Testing Framework provides easy-to-understand definitions for each of the AI principles. For example, explainability is defined as the “ability to assess the factors that led to (an) AI system’s decision, its overall behavior, outcomes and implications.”

(b) Testable criteria: For each principle, a set of testable criteria is provided. These criteria are a mix of technical and/or non-technical (e.g. processes, procedures, or organizational structures) factors that contribute to the achievement of the desired outcomes of that governance principle.

Using the example of explainability, two testable criteria are provided. A developer can run explainability methods to help users understand the drivers of the AI model. A developer can also demonstrate a development preference for AI models that can explain their decisions or that are interpretable by default.  

(c) Testing process: For each testable criteria, AI Verify provides the processes or actionable steps to be carried out. The steps could be quantitative (such as statistical or technical tests) or qualitative (such as producing documented evidence during process checks). 

For explainability, a technical test could involve empirically analyzing and determining feature contributions to a model’s output. A process-based test would be to document the rationale, risk assessments, and trade-offs of an AI model. 

(d) Metrics: These are quantitative or qualitative parameters used to measure, or provide evidence for, each testable criterion.

Using the explainability example above, the metric for determining feature contributions could examine contributing features of a model output as obtained from a technical tool (such as SHAP and LIME). The process-based metric could be documented evidence of evaluations when choosing the final model, such as risk assessments and trade-off weighing exercises.

(e) Thresholds (where applicable): Where available, the Testing Framework will provide recognized values or benchmarks for selected metrics. Such values or benchmarks could be defined by regulators, industry associations, or other recognized standard-setting organizations. For the MVP model of AI Verify, thresholds are not provided given the rapid evolution of AI technologies, their use cases, as well as methods to test AI systems. Nevertheless, as the space of AI governance matures and the use of AI Verify increases, IMDA intends to collate and develop context-specific metrics and thresholds to be added to the Testing Framework.

AI Verify’s Toolkit

While AI Verify’s Toolkit is currently only available to organizations that have successfully registered for AI Verify’s MVP program, IMDA describes the Toolkit as a “one-stop” tool for organizations to conduct technical tests. Specifically, the Toolkit packages widely-used open-source testing libraries. Such tools include SHAP (Shapley Additive ExPlanations) for explainability, the Adversarial Robustness Toolkit for robustness, and AIF360 and Fairlearn for fairness.

Users of AI Verify can deploy the Toolkit within their internal environment. Users will be guided by a user interface to navigate the testing process. For example, the Toolkit contains a “guided fairness tree” for users to identify fairness metrics relevant for their use case. At the end, AI Verify produces a summary report that helps system developers and owners interpret test results. For process checks, the report provides a checklist stating the presence or otherwise of document evidence specified in the Testing Framework. The test results are then packaged into a Docker® container for easy deployment. 

3. Conclusion

When IMDA released AI Verify, the wave of interest in generative AI seen today had yet to materialize. With the wave currently upon us, interest in demonstrating governance, testability and trustworthiness of AI systems has grown significantly. Initiatives like AI Verify appear poised to respond to this interest.

Singapore has previously demonstrated its ability to contribute to global discourse and thought leadership on AI governance and regulation, namely through the Model Framework. The stakes for AI Verify are high, but so is the global need for such an initiative. To succeed, AI Verify will likely require greater recognition and adoption. This depends on several factors. First, the tool’s accessibility is critical: AI-driven organizations hoping to use AI Verify will need to be able to access it at little or no cost. Second, convincing organizations of its value is key. This will require IMDA to demonstrate that AI Verify is technically and procedurally sound, that it can be effectively used on more (and newer) kinds and sizes of AI models and data sets, that it does not impinge on commercial sensitivities around proprietary AI models and datasets. Third, and perhaps most importantly, it must remain relevant to international regulatory frameworks. IMDA will need to ensure that AI Verify can continue to help organizations address and interoperate within key emerging global AI regulatory frameworks, such as the EU AI Act, Canada’s AI and Data Act, the NIST AI Risk Management Framework in the US, and even Singapore’s own Model Framework.

Optum & The Mayo Clinic Win the 2022 Award for Research Data Stewardship

Author: Randy Cantz, U.S. Policy Intern, Ethics and Data in Research and former Communications Intern at FPF

On Wednesday, May 10, 2023, the Future of Privacy Forum (FPF) honored representatives from Optum and the Mayo Clinic for their outstanding corporate-academic research data-sharing partnership at the 3rd annual Awards for Research Data Stewardship. The awards honor companies and researchers that prioritize privacy-oriented and ethical data sharing for research.

In a keynote address, United States Congresswoman Lori Trahan applauded the winning partnerships for their ongoing commitment to responsible data sharing. “Ensuring that independent researchers can take a look under the hood of companies is essential to holding big tech executives accountable to the promises they make to their users,” Trahan said. “That’s why the work that you all do is so important, proving that this can be done in a responsible way on both the researcher and company sides.”

screenshot 2023 05 10 at 1.15.39 pm

SHARING HEALTH DATA WHILE PROTECTING PRIVACY

Dr. Mehwish Qasim provided an overview of the award-winning research partnership. She emphasized that any collaboration that utilizes data from Optum must follow strict guidelines, and there are careful protections to ensure the appropriate use of data.

“This collaboration enabled important research that led to a broader public benefit and impact on diabetes management,” said Dr. Qasim, emphasizing the importance of accessible private data.  She added this was possible because of the variety of data and rigorous and standardized data cleaning, validation, and comprehensive safeguards.

LEVERAGING HEALTH DATA FOR RESEARCH

Dr. Rozalina McCoy introduced her research team’s study, explaining that their research was helping to understand how different communities receive different advice for diabetes-related incidents and how best to improve patient care at all stages of a person’s life.

“One out of every seven adults has been touched by diabetes in some way, and one out of every four healthcare dollars in the US is spent caring for people with diabetes,” said Dr. McCoy, describing the importance of diabetes research. In response to an audience question, Dr. McCoy recommended that researchers understand the benefits and risks of research data from the beginning and establish a clear data-sharing policy that has safeguards in place.

KEY QUESTIONS & ANSWERS

Audience members posed their most pressing questions to the winning team about the inherent challenges in data sharing for research. Below are some of the main takeaways.

What are some of the tensions between access to data and the limitations that may have been part of the research considerations?

Dr. McCoy: Optum has very strict privacy controls; they have the data, but they can’t give it to researchers all at once. We have to be strategic about what our specific questions are and what combination of variables we can look at to still answer our questions, but maintain patient privacy. We’re able to do a lot for privacy, for example, making sure that all the data linkages are done by someone besides the researcher to ensure that privacy is protected. This way, we can maintain privacy, objectivity, and rigor.

What are your thoughts on how we encourage more organizations to do this type of data sharing?

Dr. Qasim: From a corporate perspective, there are a couple of things that I would advise or look for in terms of best practices. The first is to understand the benefits and risks. Have a clear data-sharing policy that outlines what can be shared and what safeguards are in place to establish those data-sharing agreements. Of paramount importance are the data security requirements and privacy protections.

Do you see trade-offs that have to be made, given some of the challenges of ensuring that datasets are de-identified?

Dr. Qasim: There are legal requirements and ethical considerations that are critical, including controls that balance encryption and secure data storage. These comprehensive safeguards are the kinds of factors that would help me evaluate the utility of the research while maintaining the protection and privacy of the individuals.

FINAL THOUGHTS

Jules Polonetsky, CEO at FPF, communicated FPF’s appreciation for the research of the award runners-up, Gravy Analytics and the University of Florida, which is a public land-grant institution and an emerging Hispanic-Serving Institution. Polonetsky emphasized that FPF is eager to promote responsible data-sharing practices.

“This is hard, grinding work. Kudos to the team,” Jules said. “It’s not something that a broad range of companies are able to do without real and significant expert partnership and collaboration. The end goal is to advance science and the broader social good.”

For more information on privacy-oriented and ethical data sharing for research, see The Playbook: Data Sharing for Research or join the Ethics and Data in Research Working Group.