New White Paper Explores Privacy and Security Risk to Machine Learning Systems

FPF and Immuta Examine Approaches That Can Limit Informational or Behavioral Harms

WASHINGTON, D.C. – September 20, 2019The Future of Privacy Forum (FPF) released a white paper, WARNING SIGNS: The Future of Privacy and Security in an Age of Machine Learning, exploring how machine learning systems can be exposed to new privacy and security risks, and explaining approaches to data protection.

“Machine learning is a powerful tool with many benefits for society,” said Brenda Leong, FPF Senior Counsel & Director, Artificial Intelligence and Ethics. “Its use will continue to grow, so it is important to explain the steps creators can take to limit the risk that data could be compromised or a system manipulated.”

The white paper presents a layered approach to data protection in machine learning, including recommending techniques such as noise injection, inserting intermediaries between training data and the model, making machine learning mechanisms transparent, access controls, monitoring, documentation, testing, and debugging.

“Privacy or security harms in machine learning do not necessarily require direct access to underlying data or source code,” said Andrew Burt, Immuta Chief Privacy Officer and Legal Engineer. “We explore how creators of any machine learning system can limit the risk of unintended leakage of data or unauthorized manipulation.”

Co-authors of the paper are Leong, Burt, Sophie Stalla-Bourdillon, Immuta Senior Privacy Counsel and Legal Engineer, and Patrick Hall, H2O.ai Senior Director for Data Science Products.

The white paper released today builds on the analysis in Beyond Explainability: A Practical Guide to Managing Risk in Machine Learning Models, released by FPF and Immuta in June 2018.

Leong and Burt will discuss the findings of the WARNING SIGNS whitepaper at the Strata Data Conference in New York City during the “War Stories from the Front Lines of ML” panel at 1:15 p.m. on September 25, 2019, and the “Regulations and the Future of Data” panel at 2:05 p.m. on the same day.

READ THE FULL PAPER HERE

Media Contacts:

Tony Baker

Future of Privacy Forum

[email protected]

202-759-0811

Hadley Weinzierl

Immuta

[email protected]

617-388-7965

About the Future of Privacy Forum

Future of Privacy Forum is a global non-profit organization that serves as a catalyst for privacy leadership and scholarship, advancing principled data practices in support of emerging technologies. Learn more about FPF by visiting www.fpf.org.

About Immuta

Immuta was founded in 2014 based on a mission within the U.S. Intelligence Community to build a platform that accelerates self-service access and control of sensitive data. Immuta’s award-winning Automated Data Governance software platform creates trust across security, legal, compliance, and business teams so they can work together to ensure timely access to critical business data with minimal risks. Its automated, scalable, no code approach makes it easy for users to access the data they need, when they need it, while protecting sensitive data and ensuring their customers’ privacy. Immuta is headquartered in College Park, Maryland. Learn more at www.immuta.com

Warning Signs: Identifying Privacy and Security Risks to Machine Learning Systems

Machine learning (ML) is a powerful tool, providing better health care, safer transportation, and greater efficiencies in manufacturing, retail, and online services. That’s why FPF is working with Immuta and others to explain the steps machine learning creators can take to limit the risk that data could be compromised or a system manipulated.

Today, FPF released a whitepaper, WARNING SIGNS: The Future of Privacy and Security in an Age of Machine Learning, exploring how machine learning systems can be exposed to new privacy and security risks, and explaining approaches to data protection. Unlike traditional software, in machine learning systems privacy or security harms do not necessarily require direct access to underlying data or source code.

The whitepaper presents a layered approach to data protection in machine learning, including recommending techniques such as noise injection, inserting intermediaries between training data and the model, making machine learning mechanisms transparent, access controls, monitoring, documentation, testing, and debugging.

My co-authors of the paper are Andrew Burt, Immuta Chief Privacy Officer and Legal Engineer, Sophie Stalla-Bourdillon, Immuta Senior Privacy Counsel and Legal Engineer, and Patrick Hall, H2O.ai Senior Director for Data Science Products.

The whitepaper released today builds on the analysis in Beyond Explainability: A Practical Guide to Managing Risk in Machine Learning Models, released by FPF and Immuta in June 2018.

Andrew and I will discuss the findings of the WARNING SIGNS whitepaper at the Strata Data Conference in New York City during the “War Stories from the Front Lines of ML” panel at 1:15 p.m. on September 25, 2019, and the “Regulations and the Future of Data” panel at 2:05 p.m. on the same day. If you’ll be at the conference, we hope you’ll join us!

READ THE PAPER

What is 5G Cell Technology?  How Will It Affect Me?

(Except where otherwise noted, this post describes service and availability within the United States.)

5G, the fifth generation of wireless network technology, is the newest way of connecting wireless devices to cellular networks. Each previous generation of wireless technology has revolutionized the way people communicate and socialize, and led to waves of novel products and services using the new capabilities. The leap from 3G to 4G technology brought with it faster data transfer speeds, which supported widespread adoption of data cloud and streaming services, video conferencing, and Internet of Things devices such as digital home assistants and smartwatches. 5G technology has the potential to enable another wave of smart devices: always connected and always communicating to provide faster, more personalized services. While these new products and services may create significant benefits for both businesses and consumers, connected devices can raise substantial privacy risks. Concerns about data collection, use, and sharing must be addressed. 

How Does 5G Technology Work?

5G uses very fast, short-range signals on an unused frequency band to send and receive more total data, within a dense network of specialized small cell sites. A small cell site refers to the area within which this short-range wireless signal can be received. In contrast, current 4G technology uses long-range signals transmitted across crowded low frequency radio waves to send and receive data over a broad network of larger cell sites. 5G operates on a different part of the spectrum, using bands with little interference, at a higher frequency. This can result in faster download speeds and more stable connections for wireless devices, but waves at these frequencies have a shorter range. The 2G, 3G, and 4G/LTE systems all used the same long-range band of airwaves, as does broadcast television. This shift to a different part of the spectrum promises better connectivity, but will require a larger number of cell sites.

Current 4G networks provide service by broadcasting radio waves from large cell towers to extended areas. The capabilities of individual tower sites, the geographic features of covered areas, and the amount of available spectrum all impact the quality of service. These 4G/LTE services operate on a model of regional centralization – all service from large areas, including up to several states worth of traffic, is collected from broadly spaced cell towers, then aggregated into one central location before distribution (connection to the internet). 

The initial deployment of 5G technology to cell towers will likely operate similarly, and many of the gains will not be immediately noticeable to users until the supporting small cell sites are in place. Some early use cases will likely be in stadiums, malls, or other large venues where small cell sites will be placed to create discrete, distributed systems serving those bounded locations in ways that will provide faster service.

5G service requires telecommunications providers to convert existing cell towers to accommodate the new technology and then incorporate a network of small cell sites, densely located for maximum coverage of service areas. Most carriers are currently upgrading the services available on 4G systems as an interim step before full deployment of 5G – the full extent of which will take several years. For example, 4G towers are taking advantage of “MIMO” (multi-input/multi-output) technologies, which enable simultaneous operation of 2 or 4 channels to receive or send signals. Also, some carriers are opting to push internet connectivity out to more local levels than the current regionalization model – either to a single point per state, per metro area, or even to individual cell towers in some dense urban areas. These changes produce a noticeable increase in speed for individual users within those service areas.

5G technology will use a different type of signal called millimeter waves that, while significantly faster, are more limited in range and will ultimately require many more total sites. Such small cell sites must be much closer to each other than existing cell tower networks in order to allow the millimeter waves to carry a signal successfully. Service on this millimeter wave spectrum can only happen once the full infrastructure of small cell sites is in place. This transition is a massive process. In the U.S., it will likely be 2-3 years to upgrade all the existing cell towers – which will represent the initial level at which a carrier may claim to be providing 5G level service. (AT&T for example, has currently upgraded towers in over 20 metropolitan locations, and is continually adding to this list.) Some other countries are ahead of the U.S. in completing this initial phase, but none has yet extensively deployed the network of small cell sites to bring the system to full capability. 

Telecommunications companies are starting in the core of dense, urban areas, then spreading to the suburbs and fringes of metropolitan areas, and ultimately will reach all towers. Setting up the small cell sites can be done simultaneously with the cell tower conversions, but will almost certainly take much longer to reach full coverage. Many rural or less densely populated cities are offering accelerated permit processes, and other business incentives to entice carriers to prioritize upgrades in those locations.

How Will 5G Technology Change Wireless Services and Products?

When companies reach widespread 5G small cell coverage, the new technology will offer two major improvements: (1) increased signal coverage (reliability) and (2) significantly faster mobile speeds with lower latency, i.e. the lag time between a signal and a response. However, it’s important to note that no devices designed solely for 4G will work on 5G networks. 5G-enabled hardware will be required, and devices designed and sold during the transitional years will almost certainly have to include connectivity to both networks. Whereas current devices can switch between 4G and 3G connectivity based on which signal is available in any particular location, the devices only operate on one of these systems at a time. Dual-capability 4G/5G devices will be able to operate on 4G and 5G at the same time – thus ensuring minimal disruption to service as consumers transit between locations and service availability.

When fully operational, 5G networks promise to provide significantly faster speeds with lower latency compared to current 4G/LTE networks. While speeds vary based on a variety of factors, most 4G networks average 40 megabits per second (“Mbps”) download speeds with the fastest local networks getting up to 500Mbps. Although recent tests demonstrate that 4G networks can provide speeds up to 2 gigabit per second (Gbps) or 2,000 Mbps under some conditions, 5G networks promise still higher speeds and better connectivity. The early performance of 5G networks is likely to appear similar to current high-speed 4G networks, but 5G technology has been shown to reach speeds even up to 4.5 Gbps once small cells are fully deployed. 

5G networks also promise drastic decreases in the latency of wireless transmissions, reducing the amount of lag time between an interaction with the network and the network’s response. To an ordinary consumer, the difference in speed and latency may not seem noticeable during everyday transactions, but for high-speed, data-intensive computing services these differences can completely revolutionize an industry. For example, lower latency in video conferencing may enable better feedback in communications and fewer dropped calls. Just as the shift from 3G to 4G/LTE provided capabilities that generated unforeseen applications and uses, it is impossible to predict exactly what may become feasible on 5G, and what consumer-facing services that will enable.

5G is unlikely to be used for all wireless communications. Although some analysts talk of autonomous vehicles relying on 5G, no current developers of these cars are designing with the intention to use 5G. Instead, carmakers are implementing different technologies that use a different part of the spectrum. Likewise, connected devices in homes (IoT and “smart home” technology) will primarily remain connected via WiFi systems. However, those home-related devices such as power and water meters, or smart city sensors that rely on cell technology will likely transition to 5G. The improvement here will not be so much the speed, as these devices use very little bandwidth, but in quantity. Current systems support millions of devices per square kilometer. With 5G, literal billions of devices can be managed. 

5G Security

 The transition to 5G networks, the growth of IoT networks, and the expansion of other advanced technologies raises questions about security safeguards, particularly with regard to access and authentication protocols. Some IoT devices employ lighter security measures in an effort to allow simple connection and communication; this is particularly common for IoT technologies that pose lower risks to users and networks. 5G technology can support more connected devices on cellular networks, which will increase the number of potential vulnerabilities. But the 5G technology may include security improvements as well.

 Work is under way to determine the best practices for securing 5G networks in various applications. Experts expect that faster speeds and higher capacity of these networks to allow for more sophisticated threat assessment measures and more secure authentication frameworks; both aspects of 5G which could improve network security. The nature of the 5G network is a shift from a centralized system to distributed, virtual networks. The carriers developing these systems are embedding security functions at multiple stages of deployment, including virtual protections as well as those which are hardware-based. Additionally, some techniques that work for 4G networks can be translated to 5G networks, as advanced 4G networks follow many of the same technological principles as 5G. While 5G may exacerbate some existing risks or create new security challenges, the technology may also provide for new, effective ways to secure data and devices through more sophisticated networks and algorithms.

Global Use of 5G

5G standards are largely stable and interoperable in the United States. However, stakeholders have not yet agreed on a single standard for worldwide interoperability of 5G networks. The telecommunications industry is working to identify and reach agreement on connectivity standards for global 5G connectivity. The current global standard for cellular connectivity is known as “3GPP.” This standard reflects the agreement of a consensus-driven oversight body that includes partners from Asia, Europe, and North America. The standards setting process is handled by working groups formed within larger technical specification bodies. Both individual company capabilities as well as regional balancing are priorities. All countries and companies will need to adhere to an agreed-upon standard for the manufacture and operation of 5G networks, as non-conforming equipment will not be interoperable across networks. The transition to a 5G network reflects a greater number of suggestions for standards, by both operators and manufacturers, than in the past, because of the obvious importance of interoperability.

Privacy Impacts of 5G

Faster speeds and lower latency may mean better products and services, but the transition to 5G will likely create privacy risks associated with new devices, data collection, and use of personal information. 5G access will enable individuals to have more smart devices that can reliably connect and interact with online services; at the same time, the technology can also create more detailed personal data sets for device manufacturers and service providers. 

For video platforms, 5G promises to provide higher image quality with less lag time; this also means that facial recognition technology will have clearer images to analyze. Similarly, public video surveillance networks can transmit more detailed video of an individual’s activities and can be more easily analyzed by artificial intelligence systems to identify and track individuals in public. The ability to collect and share data more quickly is likely to result in more useful, personalized services; at the same time, increased data collection can increase users’ concerns about the creation of more detailed profiles on individuals. 

5G promises to be a revolutionary improvement in cellular network technology. Better connections for users and faster speeds with lower latency for data intensive computing will likely lead to improvements in existing products and services, as well as the development and implementation of new technologies. However, such changes will include the need to consistently prioritize individual privacy in this new context; 5G technology will not eliminate existing privacy and security challenges. While including beneficial services, a faster, more efficiently connected digital world will continue to pose data risks and will require continued meaningful privacy safeguards to ensure appropriate handling and protection of personal information.

Authored by Daniel Neally and Brenda Leong

10 Reasons Why the GDPR Is the Opposite of a ‘Notice and Consent’ Type of Law

The below piece was originally published on Medium. For a version with humorous images, head to the original post.

A ‘notice and consent’ privacy law puts the entire burden of privacy protection on the person and then it doesn’t really give them any choice. The GDPR does the opposite of this.

There is so much misunderstanding about what the GDPR is and what the GDPR does, that most of what is out there at this point is more mythology than anything else.

For example, an article in Axios claimed over the weekend that ‘the notice and consent approach forms the backbone of the GDPR’. This claim is simply not true.

Understanding and correctly categorizing the regulatory framework of the GDPR is actually very important, now. Look at US Senate’s hearing yesterday, on ‘GDPR & CCPA: Opt-ins, Consumer Control, and the Impact on Competition and Innovation’. If this law is considered as point of reference for future privacy legislation in the US — in the sense of deciding how close or far from it should be the future US privacy framework, then one should understand what are the mechanisms that make the GDPR what it is.

A ‘notice and consent’ framework puts all the burden of protecting privacy and obtaining fair use of personal data on the person concerned, who is asked to ‘agree’ to an endless text of ‘terms and conditions’ written in exemplary legalese, without actually having any sort of choice other than ‘all or nothing’ (agree to all personal data collection and use or don’t obtain access to this service or webpage). The GDPR is anything but a ‘notice and consent’ type of law.

There are many reasons why this is the case, and I could go on and get lost into the minutiae of it. Instead, I’m listing 10 high level reasons, explained in plain language, to the best of my knowledge:

1. Data Protection by Design and by Default is a legal obligation

All organizations, public or private, that touch personal data (“processing” in the GDPR means anything from collection to storage to profiling and creating inferences to whatever you can think of and that can be done to personal data) are under an obligation to bake privacy into all technologies and/or processes they create and, very importantly, to set privacy friendly options as default. There are no exceptions to this obligation. Data Protection by Design and by Default (DPbD) must be implemented regardless of whether the personal data will be obtained based on an opt-in, an opt-out, a legal obligation to collect the data. It doesn’t matter. All uses of personal data must be based on DPbD. Check out Article 25 GDPR.

2. Data Protection Impact Assessments are mandatory for large scale and other complex processing

All organizations that engage in any sort of sensitive, complex or large scale data uses must conduct a Data Protection Impact Assessment (DPIA) before proceeding. Think of the now-common Environmental Impact Assessments (EIA). The DPIA is just like an EIA, but instead of the impact of a project on the environment, it measures the impact of a project using personal data on all the rights of the individuals concerned, from free speech, to privacy, to non-discrimination. Depending on the results of the DPIA, safeguards must be brought to minimize the impact on rights, or the project can simply be stopped if there is no way to minimize the risks. Again, this happens regardless of opt-ins, opt-outs, legal obligations, other grounds relied on by organizations to collect and use the personal data. Check out Article 35 GDPR.

3. All processing of personal data must be fair

Absolutely all collection and uses of personal data must be fair and transparent, regardless of the ground for processing (opt-in, opt-out, legal obligation etc.). This is the Number 1 rule relating to processing of personal data listed in the GDPR (check out Article 5(1)(a)) and breaching it is sanctioned with the higher tier of fines. In practice, this means several things, including the fact that people should be expecting that their personal data is collected, used or shared in the way it is being collected, used or shared.

4. There must be a specific, well defined reason for all collection or uses of personal data

From the outset, and regardless of the justification relied upon by an organization to process personal data (opt-in, opt-out, fulfillment of contract etc.), the collection of that personal data, be it directly from individuals, observed or inferred, must be done only for specified, explicit and legitimate purposes and only processed either for those purposes, or for purposes compatible with them. This is the principle of purpose limitation. In practice, it means that it is illegal to collect personal data ‘because maybe some day I will find something useful to do with it’. Non-compliance with the purpose limitation obligations also triggers the higher level of fines.

5. Data grabs unrelated to the purpose of processing are illegal

Only those personal data that are relevant and limited to what is necessary to achieve the specified purpose can be collected or otherwise processed. Casting a net to grab as much personal data as possible, even if it is not needed for the purpose announced, is unlawful and, again, sanctioned with the higher tier of fines. This rule applies to all processing of personal data, even to those processing activities mandated by law, such as anti money-laundering.

6. The person can actually do things related to how his or her personal data is handled

The individual has well defined rights that allow him or her to do many things to ensure their personal data are processed fairly and lawfully, such as obtaining a copy of the personal data being processed (regardless of whether the personal data is processed on the basis of consent, or a legal obligation, or any other ground), erasing the personal data not being processed lawfully, objecting to processing of personal data, even to lawful processing, on his or her particular grounds, or initiating Court proceedings against any unlawful processing, with the possibility to claim moral or material damages.

7. State of the art security is an obligation

There is an obligation to ensure state of the art security measures for all processing of personal data, with hefty fines for data breaches. Check out Article 32.

8. There is someone in each organization engaging in complex processing whose job is to ensure personal data are processed fairly and lawfully

All organizations that engage in complex or sensitive or large scale data collection and use (this covers all Big Tech, but also many others) must appoint a Data Protection Officer, whose job as an independent adviser is well regulated and protected by the GDPR. Technically, the DPO is someone specialized or experienced in data protection law or applying data protection law, who advises the highest level of management on how to fairly and lawfully collect and use personal data. Check out Articles 3738 and 39.

9. Personal data is followed through the vendor maze

The GDPR provides for solid guarantees on how personal data is managed by the chain of vendors and suppliers of an organization. In particular, all vendors that process personal data on behalf of an organization have to enter detailed contractual agreements which hold them accountable for how they protect the personal data entrusted to them. Vendors also have some direct statutory obligations, such as keeping the Record of processing activities and appointing a Data Protection Officer.

10. All processing of personal data must be kept in a comprehensive and updated Record

All organizations that collect and use personal data in any way are under an obligation to keep track in a Record of all the personal data they collect and use, for what purpose, for how long, about what categories of individuals, with whom they share the data and other details as prescribed by Article 30 GDPR, regardless of whether they collect it on opt-in, opt-out, legal obligations, contract fulfillment etc. The only organizations exempted from this obligation are those with under 250 employees and that only occasionally process personal data. Even those must still keep a Record if the occasional processing may result in a high risk for the rights of individuals or involves sensitive personal data.

10th Annual Privacy Papers for Policymakers – Send Us Your Work!

The 10th Annual Privacy Papers for Policymakers awards have been announced. Register here to attend the event on February 6, 2020. We will open the submissions process for next year’s awards in fall 2020.

Have you conducted privacy-related research that policymakers should know about? If so, we can help you get it in front of key government officials. The Future of Privacy Forum will return to Capitol Hill early next year (date TBD) for the 10th installment of our annual Privacy Papers for Policymakers (PPPM) awards program.

PPPM recognizes the year’s leading privacy research and analysis that has practical implications for policymakers in Congress, federal agencies, and data protection authorities internationally. Winners are selected by a diverse team of academics, advocates, and industry privacy professionals. Awarded articles are chosen both for their scholarly value and because they offer policymakers concrete solutions and practical insights into real-world challenges.

Submit Your Work!

We are currently accepting finished papers to be considered for next year’s awards. The deadline for regular submissions is October 4, 2019, and the deadline for student submissions is October 25, 2019. For more on submission guidelines, please review this page.

Last year’s winning papers examined topical privacy issues: sexual privacy, data subject rights, how local laws can fill gaps in state and federal laws, and more. PPPM honors papers from academics, practitioners, technologists and lawyers, which allows for a wide range of approaches to research and analysis.

For 10 years, the research compiled in PPPM has informed the policy debate in Congress, in the states, and around the world. By submitting your work for consideration, you are providing a valuable tool for legislators and staff considering the structure and elements of a national privacy framework.

Please send any questions about the program or submission process to [email protected]. We look forward to reading your work!

New FPF Study: More Than 200 European Companies are Participating in Key EU-US Data Transfer Mechanism

Co-Authored by: Daniel Neally & Jeremy Greenberg

EU Companies’ Participation in Privacy Shield Grew by More than One-Third Over the Past Year 

EU-US Privacy Shield Essential to Leading European Companies

From Major Employers such as Aldi and Dr. Oetker to Leading Technology Firms like CRISPR Therapeutics and Workwave, European Companies Depend on the EU-US Agreement

Privacy Shield Program Supports European Employment While Adding to Employee Data Protections – Nearly One-Third of Privacy Shield Companies Rely on the Framework to Transfer HR Information of European Staff

The Future of Privacy Forum conducted a study of the companies enrolled in the US-EU Privacy Shield program and determined that 202 European headquartered companies are active Privacy Shield participants. This is a 32% increase from last year’s total of 152 EU companies in the cross-border data transfer framework. These European firms rely on the program to transfer data to their US subsidiaries or to essential vendors that support their business needs. Nearly one-third of Privacy Shield companies use the mechanism to process human resources data – information that is crucial to employ, pay, and provide benefits to workers.

Thousands of major companies, many of which are headquartered or have offices in Europe, rely on the protections granted under the data transfer agreement. With dozens of new companies joining each week to retain and pay their employees or create new job opportunities in Europe, the Privacy Shield has become an integral data protection mechanism for European companies and the European marketplace as a whole.

Overall, FPF found that more than 5,000 companies have signed up for Privacy Shield since the program’s inception – more than 1,300 participants joined in the last year.

Leading EU companies that rely on Privacy Shield include:

–       ALDI, German grocery market chain

–       Alter Domus S.a.r.l., Luxembourg corporate and management services company

–       CRISPR Therapeutics, Swiss gene editing technology

–       Dr. August Oetker KG, German food, drink, household goods and industrial firm

–       EuroNext, European stock exchange

–       International Drug Development Institute SA, Belgian biostatistical and eClinical services company

–       LVMH (also known as Louis Vuitton), French luxury goods maker

–       Modern Times Group, Swedish digital media entertainment company

–       Omni Partners, British hedge fund

–       Randstad – Dutch human resources consultants

–       Workwave – Swedish software developer

FPF research also determined that more than 1,580 companies, nearly one-third of the total number analyzed, joined Privacy Shield to transfer their human resources data.

The research identified 202 Privacy Shield companies headquartered or co-headquartered in Europe. This is a conservative estimate of companies that rely on the Privacy Shield framework – FPF staff did not include global companies that have major European offices but are headquartered elsewhere. The 202 companies include some of Europe’s largest and most innovative employers, doing business across a wide range of industries and countries. EU-headquartered firms and major EU offices of global firms depend on the Privacy Shield program so that their related US entities can effectively exchange data for research, to improve products, to pay employees and to serve customers.

The Privacy Shield is a living and growing instrument, guaranteeing protection of the personal data of European consumers and employees as the backbone of fundamental rights protection within transatlantic commercial exchanges. Given the importance of this mechanism to supporting protection while enabling commerce on both sides of the Atlantic, we encourage careful review of the agreement this year by the European Commission and continued oversight and enforcement of the framework by authorities on both sides of the Atlantic.

The conclusions follow previous FPF studies, which highlighted similar increases in participation and reliance by EU firms on the Privacy Shield program.

Methodology:

For the full list of European companies in the Privacy Shield program, or to schedule an interview with Jeremy Greenberg, John Verdi, or Jules Polonetsky, email [email protected].

FTC Reaches Landmark Settlement Regarding Kids’ Privacy, Clarifies Platforms’ and Video Creators’ COPPA Obligations for Child-Directed Content

By Sara Collins

Last week the Federal Trade Commission (FTC) released details of a settlement with YouTube under the Children’s Online Privacy Protection Act (COPPA). Although notable for its landmark monetary penalty, the settlement is probably more important for the other requirements that it places on YouTube and content creators. Some of YouTube’s settlement obligations exceed COPPA’s statutory and regulatory requirements. Under the settlement:

According to a public statement, YouTube will disable personalized advertising (i.e. behavioral advertising), commenting, and public playlist sharing on content identified as “child-directed.” While not required by the settlement, YouTube will also deploy a machine-learning algorithm to identify videos that are in fact child-directed, but have not been self-identified by content creators.

Basis for the Settlement

The FTC concluded that YouTube had actual knowledge that many channels hosting content on its platform are “child-directed” and therefore trigger COPPA obligations for the company. Google and its subsidiary YouTube will pay a record $170 million to settle allegations by the Commission and the New York Attorney General that the YouTube video sharing service illegally collected personal information from children without their parents’ consent. The settlement includes a monetary penalty of $136 million to the FTC and $34 million to New York State. The $136 million penalty is by far the largest amount the FTC has ever obtained in a COPPA case since Congress enacted the law in 1998; the previous record was $5.7 million against Music.ly/TikTok.

Actual Knowledge

The settlement hinges on COPPA’s “actual knowledge” criteria. COPPA imposes obligations on online services that have actual knowledge that they are collecting, using, or disclosing personal information from children under 13. Services with actual knowledge of such data collection must implement privacy safeguards, including: obtaining verified parental consent to collection of personal information from children; providing parents with access to children’s data retained by the company; and publishing enhanced transparency regarding data practices.

The settlement characterizes YouTube as a third-party operator, rather than a platform. According to the FTC’s COPPA FAQ, third-party operators are deemed to have actual knowledge that children are using a service if:

  1. a child-directed content provider (which is strictly liable for any collection) directly communicates the child-directed nature of its content to the third-party operator; or

  2. a representative of the third-party operator’s ad network recognizes the child-directed nature of the content.

YouTube meets both prongs of this test. The complaint details Google’s knowledge that the YouTube platform hosts numerous child-directed channels. YouTube marketed itself as a top destination for kids in presentations to the makers of popular children’s products and brands. For example, Google and YouTube told Mattel, maker of Barbie and Monster High toys, that “YouTube is today’s leader in reaching children age 6-11 against top TV channels” and told Hasbro, which makes My Little Pony and Play-Doh, that YouTube is the “#1 website regularly visited by kids.” Several channel owners also told YouTube and Google that their channels’ content was directed to children. In other instances, YouTube’s own content rating system identified content as directed to children. 

This set of facts, combined with those from the Musical.ly settlement, puts companies on notice that their interactions with the public, board members, potential advertising partners, and content creators are all relevant factors when the FTC determines whether  platforms have actual knowledge that they collect information from children under the age of thirteen.

Content Creators = Operators

Chairman Simons’ and Commissioner Wilson’s statement notes that individual channels on a general audience platform are “website[s] or online service[s]” and, therefore, each individual creator who operates a channel is “on notice that [the Commission] consider[s] them to be standalone ‘operators’ under COPPA.” This means that YouTube creators may face strict liability for COPPA non-compliance — likely a shock to most creators as they do not control the information collected by YouTube, contract with advertisers, and cannot negotiate how revenue is split. However, because creators can choose whether or not to allow targeted advertising on their channels, the Commission construes YouTube as collecting personal information from children on behalf of individual creators. This raises economic and compliance challenges for creators, because YouTube warns creators that opting out of targeted advertising “may significantly reduce [the] channel’s revenue.” Channel owners who publish child-directed content will likely face a choice: either eschew targeted advertising and the associated data collection, or obtain verified parental consent from parents of under-thirteen viewers to collect personal information from their children.

The statement also announces that the FTC will conduct sweeps of channels following YouTube’s implementation of the order’s provisions. Because COPPA applies to commercial entities (defined broadly to include anyone who makes money from an enterprise that is not a governmental body or non-profit), monetized channels will need to be aware of potential COPPA implications. 

YouTube also stated that the platform would voluntarily implement a machine-learning algorithm to identify child-directed content that is not self-designated by content creators. We expect that this will be similar to other algorithmic screeners, such as those used for DMCA compliance. Critics have charged that YouTube’s algorithms are biased, opaque, and inaccurate. It will be challenging for the platform to algorithmically determine what content is or is not child directed; this analysis will necessarily be nuanced and complex. The analysis is also likely to be subjective. Some YouTube content is plainly child-directed, some channels are clearly not child-directed, and reasonable people can and will disagree about a diverse range of content in between. Any algorithm will be imperfect, creating the risk that the assessments will be both over- and under-inclusive when determining what is and is not child-directed content. 

One probable issue: videos that are appropriate for children (but directed to adults or general audiences) can be inadvertently flagged as directed toward children. The Commission’s Complaint notes that YouTube videos were manually reviewed prior to being featured on the homepage of the YouTube Kids app — a separate service specifically designed for kids, and distinct from the general audience YouTube service. The Commission did not detail what the review entailed; however, one could imagine reviewers making determinations based on the content’s appropriateness for children rather than assessing the content to see if it was specifically directed toward children. For example, a David Attenborough video about sharks might be perfectly acceptable on the homepage of the YouTube Kids app, but on the general audience YouTube service, it probably would not be considered child-directed. Conflating child-appropriate content and child-directed content is common. Machine-learning algorithms may mitigate or exacerbate this problem. The implications for creators of children’s media (and child-adjacent media) are uncertain, but, at a minimum, creating content for kids will likely be less lucrative.

Targeted Advertising vs. Internal Operations

Several aspects of the settlement hinge on the Commission’s interpretation of COPPA’s “internal operations” exception – a provision that permits operators to collect persistent identifiers from children without parental consent if the identifiers are only used for internal operations – e.g. personalizing content, protecting user security, or ensuring legal compliance. The settlement reinforces the FTC’s interpretation that, in the context of child-directed content, targeted advertising without verifiable parental consent is prohibited by COPPA. While this position is not necessarily controversial — the DMA, IAB, NAI, and most industry leaders agree — it is useful for the FTC to reiterate this point.

The “internal operations” exception also drives the Commissioners’ civil penalty analysis. Commissioner Chopra notes that YouTube used viewing data to enhance the effectiveness of its recommendation engines, and that the value of that enhancement should have been considered when negotiating the ultimate monetary penalty of the settlement. Chairman Simons and Commissioner Wilson disagree, characterizing recommendation engine data as being used within the bounds of the “internal operations” exception to obtaining parental consent. They argue that “obtaining penalties in this matter based on the argument that enhancement of Google’s other products and services through analytics such as page views, time spent on a video, or algorithms for recommending videos is ill-gotten, is highly speculative.” The FTC has requested comment regarding the appropriate boundaries of the “internal operations” exemption, and the subject is likely to be a focus of the upcoming FTC COPPA workshop.

ICYMI: FPF's Amelia Vance Raises Concerns about School Surveillance Technologies on WOSU

WASHINGTON, D.C. – Future of Privacy Forum Senior Counsel and Director of Education Privacy Amelia Vance joined National Public Radio’s Ohio affiliate, WOSU, for an interview on All Sides with Ann Fisher to discuss student data privacy.

As more schools explore surveillance technologies to address school safety issues, Vance explained, it’s critical to ensure certain guardrails are in place to protect students’ information.

“Communities should absolutely adopt the school safety measures that they think are necessary for their community, but we [also] want to make sure that they don’t have unintended consequences – that they don’t actually harm students more than they help ensure school safety,” Vance said. Listen to the full interview.

During the interview, Vance highlighted concerns that must be considered as schools and state and local officials explore school safety technologies that collect data and personal information, including social media content, images, and website search history, and use this data to identify students that could be deemed a threat.

Specifically, Vance highlighted examples of students who have typed a sensitive word or phrase, like “shooting hoops,” or posted images that are falsely flagged as problematic. As a result, these students – and the school administrators – can end up trapped in time-consuming “threat assessment process” that can lead to unjust school suspension or even expulsion.

Vance noted, “You have students who have gone through the threat assessment process, which is intended to make things better for students… but what we’ve seen is, in some cases, these threat assessments are discriminating against students with autism or students with disabilities… Those students aren’t threats, they’re simply students who need additional help.”

Vance also warned that some surveillance technologies could inadvertently deter students from seeking help (e.g. searching for resources and support for depression) because they believe certain search terms they will be ‘flagged’ as potential threats.

Click here to listen to the full interview. To learn more about the Future of Privacy Forum, visit www.fpf.org.

MEDIA CONTACT: [email protected]

How the FTC Became a "Super CNIL"

By Winston Maxwell

European data protection authorities are quick to remind citizens and companies that the U.S. lacks adequate protection of personal data. Many Europeans therefore assume that the U.S. is a privacy no-man’s-land. Yet on July 24, 2019 the FTC levied a privacy fine against Facebook that is far above GDPR levels, and imposed accountability obligations that come straight out of the GDPR playbook. The Facebook settlement order highlights a U.S. privacy paradox: The U.S. has inadequate privacy laws, but in some respects the US has the toughest privacy enforcement in the world. How can that be? In an August 11, 2019 article, I try to explain to French readers how the FTC uses Section 5 of the FTC Act and settlement orders to become a world class privacy enforcer. The title of the article – intentionally provocative – is “how the FTC transformed itself into a super CNIL”.

Some French readers objected to my calling the FTC the “most powerful privacy regulator in the world.” Others objected to my comparing the FTC to a “super CNIL”; others pointed out that Facebook’s market value increased after announcement of the fine, so the FTC’s sanction must have been too low. The purpose of my article was not to defend the Facebook settlement on the merits, but to explain how the FTC got there. The article draws on an article on the FTC and the New Common Law of Privacy by Professors Daniel J. Solove and Woodrow Hartzog, which explains how the FTC transformed the words “unfair and deceptive practices” in Section 5 of the FTC Act into a full corpus of privacy law, on par with many parts of the 1995 Data Protection Directive, and now the GDPR. Gaps remain, of course. A recent Op-Ed by Jessica Rich highlights the challenges faced by the FTC, and the huge gaps left open by the 100-year-old FTC Act. After reading Ms. Rich’s article, I would probably no longer call the FTC the “most powerful privacy regulator in the world.” But it is important for Europeans to understand how much the FTC has done to apply the FTC Act’s consumer protection language to new threats raised by massive data processing. The European Commission’s second annual review of the Privacy Shield framework does justice to the FTC’s work, noting the FTC’s enforcement program and its activities in fields such as algorithmic decision-making.

Of particular interest for Europeans is the FTC’s creative use of settlement orders. The FTC Act does not give the FTC direct sanctioning authority for violations of the “unfair and deceptive practices” rule. As pointed out by FPF CEO Jules Polonetsky, this is a big statutory gap that should be filled. Under current law the FTC needs to bring (or ask the DOJ to bring) a separate action in court. However, the FTC Act does give the FTC the right to impose penalties for breaching a prior consent order. The reason the FTC could act forcefully against Facebook in 2019 was that Facebook violated provisions of a previous settlement, signed in 2012. Settlement orders also permit the FTC to impose ongoing accountability and reporting obligations. When I tell privacy students in France that the FTC’s audit and reporting obligations under settlement orders last for 20 years, they are amazed – for digital businesses, 20 years is an eternity.

Christopher Wolf and I have tried for years to dispel the impression in Europe that the U.S. lacks any effective privacy protection outside of special sectors like health care. While it’s true that the U.S. does not have anything close to a GDPR, the FTC’s recent fine shows that the FTC can do a lot with little, sometimes going beyond the GDPR. Obviously there are limits on what the FTC can do under the FTC Act, particularly when it comes to prohibiting certain business practices that might be challengeable under the GDPR but are not challengeable under current U.S. legislation. Federal privacy legislation would help fill the remaining gaps and give the FTC the resources it needs to be more effective.

Yet Europeans can still learn from the FTC’s existing playbook. Settlement orders are frequent for competition law violations in Europe (they are called “behavioral commitments”), but so far nonexistent for privacy violations. Article 58 of the GDPR might be improved to expressly allow data protection regulators to accept behavioral, or even structural, commitments in the context of sanction proceedings.[1] Another interesting aspect of the Facebook settlement order is the FTC’s requirement that Facebook’s CEO, Mark Zuckerberg, sign a personal attestation. This measure is inspired by anti-corruption laws, which frequently require senior officers to sign attestations. The attestations increase the risk of personal liability and might add a useful layer to the GDPR’s already impressive array of accountability tools.

Whatever one thinks of the Facebook fine, it is by far the largest fine ever imposed for a privacy violation. And the 20-year-long governance and reporting obligations are far from trivial. Those in Europe who continue to think that the U.S. has no privacy laws should read the 2019 settlement order and the European Commission’s second review of the Privacy Shield framework, which together paint a more accurate picture of the U.S. situation.

Winston Maxwell is Director, Law and Digital Technology, for TELECOM Paris.

[1] A resourceful DPA could potentially include something that resembles commitments in a sanction order under the GDPR, and that may or may not be valid. My point is just that unlike competition law, European privacy law does not envisage commitments as a standard tool in the regulator’s toolbox.

Digital Deep Fakes

 

Why are they deep?

What are they faking?

What does it mean for the rest of us?

Co-authored by William Marks and Brenda Leong

Introduction

Nicholas Cage played Lois Lane in a Superman film. Nancy Pelosi drunkenly slurred her words on stage. Barack Obama claimed Killmonger, the villain in Black Panther, was right. And Mark Zuckerberg told the world that whoever controlled data would control the future. Or did they?

These events seem unlikely, but click the hyperlinks; they were all recorded on video—they must be real! Or so it seems. There have long been many varieties of video manipulation, but those processes were generally expensive, time-consuming, and imperfect. That is no longer the case.  Professional tools are increasingly effective and affordable, and open-source code that even relative amateurs can use will soon be available to produce believable digital video forgeries. 

Already impressive, these new tools are improving quickly. It may soon be nearly impossible for average viewers, and challenging even for technical experts, to distinguish the authentic from the digitally faked, a circumstance described as having “the potential to disrupt every facet of our society.” From the mass market of news and viral memes to the exchanges between individuals or businesses, bad actors will be able to create fake videos (and other media) with the intent to harm personal, political, economic, and social rivals. Unable to believe what their own eyes see, trust in the news will decline. This may be exacerbated by what Robert Chesney and Danielle Citron refer to as the liar’s dividend: the benefit to unscrupulous people from denouncing authentic events as forgeries, creating enough doubt or confusion to have the same impact or accompanying harm as if the events were fake, while minimizing any consequences to themselves.

The media has recently labeled these manipulated videos of people “deepfakes,” a portmanteau of “deep learning” and “fake,” on the assumption that AI-based software is behind them all. But the technology behind video manipulation is not all based on deep learning (or any form of AI), and what are lumped together as deepfakes actually differ depending on the particular technology used. So while the example videos above were all doctored in some way, they were not all altered using the same technological tools, and the risks they pose – particularly as to being identifiable as fake – may vary. 

First (and Still), There Were “Cheapfakes”

Video manipulation is not new. The Nancy Pelosi video, for example, uses no machine learning techniques, but rather editing techniques that have been around quite awhile. It is what is now being called a “cheapfake.” Rather than requiring a tech-savvy troll equipped with state-of-the art artificial intelligence, this result was achieved through more traditional means by running at 75 percent speed, and with a modified pitch to correct for the resultant deepening of her voice. Because live (unedited) video of this type of event may also exist, it’s generally not as difficult for viewers to quickly check and identify this as fake. However, if Pelosi were a less famous person, and finding an unedited video record was more difficult, this type of fake could still confuse viewers, be harder to validate, and cause even more potential fallout for the person being misrepresented. Their simplicity does not render them harmless. As we have seen, even falsifiable fake news stories can mislead people and influence their views. But the editing process is likely to be discernible from the video file by those with the capacity to review it, and thus these types of fakes can be publicly identified. 

When Machines Learn, the Fakes Get Smarter

Two more recent versions of machine learning-based video editing are causing more concern. One is what is accurately a “deepfakes,” and the other, even newer process, is a deep video portraits (DVP). Because the processes are similar, and the outcomes and risks aligned, most common media references will likely consider them all deepfakes moving forward.

Technically, the term deepfake refers to what are essentially “faceswaps,” wherein the editor essentially pastes someone’s face like a mask over the original person’s face on an existing source video. Deepfakes generally attribute one person’s actions and words to someone else. The term was coined in 2017 by a reddit user who used the technology to make believable fake celebrity porn videos. That is – the editor took a selected clip of a pornographic video, and then needed some amount of actual video of the celebrity to be defamed, which the AI software could use to “learn” that celebrity’s movements well enough to place his or her face over the original porn film actor. Thus, it appeared the celebrity was the person in the porn film. Here, John Oliver’s face is placed over Jimmy Fallon’s as though Oliver were the one dancing and telling jokes. 

In these examples, the original videos may be fairly easily accessible to demonstrate that the altered versions were faked. Alternatively, observing other aspects of the video (such as height or other physical characteristics) may make it clear the substituted face is not the original person in the video. The technology is not seamless—close observation shows that the “fit” of Oliver’s face isn’t perfect, particularly when he turns to the side. And in some cases, the editing may be discernible upon examination of the altered file. However, quality is improving steadily and this sort of substitution may be harder to visually identify in the future. 

These extremely realistic deepfakes are the result of a powerful machine learning technology, known as “generative adversarial networks” (GANs), a programming architecture based on deep Neural Networks

The other process – creating Deep Video Portraits (DVPs) – is even newer, more powerful, and potentially more dangerous. DVPs are also created through GANs and likewise commonly referred to as deepfakes. But whereas actual deepfakes allow someone to place a mask of someone else’s face onto an existing video clip, DVPs allow the creator to digitally control the movements and features of an already-existent face, essentially turning them into a puppet in a new video recording. As demonstrated here, using a relative short amount of existing video from the target (the person to be faked), the program allows a source actor to move his head and mouth, talk, and blink so that it seamlessly appears that the targeted person is doing exactly those movements and expressions, being controlled to say or express exactly what the editor desires.

In a video created in this manner, Jordan Peele digitally manipulated Obama’s face to speak words and make facial expressions that Obama never did. After making this fake video with FakeApp (first released in 2018), Peele edited the file further with Adobe After Effects to create a particularly convincing altered performance.

Since the DVP process is creating a new video file directly (not manipulating an existing file), there is no digital editing trail to technologically identify changes. And there is no original video to find or contrast it with. There are AI systems being developed that may be able to detect DVPs based on the inconsistencies of movements and other performance discrepancies, but whether they can keep up with the improving quality of the faked imagery remains to be seen.

“When Machines Compete”: How GANS Work

GANs are a relatively recent development in neural net technology, first proposed by Ian Goodfellow in a 2014 paper. GANs allow machine-learning based systems to be “creative.” In addition to developing deepfake videos sophisticated enough to fool both humans and machines, these programs can be written to make audio recordings, paint portraits that sell for hundreds of thousands of dollars, and write fiction

GANs work by having two neural networks compete directly with each other.¹ The first (the “generator”)  is tasked with creating fake data – in this case, a video – based on a training set of real data (video, audio, or text data from existing files or recordings).  The program is trained to emulate the data in the real files: learning what human faces looks like; how people move their heads, lips, and eyebrows when they talk; and what sounds are made in speech. 

The second neural net (the “adversary”), a program also trained on the real video data, uses its learned analysis of how people move and speak to try to distinguish an AI-generated video—that is, the job of the adversary is to spot the fakes created by the generator. The more original video data it is fed, the better this adversarial network becomes at catching the fake outputs. But concurrently, the generator uses the experience of being caught out to improve its fakes. This drives an upward spiral of excellence, as each of the algorithms continually gets better and better until the first is finally able to consistently create outputs so believable that the second cannot distinguish them from real footage.

Identifying Fakes: “Technically,” It’s A Problem

As a society, we have long known that photos can be altered, even when they look real. And some are fairly obviously fakes to all but the most willfully ignorant, simply based on their level of outlandishness. Examples that come to mind include edited images of George Bush holding a book upside-down, Vladimir Putin riding a bear, and President Trump playing with plastic dinosaurs. But many fake images are more subtle, and as we’ve seen in recent years with the rise of “fake news” generally, the false can often mislead the unsuspecting or the less discerning. People accept straight out lies, generated by human writers because fabrications get clicks, as true. What will happen when people are urged to routinely critically question what appears to be unedited footage?

How will we deal with the problem when average users can generate realistic looking images, videos, and stories with nothing more than an easily accessible program that requires minimal input? Potential concerns include an abundance of fake nude photos, admissions of treason or fraud, and fabricated news stories all across the internet and social media. The private sector, government, and individuals will all need to react.²

A skeptical eye may be able to identify many of the current deepfakes as forgeries, but this may be a temporary comfort. While researchers are employing the same challenge-and-improve process to systems designed to search-and-detect artificial files, the process is uneven between creators and detectors. This is at least partly because of a disparity of attention and research. The number of people working on the fakers’ side, as opposed to the detector side, may be as high as 100 to 1, although any potential future breakthrough might quickly tip the balance. And just as the generator learns what gets it caught by the discriminator neural net, creators learn from their mistakes. For example, when some detection technology relied on the fact that deep-fakes were not able to blink naturally, creators quickly improved the technology to incorporate smooth blinking.

Even once identified, technical controls are limited. Reddit shut down r/deepfakes, where the initial deepfake pornographic content was shared. Youtube removed the Pelosi video, and Facebook notifies users who re-post about its veracity. But these actions are complicated, and don’t scale well. Trying to automate this sort of monitoring leads to problems such as deleting the history of Nazis in Germany while trying to suppress the imagery of white supremecists. It has also prompted questions over whether platforms are now fact-checking art. After all, the deepfake of Zuckerberg was created by an artist. 

The generators of these technologies will also have to wrestle with the ethical implications of their decisions, as use cases run the gamut of useful and creative applications as well as those which might be concerning. OpenAI, a group which actively promotes cooperativeness in AI research, designed a fake text generator they felt was so good, they decided the risks it posed made it too dangerous to release. The program was designed to flesh out a full story or news article based on just a short opening phrase. However, despite the group’s central policy to support open source code, they opted not to release the full model of the system, GPT-2m, “due to […] concerns about malicious applications of the technology” because of the ease of potential use for social engineering attacks, to impersonate people, or to create fake news and content.

We’ve seen how rumors and fake news can crash stock prices, raise ethnic tensions, or lead a man to burst into a pizza shop with an assault rifle. Yet even when the true accounts are available to distinguish fake from real, problems emerge. The implications and challenges posed by such content-generation systems are significant. Technological fixes alone will be insufficient to combat undesired impacts. 

Identifying Solutions: Political Will

Any proposed solutions are likely to include at least some degree of regulatory intervention. Specific to the U.S. legislative context, one proposal to deal with the potential proliferation of fake videos, news, and photos, is to amend section 230 of the Communications Decency Act. The law has currently established that ISPs and many online services are not responsible for content posted by users. Free speech groups maintain that if companies were responsible for the content posted by users, they would be forced to block or strictly censor at mass scale, with a subsequent chilling effect on internet free speech. 

Danielle Citron and Benjamin Wittes have proposed amending Section 230 to hold companies liable for the failure to take reasonable steps to prevent or address unlawful uses of their services. Members of Congress and state governments have proposed bills to criminalize both the ‘malicious’ creation and distribution of deepfakes, and to prohibit creating videos, photos, and audio of someone without their consent. These proposals presuppose that the ability to reliably identify synthetic or manipulated media remains possible. Increasing liability for something platforms literally cannot provide will not solve the problem, although technical breakthroughs remain possible.

The feasibility of finding practical ways to identify and enforce such requirements is questionable, and there would be inevitable First Amendment challenges to resolve. These questions only expand when considering global regulatory implications.

Identifying Solutions: Social Norms

An information flow in which it is difficult or even impossible to distinguish what is real from what is not is a dangerous one. Forgeries may be accepted as true, and truth can be undermined by doubt. In 1994, Johnson and Seifert called this the “continued influence effect.” When a fake is later to be proven false (the Protocols of the Elders of Zion), or a truth is doubted when clearly proven to be correct (scientific progress such as the history of women’s physiology; or the recent anti-vax movement based on a retracted study suggesting a connection between vaccines and autism), the initial negative or positive associations remain, and can have significant, lingering effects.

Democracy depends on an informed electorate. Encouraging the population at large to think critically about the source and content validity of news and stories they hear is essential. As stated in a 2018 New York Times Op-ed,  “Democracy assumes that its citizens share the same reality.” Or as Daniel Patrick Moynihan once put it, “You are entitled to your own opinion. You are not entitled to your own facts.” It is unhealthy for a democracy to exist in which facts are doubted and “alternative facts” are accepted. This is a condition that U.S. society is already grappling with, and the introduction of false information via deepfakes will only exacerbate the problem. 

Some people may align themselves with sources they deem reliable, while some may deny the pursuit of truth as a meaningful endeavor altogether. But many will want to proactively ensure they are receiving a comprehensive and accurate reporting of the world around them. If unable to confidently determine the truth for themselves, these people may seek an arbiter of some kind, looking to fact-checking applications, reliable media companies, or public and non-profit agencies to supply, verify, or validate information. It may be that more or different organizations are needed to formally fill this role, to objectively and disinterestedly provide news “about” the news.

People do adapt. New technologies, like film in the late 1800s or the proliferation of Photoshop in the late 20th century, force people to recognize and react to new realities. And technology certainly will play a role, as companies seek to leverage AI and other tools to identify fakes. But we are in the midst of the steep transition between technological capability and counter strategies. As Sandra Wachter of the Oxford Institute reminds us, while this problem is not new, the rate at which technology is developing is challenging our ability to adapt quickly enough. 

We need to confront the technical and political issues arising from computer-generated fake videos and media. A first step certainly includes at least increasing public awareness that these technologies exist, are getting better, and that people must assume some responsibility to critically analyze the information that they consume.

 

¹ This paper addresses the use of GANs to generate fake videos specifically, but the use of GANs in ML developments generally have many different applications and use cases, including advancement of music, recreating voices for those who cannot speak, addressing various medical conditions, and other simulated or synthetic applications that, while “artificial” are not designed with the intent to deceive.

² While detection is one clear, and probably necessary, arm of research, there are multiple ways to consider how to meet this threat, including ways to track original files, establish provenance, and otherwise validate certain files. These methods also raise questions of risk, however, for anonymous uses such as whistleblowers or civil rights activists, so all options include challenges that preclude “easy” fixes.