Privacy Papers 2016

The winners of the 2016 PPPM Award are:

Law Enforcement Access to Data Across Borders: The Evolving Security and Human Rights Issues

by Jennifer Daskal, Associate Professor, American University Washington College of Law


A revolution is underway with respect to law enforcement access to data across borders. Frustrated by delays in accessing sought-after data located across territorial borders, several nations are taking action, often unilaterally, and often in concerning ways. Several nations are considering—or have passed—mandatory data localization requirements, pursuant to which companies doing business in their jurisdiction are required to store certain data, or copies of such data, locally. Such measures facilitate domestic surveillance, increase the cost of doing business, and undercut the growth potential of the Internet by restricting the otherwise free and most efficient movement of data. Meanwhile, a range of nations—including the United Kingdom, Brazil, and others —are asserting that they can unilaterally compel Internet Service Providers (ISPs) that operate in their jurisdiction to produce the emails and other private communications that are stored in other nation’s jurisdictions, without regard to the location or nationality of the target. ISPs are increasingly caught in the middle—being forced to choose between the laws of a nation that seeks production of data and the laws of another nation that prohibits such production. In 2015, for example, Brazilian authorities detained a Microsoft employee for failing to turn over data sought by Brazil; U.S. law prohibited Microsoft from complying with the data request. Governments also are increasingly incentivized to seek other means of accessing otherwise inaccessible data, via, for example, use of malware or other surreptitious forms of surveillance.

While this is a problem of international scope, the United States has an outsized role to play, given a combination of the U.S.-based provider dominance of the market, blocking provisions in U.S. law that prohibit the production of the content of emails and other electronic communications to foreign-based law enforcement, and the particular ways that companies are interpreting and applying their legal obligations. It also means that the United States is uniquely situated to lay the groundwork for an alternative approach that better reflects the normative and practical concerns at stake—and do so in a privacy-protective way. This article analyzes the current state of affairs, highlights the urgent need for a new approach, and suggests a way forward, pursuant to which nations would be able to directly access data from U.S.-based providers when specified procedural and substantive standards are met. The alternative is a Balkanized Internet and a race to the bottom, with every nation unilaterally seeking to access sought-after data, companies increasingly caught between conflicting laws, and privacy rights minimally protected, if at all.

Accountable Algorithms

by Joshua A. Kroll, Engineer, Security Team, Cloudflare; Joanna Huey, Princeton University; Solon Barocas, Princeton University; Edward W. Felten, Princeton University; Joel R. Reidenberg, Stanley D. and Nikki Waxberg Chair in Law, Fordham University School of Law; David G. Robinson, Upturn; and Harlan Yu, Upturn



Many important decisions historically made by people are now made by computers. Algorithms count votes, approve loan and credit card applications, target citizens or neighborhoods for police scrutiny, select taxpayers for an IRS audit, grant or deny immigration visas, and more.

The accountability mechanisms and legal standards that govern such decision processes have not kept pace with technology. The tools currently available to policymakers, legislators, and courts were developed to oversee human decision-makers and often fail when applied to computers instead: for example, how do you judge the intent of a piece of software? Additional approaches are needed to make automated decision systems—with their potentially incorrect, unjustified or unfair results—accountable and governable. This Article reveals a new technological toolkit to verify that automated decisions comply with key standards of legal fairness.

We challenge the dominant position in the legal literature that transparency will solve these problems. Disclosure of source code is often neither necessary (because of alternative techniques from computer science) nor sufficient (because of issues analyzing code) to demonstrate the fairness of a process. Furthermore, transparency may be undesirable, such as when it permits tax cheats or terrorists to game the systems determining audits or security screening, or when it discloses private or protected information.

The central issue is how to assure the interests of citizens, and society as a whole, in making these processes more accountable. This Article argues that technology is creating new opportunities—more subtle and flexible than total transparency—to design decision-making algorithms so that they better align with legal and policy objectives. Doing so will improve not only the current governance of algorithms, but also—in certain cases—the governance of decision-making in general. The implicit (or explicit) biases of human decision-makers can be difficult to find and root out, but we can peer into the “brain” of an algorithm: computational processes and purpose specifications can be declared prior to use and verified afterwards.

The technological tools introduced in this Article apply widely. They can be used in designing decision-making processes from both the private and public sectors, and they can be tailored to verify different characteristics as desired by decision-makers, regulators, or the public. By forcing a more careful consideration of the effects of decision rules, they also engender policy discussions and closer looks at legal standards. As such, these tools have far-reaching implications throughout law and society.

The Privacy Policymaking of State Attorneys General

by Danielle Keats Citron, Professor of Law, University of Maryland Carey School of Law


Accounts of privacy law have focused on legislation, federal agencies, and the self-regulation of privacy professionals. Crucial agents of regulatory change, however, have been ignored: the state attorneys general. This article is the first in-depth study of the privacy norm entrepreneurship of state attorneys general. Because so little has been written about this phenomenon, I engaged with primary sources — first interviewing state attorneys general and current and former career staff, and then examining documentary evidence received through FOIA requests submitted to AG offices around the country.

Much as Justice Louis Brandeis imagined states as laboratories of the law, offices of state attorneys general have been laboratories of privacy enforcement. State attorneys general have been nimble privacy enforcement pioneers where federal agencies have been more conservative or constrained by politics. Their local knowledge, specialization, multi-state coordination, and broad legal authority have allowed them to experiment in ways that federal agencies cannot. These characteristics have enabled them to establish baseline fair information protections; expand the frontiers of privacy law to cover sexual intimacy and youth; and pursue enforcement actions that have harmonized privacy policy.

Although certain systemic practices enhance AG privacy policy making, others blunt its impact, including an over reliance on informal agreements that lack law’s influence and a reluctance to issue closing letters identifying data practices that comply with the law. This article offers ways state attorneys general can function more effectively through informal and formal proceedings. It addresses concerns about the potential pile-up of enforcement activity, federal preemption, and the dormant Commerce Clause. It urges state enforcers to act more boldly in the face of certain shadowy data practices.

Privacy of Public Data

by Kirsten Martin, Assistant Professor of Strategic Management & Public Policy, George Washington University School of Business; and Helen Nissenbaum, Professor, Media, Culture, and Communication & Computer Science, New York University


The construct of an information dichotomy has played a defining role in regulating privacy: information deemed private or sensitive typically earns high levels of protection, while lower levels of protection are accorded to information deemed public or non-sensitive. Challenging this dichotomy, the theory of contextual integrity associates privacy with complex typologies of information, each connected with respective social contexts. Moreover, it contends that information type is merely one among several variables that shape people’s privacy expectations and underpin privacy’s normative foundations. Other contextual variables include key actors — information subjects, senders, and recipients — as well as the principles under which information is transmitted, such as whether with subjects’ consent, as bought and sold, as required by law, and so forth. Prior work revealed the systematic impact of these other variables on privacy assessments, thereby debunking the defining effects of so-called private information.

In this paper, we shine a light on the opposite effect, challenging conventional assumptions about public information. The paper reports on a series of studies, which probe attitudes and expectations regarding information that has been deemed public. Public records established through the historical practice of federal, state, and local agencies, as a case in point, are afforded little privacy protection, or possibly none at all. Motivated by progressive digitization and creation of online portals through which these records have been made publicly accessible our work underscores the need for more concentrated and nuanced privacy assessments, even more urgent in the face of vigorous open data initiatives, which call on federal, state, and local agencies to provide access to government records in both human and machine readable forms. Within a stream of research suggesting possible guard rails for open data initiatives, our work, guided by the theory of contextual integrity, provides insight into the factors systematically shaping individuals’ expectations and normative judgments concerning appropriate uses of and terms of access to information.

Using a factorial vignette survey, we asked respondents to rate the appropriateness of a series of scenarios in which contextual elements were systematically varied; these elements included the data recipient (e.g. bank, employer, friend,.), the data subject, and the source, or sender, of the information (e.g. individual, government, data broker). Because the object of this study was to highlight the complexity of people’s privacy expectations regarding so-called public information, information types were drawn from data fields frequently held in public government records (e.g. voter registration, marital status, criminal standing, and real property ownership). helen-nissenbaum

Our findings are noteworthy on both theoretical and practical grounds. In the first place, they reinforce key assertions of contextual integrity about the simultaneous relevance to privacy of other factors beyond information types. In the second place, they reveal discordance between truisms that have frequently shaped public policy relevant to privacy. For example,
• Ease of accessibility does not drive judgments of appropriateness. Thus, even when respondents deemed information easy to access (marital status) they nevertheless judged it inappropriate (“Not OK”) to access this information under certain circumstances.
• Even when it is possible to find certain information in public records, respondents cared about the immediate source of that information in judging whether given data flows were appropriate. In particular, no matter that information in question was known to be available in public records, respondents deemed inappropriate all circumstances in which data brokers were the immediate source of information
• Younger respondents (under 35 years old) were more critical of using data brokers and online government records as compared with the null condition of asking data subjects directly, debunking conventional wisdom that “digital natives” are uninterested in privacy.

One immediate application to public policy is in the sphere of access to records that include information about identifiable or reachable individuals. This study has shown that individuals have quite strong normative expectations concerning appropriate access and use of information in public records that do not comport with the maxim, “anything goes.” Furthermore, these expectations are far from idiosyncratic and arbitrary. Our work calls for approaches to providing access that are more judicious than a simple on/off spigot. Complex information ontologies, credentials of key actors (i.e. sender and recipients in relation to data subject), and terms of access – even lightweight ones – such as, identity or role authentication, varying privilege levels, or a commitment to limited purposes may all be used to adjust public access to align better with legitimate privacy expectations. Such expectations should be systematically considered when crafting policies around public records and open data initiatives.

Risk and Anxiety: A Theory of Data Breach Harms

(Full paper available pending publication)

by Daniel Solove, Professor of Law, George Washington University Law School; and Danielle Citron, Professor of Law, University of Maryland Carey School of Law


In lawsuits about data breaches, the issue of harm has confounded courts. Harm is central to whether plaintiffs have standing to sue in federal court and whether plaintiffs have viable claims in tort or contract. Plaintiffs have argued that data breaches create a risk of future injury from identity theft or fraud and that breaches cause them to experience anxiety about this risk. Courts have been reaching wildly inconsistent conclusions on the issue of harm, with most courts dismissing data breach lawsuits for failure to allege harm. A sound and compelling approach to harm has yet to emerge, resulting in a lack of consensus among courts and a rather incoherent jurisprudence.

Two U.S. Supreme Court cases within the past five years have contributed significantly to this tortured state of affairs. In 2013, the Court in Clapper v. Amnesty International concluded that fear and anxiety about surveillance – and the cost of taking measures to protect against it – were too speculative to constitute “injury in fact” for standing. The Court emphasized that injury must be “certainly impending” to be recognized. This past term, the U.S. Supreme Court in Spokeo v. Robins issued an opinion aimed at clarifying the harm required for standing in a case involving personal data. But far from providing guidance, the opinion fostered greater confusion. What the Court made clear, however, was that “intangible” injury, including the “risk” of injury, could be sufficient to establish harm.
Little progress has been made to harmonize this troubled body of law, and there is no coherent theory or approach. In this Article, we examine why courts have struggled when dealing with harms caused by data breaches. We contend that the struggle stems from the fact that data breach harms there are intangible, risk-oriented, and diffuse. Although these characteristics have been challenging to courts in the past, courts have, in fact, been recognizing harms with these characteristics in other areas of law.

We argue that many courts are far too dismissive of certain forms of data breach harm. In many instances, courts should be holding that data breaches cause cognizable harm. We explore why courts struggle to recognize data breach harms and how existing foundations in the law should be used by courts to recognize such harm. We demonstrate how courts can assess risk and anxiety in a concrete and coherent way.

Online Tracking: A 1-million-site Measurement and Analysis

by Steven Englehardt, PhD Candidate, Princeton Universitysteven-englehardt


We present the largest and most detailed measurement of online tracking conducted to date, based on a crawl of the top 1 million websites. We make 15 types of measurements on each site, including stateful (cookie-based) and stateless (fingerprinting-based) tracking, the effect of browser privacy tools, and the exchange of tracking data between different sites (“cookie syncing”). Our findings include multiple sophisticated fingerprinting techniques never before measured in the wild. This measurement is made possible by our open-source web privacy measurement tool, OpenWPM1 , which uses an automated version of a full-fledged consumer browser. It supports parallelism for speed and scale, automatic recovery from failures of the underlying browser, and comprehensive browser instrumentation. We demonstrate our platform’s strength in enabling researchers to rapidly detect, quantify, and characterize emerging online tracking behaviors.

The 2016 PPPM Honorable Mentions are:


The implementation of a universal digitalized biometric ID system risks normalizing and integrating mass cybersurveillance into the daily lives of ordinary citizens. ID documents such as driver’s licenses in some states and all U.S. passports are now implanted with radio frequency identification (RFID) technology. In recent proposals, Congress has considered implementing a digitalized biometric identification card — such as a biometric-based, “high-tech” Social Security Card — which may eventually lead to the development of a universal multimodal biometric database (e.g., the collection of the digital photos, fingerprints, iris scans, and/or DNA of all citizens and noncitizens). Such “hightech” IDs, once merged with GPS-RFID tracking technology, would facilitate exponentially a convergence of cybersurveillance-body tracking and data surveillance, or dataveillance-biographical tracking. Yet, the existing Fourth Amendment jurisprudence is tethered to a “reasonable expectation of privacy” test that does not appear to restrain the comprehensive, suspicionless amassing of databases that concern the biometric data, movements, activities, and other personally identifiable information of individuals.

In this Article, I initiate a project to explore the constitutional and other legal consequences of big data cybersurveillance generally and mass biometric dataveillance in particular. This Article focuses on how biometric data is increasingly incorporated into identity management systems through bureaucratized cybersurveillance or the normalization of cybersurveillance through the daily course of business and integrated forms of governance.


Website privacy policies often contain ambiguous language that undermines the purpose and value of privacy notices for site users. This paper compares the impact of different regulatory models on the ambiguity of privacy policies in multiple online sectors. First, the paper develops a theory of vague and ambiguous terms. Next, the paper develops a scoring method to compare the relative vagueness of different privacy policies. Then, the theory and scoring are applied using natural language processing to rate a set of policies. The ratings are compared against two benchmarks to show whether government-mandated privacy disclosures result in notices less ambiguous than those emerging from the market. The methodology and technical tools can provide companies with mechanisms to improve drafting, enable regulators to easily identify poor privacy policies and empower regulators to more effectively target enforcement actions.


A data revolution is transforming the workplace. Employers are increasingly relying on algorithms to decide who gets interviewed, hired or promoted. Proponents of the new data science claim that automated decision systems can make better decisions faster, and are also fairer, because they replace biased human decision-makers with “neutral” data. However, data are not neutral and algorithms can discriminate. The legal world has not yet grappled with these challenges to workplace equality. The risks posed by data analytics call for fundamentally rethinking anti-discrimination doctrine. When decision-making algorithms produce biased outcomes, they may seem to resemble familiar disparate impact cases, but that doctrine turns out to be a poor fit. Developed in a different context, disparate impact doctrine fails to address the ways in which algorithms can introduce bias and cause harm. This Article argues instead for a plausible, revisionist interpretation of Title VII, in which disparate treatment and disparate impact are not the only recognized forms of discrimination. A close reading of the text suggests that Title VII also prohibits classification bias — namely, the use of classification schemes that have the effect of exacerbating inequality or disadvantage along the lines of race or other protected category. This description matches well the concerns raised by workplace analytics. Framing the problem in terms of classification bias leads to some quite different conclusions about how the anti-discrimination norm should be applied to algorithms, suggesting both the possibilities and limits of Title VII’s liability focused model.


According to conventional wisdom, data privacy regulators in the European Union are unreasonably demanding, while their American counterparts are laughably lax. Many observers further assume that any privacy enforcement without monetary fines or other punishment is an ineffective “slap on the wrist.” This Article demonstrates that both of these assumptions are wrong. It uses the simultaneous 2011 investigation of Facebook’s privacy practices by regulators in the United States and Ireland as a case study. These two agencies reached broadly similar conclusions, and neither imposed a traditional penalty. Instead, they utilized “responsive regulation,” where the government emphasizes less adversarial techniques and considers formal enforcement actions more of a last resort.

When regulators in different jurisdictions employ this same responsive regulatory strategy, they blur the supposedly sharp distinctions between them, whatever may be written in their respective constitutional proclamations or statute books. Moreover, “regulatory friending” techniques work effectively in the privacy context. Responsive regulation encourages companies to improve their practices continually, it retains flexibility to deal with changing technology, and it discharges oversight duties cost-efficiently, thus improving real-world data practices.