Finding a Balance Between Privacy and Progress: Jules Polonetsky at TEDxMidAtlantic2012

Seeking Submissions for Privacy Papers for Policy Makers 2013

FPF is pleased to invite privacy scholars, professionals, and others with an interest in privacy issues to submit papers to be considered for inclusion in FPF’s annual edition of “Privacy Papers for Policy Makers.”

The purpose of Privacy Papers for Policy Makers is to present policy makers with highlights of important research and analytical work on a variety of a privacy topics.  Specifically, we wish to showcase papers that analyze cutting-edge privacy issues, and propose either achievable short-term solutions or new means of analysis that could lead to solutions.

Academics, privacy advocates and Chief Privacy Officers on FPF’s Advisory Board will review the submitted papers to determine which papers are best suited and most useful for policy makers in Congress, at federal agencies and for distribution to data protection authorities internationally.  Selected papers will be presented at an event with privacy leaders in the Fall, and will be included in a printed digest that will be distributed to policy makers.

The entry can provide a link to a published paper or a draft paper that has a publication data.  FPF will work with authors of the selected papers to develop a digest.

Our deadline for submissions is July 19, 2013.  Please include the author’s full name, phone number, current postal address, and e-mail address.

Please send submissions via e-mail to [email protected] with the subject line “Privacy Papers for Policy Makers 2013,” or send by mail to:

Future of Privacy Forum

919 18th Street NW, Suite 901

Washington, DC  20006

Click here to view prior editions of “Privacy Papers for Policy Makers.”  We look forward to your submissions.

Looking at Privacy Protections for Facial Recognition

On Sunday, Google announced that it would not allow facial recognition applications on Google Glass until “strong privacy protections” were in place. But this announcement begs the very question: what sort of privacy protections can actually be put in place for this sort of technology?

Thus far, concerns about facial recognition technology have appeared within the context of “tagging” images on Facebook or how it might be used to transform marketing, but these interactions are largely between users and service providers. Facial recognition on the scale offered by wearable technology such as Google Glass can change how we navigate the outside world. As one commenter put it, notice and consent mechanisms can protect Glass users but not the use by the user himself.

Many suggestions have focused on sending signals to the outside world that Glass is at work, such as blinking lights or other audio or visual cues. This is similar to efforts such as requiring cameras to go “click” whenever a photo is taken in order to make surreptitious photography more difficult. However, these sorts of mechanisms place the responsibility on non-users to constantly be aware of their surroundings lest they be recognized without their approval.

In its report last year on best practices for facial recognition technology, the FTC specifically addressed scenarios where companies use facial recognition to identify anonymous images of a consumer to someone who could not otherwise identify him or her, pointing to mobile apps that could permit users to surreptitiously discover information about people on the street. Noting “the significant privacy and safety risks that such an app would raise,” the FTC suggested that “only consumers who have affirmatively chosen to participate in such a system should be identified.”

As a practical matter, for now, facial recognition on Glass could be tied to a user’s social network. Information that a user has access to about people out in the world would reflect information shared on that social network. Though a heads-up display could be permitted to recognize only “friends,” it seems inevitable that this technology will creep beyond this sort of artificial barrier. Drawing the line will be incredibly difficult. For example, what reason would there be to exclude professional email contacts or prominent public figures from being identified?  With some work, almost anyone who has set foot in a public space can be visually identified. Facial recognition on wearable devices simply lowers this already-diminishing bar. Empowering the general public to affirmatively choose to participate in broad-based, public facial recognition on the scale offered by wearable technologies poses a tremendous challenge to many of our traditional privacy protection tools.

Stopping the collection of this information may prove impossible. Even as Google has pledged to limit facial recognition abilities on Glass, Lambda Labs, which provides facial recognition services, have indicated that facial recognition is “a core feature” of wearable technology and that “Google will allow it or be replaced with something that does.” While creating a comprehensive opt-out program will likely serve as one potential solution, such a system could create further privacy problems by requiring the collection of facial information in order for the application to “know” to ignore that face in the future. Another option could be for other wearable tech to send signals not to identify an individual’s face, creating a Google Glass duel of sorts.

However, the challenge of stopping or restricting facial data collection suggests a focus on regulating potential uses could be more productive. We could attempt to draw distinctions among what facial recognition is being used to accomplish—is it being used to assist or augment the user’s memory? For example, using facial recognition technology to help recall a distant, long absent relative could be distinguished from using additional data sources to learn about someone as you sit across from them at a table. Further, facial recognition applications could provide information based on contextual cues, such as identifying restaurant managers and staff at a restaurant while ignoring other people. In the end, applications will need to specifically enumerate how they will use the facial data they are collecting.

Both software developers and device manufacturers need to think creatively about how to establish guidelines around facial recognition technology. The alternative is a complete loss of anonymity in public, or a complete transformation of the public sphere into a place where individuals must cover up, lower their gazes, and avert their eyes—all actions that seem contrary to Google Glass’ effort to present individuals with new ways to experience our world.

Comments for the FTC's Workshop on "Internet of Things"

FPF today offered comments to the FTC in advance of a public workshop on new security and privacy issues presented by growing networks of connected devices.  Commonly referred to as the “Internet of Things,” these physical devices range from appliances and vehicles to our smart phones, and present an elaborate array of objects that capture, share, and use data.

The Internet of Things has been a focus of FPF’s work since our founding, starting with our original project on the Smart Grid and continuing to our recent projects on Connected Cars and Smart Stores.  While connected, smart devices provide many benefits, new ways to protect consumer privacy may need to be explored.  Connected devices present circumstances where our traditional Fair Information Practice Principles (FIPPs) may not be available or practical.  Codes of conduct, seals and other public-facing and enforceable commitments are examples of how to address the privacy issues in the Internet of Things.

Our full set of comments is available to read here.

New Report Shows Cybersecurity Risks from FBI “Going Dark” Proposal

Today’s New York Times discusses a major new report by 20 technologists about the cybersecurity risks that would result an FBI plan to expand wiretapping capabilities on the Internet.  The administration is reportedly close to sending the FBI proposal to Capitol Hill, to amend the Communications Assistance to Law Enforcement Act of 1994.

FPF Senior Fellow Peter Swire blogs about this issue today at the International Association of Privacy Professionals website.  His post draws on work he has done at FPF with Kenesa Ahmad.  Swire writes:

The FBI argues that new wiretapping mandates on the Internet are needed because it is “going dark,” because new and evolving Internet technologies mean that government may not have a way to get the content of communications with a wiretap order.  In a 2011 paper, Kenesa Ahmad and I argued that “going dark” is the wrong image, and that today should instead be understood as a “golden age of surveillance.”  As members of the IAPP know, law enforcement and national security agencies today have far greater data gathering capabilities than ever before, such as: (1) location information; (2) information about contacts and confederates; and (3) an array of new databases that create digital dossiers about individuals’ lives.

As the debate heats up about expanding CALEA requirements to the Internet, there are thus strong privacy and cybersecurity reasons for concern about the FBI’s proposed approach.

What's Scary About Big Data, and How to Confront It

Any discussion surrounding the benefits–and the risks–presented by Big Data often focuses on the far-off future.  The world of Minority Report is frequently invoked, but in the wake of April’s “Big Data Week,” it is time to recognize that Big Data is already here.  In their recent book, Big Data: A Revolution that Will Transform How We Live, Work, and Think, Viktor Mayer-Schönberger and Kenneth Cukier act as heralds of Big Data, and suggest that the real phenomenon is the “datafication” of our world.  They describe the transformation of our entire world into “oceans of data that can be explored” that can provide us with a new perspective on reality.  The language and rhetoric in the book highlight Big Data’s potential: the scale of Big Data, they suggest, allows us to “extract new insights” and “create new forms of value” in ways that will fundamentally change how we interact with one another.

These new insights can be used for good or for ill, but that’s true of any new piece of knowledge.  What exactly is it then that some find so disconcerting about Big Data?

Mayer-Schönberger and Cukier recognize that Big Data is on a “direct collision course” with our traditional privacy paradigms, and further, that it opens the door to create the sort of propensity models seen in Minority Report.  However, the pair are more concerned with what they term the “dictatorship of data.”  They fear that well-meaning organizations may “become so fixated on the data, and so obsessed with the power and promise it offers, that [they] fail to appreciate its limitations.”

And these limitations are very real.  The popular statistician Nate Silver argues that it is time to admit that “we have a prediction problem.  We love to predict things–and we aren’t very good at it.” It is this dynamic that presents the biggest worries about Big Data.  Its promise is that by transforming our entire world, our whole experience into data points that numbers will be able to speak for themselves, but this alone will not cure our prediction predilection.  As Kate Crawford of Microsoft Research recently pointed out, Big Data is full of hidden biases. “Data and data sets are not objective,” she states. “They are creations of human design.”

Google Flu Trends is often held out as something that can only be done on the scale provided by Big Data.  Using aggregated Internet searches to chart the spread of a disease demonstrates how seemingly mundane web browsing can produce new insights, but it is important to recognize the limitations behind the project’s underlying algorithms.  Google Flu Trends got things wrong this year. Why?  As Google admits, not everyone who searches for “flu” is actually sick. This year, due to extensive media coverage, more people than anticipated were using Google to learn more.  The result was that the algorithms behind the scenes began to see signs of the flu’s spread where it didn’t actually exist. Google Flu Trends’ mistake can be excused for a number of reasons: not only is the tool largely a data experiment, but it also has a generally benevolent purpose.  Had a similar algorithm informed a decision by the CDC to quarantine a community or otherwise directly impact individuals, it would be a different conversation. Organizations and individuals need to become more aware of the biases and assumptions that underlie our datafied world.

This requires establishing a data conversation among users. In order to strengthen our understanding of individual privacy without cutting off technological innovation, individuals need to be educated about how their data is used. To start this conversation, we need more transparency. Jules Polonetsky and Omer Tene suggest that organizations should disclose the logic underlying their decision-making processes as best as possible without compromising their algorithmic “secret sauce.” This information has two key benefits: it allows us to monitor how data is used, and it also allows individuals to become more active participants in how their data is used.

Today, the data deluge that Big Data presents encourages a passivity and misguided efforts to get off the grid.  With an “Internet of Things” ranging from our cars to our appliances, even to our carpets, retreating to our homes and turning off our phones will do little to stem the datafication tide. Transparency for transparency’s sake is meaningless.  We need mechanisms to achieve transparency’s benefits. We need to encourage users to see their data as a feature that can be turned on or off, and toggled at will. Letting users declare their own data preferences will encourage individuals to care about what their data says about them and how to actively engage in how their information is processed.

The challenge will be making this process both easily accessible and fun for users. The BlueKai Registry suggests one possible avenue by allowing consumers to see what data companies think about their computer, and Google and Yahoo already offer settings managers for users to select who sees what data. More organizations must think carefully about how best to strike the balance between offering user-friendly and comprehensive controls.

At the same time, transparency also allows experts to police companies in order to monitor, expose, and prevent practices we do not want. Mayer-Schönberger and Cukier call for the rise of the “algorithmist,” a new professional that would evaluate the selection of data sources, the choice of analytical tools, and the algorithms themselves. While offering individuals opportunities to understand and to challenge how decisions about them is important, internal algorithmists alongside the watchful eyes of regulators and privacy advocates can help to ensure that companies are held accountable. This could go a long way toward alleviating fears about Big Data and providing an environment where society can safely maximize its benefits.

New Study Shows Need for De-identification Best Practices

Publically releasing sensitive information is risky.  In 1997, Latanya Sweeney used full date of birth, 5 digit ZIP code, and gender to show that seemingly anonymous medical data could be linked to an actual person when she uncovered the health information of William Weld, the former governor of Massachusetts.   Sweeney in a new study analyzes the data available in the Public Genome Project (PGP) and shows once again that many people can be re-identified by using date of birth, ZIP, and gender, when other data such as a voter registration list is available.

Sweeney’s work is important, but we don’t think it should be considered an indictment of de-identification.   The cases so often cited as proof that de-identification doesn’t work – the AOL Search data release, the Netflix prize, the Weld example and the PGP data – are all examples of barely or very poorly de-identified data.  De-identification experts do NOT consider a publically disclosed database with full date of birth, 5 digit ZIP code, and gender de-identified.  In fact, those three data points divide the US population into over 3 billion unique combinations.  Full date of birth divides a population into over 36 thousand separate groups and ZIP codes further divide the US population into over 43 thousand separate groups.  Publically releasing a database with such a large number of unique combinations allows additional databases to be added and gives attackers all the time in the world to examine the data. Thus, public disclosure greatly increases the risk of identifying individuals from a database.

Sweeney’s study shows the importance of very strong de-identification practices when data is disclosed publically.  With public data, organizations should use very strong de-identification techniques, such as the Privacy Analytics Risk Assessment Tool developed by Dr. Khaled El Emam or the use of differential privacy as proposed by Dr. Cynthia Dwork.

For nonpublic databases, however, strong de-identification techniques may not strike the right balance between data utility and privacy.  When nonpublic databases are protected by both technical and administrative controls, reasonable de-identification techniques, as opposed to very strong de-identification techniques, may be appropriate.  Attackers do not have unlimited time to attempt to break the technical de-identification protection, third party data is not available, and measures are in place to provide legal commitments.  Data breaches can occur of course, but certainly we need to recognize the very different status of protected versus unprotected data and should appreciate the range of protections that can support a de-identification promise.

FPF staff are conducting research exploring the different risk profiles of nonpublic databases and publically released databases and the relevant best practices for “pretty good” de-identification for restricted databases.  Please contact us if you are interested.

 

Do Not Track Hearing Takeaways

Organized by Sen. Rockefeller (D-W. Virginia), who has repeatedly pushed for a “Do Not Track” law, yesterday’s Senate Commerce Commerce Committee hearing  on Do Not Track (DNT) was billed as an opportunity for industry to provide senators with an update on how voluntary DNT standards were proceeding.  Joined by Senators Blumenthal, Heller, McCaskill, and Thune, Sen. Rockefeller engaged in a two hour discussion that touched on not only the state of the online economy and behavioral advertising, but also important consumer privacy concerns.  The hearing produced three key takeaways:

1)     Advertisers and Industry Must Be More Proactive

Advertising and industry groups need to be more proactive in encouraging the DNT process or risk the government imposing its own solution.  Sen. Rockefeller (D-W. Virginia) criticized industry for “deliberately dragging its feet” and “undermin[ing] the very essense of a meaningful Do-Not-Track standard.”

Part of the problem, as FPF’s Jules Polonetsky and Omer Tene have suggested previously, is that there remains wide debate surrounding the question of whether behavioral tracking is a net social good or an unnecessary evil.  Discussions surrounding the technical implementation of DNT “camouflage deep value judgments which have yet to be made,” the pair concludes.

This dilemma was on full display during the hearing.  Sen. Heller (R-Nevada) asked directly whether behavioral tracking was producing any sort of harm, and the panelists explained that this may be the most difficult question of all.  Determining whether tracking produces either quantitative or qualitative harm to consumer privacy is a huge challenge.  “Privacy is a highly subjective condition,” Adam Thierer of the Mercatus Center noted, explaining that behavior we find to be creepy may not be harmful in any real sense.

The Digital Advertising Alliance’s Lou ­Mastria suggested that the question should revolve around user choice, arguing that the DAA was already voluntarily providing a consumer opt-out mechanism largely in line with that Sen. Rockefeller has proposed.

Harvey Anderson, speaking for Mozilla, stated that the DNT debate has mistakenly focused on business revenue models.  Models, he claimed, that lack consumer transparency.  The solution he put forward was for Internet industries to emphasize developing and encouraging trust with consumers.

However, though the World Wide Web Consortium (W3C) provides the perfect forum to hash out technical standards, it is ill-positioned to make these types of privacy value judgments.  The inability of everyone to agree what behaviors are good or bad may be hamstringing the process.

2)     Senators Are Skeptical of the W3C 

Perhaps as a result, senators appear skeptical of the ability of the Word Wide Web Consortium (W3C) to adequately tackle the problem.  Acknowledging that Congress may be ill-equipped to handle complicated technical policy questions, Sen. McCaskill  (D-Missouri) questioned whether  a technical body such as the W3C was the proper forum to be making sweeping Internet policy decisions.  Justin Brookman noted that the W3C already includes all of the major players, and Harvey Anderson explained that the organization was better positioned than regulators or other entities to achieve a technically feasible agreement.

Sen.  Rockefeller remained skeptical.  “The WC3, W3C, whatever it  has no authority whatsoever,” he said, and none of its standards were legal enforceable.  Beyond that, he was worried about the group’s generally slow progress at developing a self-regulatory framework for DNT.

Theirer, a frequent critic of the process, defended the W3C, emphasizing that developing technical standards, let alone establishing Internet policy, is incredibly challenging work.

Peter Swire, a senior fellow at FPF and the co-chair of the W3C DNT standards process, wrote in advance of the hearing that failure to come to a negotiated standard threatens a “new digital arms race.”  Further, he warned that failure at the W3C would lead to a government imposed solution, and if yesterday’s hearing was any indication, this is an avenue several senators want to explore.

3)     There Is Some Enthusiasm to Explore Legislative or Regulatory Solutions

Indeed, Sen. Rockefeller appears eager to pursue a legislative response.  He has reintroduced his Do-Not-Track Online Act, but it is worth noting that the bill currently only has on co-sponsor:  Sen. Richard Blumenthal (D-Conn.). Thus, it is unclear how successful Sen. Rockefeller’s effort will be.  For his part, Sen. Blumenthal, who also sits on the committee, was left wondering what sort of action might be required by either Congress or the FTC to spur the DNT process along.

“If voluntary agreements are not forthcoming, is it time for a law?” Sen. Blumenthal (D-Conn.) asked.

While the panelists did not directly address this question, the general sentiment was that stakeholders were on a path to finding a solution without congressional involvement.  Justin Brookman from the Center for Democracy & Technology noted that part of the problem remains that the United States simply lacks any sort of comprehensive privacy law to provide a baseline.  DNT receives much attention, but it is hardly “the worst thing out there,” he suggested.

Nonetheless, even as panelists pushed for more time, all eyes will be on the W3C’s next meeting among all the major stakeholders on May 6.

Peter Swire's Op-Ed on Do Not Track

FPF Senior Fellow and the Ohio State University Moritz College of Law Professor Peter Swire wrote an Op-Ed today for Wired on “How To Prevent the ‘Do Not Track’ Arms Race.” The article highlights the challenges of implementation and the need for a multistakeholder negotiated Do Not Track standard.

 

Techworld: Our Internet Privacy is at risk – but not dead (yet)

With this year declared, “The Year of Privacy on Steroids” companies, policy makers and professional experts alike agree that privacy is essential but the real conversation on the matter is, where is the sliver lining?

Future of Privacy Forum’s own, Jules Polonetsky, shared his own professional expertise on the topic specifically when it comes to companies tracking user’s online behavior and their attempt to self-regulation.

To read the article click here.