0 - 2014 - Future of Privacy Forum

Thoughts on the Data Innovation Pledge

Yesterday, as he accepted the IAPP Privacy Vanguard award, Intel’s David Hoffman made a “data innovation pledge” that he would work only to promote ethical and innovative uses of data. As someone who only relatively recently entered the privacy world by diving headfirst into the sea of challenges surrounding big data, I think an affirmative pledge of the sort David is proposing is a great idea.

While pledges can be accused of being mere rhetorical flourishes, words do matter. A simple pledge can communicate a good deal — and engage the public in a way that drives conversation forward. Think of Google’s early motto to “don’t be evil.” For years this commitment fueled a large reservoir of trust that many have for the company. With every new product and service that Google releases, its viewed through the lens of whether or not its “evil.” That places a high standard on the folks at Google, and for that, we should pleased.

Of course, pledges present obligations and challenges. Using data only for good presents a host a new questions. As FPF explored in our whitepaper on benefit-risk analysis for Big Data, there are different aspects to consider when evaluating the benefits of data use — and some of these factors are largely subjective. Ethics is a broad field, and it also exposes the challenging philosophical underpinnings of privacy.

The very concept of privacy has always been a philosophical conundrum, but so much of the rise of the privacy profession has focused on compliance issues and the day-to-day reality of data protection. But today, we’re swimming in a sea of data, and all of this information makes us more and more transparent to governments, industry, and each other. It’s the perfect catalyst to consider what the value of “privacy” truly is. Privacy as an information on/off switch may be untenable, but privacy as a broader ethical code makes a lot of sense.

There are models to learn from. As David points out, other professions are bound by ethical codes, and much of that seeps into how we think about privacy. Doctors not only pledge to do no harm, but they also pledge to keep our confidences about our most embarrassing or serious health concerns. Questionable practices around innovation and data in the medical field led to review boards to protect patients and human test subjects and reaffirmed the role of every medical professional to do no evil.

Similar efforts are needed today as everything from our wristwatches to our cars is “datafied.” In particular, I think about all of the debates that have swirled around the use of technology in the classroom in recent years. A data innovation pledge could help relieve worried parents. If Monday’s FTC workshop is indication, similar ethical conversations may even be needed for everyday marketing and advertising.

The fact is that there are a host of different data uses that could benefit from greater public confidence. A data innovation pledge is a good first start. There is no question that companies need to do more to show the public how they are promoting innovative and ethical uses of data. Getting that balance right is tough, but here’s to privacy professionals helping to lead that effort!

-Joseph Jerome, Policy Counsel

FTC Wants Tools to Increase Transparency and Trust in Big Data

However we want to define “Big Data” – and the FTC’s latest workshop on the subject suggests a consensus definition remains elusive – the path forward seems to call for more transparency and the establishment of firmer frameworks on the use of data. As Chairwoman Ramirez suggested in her opening remarks, Big Data calls for a serious conversation about “industry’s ethical obligations as stewards of information detailing nearly every facet of consumers’ lives.”

Part of that challenge is that some Big Data uses are often “discriminatory”. Highlighting findings from his paper on Big Data and discrimination, Solon Barocas began the workshop by noting that whole point of data mining is to differentiate and to draw distinctions. In effect, Big Data is a rational form of discrimination, driven by apparent statistical relationships rather than any capriciousness. When humans introduce unintentional biases into the data, there is no ready solution at a technical or legal level. Barocas called for a conversation for lawyers and public policy makers to have a conversation with the technologists and computer scientists working directly with data analytics – a sentiment echoed when panelists realized a predictive analytics conference was going on simultaneously across town.

But the key takeaway from the workshop wasn’t that Big Data could be used as tool to exclude or include. Everyone in the civil rights community agreed that data could be a good thing, and a number of examples were put forth to suggest once more that data had the potential to be used for good or for ill. Pam Dixon of the World Privacy Forum classifying individuals creates a “data paradox,” where the same data can be used to help or to harm that individual. For our part, FPF released a report alongside the Anti-Defamation League detailing Big Data’s ability to combat discrimination. Instead, there was considerable desire to understand more about industry’s approach to big data. FTC staff repeatedly asked not just for more positive uses of big data by the private sector, but inquired as to what degree of transparency would help policy makers understand Big Data decision-making.

FTC Chief Technologist Latanya Sweeney followed up her study that suggested web searches for African-American names were more likely than searches of white-sounding names to return ads suggesting the person had an arrest record by looking at credit card advertising and website demographics. Sweeney presented evidence that advertisements for harshly criticized credit cards were often directed to the homepage of Omega Psi Phi, a popular black fraternity.

danah boyd observed that there was a general lack of transparency about how Big Data is being used within industry, for a variety of complex reasons. FTC staff and Kristin Amerling from Senate Commerce singled out the opacity surrounding the practices of data brokers when describing some of the obstacles being faced when policy makers try to under how Big Data is being used.

Moreover, while consumers and policy makers are trying to grapple with what companies are doing with their streams of data, industry is also placed in the difficult position of making huge decisions about how that data can be used. For example, boyd cited the challenges JPMorgan Chase faces when using analytics to evaluate human trafficking. She applauded the positive work the company was doing, but noted that expecting it to have the ability or expertise to effectively intervene in trafficking perhaps asks too much. They don’t know when to intervene or whether to contact law enforcement or social services.

These questions are outside the scope of their expertise, but even general use of Big Data can prove challenging for companies. “A lot of the big names are trying their best, but they don’t always know what the best practices should be,” she concluded.

FTC Commissioner Brill explained that her support for a legislative approach to increase transparency and accountability among data brokers, their data sources, and their consumers, was to help consumers and policy makers “begin to understand how these profiles are being used in fact, and whether and under what circumstances they are harming vulnerable populations.” In the meantime, she encouraged industry to take more proactive steps. Specifically, she recommended again that data brokers explore how their clients are using their information and take steps to prevent any inappropriate uses and further inform the public. Companies can begin this work now, and provide all of us with greater insight into – and greater assurances about – their models,” she concluded.

A number of legal regimes may already apply to Big Data, however. Laws that govern the provision of credit, housing, and employment will likely play a role in the Big Data ecosystem. Carol Miaskoff at the Equal Employment Opportunity Commission suggested there was real potential with Big Data to gather information about successful employees and use that to screen people for employment in a way that exacerbates prejudices built into the data. Emphasizing his recent white paper, Peter Swire suggested there were analogies to be made between sectoral regulation in privacy and sectoral legislation in anti-discrimination law. With existing legal laws in place, he argued that it was past time to “go do the research and see what those laws cover” in the context of Big Data.

“Data is the economic lubricant of the economy,” the Better Business Bureau’s C. Lee Peeler argued, and he supported the FTC’s continued efforts to explore the subject of Big Data. He cited earlier efforts by the Commission to examine inner-city marketing practices, which produced a number of best practices still valid today. He encouraged the FTC to look at what companies are doing with Big Data on a self-regulatory basis as a basis for developing workable solutions to potential problems.

So what is the path forward? Because Big Data is, in the words of Promontory’s Michael Spadea, a nascent industry, there is a very real need for guidelines on not just how to evaluate the risks and benefits of Big Data but also how to understand what is ethically appropriate for business. Chris Wolf highlighted FPF’s recent Data-Benefit Analysis and suggested companies were already engaged in detailed analysis of the use of Big Data, though everyone recognized that businesses practices and trade secrets precluded making much of this public.

FTC staff noted there was a “transparency hurdle” to get over in Big Data. Recognizing that “dumping tons of information” onto consumers would be unhelpful, staff picked up on Swire’s suggestion that industry needed some mechanism to justify what is going on to either regulators or self-regulatory bodies. Spadea argued that “the answer isn’t more transparency, but better transparency.” The Electronic Frontier Foundation’s Jeremy Gillula recognized the challenge companies face revealing their “secret sauce,” but encouraged them to look at more way to give consumer more general information about what was going on. Otherwise, he recommended, consumers ought to collect big data on big data and turn data analysis back on data brokers and industry at large through open-source efforts.

At the same time, Institutional Review Boards, which are used in human subject testing research, were again proposed as a model for how companies can begin affirmatively working through these problems. Citing a KPMG report, Chris Wolf insisted that strong governance regimes, including “a strong ethical code, along with process, training, people, and metrics,” were essential to confront the many ethical and philosophical challenges that flirted around the day’s discussions.

Jessica Rich, the Director on the FTC’s Consumer Protection Bureau, cautioned that the FTC would be watching. In the meantime, industry is on notice. The need for clearer data governance frameworks is clear, and careful consideration of Big Data project should be both reflexive and something every industry privacy professional talks about.

//

Other Relevant Reading from the Workshop:

iOS 8 and Privacy: Major New Privacy Features

iOS 8 includes several new privacy features founded on Apple’s core privacy principles of consent, choice and transparency. With these principles in mind, Apple created and incorporated increasingly granular controls for location, opportunities for developers to communicate to users how and why they use data, and limits on how third parties can track your device.

Users now have greater visibility regarding application access to location information.

In previous versions of iOS, apps could prompt users for permission to use Location Services, and, once a user gave an app access, the app could access the user’s location any time it was running, including when the app was not on screen (i.e. in the background). In iOS 8, Location Services has two modes: “While Using the App” – whereby the app can only access location when the app is on screen or made visible to a user by iOS making the status bar blue – or “Always.” Apps have to decide which Location Services mode to request and are encouraged by Apple to only request access “Always” location permission when users would “thank them for doing so.” In fact, iOS 8 will at times present a reminder notice to users if an app that has “Always” location permission uses Location Services while the app is not on screen.

Users will be able to limit access to their contacts.

In iOS 8, users can use a picker, controlled and mediated by iOS, that allows users to share a specific contact with an app without giving the app access to their entire address book.

Apps will be able to link directly to privacy settings.

With iOS 8, apps will be able to link directly to their settings, including their privacy settings, making it easier for users to control their privacy. Before, apps could only give instructions on how to go to the phone’s settings to change the privacy controls. This new feature makes control over privacy settings more accessible to users.

Apple’s new Health app implements additional protections for user’s health data.

Apple’s new Health app and HealthKit APIs give third party health and fitness apps a secure location to store their data and gives users an easy-to-read dashboard for their health and ﬁtness data. Apple has implemented a number of features and safeguards to protect user privacy. First, a user has full control as to which apps can input data into Health and which apps can access Health data. Second, all Health data on the iOS device is encrypted with keys protected by a user’s passcode. Finally, developers are required to obtain express user consent before sharing Health data with third parties, and even then they may only do so for the limited purpose of providing health or fitness services to the user. These features and restrictions allow users to have control over their HealthKit data.

Apple requires apps accessing sensitive data to have a privacy policy disclosing their practices to users

Apple requires apps that utilize the HealthKit or HomeAPIs, offer third party keyboards, or target kids, to have a privacy policy, supporting industry standards and California law. App privacy policies should include what data is collected, what the app plans to do with that data, and if the app plans to share it with any third parties, who they are. Users will be able to see the privacy policy on the App Store before and after downloading an app.

iOS 8 places additional emphasis on disclosure of why developers want access to data.

Apple strongly encourages developers to explain why their apps request a user’s data or location when a user is prompted to give an app access. Developers can do so in “purpose strings,” which are part of the notice that appears when an app first tries to access a protected data class.

Apple’s iOS encourages a “just in time” model, where users should be prompted for access after they take an action in an app that requires the data. The “just in time” prompt and access flow is mediated by iOS and replaces consent models such as those consisting of strings of permissions that pop up after installation like a conga line or users having to give an app access to all data if they want to use an app. Moreover, Apple continues its practice of encouraging app developers to only ask for access to data when needed, and to gracefully handle not getting permission to access a user’s data.

MAC address randomization makes it more difficult to track and individualize iOS devices.

Wi-Fi enabled devices generally scan for available wireless networks. These scans include the device Media Access Control (MAC) address, which is a 12 character string of letters and numbers, required by networking standards to identify a device on a network and assigned by the manufacturer. Mobile Location Analytic companies have, at times, relied on these scans, and the fact that Wi-Fi devices’ MAC addresses do not change, to track individual mobile devices as they move around a venue.

In iOS 8, Apple devices will generate and use random MAC addresses to passively scan for networks, shielding users’ true MAC addresses until a user decides to associate with a specific network. Randomizing MAC addresses makes this kind of tracking much more difficult. However, your device can still be tracked when you are connected to a Wi-Fi network or using Bluetooth. FPF’s Mobile Location Analytics Code of Conduct governs the practices of the leading location analytics companies and provides an opt-out from mobile location tracking. Visit Smart-Places for more details or to opt-out.

Summary

iOS 8’s new “prompting with purpose” disclosures, refined location settings, strict requirements for HealthHit, HomeKit, and kids apps, and MAC address randomization will present greater transparency, protection, and control over privacy for iOS users.

Lessons from Fair Lending Law for Fair Marketing and Big Data

Where discrimination presents a real threat, big data need not necessary lead us to a new frontier. Existing laws, including the Equal Credit Opportunity Act and other fair lending laws, provide a number of protections that are relevant when big data is used for online marketing related to lending, housing, and employment. In comments to be presented at the FTC public workshop, Professor Peter Swire will discuss his work in progress entitled Lessons from Fair Lending Law for Fair Marketing and Big Data. Swire explains that fair lending laws already provide guidance as to how to approach discrimination that allegedly has an illegitimate, disparate impact on protected classes. Data actually plays an important role in being able to assess whether a disparate impact exists! Once a disparate impact is shown, the burden shifts to creditors to show their actions have a legitimate business need and that no less reasonable alternative exists. Fair lending enforcement has encouraged the development of rigorous compliance mechanisms, self-testing procedures, and a range of proactive measures by creditors.

Big Data: A Tool for Fighting Discrimination and Empowering Groups

Even as big data uses are examined for evidence of facilitating unfair and unlawful discrimination, data can help to fight discrimination. It is already being used in myriad ways to protect and to empower vulnerable groups in society. In partnership with the Anti-Defamation League, FPF prepared a report that looked at how businesses, governments, and civil society organizations are leveraging data to provide access to job markets, to uncover discriminatory practices, and to develop new tools to improve education and provide public assistance.

Big Data: A Tool for Fighting Discrimination and Empowering Groups explains that although big data can introduce hidden biases into information, it can also help dispel existing biases that impair access to good jobs, good education, and opportunity.

Big Data: A Benefit and Risk Analysis

On September 11, 2014, FPF released a whitepaper we hope will help to frame the big data conversation moving forward and promote better understanding of how big data can shape our lives. Big Data: A Benefit and Risk Analysis provides a practical guide for how benefits can be assessed in the future, but they also show how data is already is being used in the present.

Privacy professionals have become experts at evaluating risk, but moving forward with big data will require rigorous analysis of project benefits to go along with traditional privacy risk assessments. We believe companies or researchers need tools that can help evaluate the cases for the benefits of significant new data uses. Big Data: A Benefit and Risk Analysis is intended to help companies assess the “raw value” of new uses of big data. Particularly as data projects involve the use of health information or location data, more detailed benefit analyses that clearly identify the beneficiaries of a data project, its size and scope, and that take into account the probability of success and evolving community standards are needed. We hope this guide will be a helpful tool to ensure that projects go through a process of careful consideration.

Identifying both benefits and risks is a concept grounded in existing law. For example, the Federal Trade Commission weighs the benefits to consumers when evaluating whether business practices are unfair or not. Similarly, the European Article 29 Data Protection Working Party has applied a balancing test to evaluate legitimacy of data processing under the European Data Protection Directive. Big data promises to be a challenging balancing act.

Click here to read the full document: Big Data: A Benefit and Risk Analysis.

Data Protection Law Errors in Google Spain LS, Google Inc. v. Agencia Espanola de Proteccion de Datos, Mario Costeja Gonzalez

The following is a guest post by Scott D. Goss, Senior Privacy Counsel, Qualcomm Incorporated, addressing the recent “Right to be Forgotten” decision by the European Court of Justice.

There has been quite a bit of discussion surrounding the European Court of Justice’s judgment in Google Spain LS, Google Inc. v. Agencia Espanola de Proteccion de Datos (AEPD), Mario Costeja Gonzalez. In particular, some interesting perspectives have been shared by Daniel Solove, Ann Cavoukian and Christopher Wolf, and Martin Husovec. The ruling has been so controversial, newly appointed EU Justice Commissioner, Martine Reicherts delivered a speech defending it. I’d like to add to the discussion.[1] Rather than focusing on the decision’s policy implications or on the practicalities of implementing the Court’s ruling, I’d like to instead offer thoughts on a few points of data protection law.

To start, I don’t think “right to be forgotten” is an apt description of the decision, and instead distorts the discussion. Even if Google were to follow the Court’s ruling to the letter, the information doesn’t cease to exist on the Internet. Rather, the implementation of the Court’s ruling just makes internet content linked to peoples’ names harder to find. The ruling, therefore, could be thought of as, “the right to hide”. Alternatively, the decision could be described as, “the right to force search engines to inaccurately generate results.” I recognize that such a description doesn’t roll off the tongue quite so simply, but I’ll explain why that description is appropriate below.

I believe the Court made a few important legal errors that should be of interest to all businesses that process personal data. First was the Court’s determination that Google was a “controller” as defined under EU data protection law and second was the application of the information relevance question. Then, I’ll explain why “the right to force search engines to inaccurately generate results” may be a more appropriate description of the Court’s ruling.

1. “Controller” status must be determined from the activity giving rise to the complaint

To understand how the Court erred in determining that Google is a “controller” in this case, it helps to understand how search engines work. At a conceptual level, search is comprised of three primary data processing activities: (i) caching all the available content, (ii) indexing the content, and (iii) ranking the content. During the initial caching phase, a search engine’s robot minions scour the Internet noting all the content on the Internet and its location. The cache can be copies of all or parts of the web pages on the Internet. The cache is then indexed to enable much faster searching by sorting the content. Indexing is important because without it searching would take immense computing power and significant time for each page of the Internet to be examined for users’ search queries. Finally, the content within the index is ranked for relevance.

From a data processing perspective, I believe that caching and indexing achieve two objective goals: Determining the available content of the internet and where can it be found. Tellingly, the only time web pages are not cached and indexed is when website publishers, not search engines, include a special code on their web pages instructing search engines to ignore the page. This special code is called robots.txt

The web pages that are cached and indexed could be the text of the Gettysburg address, the biography of Dr. Martin Luther King, Jr., the secret recipe for Coca-Cola, or newspaper articles that include peoples’ names. It is simply a fact that the letters comprising the name “Mario Costeja Gonzalez” could be found on certain web pages. Search engines cannot control that fact any more than they could take a picture of the sky and be said to control the clouds in the picture.

After creating the cache and index, the next step involves ranking the content. Search engine companies employ legions of the world’s best minds and immense resources to determine rank order. Such ranking is subjective and takes judgment. Arguably, ranking search results could be considered a “controller” activity, but the ranking of search results was never at issue in the Costeja Gonzalez case. This is a key point underlying the Court’s errors. Mr. Costeja Gonzalez’s complaint was not that Google ranked search results about him too high (i.e., Google’s search result ranking activity), but rather that the search engine indexed the information at all. The appropriate question, therefore, is whether Google is the “controller” of the index. The question of whether Google’s process of ranking search results confers “controller” status on Google is irrelevant. The Court’s error was to conflate Google’s activity of ranking search results with its caching and indexing of the Internet.

Some may defend the Court by arguing that controller status of some activities automatically anoints controller status on all activities. This would be error. The Article 29 Working Party opined,

[T]he role of processor does not stem from the nature of an entity processing data but from its concrete activities in a specific context. In other words, the same entity may act at the same time as a controller for certain processing operations and as a processor for others, and the qualification as controller or processor has to be assessed with regard to specific sets of data or operations.

Opinion 1/2010 on the concepts of “controller” and “processor”, page 25, emphasis added. In this case, Mr. Costeja Gonzalez’s complaint focused on the presence of certain articles about him in the index. Therefore, the “concrete activities in a specific context” is the act of creating the index and the “specific sets of data” is the index itself. The Article 29 WP went on to give an example of an entity acting as both a controller and a processor of the same data set:

An ISP providing hosting services is in principle a processor for the personal data published online by its customers, who use this ISP for their website hosting and maintenance. If however, the ISP further processes for its own purposes the data contained on the websites then it is the data controller with regard to that specific processing.

I submit that creation of the index is analogous to an ISP hosting service. In creating an index, search engines create a copy of everything on the Internet, sort it, and identify its location. These are objective, computational exercises, not activities where the personal data is noted as such and treated with some separate set of processing. Following the Article 29 Working Party opinion, search engines could be considered processors in the caching and indexing of Internet content because such activities are mere objective and computational exercises, but controllers in the ranking of the content due to the subjective and independent analysis involved.

Further, as argued in the Opinion of Advocate General Jaaskinen, a controller needs to recognize that they are processing personal data and have some intention to process it as personal data. (See paragraph 82). It is the web publishers who decide what content goes into the index. Not only do they have discretion in deciding to publish the content on the Internet in the first instance, but they also have the ability to add the robots.txt code to their web pages which directs search engines to not cache and index. The mark of a controller is one who “determines the purposes and means of the processing of personal data.” (Art. 2, Dir. 95/46 EC). In creation of the index, rather than “determining”, search engines are identifying the activities of others (website publishers) and heeding their instructions (use or non-use of robots.txt). I believe such processing cannot, as a matter of law, rise to the level of “controller” activities.

Finding Google to be a “controller” may have been correct if either the facts or the complaint had been different. Had Mr. Costeja Gonzalez produced evidence that: (i) the web pages he wanted removed contained the “robots.txt” instruction or, (ii) the particular web pages were removed from the Internet by the publisher but not by Google in its search results, then it may be appropriate to hold Google as a “controller” due to these independent activities. Such facts would be similar to the example given by the Article 29 Working Party of an ISP’s independent use of personal data maintained by its web hosting customers. Similarly, had Mr. Costeja Gonzalez’s complaint been that search results regarding his prior bankruptcy been ranked too high, then I could understand (albeit I may still disagree) that Google would be found to be a controller. But that was not his complaint. His complaint was that certain information was included in the index at all – and for that, I believe, Google should have no more control over than it has in the content of the Internet itself.

2. “Relevance” of Personal Data must be evaluated in light of the purpose of the processing.

The Court’s second error arose in the application of the controller’s obligations. Interestingly, after finding that Google is the controller of the index, the Court incorrectly applied the relevancy question. To be processed legitimately, personal data must be “relevant and not excessive in relation to the purposes for which they are collected and/or further processed.” Directive 95/46 EC, Article 6(c) (emphasis added). Relevancy is thus a question in relation to the purpose of the controller – not as to the data subject, a customer, or anyone else. The purpose of the index, in Google’s own words, is to “organize the world’s information and make it universally accessible and useful.” (https://www.google.com/about/company/). With that purpose in mind, all information on the Internet is, by definition, relevant. While clearly there are legal boundaries to the information that Google can make available, the issue is whether privacy law contains one of those boundaries. I suggest that in the context of caching and creating an index of the Internet, it is not.

The court found that Google legitimizes its data processing under the legitimate interest test of Article 7(f) of the Directive. Google’s legitimate interests must be balanced against the data subjects’ fundamental rights under Article 1(1). Since Article 1(1) provides no guidance as to what those rights are (other than “fundamental”), the Court looks to subparagraph (a) of the first paragraph of Article 14. That provision provides data subjects with a right to object to data processing of their personal data, but offers little guidance as to when controllers must oblige. Specifically, it provides in cases of legitimate interest processing, a data subject may,

“object at any time on compelling legitimate grounds relating to his particular situation to the processing of data relating to him, save where otherwise provided by national legislation. Where there is a justified objection, the processing instigated by the controller may no longer involve those data.”

What are those “compelling legitimate grounds” for a “justified objection”? The Court relies on Article 12(b) “the rectification, erasure, or blocking of data the processing of which does not comply with the provisions of this Directive, in particular because of the incomplete or inaccurate nature of the data”. It is here the Court erred.

The Court took the phrase “incomplete or inaccurate nature of the data” and erroneously applied it to the interests of the data subject. Specifically, the Court held that the question is whether the search results were “incomplete or inaccurate” representations of the data subject as he/she exists today. I submit that was not the intent of Article 12(b). Rather, Article 12(b) was referring back to the same use of that phrase in Article 6 providing that:

“personal data must be: . . .(d) accurate and, where necessary, kept up to date; every reasonable step must be taken to ensure that data which are inaccurate or incomplete, having regard to the purposes for which they were collected or for which they are further processed, are erased or rectified.

The question is not whether the search results are “incomplete or inaccurate” representations of Mr. Costeja Gonzalez, but whether the search results are inaccurate as to the purpose of the processing. The purpose of the processing is to copy, sort, and organize the information on the internet. In this case, queries for the characters “Mario Costeja Gonzalez,” displayed articles that he admits were actually published on the Internet. Such results, therefore, are by definition not incomplete or inaccurate as to the purpose of the data processing activity. To put it simply, the Court applied the relevancy test to the wrong party (Mr. Costeja Gonzalez) as opposed to Google and the purpose of its index.

To explain by analogy, examine the same legal tests applied to a credit reporting agency. A credit reporting analogy is helpful because it also has at least three parties involved in the transaction. In the case of the search engine, those parties are the search engine, the data subject, and the end user conducting the search. In the case of credit reporting, the three parties involved are the credit ratings businesses, the consumers who are rated (i.e., the data subjects), and the lenders and other institutions that purchase the reports. It is well-established law that consumers can object to information used by credit ratings businesses as being outdated, irrelevant, or inaccurate. The rationale for this right is found in Article 12(f) and Article 6 in relation to the purpose of the credit reporting processing activity.

The purpose of credit reporting is to provide lenders an opinion on the credit worthiness of the data subject. The credit ratings business must take care that the information they use is not “inaccurate or incomplete” or they jeopardize the purpose of their data processing by generating an erroneous credit score. For example, if a credit reporting agency collected information about consumers’ height or weight, consumers would be able to legitimately object. Consumers’ objections would not be founded on the fact that the information is not representative of who they are – indeed such data may be completely accurate and current. Instead, consumers’ objections would be founded on the fact that height or weight are not relevant for the purpose of assessing consumers’ credit worthiness.

Returning to the Costeja Gonzalez case, the issue was whether the index (not the ranking of such results) should include particular web pages containing the name of Mr. Costeja Gonzalez. Since the Court previously determined that Google was the “controller” of the index (which I contend was error), the Court should have determined Google’s purpose of the index and then set the inquiry as to whether the contested web pages were incorrect, inadequate, irrelevant or excessive as to Google’s purpose. As discussed above, Goolge’s professed goal is to enable the discovery of the world’s information and to that end the purpose of the index is to, as much as technologically possible, catalog the entire Internet – all the good, bad, and ugly. For that purpose, any content on the Internet about the key words “Mario Costeja Gonzalez” is, by definition, not incorrect, inadequate, irrelevant or excessive because the goal is to index everything. Instead, however, the Court erred by asking whether the web pages were incorrect, inadequate, irrelevant or excessive as to Mr. Costeja Gonzalez, the data subject. Appling the relevancy question as to Mr. Costeja Gonzalez is, well, not relevant.

Some may argue that the Court recognized the purpose of the processing when making the relevancy determination by finding that Mr. Costeja Gonzalez’s rights must be balanced against the public’s right to know. By including the public’s interest in the relevancy evaluation, some may argue, the Court has appropriately directed the relevancy inquiry to the right parties. I disagree. First, I do not believe it was appropriate to inquire as to the relevancy of the links vis-a-vis Mr. Costeja Gonzalez in the first instance and, therefore, to balance it against other interests (in this case the public) does not cure the error. Secondly, to weigh the interests of the public, one must presume that the purpose of searching for individuals’ names is to obtain correct, relevant, not inadequate and not excessive information. I do not believe such presumptions are well-founded. For example, someone searching for “Scott Goss” may be searching for all current, relevant, and non-excessive information about me. On the other hand, fifteen years from now perhaps someone is searching for all privacy articles written in 2014 and they happened to know that I wrote one and so searched using my name. One cannot presume to know the purpose of an individual’s search query other than a desire to have access to all the information on the Internet containing the query term.

If not the search engines, where would it be pertinent to ask the question of whether information on the Internet about Mr. Costeja Gonzalez was incorrect, inadequate, irrelevant or excessive? The answer to this question is clear: it is the entities that have undertaken the purpose of publishing information about Mr. Costeja Gonzales. Specifically, website publishers process personal data for the purpose of informing their readers about those individuals. The website publishers, therefore, have the burden to ensure that such information is not incorrect, inadequate, irrelevant or excessive as to Mr. Costeja Gonzales. That there may be an exception in data protection law for web publishers, does not mean that courts should be free to foist obligations onto search engines.

3. The right to force search engines to inaccurately generate results

Finally, the “right to force search engines to inaccurately generate results” is, I believe, an apt description of the ruling. A search engine’s cache and index is supposed to contain the entire web’s information that web publishers want the world to know. Users expect that search engines will identify all information responsive to their queries when they search. Users further expect that search engines will rank all the results based upon their determination of the relevancy of the results in relation to the query. The Court’s ruling forces search engines to generate an incomplete list of search results by gathering all information relevant to the search and then pretending that certain information on the Internet doesn’t exist on the Internet at all. The offending content is still on the Internet, people just cannot rely on finding the content using individuals’ names entered into search engines (at least the search engines on European country-coded domains).

[1] These thoughts are my own and not the company for which I work and I do not profess to be an expert in search technologies or the arguments made by the parties in the case.

FERPA | SHERPA: Providing a Guide to Education Privacy Issues

Education is changing. New technologies are allowing information to flow within classrooms, school, and beyond, enabling new learning environments and new tools to understanding and improve the way teachers teach and students learn. At the same time, however, the confluence of enhanced data collection with highly sensitive information children and teens also makes for a combustive mix from a privacy perspective. Even the White House recognizes this challenge! Its recent Big Data Review specifically highlighted the need for responsible innovation in education.

There are many organizations – many of which we’ve partnered with – working tirelessly to privacy issues in education and provide the best experience for students. So too is the Department of Education. Yet these resources are scattered. The need for an education privacy resource clearinghouse is clear. With “back to school” now in full-swing, we thought it a great time to launch FERPA|SHERPA. The site – named after the core federal law that governs education privacy – aims to provide a one-stop shop for education privacy-related offerings of interest to parents and schools, as well as education service providers and the policymakers struggling to grapple with the legal landscape.

Everyone in the educational ecosystem has a role to play here, lest legitimate privacy concerns combine with other worries to overwhelm the benefits of education technologies and the expanded use of student data. One need only look at the recent collapse of inBloom – a new technology platform that school systems were clamoring for until a combination of poor communication and privacy fears came to dominate any and all conversations about the underlying technologies – as an example of the need for schools and the companies they partner with to better address education privacy issues.

To ensure parents have a voice in the ongoing privacy debate, the site will also host a blog written by parent privacy advocate Olga Garcia-Kaplan, a Brooklyn, NY public school parent of three children.

Additionally, we’re also releasing an education privacy whitepaper by Jules Polonetsky, our executive director, and Omer Tene, Vice President, Research & Education, IAPP, that analyzes the opportunities and challenges of data-driven education technologies and how key stakeholders should address them. The piece – “The Ethics of Student Privacy: Building Trust for Ed Tech” – was recently published in a special issue of the International Review of Information Ethics, “The Digital Future of Education.”

We hope FERPA | SHERPA will help get everyone on the same page when it comes to privacy issues around student data. We would love your feedback and thoughts on the new site, and we look forward to helping to jump start conversations about education privacy in the new school year. If we’ve missed something or you’d like to join our effort, please reach out to [email protected].

Future of Privacy Forum Launches One-Stop Shop Website for Student Privacy

FUTURE OF PRIVACY FORUM LAUNCHES ONE-STOP SHOP WEBSITE FOR STUDENT PRIVACY

FPF Urges Parents, Teachers and Policymakers to Follow the “ABC’s” of Education Privacy & Make “D” for Data Protection

WASHINGTON, D.C. – August 21, 2014 – As schools increasingly rely on data to improve education, and as teachers increasingly rely on technology in the classroom to improve the learning experience, privacy concerns are being raised about the collection and use of student data. With ‘back to school’ now in full-swing, and to address both the promise and challenges surrounding privacy and data in education, the Future of Privacy Forum (FPF) today unveiled a first-of-its-kind, one-stop shop resource website providing parents, school officials, policymakers, and service providers easy access to the laws, standards and guidelines that are essential to understanding student privacy issues and navigating a responsible path to managing student data with trust, integrity, and transparency.

More than at any other time in the evolution of education, data-driven innovations and use of emerging technologies – such as online textbooks, apps, tablets and mobile devices, and internet-based learning – are bringing advances and critical improvements in teaching and learning, with profound implications.

At the same time, the increased use of vendors and data is matched by the need for heightened responsibility to manage and safeguard student data and implement policies that benefit education and minimize risk. Concerns are being raised about how student data is collected and used in a next-stage learning ecosystem buzzing with social media, mobile devices, central databases, student records, Big Data, and an array of vendors and software.

The new, resource-rich FERPA|SHERPA website – named after the core federal law that governs education privacy – seeks to address these opportunities and concerns. The unique site hosts a comprehensive, digital dashboard of quality education privacy-related offerings for four distinct audiences: parents, service providers, schools, and policymakers.

To ensure parents have a voice in the ongoing privacy debate, the site will also host a blog written by parent privacy advocate Olga Garcia-Kaplan, a Brooklyn, NY public school parent of three children.

Some of the assets available at FERPA|SHERPA include:

Vendor quick tips – for app and software developers
Overview and explanation of relevant federal laws and policies on student data – such as FERPA, COPPA, PPRA, ESRA and CIPA
Policy papers about education privacy
Clearinghouse of education websites and resources for parents and school administrators
Topical blog that brings a parent’s perspective to the many facets of privacy issues related to learning and education
Ongoing expertise provided by FPF staff, partners, and other stakeholders to help shape and guide understanding of data privacy and responsible use

“Getting privacy right in student education requires a partnership of trust between families, teachers and schools, technology companies and education officials,” said Jules Polonetsky, executive director, FPF. “Any weak link in this chain of responsibility could undermine education and risk student data. With FERPA|SHERPA, we are making sure that the laws and best practices are easy to find.”

“Since our creation, Edmodo has been focused on safeguarding user privacy, and we’re excited to partner with FPF on this effort to provide schools, teachers, and parents with great resources about student privacy issues,” said Aden Fine, chief privacy officer of Edmodo. “Education is critical to addressing questions about privacy, and we think the FERPA|SHERPA website will really help the public better understand these complicated issues.”

“Parents have to sort through a tremendous amount of information issued about student data privacy to learn how and why data compiled pertaining to their children may be used. As a parent, the FERPA|SHERPA site is an invaluable resource for obtaining timely, accurate and impartial information necessary to understand this evolving landscape,” said Olga Garcia-Kaplan, parent and advocate for student data privacy.

“Educational leaders, service providers, parents and policy makers increasingly need accurate and reliable information on privacy issues. For too long, it has been a real challenge to find that information. The Future of Privacy Forum’s new FERPA|SHERPA is a great starting place to find what you need,” said Keith Krueger, CEO, Consortium for School Networking.

Protecting student data and privacy involves navigating myriad regulations, policies, and practices,” said Marsali Hancock, CEO & President, iKeepSafe. “iKeepSafe has worked with schools, parents, students, and industry to promote safe and effective use of technology, and we are thrilled that FERPA|SHERPA is providing these stakeholders with additional resources on important laws and best practices to protect student data.”

The FERPA|SHERPA website initiative – which began in the fall of 2013 – is the first of many offerings generated by the FPF on education privacy, which began as the FPF invested its privacy expertise and leveraged staff talent in education issues and subsequently developed a comprehensive education privacy campaign with wide stakeholder engagement – including parents, teachers, school administrators, trade associations, and leading education and technology companies in the private sector.

In addition, the FPF today released an education privacy whitepaper that has been published in a special issue of the International Review of Information Ethics, “The Digital Future of Education.” The piece – “The Ethics of Student Privacy: Building Trust for Ed Tech” – is authored by Polonetsky and Omer Tene, Vice President, Research & Education, IAPP, and analyzes the opportunities and challenges of data-driven education technologies and how key stakeholders should address them.

About the Future of Privacy Forum

The Future of Privacy Forum (FPF) is a Washington, DC-based think tank that seeks to advance responsible data practices. The forum is led by internet privacy experts Jules Polonetsky and Christopher Wolf and includes an advisory board comprised of leading figures from industry, academia, law and advocacy groups. Visit fpf.org.

Media Contact:

Nicholas Graham, for Future of Privacy Forum

571-291-2967

[email protected]

FPF Statement on Today's Safe Harbor Complaint

Today, the Center for Digital Democracy filed a complaint with the Federal Trade Commission, alleging that companies are violating the U.S.-EU Safe Harbor agreement. CDD’s filing came with a report criticizing the practices of thirty companies.

“We are carefully reviewing the report’s claims, but the dozen we have examined so far seem to reflect the authors distaste for marketing, rather than legal safe harbor violations,” said Jules Polonetsky, Executive Director, Future of Privacy Forum.

The Future of Privacy Forum has long focused on the value of the Safe Harbor agreement, and issued a comprehensive report on the framework last fall.