Privacy Chutzpah: A Story for the Onion?

I recently received an email promoting a campaign by a group called Some Of Us, an organization that generates petitions opposing various activities of large companies. This campaign was directed at Facebook, calling on the social network to not sell user data to advertisers. Facebook has recently announced plans to allow advertisers to target ads to Facebook users based on the web sites users have visited. Facebook is not selling user data to advertisers, but I can understand the confusion. Behavioral advertising is complicated, and although selling user data to advertisers is very different than choosing ads for users based on their web surfing, it’s not uncommon for critics to use broad language to blast targeted ads in general.

How Ads Work on Facebook from Facebook on Vimeo.

The surprise was what I found when I examined the privacy policy for the Some of Us site. In a move worthy of an Onion fake news story, the Some of Us policy discloses that it works with ad networks to retarget ads to users on the web after they visit the Some of Us site. Yup! Some of Us does exactly what it is calling on users to protest to Facebook. A quick scan of the site using popular tracking cookie scanner Ghostery finds the code for several ad companies, including leading data broker Axciom.

Some of Us also complains that the Facebook opt-out process, where Facebook links users to the industry central opt-out site found at aboutads.info, is too tedious. But Some of Us doesn’t even bother to provide its visitors with a link or an url to opt-out, as the behavioral advertising code enforced by the Better Business Bureau requires. Some of Us just tells visitors they can visit the Network Advertising Initiative opt out page, leaving them to research how to find the opt-out page on their own.

It gets better. Some of Us solicits users emails and names for petitions, but only if you read the site privacy policy will you learn that signing the petition adds you to the email list for future emails from Some of Us about other causes. The site privacy policy also explains the use of email web bugs that enable Some of Us to personally track when and if the recipients of emails open and read the emails.

I am used to reading stories in the media blasting behavioral ads on the home pages of newspapers embedded with dozens of web trackers. Reporters don’t run the web sites of newspapers, and although they might want to consider whether the ad tracking they consider odious is funding their salaries, they can credibly argue that the business side of media and reporting are separate worlds. But how can an advocacy group blast behavioral ads while targeting behavioral ads to users who come to sign a petition against behavioral ads?!!!

I signed the petition and was immediately taken to a page where Some of Us encouraged me to share the news with my friends on Facebook.

-Jules Polonetsky, Executive DirectorThis post originally appeared on LinkedIn

Mexico Takes Step Toward Data Privacy Interoperability

Last week, the Mexican Institute for Federal Access to Information (IFAI) hosted an event in Mexico City to discuss the recently-announced “Parameters of Self-Regulation for the Protection of Personal Data.”  FPF participated in this workshop along with representatives from the Mexican government, TRUSTe, EuroPriSe and the Better Business Bureau.

As described in opening remarks by the Secretary for Data Protection, under the new regulation, IFAI now has the authority to recognize codes of conduct for data protection and has developed a process through which an organization can be recognized as a certifying body for these codes.  Under the new regulation, the Mexican Accreditation Agency will make a determination on applicant organizations against a set recognition criteria.  Successful applicants will then receive formal recognition as certifying entities from the Ministry of the Economy.

This approach mirrors the process developed as part of the Asia Pacific Economic Cooperation’s (APEC) Cross Border Privacy Rules (CBPR) system in several key ways.  First, the certifying organizations contemplated under this approach serve the same function as “Accountability Agents” under the CBPR system.  In addition, both approaches require a formal recognition based on established criteria.  And second, the standards to which these organizations will be certifying companies are both keyed to Mexico’s Federal Law on the Protection of Personal Information  (the legal basis for Mexico’s participation in the CBPR system).  Given these parallels in both process and substance, a company that receives CBPR certification in Mexico should also be able to attain recognition under this approach.  But perhaps most importantly, CBPR certification should allow a company to avail itself of the incentives offered under Mexican law.

Article 68 of the implementing regulations of the privacy law encourages the development of self-regulatory frameworks and states that participation in a recognized framework (such as the CBPR system) will be taken into account in order to determine any reduction in sanctions determined by IFAI in the event of a violation of the privacy law.

What makes this development so critical to global interoperability is that it serves as a model for other APEC member economies to consider how an enforceable code of conduct based on an international standard can be successfully incorporated into a legal regime – including extending express benefits to certified companies.  It remains to be seen how other APEC economies  will manage this task – but Mexico’s approach offers a promising start.

-Josh Harris, Policy Director 

"Gambling? In This Casino?" Jules and Omer on the Facebook Experiment

Today, Re/code ran an essay by Jules Polonetsky and Omer Tene, offering their take on the Facebook’s now-infamous experiment looking at the effects of tweaking the amount of positive or negative comments on a user’s News Feed:

As the companies that serve us play an increasingly intimate role in our lives, understanding how they shape their services to influence users has become a vexing policy issue. Data can be used for control and discrimination or utilized to support fairness and freedom. Establishing a process for ethical decision-making is key to ensuring that the benefits of data exceed their costs.

FPFcast: Stalking and the Location Privacy Protection Act with Cindy Southworth

June 30, 2014: Stalking and the Location Privacy Protection Act

[audio

In this podcast, FPF Policy Counsel Joseph Jerome talks with Cindy Southworth from the National Network to End Domestic Violence about stalking apps and how Senator Franken’s proposed bill might curtail their use.

Click on the media player above to listen, or download the complete podcast here.

Synopsis: Education Privacy Hearing—How Data Mining Threatens Student Privacy

Yesterday, the House of Representatives Education Subcommittee on Early Childhood, Elementary, and Secondary Education and the Homeland Security’s Subcommittee on Cybersecurity, Infrastructure Protection, and Security Technologies held a joint hearing to discuss “How Data Mining Threatens Student Privacy.”

Four witnesses presented testimony from a number of perspectives:

(1) Joel R. Reidenberg, Chair and Professor of Law and Founding Academic Director of the Center on Law and Information Policy at Fordham University School of Law; (2) Mark MacCarthy, Vice President of Public Policy at the Software and Information Industry Association; (3) Joyce Popp, Chief Information Officer at the Idaho State Department of Education; and (4) Thomas Murray, State and District Digital Learning Policy and Advocacy Director at the Alliance for Excellent Education.

Rep. Patrick Meehan, Chairman of the Cybersecurity Subcommittee, opened the hearing by noting that technology is increasingly used in a positive way to enhance student learning both in-and-out of the classroom, which was echoed by Rep. Todd Rokita in his opening remarks. The Subcommittees’ Ranking Members, Reps. Yvette Clark and Dave Loebsack, honed in on privacy concerns. They specifically cited the need to ensure that companies contracted to examine student data for the purpose of improving individual learning are not also scanning the data for improper commercial gain.

Each witness made strong points in their opening testimony.  Joel Reidenberg highlighted many of the themes in his December 2013 study about school contracts with third party service providers, and regulatory gaps in FERPA and COPPA.  Mark MacCarthy provided the industry point of view, by noting that presently there are significant protections in place for student data through existing federal laws, state efforts, and contract protections between schools and vendors.  He explained that although FERPA is an old law, it has been updated a number of times with additional guidance by the Department of Education—which industry members abide by.  Joyce Popp brought a unique and practical perspective to the hearing.  She discussed practices that have been well received and effective in Idaho, such as a state policy demanding that schools document student data collection and provide notice to parents through their websites.  Additionally, Popp emphasized Idaho Senate Bill 1372, ending the practice of allowing public education vendors to use the verbiage “own” as it related to student data.  Finally, Thomas Murray began by explaining that too few students’ graduate high school on time, and argued that this could be combated by enabling teachers to use individual student data to keep more students on track.

During the question-and-answer phase of the hearing, the Chairs and Ranking Members asked a number of questions.  Chairman Meehan was concerned about just how much information is getting into the hands of third parties, and wondered what could be done to ensure this information was not used to make potential hiring decisions about students after graduation.  Ranking Member Clark acknowledged that most vendor companies are probably not doing anything wrong, but broached the difficult topic of how to regulate potential bad actors.  Chairman Rokita returned to Idaho Senate Bill 1372, and the potential of using it as model language.  He also sought more information about using Title II funds to support oversight and enforcement of student privacy rules and regulations within schools.  Ranking Member Loebsack focused his questions on finding a balance between innovation and privacy.  He pressed on the issue that this is not an “either/or,” but rather an “and” game—and that because we must use data to improve education, we must also demand greater accountability from teachers, schools, and third party vendors.  In other words, increased data collection and use requires increased data protection and security.

Representatives Roe and Bonamici also chimed in to the conversation.  Rep. Roe noted that data mining takes place everywhere, citing his own supermarket saver card as an example. Rep. Bonamici responded by saying that even though data is collected everywhere today, the education space presents special consideration. Student data collection should be treated differently because it is not always clear that collection is occurring, and the content of the information is highly sensitive and about minors.

Additionally, Members sought to resolve differences in testimony given by Joel Reidenberg and Mark MacCarthy.  While MacCarthy stated that no new federal legislation is necessary because plenty of penalties already exist, Reidenbergnoted that protections under FERPA are limited. He pointed to the fact that the law’s penalties have never been used against a single school. On the issue of contracts, both witnesses seemed to agree that many school vendor contracts do not expressly prohibit third party commercial use or sharing, but both also agreed that there is no evidence that vendors are actually using student information inappropriately.

In closing remarks, some believed that the fact that people clearly disagree about what present law covers is evidence that Congress has a place to review the state of student privacy regulation to determine what, if anything, needs to be done.  Thomas Murray got the final word—where he reminded everyone of the enormous benefits already emerging from education tech, and pled that whatever the next step is that it should not stifle innovation.

FPF Statement on Today's Joint Subcommittee Hearing on Education Privacy

One of the most important sections of the Administration’s recent report on Big Data concerns was focused on education technology and privacy. The report noted the need to ensure that innovations in educational technology, including new approaches and business models, have ample opportunity to flourish.

Many of these benefits include robust tools to improve teaching and instructional methods; diagnose students’ strengths and weaknesses and adjust materials and approaches for individual learners; identify at-risk students so teachers and counselors can intervene early; and rationalize resource allocation and procurement decisions. Today, students can access materials, collaborate with each other, and complete homework all online.

Some of these new technologies and uses of data raise privacy concerns. Schools may not have the proper contracts in place to protect data and restrict uses of information by third parties. Many school officials may not even have an understanding of all the data they hold. As privacy expert Daniel Solove has noted, privacy infrastructure in K-12 schools is lacking. Without this support, some schools and vendors may not understand their obligations under student privacy laws such as COPPA, FERPA, and PPRA.

The Future of Privacy Forum believes it is critical that schools are provided with the help needed to build the capacity for data governance, training of essential personnel, and basic auditing. Schools must ensure additional data transparency to engender trust, tapping into innovative solutions such as digital backpacks, and providing parent friendly communications that explain how technology and data are used in schools.

Representatives Jared Polis and Luke Messer have called for bipartisan action on student data privacy, and the Future of Privacy Forum looks forward to working with them on their efforts.

Without measures to help parents see clearly how data are used to help their children succeed, the debate about data in education will remain polarized. With such measures in place, ed tech can be further harnessed to bridge educational inequalities, better tailor solutions for individual student needs, and provide objective metrics for measurement and improvement.

Striking a nuanced and thoughtful balance between harnessing digital innovation in education, while taking into account the need to protect student privacy, will help ensure trust, transparency, and progress in our education paradigm for years to come.

-Jules Polonetsky, Executive Director

Making Perfect De-Identification the Enemy of Good De-Identification

This week, Ann Cavoukian and Dan Castro waded into the de-identification debate with a new whitepaper, arguing that the risk of re-identification has been greatly exaggerated and that de-identification will play a central role in the age of big data. FPF has repeatedly called for the need for informed conversations about what practical de-identification requires, and while part of the challenge is that terms like de-identification or “anonymization” have come to mean very different things to different stakeholders, privacy advocates have effectively made perfection the enemy of the good when it comes to de-identifying data.

Cavoukian and Castro highlight the oft-cited re-identification of Netflix users as an example of how re-identification risks have been overblown. Researchers were able to compare data released by Netflix with records available on the Internet Movie Database in order to uncover the identities of Netflix users.  While this example highlights the challenges facing organizations when they release large public datasets, it is easy to ignore that only two out of 480,189 Netflix users were successfully identified in this fashion. That’s a 0.0004 percent re-identification rate – that’s only a little bit worse than anyone’s odds of being struck by lightning.*

De-identification’s limitations are often conflated with a lack of trust in how organization’s handle data in general. Most of the big examples of re-identification, like the Netflix example, focus on publicly-released datasets. When data is released into the wild, organizations need to be extremely careful; once data is out there anyone with the time, energy, or technological capability has the opportunity to try to re-identify the dataset. There’s no question that companies have made mistakes when it comes to making their data widely available to the public.

But focusing on publicly-released information does not describe the entire universe of data that exists today. In reality, much data is never released publicly. Instead, de-identification is often paired with a variety of administrative and procedural safeguards that govern how individuals and organizations can use data. When used in combination, bad actors must (1) circumvent administrative restraints and (2) then re-identify any data before getting any value from their malfeasance. As a matter of simple statistics, the probability of breaching both sets of controls and successfully re-identifying data in a non-public database is low.

De-identification critics remain skeptical. Some have argued that any potential ability to reconnect information to an individual’s personal identify suggests inadequate de-identification. Perfect unlinkability may be an impossible standard, but this argument is less an attack on the efficacy of de-identification than it is a manifestation of a lack of trust. When some suggest we ignore privacy, it makes it easier for critics to not trust how businesses protect data. Fights about de-identification thus became a proxy for how much to trust industry.

In the process, discussions about how to advance practical de-identification are lost. As a privacy community, we should fight over exactly what de-identification means. FPF is currently engaged in just such a scoping project. Recognizing that there are many different standards for how academics, advocates, and industry understand “de-identified” data should be the start of a serious discussion about what we expect out of de-identification, not casting aside the concept altogether. Perfect de-identification may be impossible, but good de-identification isn’t.

-Joseph Jerome, Policy Counsel

* Daniel Barth-Jones notes that I’ve compared the Netflix re-identification study to the annual risk of being hit by lightning and responds as follows:

This was an excellent and timely piece, but there’s a fact that should be corrected because this greatly diminishes the actual impact of the statistic you’ve cited. The article cites the fact that only two out of 480,189 Netflix users were successfully identified using the IMDb data, which rounds to a 0.0004 percent (i.e., 0.000004 or 1/240,000) re-identification risk. This is correct, but then the piece goes on to say “that’s only a little bit worse than anyone’s odds of being struck by lightning.” Which, without further explanation, is likely to be misconstrued.

The blog author cites the annual risk for being hit by lightning (which is, of course, exceedingly small). However, the way most people probably think about lightning risk is not “what’s the risk of being hit in the next year”, but rather “what’s my risk of ever being hit by lightning”? While estimates of the lifetime risk of being hit by lightning vary slightly (according to the precision of the formulas used to calculate this estimate), one’s lifetime odds of being hit by lightning is somewhere between 1 in 6,250 and 1 in 10,000, so even if you went with the more conservative number here, the risk being re-identified by the Netflix attack was only 1/24 of your lifetime risk of being hit by lighting (assuming you’ll make to age 80 without something else getting you). This is truly a risk at a magnitude that no one rationally worries about.

Although the evidence-base provided by the Netflix re-identification was extremely thin, the algorithm is intelligently designed and it will be helpful to the furtherance of sound development of public policy to see what the re-identification potential is for such an algorithm with a real-world sparse dataset (perhaps medical data?) for a randomly selected data sample when examined with some justifiable starting assumptions regarding the extent of realistic data intruder background knowledge (which should reasonably account for practical data divergence issues).

Aspen Institute Task Force Releases “Learner at the Center of a Networked World”

The Aspen Institute launched a Task Force on Learning and the Internet to better understand how young people are learning today, and to address how to optimize that learning through innovation. With the support of the John D. and Catherine T. MacArthur Foundation, the Aspen Institute gathered a group of 20 respected minds in the fields of technology, public policy, education, business, and privacy to develop a comprehensive report.

Yesterday, the report was made public.  It can be accessed digitally, and the launch event is also available online.

Synopsis:

The Task Force provides a starting point for the kinds of actions policy makers, education officials, and industry can take to move the education system forward.

The Pillars of the Report are:

(1)   Learners need to be at the center of new learning networks

(2)   Every student should have access to learning networks

(3)   Learning networks need to be interoperable

(4)   Learners should have the digital literacies necessary to utilize media and safeguard themselves

(5)   Students should have safe and trusted environments for learning

As it relates to privacy, the report focuses on creating trusted environments. “The Task Force recognizes that many of the benefits of data and technology require parents to be confident student information will be handled ethically and responsibly. The report’s call for policies, tools and practices that build a framework of trust is exactly the right prescription to address education privacy concerns,” Jules Polonetsky noted. The core of that being ensuring (1) safety, (2) privacy, and (3) security.  The Task Force set out several principles intended to guide the process of developing a trusted environment.

Additionally, the report briefly examines the effectiveness of existing privacy laws like COPPA, CIPA, and FERPA.  For instance, there is some attention placed on the unintended consequences and ineffectiveness of COPPA’s restrictions: including how the law has caused some websites to bar underage children from using their services, and the illogical age trigger which leaves children over the age of 13 unprotected.  Moving forward, the Task Force recommends that policy makers “base their deliberation on evidence-based research” and encourages funders “to support researchers, legal scholars, and panels of expert to develop new approaches, tools, and practices.”

Similar to the White House Big Data and Privacy Working Group Review, the Task Force believes in balancing privacy and innovation.  Appropriate safeguards must be in place to protect learners, while not impeding access to high quality education.  Specifically, the report suggests that there needs to be a re-examination of federal and state regulations governing the collection and access to student education data.

In building a trust framework for students, the Task Force sees “privacy by design” as the key to designing, implementing, and evaluating technologies that engender trust.  One suggestion is “a tool that allows student access to their own data to encourage agency and allow the student to help define their learning pathway,” similar to electronic health care records.  Another thought was to have service providers and app developers “provide in-service user education on how to manage one’s privacy and safety.”

Finally, the report encourages funding public awareness campaigns that will help inform safe and responsible on and off-line behavior, and further the effect of those campaigns with corresponding risk prevention education that will afford students the know-how to protect themselves online through media, digital, and social-emotional literacy tools.

Notable Announcements from the Launch Event:

Wall Street Journal: MLA-Driven Approach to Airport Wait Times

On Wednesday, The Wall Street Journal published an article about long lines at U.S. customs in airports around the country, and what airlines are doing to shorten them. They include a spreadsheet where you can see the kind information that’s been collected thanks to Mobile Location Analytics (“MLA”) technology. This is just one example of how MLA is being used to provide insights to venues and make life easier for consumers, or in this case, travelers.

Interest Based Ads and More Transparency

Facebook Ads

Facebook wasn’t doing interest based advertising until now?  Huh?

Most users of Facebook know that the ads they see are selected by Facebook based on information on their profile, what they have “liked” and interests they have selected.  Most have also noticed that if they visit a web site off Facebook like Zappos, they may get “retargeted” ads on Facebook for Zappos. Similarly, Facebook works with online and offline retailers to help them buy ads on Facebook aimed at users who have been their customers.

Today Facebook, with much fanfare, has announced that it is launching an interest based advertising program. What’s new? Well, the one thing Facebook hasn’t been doing is selling ads targeted based on the web sites and apps you use outside of Facebook. An individual advertiser could buy an ad, based on your visit to a particular site – but many advertisers couldn’t buy an ad based on your visits to many sites. Now they can.

Got it? Ads on Facebook are selected in an attempt to make them relevant based on your profile, and your activity off of Facebook. And now they will use more activity off Facebook.

What is new is a major new effort to show users extensive detail about the many categories that are used to select ads, and to let users add or edit many categories of interest. This is one of the most extensive moves to give users a deep look at the data used to target ads that we have seen and should make some users feel more in control of the experience.

Don’t like it?  Click on the icon on every targeted ad and turn off the interest based targeting. On mobile, use the limit ad tracking settings on iOS or Android (which will actually tell all apps you dont want interest based ads, not just Facebook).