6 - 2014 - Future of Privacy Forum

FPFcast: Stalking and the Location Privacy Protection Act with Cindy Southworth

June 30, 2014: Stalking and the Location Privacy Protection Act

[audio

In this podcast, FPF Policy Counsel Joseph Jerome talks with Cindy Southworth from the National Network to End Domestic Violence about stalking apps and how Senator Franken’s proposed bill might curtail their use.

Click on the media player above to listen, or download the complete podcast here.

Synopsis: Education Privacy Hearing—How Data Mining Threatens Student Privacy

Yesterday, the House of Representatives Education Subcommittee on Early Childhood, Elementary, and Secondary Education and the Homeland Security’s Subcommittee on Cybersecurity, Infrastructure Protection, and Security Technologies held a joint hearing to discuss “How Data Mining Threatens Student Privacy.”

Four witnesses presented testimony from a number of perspectives:

(1) Joel R. Reidenberg, Chair and Professor of Law and Founding Academic Director of the Center on Law and Information Policy at Fordham University School of Law; (2) Mark MacCarthy, Vice President of Public Policy at the Software and Information Industry Association; (3) Joyce Popp, Chief Information Officer at the Idaho State Department of Education; and (4) Thomas Murray, State and District Digital Learning Policy and Advocacy Director at the Alliance for Excellent Education.

Rep. Patrick Meehan, Chairman of the Cybersecurity Subcommittee, opened the hearing by noting that technology is increasingly used in a positive way to enhance student learning both in-and-out of the classroom, which was echoed by Rep. Todd Rokita in his opening remarks. The Subcommittees’ Ranking Members, Reps. Yvette Clark and Dave Loebsack, honed in on privacy concerns. They specifically cited the need to ensure that companies contracted to examine student data for the purpose of improving individual learning are not also scanning the data for improper commercial gain.

Each witness made strong points in their opening testimony. Joel Reidenberg highlighted many of the themes in his December 2013 study about school contracts with third party service providers, and regulatory gaps in FERPA and COPPA. Mark MacCarthy provided the industry point of view, by noting that presently there are significant protections in place for student data through existing federal laws, state efforts, and contract protections between schools and vendors. He explained that although FERPA is an old law, it has been updated a number of times with additional guidance by the Department of Education—which industry members abide by. Joyce Popp brought a unique and practical perspective to the hearing. She discussed practices that have been well received and effective in Idaho, such as a state policy demanding that schools document student data collection and provide notice to parents through their websites. Additionally, Popp emphasized Idaho Senate Bill 1372, ending the practice of allowing public education vendors to use the verbiage “own” as it related to student data. Finally, Thomas Murray began by explaining that too few students’ graduate high school on time, and argued that this could be combated by enabling teachers to use individual student data to keep more students on track.

During the question-and-answer phase of the hearing, the Chairs and Ranking Members asked a number of questions. Chairman Meehan was concerned about just how much information is getting into the hands of third parties, and wondered what could be done to ensure this information was not used to make potential hiring decisions about students after graduation. Ranking Member Clark acknowledged that most vendor companies are probably not doing anything wrong, but broached the difficult topic of how to regulate potential bad actors. Chairman Rokita returned to Idaho Senate Bill 1372, and the potential of using it as model language. He also sought more information about using Title II funds to support oversight and enforcement of student privacy rules and regulations within schools. Ranking Member Loebsack focused his questions on finding a balance between innovation and privacy. He pressed on the issue that this is not an “either/or,” but rather an “and” game—and that because we must use data to improve education, we must also demand greater accountability from teachers, schools, and third party vendors. In other words, increased data collection and use requires increased data protection and security.

Representatives Roe and Bonamici also chimed in to the conversation. Rep. Roe noted that data mining takes place everywhere, citing his own supermarket saver card as an example. Rep. Bonamici responded by saying that even though data is collected everywhere today, the education space presents special consideration. Student data collection should be treated differently because it is not always clear that collection is occurring, and the content of the information is highly sensitive and about minors.

Additionally, Members sought to resolve differences in testimony given by Joel Reidenberg and Mark MacCarthy. While MacCarthy stated that no new federal legislation is necessary because plenty of penalties already exist, Reidenbergnoted that protections under FERPA are limited. He pointed to the fact that the law’s penalties have never been used against a single school. On the issue of contracts, both witnesses seemed to agree that many school vendor contracts do not expressly prohibit third party commercial use or sharing, but both also agreed that there is no evidence that vendors are actually using student information inappropriately.

In closing remarks, some believed that the fact that people clearly disagree about what present law covers is evidence that Congress has a place to review the state of student privacy regulation to determine what, if anything, needs to be done. Thomas Murray got the final word—where he reminded everyone of the enormous benefits already emerging from education tech, and pled that whatever the next step is that it should not stifle innovation.

FPF Statement on Today's Joint Subcommittee Hearing on Education Privacy

One of the most important sections of the Administration’s recent report on Big Data concerns was focused on education technology and privacy. The report noted the need to ensure that innovations in educational technology, including new approaches and business models, have ample opportunity to flourish.

Many of these benefits include robust tools to improve teaching and instructional methods; diagnose students’ strengths and weaknesses and adjust materials and approaches for individual learners; identify at-risk students so teachers and counselors can intervene early; and rationalize resource allocation and procurement decisions. Today, students can access materials, collaborate with each other, and complete homework all online.

Some of these new technologies and uses of data raise privacy concerns. Schools may not have the proper contracts in place to protect data and restrict uses of information by third parties. Many school officials may not even have an understanding of all the data they hold. As privacy expert Daniel Solove has noted, privacy infrastructure in K-12 schools is lacking. Without this support, some schools and vendors may not understand their obligations under student privacy laws such as COPPA, FERPA, and PPRA.

The Future of Privacy Forum believes it is critical that schools are provided with the help needed to build the capacity for data governance, training of essential personnel, and basic auditing. Schools must ensure additional data transparency to engender trust, tapping into innovative solutions such as digital backpacks, and providing parent friendly communications that explain how technology and data are used in schools.

Representatives Jared Polis and Luke Messer have called for bipartisan action on student data privacy, and the Future of Privacy Forum looks forward to working with them on their efforts.

Without measures to help parents see clearly how data are used to help their children succeed, the debate about data in education will remain polarized. With such measures in place, ed tech can be further harnessed to bridge educational inequalities, better tailor solutions for individual student needs, and provide objective metrics for measurement and improvement.

Striking a nuanced and thoughtful balance between harnessing digital innovation in education, while taking into account the need to protect student privacy, will help ensure trust, transparency, and progress in our education paradigm for years to come.

-Jules Polonetsky, Executive Director

Making Perfect De-Identification the Enemy of Good De-Identification

This week, Ann Cavoukian and Dan Castro waded into the de-identification debate with a new whitepaper, arguing that the risk of re-identification has been greatly exaggerated and that de-identification will play a central role in the age of big data. FPF has repeatedly called for the need for informed conversations about what practical de-identification requires, and while part of the challenge is that terms like de-identification or “anonymization” have come to mean very different things to different stakeholders, privacy advocates have effectively made perfection the enemy of the good when it comes to de-identifying data.

Cavoukian and Castro highlight the oft-cited re-identification of Netflix users as an example of how re-identification risks have been overblown. Researchers were able to compare data released by Netflix with records available on the Internet Movie Database in order to uncover the identities of Netflix users. While this example highlights the challenges facing organizations when they release large public datasets, it is easy to ignore that only two out of 480,189 Netflix users were successfully identified in this fashion. That’s a 0.0004 percent re-identification rate – that’s only a little bit worse than anyone’s odds of being struck by lightning.*

De-identification’s limitations are often conflated with a lack of trust in how organization’s handle data in general. Most of the big examples of re-identification, like the Netflix example, focus on publicly-released datasets. When data is released into the wild, organizations need to be extremely careful; once data is out there anyone with the time, energy, or technological capability has the opportunity to try to re-identify the dataset. There’s no question that companies have made mistakes when it comes to making their data widely available to the public.

But focusing on publicly-released information does not describe the entire universe of data that exists today. In reality, much data is never released publicly. Instead, de-identification is often paired with a variety of administrative and procedural safeguards that govern how individuals and organizations can use data. When used in combination, bad actors must (1) circumvent administrative restraints and (2) then re-identify any data before getting any value from their malfeasance. As a matter of simple statistics, the probability of breaching both sets of controls and successfully re-identifying data in a non-public database is low.

De-identification critics remain skeptical. Some have argued that any potential ability to reconnect information to an individual’s personal identify suggests inadequate de-identification. Perfect unlinkability may be an impossible standard, but this argument is less an attack on the efficacy of de-identification than it is a manifestation of a lack of trust. When some suggest we ignore privacy, it makes it easier for critics to not trust how businesses protect data. Fights about de-identification thus became a proxy for how much to trust industry.

In the process, discussions about how to advance practical de-identification are lost. As a privacy community, we should fight over exactly what de-identification means. FPF is currently engaged in just such a scoping project. Recognizing that there are many different standards for how academics, advocates, and industry understand “de-identified” data should be the start of a serious discussion about what we expect out of de-identification, not casting aside the concept altogether. Perfect de-identification may be impossible, but good de-identification isn’t.

-Joseph Jerome, Policy Counsel

* Daniel Barth-Jones notes that I’ve compared the Netflix re-identification study to the annual risk of being hit by lightning and responds as follows:

This was an excellent and timely piece, but there’s a fact that should be corrected because this greatly diminishes the actual impact of the statistic you’ve cited. The article cites the fact that only two out of 480,189 Netflix users were successfully identified using the IMDb data, which rounds to a 0.0004 percent (i.e., 0.000004 or 1/240,000) re-identification risk. This is correct, but then the piece goes on to say “that’s only a little bit worse than anyone’s odds of being struck by lightning.” Which, without further explanation, is likely to be misconstrued.

The blog author cites the annual risk for being hit by lightning (which is, of course, exceedingly small). However, the way most people probably think about lightning risk is not “what’s the risk of being hit in the next year”, but rather “what’s my risk of ever being hit by lightning”? While estimates of the lifetime risk of being hit by lightning vary slightly (according to the precision of the formulas used to calculate this estimate), one’s lifetime odds of being hit by lightning is somewhere between 1 in 6,250 and 1 in 10,000, so even if you went with the more conservative number here, the risk being re-identified by the Netflix attack was only 1/24 of your lifetime risk of being hit by lighting (assuming you’ll make to age 80 without something else getting you). This is truly a risk at a magnitude that no one rationally worries about.

Although the evidence-base provided by the Netflix re-identification was extremely thin, the algorithm is intelligently designed and it will be helpful to the furtherance of sound development of public policy to see what the re-identification potential is for such an algorithm with a real-world sparse dataset (perhaps medical data?) for a randomly selected data sample when examined with some justifiable starting assumptions regarding the extent of realistic data intruder background knowledge (which should reasonably account for practical data divergence issues).

Aspen Institute Task Force Releases “Learner at the Center of a Networked World”

The Aspen Institute launched a Task Force on Learning and the Internet to better understand how young people are learning today, and to address how to optimize that learning through innovation. With the support of the John D. and Catherine T. MacArthur Foundation, the Aspen Institute gathered a group of 20 respected minds in the fields of technology, public policy, education, business, and privacy to develop a comprehensive report.

Yesterday, the report was made public. It can be accessed digitally, and the launch event is also available online.

Synopsis:

The Task Force provides a starting point for the kinds of actions policy makers, education officials, and industry can take to move the education system forward.

The Pillars of the Report are:

(1) Learners need to be at the center of new learning networks

(2) Every student should have access to learning networks

(3) Learning networks need to be interoperable

(4) Learners should have the digital literacies necessary to utilize media and safeguard themselves

(5) Students should have safe and trusted environments for learning

As it relates to privacy, the report focuses on creating trusted environments. “The Task Force recognizes that many of the benefits of data and technology require parents to be confident student information will be handled ethically and responsibly. The report’s call for policies, tools and practices that build a framework of trust is exactly the right prescription to address education privacy concerns,” Jules Polonetsky noted. The core of that being ensuring (1) safety, (2) privacy, and (3) security. The Task Force set out several principles intended to guide the process of developing a trusted environment.

Additionally, the report briefly examines the effectiveness of existing privacy laws like COPPA, CIPA, and FERPA. For instance, there is some attention placed on the unintended consequences and ineffectiveness of COPPA’s restrictions: including how the law has caused some websites to bar underage children from using their services, and the illogical age trigger which leaves children over the age of 13 unprotected. Moving forward, the Task Force recommends that policy makers “base their deliberation on evidence-based research” and encourages funders “to support researchers, legal scholars, and panels of expert to develop new approaches, tools, and practices.”

Similar to the White House Big Data and Privacy Working Group Review, the Task Force believes in balancing privacy and innovation. Appropriate safeguards must be in place to protect learners, while not impeding access to high quality education. Specifically, the report suggests that there needs to be a re-examination of federal and state regulations governing the collection and access to student education data.

In building a trust framework for students, the Task Force sees “privacy by design” as the key to designing, implementing, and evaluating technologies that engender trust. One suggestion is “a tool that allows student access to their own data to encourage agency and allow the student to help define their learning pathway,” similar to electronic health care records. Another thought was to have service providers and app developers “provide in-service user education on how to manage one’s privacy and safety.”

Finally, the report encourages funding public awareness campaigns that will help inform safe and responsible on and off-line behavior, and further the effect of those campaigns with corresponding risk prevention education that will afford students the know-how to protect themselves online through media, digital, and social-emotional literacy tools.

Notable Announcements from the Launch Event:

The Fifth HASTAC/MacArthur Foundation Digital Media and Learning Competition will be focused on the “Trust Challenge” this year. Contenders are to develop digital tools, systems, and platforms that foster trust, safety, and privacy.
On behalf of Mozilla Foundation, An-Me Chung announced several commitments to fulfill the action steps laid out in the Task Force report. The non-profit seeks to provide pathways to quality contents, and build digital literacies for both adults and youth. Lastly, Mozilla is intending to fund a new credentialing system in the form of an open badge infrastructure that would enable crediting new skills acquired across formal and informal learning contexts.

Wall Street Journal: MLA-Driven Approach to Airport Wait Times

On Wednesday, The Wall Street Journal published an article about long lines at U.S. customs in airports around the country, and what airlines are doing to shorten them. They include a spreadsheet where you can see the kind information that’s been collected thanks to Mobile Location Analytics (“MLA”) technology. This is just one example of how MLA is being used to provide insights to venues and make life easier for consumers, or in this case, travelers.

Interest Based Ads and More Transparency

Facebook Ads

Facebook wasn’t doing interest based advertising until now? Huh?

Most users of Facebook know that the ads they see are selected by Facebook based on information on their profile, what they have “liked” and interests they have selected. Most have also noticed that if they visit a web site off Facebook like Zappos, they may get “retargeted” ads on Facebook for Zappos. Similarly, Facebook works with online and offline retailers to help them buy ads on Facebook aimed at users who have been their customers.

Today Facebook, with much fanfare, has announced that it is launching an interest based advertising program. What’s new? Well, the one thing Facebook hasn’t been doing is selling ads targeted based on the web sites and apps you use outside of Facebook. An individual advertiser could buy an ad, based on your visit to a particular site – but many advertisers couldn’t buy an ad based on your visits to many sites. Now they can.

Got it? Ads on Facebook are selected in an attempt to make them relevant based on your profile, and your activity off of Facebook. And now they will use more activity off Facebook.

What is new is a major new effort to show users extensive detail about the many categories that are used to select ads, and to let users add or edit many categories of interest. This is one of the most extensive moves to give users a deep look at the data used to target ads that we have seen and should make some users feel more in control of the experience.

Don’t like it? Click on the icon on every targeted ad and turn off the interest based targeting. On mobile, use the limit ad tracking settings on iOS or Android (which will actually tell all apps you dont want interest based ads, not just Facebook).

Comments to the FTC on Consumer Generated Health Data

Today, the Future of Privacy Forum submitted comments to the FTC on the privacy issues surrounding consumer generated health data. Noting that innovative products and services that allow consumers to generate and manage their own health information are increasingly becoming part of how consumers manage their broader health care, FPF discusses the need for a thoughtful approach to applying core privacy principles in a way that is context-specific, use-based, and focused on real harms to consumers.

Seeking Submissions for Privacy Papers for Policy Makers 2014

FPF is pleased to invite privacy scholars, professionals, and others with an interest in privacy issues to submit papers to be considered for inclusion in FPF’s fifth annual edition of “Privacy Papers for Policy Makers.”

The purpose of Privacy Papers for Policy Makers is to present policy makers with highlights of important research and analytical work on a variety of a privacy topics. Specifically, we wish to showcase papers that analyze cutting-edge privacy issues, and propose either achievable short-term solutions or new means of analysis that could lead to solutions.

Academics, privacy advocates and Chief Privacy Officers on FPF’s Advisory Board will review the submitted papers to determine which papers are best suited and most useful for policy makers in Congress, at federal agencies and for distribution to data protection authorities internationally. Selected papers will be presented at an event with privacy leaders in the Fall, and will be included in a printed digest that will be distributed to policy makers.

The entry can provide a link to a published paper or a draft paper that has a publication data. FPF will work with authors of the selected papers to develop a digest.

Our deadline for submissions is July 31, 2014. Please include the author’s full name, phone number, current postal address, and e-mail address.

Please send submissions via e-mail to [email protected] with the subject line “Privacy Papers for Policy Makers 2014.”

Click here to view prior editions of “Privacy Papers for Policy Makers.” We look forward to your submissions.