Paper highlights de-identification standards, re-identification research, and emerging technical, contractual, and policy protections that can safeguard genetic data while supporting research.
Genomic data is arguably the most personal of all personally identifiable information (“PII”). Techniques to de-identify genomic data to limit privacy and security risks to individuals–while that data is used for research and statistical purposes–are at the center of discussions among stakeholders engaged in genetic research.
The Future of Privacy Forum (FPF) andPrivacy Analytics have partnered to publish “A Practical Path Toward Genetic Privacy in the United States.” The white paper is intended to highlight the personal nature of genetic data, describe existing regulatory requirements, and discuss emerging developments regarding the de-identification & re-identification of genetic data while highlighting consensus practices organizations are taking to safeguard genomic information.
“Genetics has become increasingly valuable to cutting-edge medical research, with implications from public health to rare disease diagnostics,” said Katelyn Ringrose, FPF Policy Fellow. “Observing this evolution, FPF and Privacy Analytics collaborated to create a practical path forward; one which will protect the privacy of those individuals who contribute their genomes to fuel such incredible discoveries.”
The white paper explores and drives discussion around two prominent examples of privacy engineering solutions applicable to genetic privacy: differential privacy and secure (multi-party) computation. Although technical solutions like these show promise in protecting genetic data, companies should also follow emerging privacy and security-centric norms that are evolving in the space, including the use of:
Access Controls – Depending on the nature of the data and its identifiability, access controls can limit access to certain individuals and institutions.
Contractual Controls – Researchers and institutions can be required to enter into a data use agreement prior to being able to access data, in order to ensure that that data is accessed only for legitimate purposes and that identifiability remains low.
Security Protocols – Organizations sharing genetic data can create specific security protocols dictating how researchers utilize data in open access or controlled-access data repositories.
FPF hopes that this white paper will help guide stakeholders in the genetics arena, including those stakeholders providing and utilizing genetic data to identify health risks, learning more about rare diseases, and creating new treatments and precise diagnostics. We look forward to continuing to support cutting-edge research, while aiming to mitigate the risks associated with the use of genetic data.
For additional information about this publication or the Future of Privacy Forum’s health working group, please contact Rachele Hendricks Sturrup ([email protected]) and Katelyn Ringrose ([email protected]).
Privacy & Pandemics Virtual Workshop: The Role of Mobile Apps
The Future of Privacy Forum and the Israel Tech Policy Instituterecently convened a briefing with experts from government,academia, and leading companies about the use of mobile apps related to the COVID-19 public health crisis, and how data protection and ethics can be managed when sensitive health and location data are collected. The briefing featured privacy experts from around the world, including:
SaritDeshe, Head of Nationwide Information Projects Department, Ministry of Health Israel
Talia Agmon, Deputy Chief Legal Counsel, Ministry of Health Israel
Professor Michael Birnhack, Associate Dean for Research, The Buchmann Faculty of Law, Tel Aviv University
Hyunik Kim, Deputy Director, Planning & Management Division, and Head, International Cooperation for Personal Information Protection Commission (PIPC), Republic of Korea
Steve Penrod, Vice President of Product Development, TripleBlind (U.S.)
Bart Preneel, leads COSIC (Computer Security and Industrial Cryptography group) in the Department of Electrical Engineering at KU Leuven, Belgium
Leaders from the Future of Privacy Forum and the Israel Tech Policy Institute, including FPF CEO Jules Polonetsky, Managing Director of the Israel Tech Policy InstituteLimorShmerlingMagazanik,FPF Director of Technology and Privacy Research Christy Harris, Policy Counsel Polly Sanderson, and FPF Managing Director for Europe Rob van Eijk.
Participants discussed the privacy implications and utility of storing data locally versuscentrally; strategies for improving the accuracy of data; promotion of apps to ensure sufficient scale; and how to assess the usefulness of certain data types (such as Bluetooth data) for public health purposes. Insights from the discussion will inform FPF’s ongoing work with stakeholders to identify best practices and policy recommendations for decision–makers.
To complement the virtual workshop, FPF released a detailed comparison of specific objectives and methods employed by“contact tracing” apps and software development kits (SDKs) that have been developed in various countries and regions to help public and private entities mitigatethe COVID-19 pandemic. Stakeholders interested in how leading apps are collecting and using data in response to the COVID-19 pandemic andpolicymakers considering the use of one of these appswill want to take a look at the chart.
Through a series of original Privacy &Pandemics publications and resources,FPF is exploringthe challenges the COVID-19 pandemic poses to existing ethical, privacy, and data protection frameworks. Thisseries is intended to help governments, researchers, companies, and other organizations navigate essential privacy questions regarding the collection and use of data in response to a global pandemic.
FPF Provides Senate Testimony on Strategies to Mitigate Privacy Risks of Using Data to Combat COVID-19
Future of Privacy Forum (FPF) Senior Counsel Stacey Gray today provided the Senate Committee on Commerce, Science, and Transportation with written testimony, including recommendations based on how experts in the U.S. and around the world are currently mitigating the risks of using data to combat the COVID-19 pandemic.
“The collection and use of data, including personal data, to respond to a public health crisis like a pandemic can be compatible with privacy and data protection principles,” said Gray. “In many cases, commercial data can be shared in a way that does not reveal any information about identified or identifiable individuals.”
Gray offered recommendations, based on recent FPF workshops with global experts, to mitigate the risks of processing location data and other consumer data for public health initiatives, including:
Follow the lead of public health experts. Rather than leading the way with data that is already available, technology companies should play a supporting role to epidemiologists, established research partners, and public health experts and rely on their expertise in determining what data is useful to achieving specific, clear public health goals.
Ensure transparency and lawfulness. In order to ensure public trust, including in the use of voluntary pandemic apps, companies should be as transparent as possible about data shared with government or public health officials.
Apply privacy enhancing technologies (PETs). Companies should take advantage of advances made by privacy engineers in recent years, and apply privacy enhancing technologies (PETs), such as differential privacy, in accordance with principles of data minimization and privacy by design.
Employ privacy risk assessments. Companies should use well-established privacy and data protection impact assessment frameworks to help identify risks and find ways to mitigate or eliminate them.
Follow core purpose limitation principles. Any personal data collection and use enlisted to fight the pandemic should be limited in time and limited to a specific, well-defined purpose identified in advance, with clear limitations on secondary uses.
Gray also explored the commercial sources and relative risks and benefits of precise location data generated by consumer devices, and highlighted the needs for baseline federal consumer privacy legislation. In addition to providing legal protections for individuals, a federal privacy law would also provide much-needed legal clarity for US companies to be able to respond quickly and understand what kind of data they may or may not share legally and ethically to support emergency public health initiatives.
Gray provided testimony to a full Commerce, Science, and Transportation Committee paper hearing, “Enlisting Big Data in the Fight Against Coronavirus.” Witness testimony was published by the committee on Thursday, April 9, 2020, at 10:00 a.m. Questions from committee members will be posted by the end of the day, and witnesses will have 96 business hours to respond.
FPF is exploring the challenges posed by the COVID-19 pandemic to existing ethical, privacy, and data protection frameworks through a series of original Privacy and Pandemics publications, workshops, and resources, accessible on the FPF website. The series is intended to help governments, researchers, companies, and other organizations navigate essential privacy questions regarding the response to the coronavirus pandemic. Resources include a chart that compares the specific objectives and methods of apps and software development kits (SDKs) that have been deployed to help public and private entities tackle the COVID-19 pandemic, and lessons learned from a workshop on corporate data-sharing for COVID-19 research.
ICYMI: FPF Experts Raise Concerns about Protecting Student Privacy During Rapid Switch to Online Learning
Experts from the Future of Privacy Forum, the nation’s leading think tank focused on advancing responsible consumer privacy practices, have spokenout in numerousarticles and publications to raise awareness about privacy concerns stemming from the rapid adoption of general-use technologies to support online learning at K-12 and higher education institutions nationwide.
Vance said, “You obviously have all of the privacy concerns that carry over from the use of ed tech generally… Is this company using data in an inappropriate way? Is this a privacy-protected product? Does the school have a data governance policy? When is information going to be deleted? Who has access to that information? Do people just have what information they need to do their job and no more? Because every additional person who has access to information can increase the risk that that information is shared and inappropriately or breached.”
As more schools and teachers move to quickly adapt existing general use apps and software for the virtual classroom, Vance warned in EdSurge, “We are likely to see more uncontrolled and unregulated use of technology by educators and others who suddenly have to move things online without clear guidance from the institution.”
In an interview with the Washington Post, Vance stated, “There is a very complex legal landscape around student privacy, and products made for consumers generally—for offices, for adults—are unlikely to comply with those laws.” She added to EdSurge that those products generally have not been set-up in a private-protective way, noting that “many companies are set up to allow ease of access and broad information collection as default settings instead of thinking more completely about preventing harms or protecting privacy.”
FPF CEO Jules Polonetsky spoke to the New York Times about the expanded use of Zoom in the virtual classroom. From the article: “some of [Zoom’s] standard terms are not consistent with the Family Educational Rights and Privacy Act, or FERPA, ‘in addition to many of the 130+ state student privacy laws passed since 2014,’ [Polonetsky] added.”
Vance echoed Polonetsky’s concerns about Zoom in interviews with EdSurge and NPR, flagging the privacy and legal implications of the tool: “A standard Zoom account is ‘not at all’ compliant with FERPA, COPPA or state student privacy laws” according to Vance.
She recommended that “schools stick with platforms designed for education” and noted to NPR that this problem is not unique to Zoom, saying’ “‘I don’t know that Zoom is any worse, and it may in many ways be better than a lot of the platforms out there, especially when it comes to security, accessibility and certainly when it comes to ease of use.’ But, she says, Zoom could have anticipated these privacy issues. “‘And now Zoom has the very difficult task of attempting to regain trust.’”
Vance also spoke with Inside Higher Ed about the potential for online learning to result in increased monitoring of students due to accountability reporting requirements. “Moving classes online will also raise questions about the extent to which school-issued devices with surveillance software pre-installed will monitor student activity at home, since officials are still supposed to ensure that students are receiving an education at home. Vance asks: “‘How comfortable will we be with schools monitoring students and what they do at home, now that home is going to be school?’”
Last Friday, FPF hosted a webinar with California IT in Education (CITE) and education law firm Fagen Friedman & Fulfrost (F3) entitled “Classrooms in the Cloud: Student Privacy & Safety During the COVID-19 Pandemic” that examined the tough privacy questions facing K-12 schools and higher education institutions during the rapid transition to online learning platforms. View the archived webinar here.
To learn more about the Future of Privacy Forum’s student privacy work, visit studentprivacycompass.org.
About FPF
The Future of Privacy Forum (FPF) is a Washington, DC-based think tank that seeks to advance responsible data practices. The forum is led by Internet privacy experts and includes an advisory board comprised of leading figures from industry, academia, law, and advocacy groups. For more information, visit www.fpf.org.
Why Data Protection Law Is Uniquely Equipped to Let Us Fight a Pandemic with Personal Data
Data protection law is different than “privacy”. We, data protection lawyers, have been complacent recently and have failed to clarify this loud and clear for the general public. Perhaps happy to finally see this field of law taking the front stage of public debate through the GDPR, we have not stopped anyone from saying that the GDPR is a privacy law.
The truth is, the GDPR is a “data protection” law (it stands for the General “Data Protection” Regulation). And this makes a world of difference these days, when governments, individuals, companies, public health authorities are looking at the collection of personal data and digital tracking of people as a potential effective way to stop the spread of the COVID-19 pandemic.
The GDPR is the culmination of about half a century of legislative developments in Europe, which saw data protection evolve from a preoccupation of regional laws, to national laws, to EU laws, to a fundamental right in the EU Charter of Fundamental Rights. A fundamental right (Article 8) which is provided for distinctly than the fundamental right to respect for private and family life (Article 7). What a wonderous distinction!
The right to the protection of personal data has been conceived particularly to support societies in facing the reality of massive automation of systems fed with data about individuals. At the very beginning, the introduction of computerized databases in public administration pushed for the necessity of adopting detailed safeguards that would ensure the rights of individuals are not breached by the collection and use of their data.
In the following decades, waves of development added layers to those safeguards and shaped data protection law as we know it today, layers such as the need for a justification to collect and use personal data; fair information principles like purpose limitation and data minimization; transparency and fairness; control of data subjects over their own data through specific rights like access, correction and deletion; the need of having a dedicated, independent supervisory authority to explain and enforce data protection law; accountability of whomever is responsible for the collection and use of personal data.
The right to data protection is procedural in nature. It does have a flavor of substantial protection, which will certainly grow in importance and will likely be developed in the age of AI and Machine Learning – in particular I am thinking of fairness, but at its core the right to data protection remains procedural. Data protection sets up specific measures or safeguards that must be implemented to reach its goal, in relation to personal data being collected and used.
Importantly, the goal of data protection is to ensure that information relating to individuals are collected and used in such a way that all their other fundamental rights are protected. This includes freedom of speech, the right to private life/privacy, the right to life, the right to security, the right to non-discrimination and so on. Even though I have not seen this spelled out anywhere, I believe it has also been developed to support the rule of law.
This is why data protection is uniquely equipped to let us fight the pandemic using personal data. It has literally been conceived and developed to allow the use of personal data by automated systems in a way that guarantees the rule of law and the respect of all fundamental rights. This might be the golden hour for data protection.
That is, if its imperatives are being applied to any technological or digital responses to the COVID-19 pandemic relying on personal data:
The dataflow proposed must be clear, including all the categories of data that will be collected and used.
The purpose(s) must be clear, specific, granular, well-defined.
Have a lawful ground for processing in place.
Building any solution that necessitates personal data must be done by taking into account from the outset data protection requirements (data protection by design).
The web of responsibility must be clear (who are the controllers and the processors?).
Personal data must not be shared, or given access to, beyond the defined web of responsibility (for example, through controller-processor agreements).
There must be transparency in an intelligible way for the individuals whose personal data are collected.
The necessity of collecting any of the personal data items must be assessed (can the project do without some of them and achieve the same purpose?).
All personal data must be accurate.
Ensure that individuals have a way to obtain access to their own data and to ask for correction, erasure if it is justified (as well as for the other rights they have).
Ensure the security of data.
The personal data collected must be retained only for as long as it is necessary to achieve the purpose (afterwards, it must be deleted; anonymization may be accepted as an alternative to deletion, but there is an ongoing debate about this).
Data Protection Impact Assessments (even if loose) should be conducted and then engaging with supervisory authorities to discuss the risks identified which cannot be mitigated could be helpful (and may even be obligatory under certain circumstances).
Therefore, all the data-based solutions proposed to diminish the effects of the COVID-19 pandemic are not being proposed and accepted in Europe in spite of the GDPR, as media has been portraying it. It is almost as if data protection has been developing in the past half a century to give us the right instruments to be able to face this challenge and preserve our freedoms and our democracies. I hope we will be smart enough to properly use them.
FPF Charts the Role of Mobile Apps in Pandemic Response
Multiple apps and software development kits (SDK) have been deployed to help both private and public entities tackle the COVID-19 pandemic. In order to better understand these technologies, the Future of Privacy Forum has created a comparison chart to contrast the objectives and methods of specific apps and SDKs.
The chart compares relevant privacy and data protection issues – such as data collection, retention, purpose, and sharing – as well as what privacy and data security safeguards are employed. The key question is the extent to which each technology appropriately and ethically balances public health and safety with privacy risks and other interferences with civil liberties throughout the crisis and in the future.
If you’re interested in data collection and use in response to the COVID-19 pandemic – or a decision-maker considering the use of one of these apps – you’ll want to take a look at the chart.
FPF Offers New Resources on Privacy and Pandemics
Today, the Future of Privacy Forum (FPF) released a collection of new publications and resources to help governments, educators, researchers, companies, and other organizations navigate essential privacy questionsregarding the response to the coronavirus pandemic. Global leaders responding to the coronavirus pandemicare increasingly relying on data from individuals and communities to analyze the virus’ progression, deploy resources, and make policy decisions.
“We want to help organizations make data available for leaders, researchers, and the publicwithout opening the door to lasting or limitless surveillance,” said Jules Polonetsky, CEO of the Future of Privacy Forum. “The information we have compiled will help decision makersthink clearly about – and document – what personal information they will collect or disclose, to whom, and under what conditions.”
COVID-19: Privacy & Data Protection Resourcesconsolidatesprivacy resources from sources around the world, highlighting resources that are useful to organizations grappling with questions about pandemic-related data. The site will be updated on a regular basis with new content.
A Closer Look at Location Data: Privacy and Pandemics.Public health agencies and epidemiologists are analyzing device location data to track the COVID-19 pandemic. Reporters and researchers considering the implications of usinglocation data to respond to the pandemic will find this blog post valuable.
Student Privacy During the COVID-19 Pandemic. K-12 and higher education administrators and educators will appreciate this joint School Superintendents Association (AASA)-FPF white paper’s guidance on how the Family Educational Rights and Privacy Act (FERPA) applies to schools in the context of COVID-19. FERPA and the Health Insurance Portability and Accountability Act (HIPAA) govern the disclosure of students’ health information held by schools; both laws permitemergency disclosures to protect the health or safety of others in some circumstances.
Privacy and Pandemics: A Thoughtful Discussion.On March 26, 2020, FPF brought together a dozen ethicists, academics, government officials, and corporate privacy and “data for good” leaders for a virtual workshop with more than 100 attendees to discuss data sharing in times of crisis and effective privacy and civil liberties measures. It’s the first in a series of events about privacy and pandemics that will be used to develop best practices and policy recommendations for decision makers.
Additional U.S. and international privacy resources that cover civil liberties and ethical best practices, security, technical tools, and emerging solutions.
As the COVID-19 virus spreads, governments, researchers, and healthcare institutions are seeking to obtain and deploy consumer data to track the spread of the virus, deliver emergency supplies, target travel restrictions and quarantines, and develop vaccines and cures. But can data collected from phones, credit cards, and other sources be used in this emergency without opening the door to lasting or limitless surveillance?
Yesterday, FPF convened a Virtual Workshop with a dozen ethicists, academics, government officials, and corporate leaders, and over 100 corporate attendees, to discuss responsible data sharing in times of crisis. It’s the first in a series of events about privacy and pandemics that FPF will use to develop best practices and policy recommendations for decision makers.
Participants discussed how recent “data for good” initiatives have informed data sharing during the crisis, concerns about data sharing in a time of low trust, lessons learned from past pandemics, how to effectively protect privacy and civil liberties, and what the COVID-19 pandemic means for the future of data sharing between companies, academics, and governments.
A more detailed workshop report is forthcoming, but in the interest of urgency we share the most important advice that arose in the Workshop for companies with data that could be of value to public health:
Understand how your own data sets relate to the needs of health experts. Any data set should be just one input into a broader epidemiological model. Some sets are not large enough, accurate enough, or relevant enough to be useful. Several participants warned that sharing flawed data or treating one data set as a “silver bullet” can lead decision-makers astray. Instead, companies should be sure to understand both the best ways that their data can be used and the risks associated with sharing their specific data. It is essential to work with medical and public health partners to understand their data needs, rather than merely provide analysis based on data collected for commercial uses.
Continue to follow your guidelines for data protection during the crisis, and recognize that your standards for sharing have not changed. Participants agreed that data protection principles should not be abandoned because there is a crisis, but pointed out that the standards for prioritizing review of projects have changed because of pandemic-driven urgency. Many companies regularly face smaller scale emergency requests for data.Some companies have established expedited processes to quickly elevate exigent data-sharing decisions to the highest levels.
Establish clear boundaries. History tells us that it is difficult to discontinue practices started in an emergency. In the absence of clear systemic rules, organizations should establish an exit strategy up front to protect against continued “emergency” practices after the crisis. Companies must be clear that data shared now should not be kept forever or used for other purposes; clear rules help maintain and build public trust in their programs.
Use data protection safeguards, such as anonymization and aggregating data. These are established techniques, but there is no standard definition about what they mean, and much skepticism about the ability to guarantee anonymization.Companies should explain how they use techniques in specific situations. While data is being shared during this emergency, organizations must continue to follow principles such as data minimization, proportionality and destroying data after it is transferred or used.
Work with a partner that has controls in place. Companies with established data for good programs have been working with partners to ensure data sets are appropriate, anonymized, and aggregated as much as possible. Participants expressed that working through existing arrangements is preferable to developing new partnerships in the midst of a crisis. For example, many university research groups have already data sharing agreements in place which have been vetted by Institutional Review Boards. These groups could act as a trusted partner between companies and public agencies.
Be transparent. To maintain public trust, companies need to clearly explain what data is being shared, with whom, and for what purpose.
The Workshop’s participants agreed that it would be better if more companies, non-profits, governments, and academics had been working collaboratively on the technical infrastructure, governance structures, and legal frameworks for data sharing in an emergency before the COVID-19 pandemic hit.
Some participants recommended ways to strengthen the “data for good” ecosystem over time, including standing up new trust structures. One participant recommended strengthening the “data enablers” in the system, such as institutional or ethical review boards, which can serve as checks on ill-advised data sharing and also facilitate connecting data sources – often, companies that have data with socially beneficial uses – with data users, like researchers and policymakers.
Participants also agreed that data protection and humanitarian action are completely compatible. While the trade-offs for decisions about sharing data have changed, there still should be a thoughtful and legally justified process for considering what data to share, with whom, for what purposes, and how it should be protected.
Many more insights and details were gathered and will inform FPF’s ongoing work with stakeholders to identify best practices and policy recommendations for decision makers.
A Closer Look at Location Data: Privacy and Pandemics
In this series, Privacy and Pandemics, the Future of Privacy Forum explores the challenges posed by the COVID-19 crisis to existing ethical, privacy, and data protection frameworks, and will seek to provide information and guidance to companies and researchers interested in responsible data sharing to support public health response. Future posts will examine pandemic-tracking mobile apps, regulatory guidance across the world, and more.
Part 1: A Closer Look at Location Data
Principal author: Stacey Gray (Senior Counsel) ([email protected]). Contributors: Chelsey Colbert (Policy Counsel, Mobility and Location); Polly Sanderson (Policy Counsel, Legislative Analysis); Katelyn Ringrose (Policy Fellow); Dr. Sara Jordan (Policy Counsel, Artificial Intelligence and Ethics). Email us at [email protected].
In light of COVID-19, there is heightened global interest in harnessing location data held by major tech companies to track individuals affected by the virus, better understand the effectiveness of social distancing, or send alerts to individuals who might be affected based on their previous proximity to known cases. Governments around the world are considering whether and how to use mobile location data to help contain the virus: Israel’s government passed emergency regulations to address the crisis using cell phone location data; the European Commission requested that mobile carriers provide anonymized and aggregate mobile location data; and South Korea has created a publicly available map of location data from individuals who have tested positive.
Public health agencies and epidemiologists have long been interested in analyzing device location data to track diseases. In general, the movement of devices effectively mirrors movement of people (with some exceptions discussed below). However, its use comes with a range of ethical and privacy concerns.
In order to help policymakers address these concerns, we provide below a brief explainer guide of the basics: (1) what is location data, (2) who holds it, and (3) how is it collected? Finally we discuss some preliminary ethical and privacy considerations for processing location data. Researchers and agencies should consider: how and in what context location data was collected; the fact and reasoning behind location data being classified as legally “sensitive” in most jurisdictions; challenges to effective “anonymization”; representativeness of the location dataset (taking into account potential bias and lack of inclusion of low-income and elderly subpopulations who do not own phones); and the unique importance of purpose limitation, or not re-using location data for other civil or law enforcement purposes after the pandemic is over.
What is precise location data?
Precise location data, or “mobility data,” involves information about how devices and people move through spaces over time. Most of this information comes from the devices we carry with us, with smartphones acting as proxies for people (according to Pew, smartphone ownership in 2019 was near-universal at 81% of Americans).
Why is this the case? Even the most basic connectivity, or the ability to send and receive wireless content on devices, has to involve information about where those devices are located. For example, providers of wireless services know where devices are located because they provide the service through local cell towers and networks. At a more general level, an IP address (an identifier that is freely and openly shared by devices to send and receive Internet traffic) is often sufficient to know a person’s city and state.
However, most researchers analyzing COVID-19 are interested in highly “precise” information about where devices (and therefore people) are located over time. The fact that an individual is located in “Washington, DC” is not sufficient for tracking an infectious disease, but information such as “works in the same building” or “attended the same restaurant at the same time as a diagnosed person” (precise location) can be very useful. Typically, we think of location data as having privacy implications when it is precise enough to single out an individual with reasonable specificity. This is often GPS-level specificity, and would usually not include information like an IP address. Measuring precise location depends in part on context, such as population density (for example, in a rural or remote area, a lower level of specificity might be more able to identify a person than if that same person were standing in Times Square). Recent legislative proposals have attempted to create strict cut-offs (such as an 1,640 foot radius under the U.S. House and Commerce Discussion Draft, or an 1,850 foot radius under the California Privacy Rights Act ballot initiative of 2020).
Sometimes mobility or location data is tied to known individuals (such as a name associated with a cell phone subscription), and at other times it is tied to a unique identifier associated with a device. In these cases, individualized data is often referred to as “anonymized.” In other cases, if a dataset has been modified to show movements of groups of people (and not individuals), it is often referred to as “aggregated.”
Who has access to location data?
Location data is held by a variety of commercial entities that provide different services, including as part of the core functionality of a device (mobile phone carriers and operating systems), as part of a consumer-facing feature (mobile apps), or as part of tracking in physical spaces that relies on device connectivity (Internet of Things):
Mobile phone carriers. Cell phone carriers know where phones are located because they direct calls to phones through local cell towers, which may be enhanced with GPS location data.
Operating Systems. Providers of mobile operating systems — Android (Google) and iOS (Apple) — may know where devices are located as a result of providing services, improving functionality, or enabling opt-in location features. In addition, some users may have opted in to the use of cell tower and Wi-Fi data being used to improve location services.
Left – iOS (Apple), middle and right – Android (Google)
Apps and App Partners. Many people have installed apps with location-based features, such as weather alerts, ridesharing, or groceries deliveries. In many cases, this location data is shared with partners in order to provide personalized ads or to monetize the free app. Many apps use Software Development Kits (SDKs), or code developed by third parties. Frequently, location data is shared with these SDK providers to improve their service or in exchange for monetization or other services.
Location Analytics Providers (Internet of Things).Connected devices emit identifying information that allows them to be tracked, even when they are not actively connected to a network. This includes mobile phones (when Wi-Fi or Bluetooth are turned on), but also other Internet of Things (IoT) devices such as fitness trackers, smart toys, or vehicles. As a result, many airports, stadiums, and brick-and-mortar stores analyse this signal data to better understand when their busiest hours are, where the highest in-store foot-traffic is, what products customers show an interest in, or how long people are waiting in lines.
How is location data collected?
When most people think of location data, they think of GPS (Global Positioning System). In fact, GPS is only one of many ways to infer where devices are located, most used in some combination by carriers, OS’s, apps, and others. Commonly used methods include: GPS; Cell Towers; Wi-Fi Networks; and Beacons (among others). Each provides a different level of precision and can be used for different purposes:
GPS. Smartphones and other devices can detect location via satellite GPS independently of any telephone or internet reception, although a phone’s GPS chip it is only one sensor among many. The accuracy of GPS signals varies widely, and can be affected by weather, or physical interference. For example, it is much less accurate in urban areas, and especially poor for detecting specific locations inside large buildings. As a result, modern cell phones use GPS in combination with other forms of location signal (Wi-Fi, Bluetooth) at various times to create a more accurate location determination.
Cell Towers. Cell towers have a main function, which is to be used by carriers to provide cell service. As a result, mobile carriers (such as AT&T, Sprint, Verizon, T-Mobile, and many others in the United States) know approximately where devices are located because they know which cell towers the devices connect with. In addition to this core function, cell towers also emit unique “Cell Tower IDs,” which can be freely detected. There are many private and public databases of the Cell Tower IDs associated with mapped locations of known cell towers. As a result, the proximity of nearby cell towers (and the signal strength of their IDs) can be used to infer where a device is located. Find your local cell towers here (OpenCellID).
Wi-Fi Networks. Mobile devices can infer their location by scanning for nearby Wi-Fi networks. Nearby networks or “access points” might include, for example, neighbors’ Wi-Fi, or the Wi-Fi available in cafes and shops. Large databases exist of the unique identifiers (MAC addresses and SSID) of wireless routers and their known locations, with companies such as Mozilla and Combain reporting databases of millions of unique Wi-Fi networks. Despite the relatively public nature of these identifiers, most (but not all) commercial databases offer an Opt Out mechanism for users who prefer that their own network not be included. In 2011, Google created an approach for opting-out a particular access point from being included in its database, which involves appending the phrase “_nomap” to the end of the wireless router’s SSID. Mozilla similarly honors the _nomap method, but other databases do not, or offer their own opt-outs.
Bluetooth Beacons. Many apps are designed to detect their proximity to “beacons,” small radio transmitters that broadcast one-way Bluetooth signals. Beacons are inexpensive and can be attached to personal items (such as a person’s keys or wallet).They can also be installed at known locations, for example in a retail space or in front of a special display of products in a shop. In these cases, an app that a user has given permission to access Bluetooth can infer the device’s location or send proximity-based alerts or other content.
Combining Signals for Precision. Modern smartphones combine signals detected from the sources above to create a more precise location than any one signal (such as GPS) would provide alone. For example, iOS and Android harness the signals from many different sensors on the device, such as altimeter and accelerometer sensors, to provide consolidated a “Location Services” feature that offers highly precise location information to apps (with a user’s permission) and that users can control in Settings. Signals can also be combined to create predictive location services, for example to predict a future traffic jam, or show users upcoming attractions on their predicted path.
Ethical and Privacy Considerations for Location Data
Lawmakers are beginning to navigate whether and how to make use of the many sources of commercial location data. As they do so, we recommend that they consider: how and in what context location data was collected (described above), as well as: the fact and reasoning behind location data being classified as legally “sensitive” in most jurisdictions; challenges to effective “anonymization”; representativeness of the location dataset (taking into account potential bias and lack of inclusion of low-income and elderly subpopulations who do not own phones); and the unique importance of purpose limitation, or not re-using location data for other civil or law enforcement purposes after the pandemic is over.
Precise location data is legally sensitive. In most jurisdictions, location data is treated as a special category of data subject to greater protections, such as heightened security standards, and the requirement of affirmative express consent. For example, the longstanding approach of the US Federal Trade Commission (FTC) has been to require affirmative consent for location data. In 2016 the FTC settled with ad platform InMobi for failing to respect users’ choice not to agree to share location data with apps. Affirmative express consent is also a feature of most US legislative proposals from 2018-2020, such as the proposed California Privacy Rights Actof 2020; and U.S. Senator Cantwell’s proposed Consumer Online Privacy Rights Act. The U.S. Supreme Court has also held that location data carries unique sensitivities because of its ability to reveal highly sensitive data about people’s behaviors, patterns, and personal life, most recently in Carpenter v. United States (requiring law enforcement to obtain a warrant for cell site location data). In the EU, access to location data is normally regulated as a matter of confidentiality of telecommunications, by the strict provisions of the ePrivacy Directive which require individual consent (with very narrow exceptions).
Precise location data is very challenging to fully “anonymize.” Many government entities are interested in gaining access to “anonymous” or “anonymous and aggregated” location data, to observe population-level trends and movements. While in some cases this is possible, it is very challenging to make any dataset of individual precise location data truly “anonymous.” Even if unique identifiers are used instead of names, most people’s behavior can be easily traced back to them — for example, from the location of their home (where the device “dwells” at night). These challenges are not insurmountable, but policymakers should be very careful not to overpromise, and should treat location datasets as private, sensitive information. This means it should be subject to administrative, technical, and legal controls to ensure it remains protected and limited in who can access it and for what purposes.
Even fully “aggregate” location data can sometimes be revealing. At times, even highly aggregated data about patterns of large groups of people (such as high-level heat maps) can inadvertently reveal information. In 2017, an interactive “Global Heat Map” of movements of users of the Strava fitness app inadvertently revealed the locations of deployed military personnel at classified locations. This incident highlights some of the wider ethical issues associated with open data and default public data sharing. In FPF’s privacy assessmentof the City of Seattle, we recommended that companies thoroughly analyze all risks, not only risks to privacy and re-identification, but also to “group privacy,” and impact on other values such as data quality, fairness, equity, and public trust.
Representativeness and bias are uniquely important for location datasets. Unfair data processing practices involving geolocation fall disproportionately on marginalized and vulnerable communities. As such, heightened privacy protections are especially critical for these groups. Voluntary apps, for example, are more likely to capture affluent communities. For example, a mobile app ‘Street Bump’ was released by a municipal authority in an attempt to crowdsource data to work out which roads it needed to repair. However, affluent citizens downloaded the app more than people in poorer neighborhoods. As such, the system reported a disproportionate number of potholes in wealthier neighborhoods, and could have led the city to distribute or prioritize its repair services inequitably. In contrast, mobile phone carrier data may be more representative, but may miss more of the elderly, very young, or lowest income people who may not own cellphones.
Purpose limitation is uniquely important in a crisis. Purpose limitation is a core guiding light of the US-based Fair Information Practice Principles (FIPPs) and the EU’s General Data Protection Regulation (GDPR). Because location data is sensitive and challenging to truly “de-identify” (i.e. to significantly reduce or eliminate all privacy risks), there is a serious concern that once collected by a public health agency for pandemic tracking, it could be retained or used for other purposes. Governments should consider how the location data was collected in the first instance (with users’ knowledge or consent?), and if the decision is made to repurpose it for pandemic tracking, it should be clearly siloed for that purpose and not re-used or retained for other civil or law enforcement uses. Researchers or agencies should have clear policies and procedures in place that describe the operational and technical aspects of data management.
Conclusion
As COVID-19 continues to spread, we are facing global challenges to existing norms and best practices for data collection and use. In some cases, location and mobility data might provide one path to better understanding and combatting the pandemic. Governments and researchers seeking to address concerns and risks should ask: how and in what context the location data was collected; whether it is necessary and appropriate to achieving their goals (including whether the data is truly representative of the overall population and takes into account vulnerable populations such as the elderly); whether those goals can be achieved through less invasive means; and how that data will be used, safely stored, retained, or re-purposed following the conclusion of the pandemic.
Image Attribution: “My New York heat map” by matteoc is licensed under CC BY-NC-SA 2.0.
Additional Resources:
AAAS Science Article on How Aggregated Mobility Data could Help Fight COVID-19 (Mar. 23, 2020) (arguing that mobile location data is useful to battling the pandemic, but advocating against the use of individual-level data);
Nature Article on The Privacy Bounds of Human Mobility (Mar. 25, 2020) (revealing the lack of anonymity associated with even coarse of aggregated mobile location datasets);
FPF’s blog post on the FTC Settlement with InMobi (June, 2016) (outlining the InMobi settlement for misrepresenting the fact that they were collecting location data via Wi-Fi networks);
FPF’s City of Seattle Open Data Risk Assessment (Jan. 30, 2018) (providing tools and guidance to the City of Seattle and other municipalities navigating the complex policy, operational, technical, organizational, and ethical standards that support privacy-protective open data programs.
SharedStreets’ Post on Using Location Data for Guiding Micromobility Outcomes (Mar. 26, 2019);
European Data Protection Board, 1/2020 Guidelines on Processing Personal Data (Feb. 2020) (providing guidance in the context of connected vehicles and mobility related applications);
“There’s no question that schools and institutions are struggling to manage this unprecedented situation and need as much support and information as possible to do their jobs,” said Amelia Vance, FPF’s Director of Youth and Education Privacy. “The Future of Privacy Forum is tracking the situation closely in an effort to anticipate and help address the challenges that schools may encounter as they work to navigate the COVID-19 pandemic, and we expect to release additional resources in the days ahead.”
“As our nation’s public school superintendents navigate through the extraordinary set of circumstance they face in light of COVID-19, AASA remains committed to gathering, creating, and disseminating as many resources as possible to answer, to the best of our ability, the myriad questions they raise,” said Noelle Ellerson Ng, AASA’s Associate Executive Director for Advocacy & Governance. “Through our work with FPF, we are happy to provide this collection of frequently asked questions in the context of student data and privacy and FERPA. Protecting student data and privacy is just one of the many factors they need to consider, and we are pleased to have the opportunity to share this resource today.”
The white paper offers insight into how the health or safety emergency exception under the Family Educational Rights and Privacy Act (FERPA) allows schools to share students’ personally identifiable information (PII) with the community and relevant officials during the COVID-19 pandemic.
According to FPF and AASA, under the FERPA health or safety emergency exception, “if a school determines that there is an articulable and significant threat to the health or safety of a student or other individuals and that someone needs PII from education records to protect the student’s or other individuals’ health or safety, it may disclose that information to the people who need to know it without first gaining the student’s or parent’s consent.” Read more.
The white paper also addresses a number of frequently asked questions, including:
If a student has COVID-19, what information from education records can the school share with the community?
If the school suspects that a student has COVID-19, what information can the school share with its community?
If a school suspects that a student may have COVID-19, can school officials contact the student’s primary care physician?
If a student has COVID-19 and the school’s health records are covered by HIPAA rather than FERPA, what information may the school disclose to its community?
What if the school receives a voluntary request from a local, state, or federal agency for student records to assist the agency in responding to the COVID-19 outbreak?
What should a school do if it receives a request under a mandatory reporting law to share student health records with a public health agency?
Do interagency agreements with other state or local agencies allow schools to disclose education records without obtaining consent?
To read the white paper, click here. To learn more about the Future of Privacy Forum’s student privacy work, click here.