FPF Welcomes New Senior Fellow

FPF is pleased to welcome Stanley W. Crosley as a senior fellow. Stanley has over 20 years of applied experience in law, data governance and data strategy across a broad sector of the economy, including from inside multinational corporations, academia, large law firm and boutique practices, not-for-profit advocacy organizations, and governmental agencies, and is the Co-Director of the Indiana University Center for Law, Ethics, and Applied Research in Health Information (CLEAR), is Counsel to the law firm of Drinker Biddle & Reath in Washington, DC, and Principal of Crosley Law Offices, LLC.  Stan is a Senior Strategist at the Information Accountability Foundation and a Senior Fellow at the Future of Privacy Forum, where he leads health policy efforts.  He is an Adjunct Professor of Law at Maurer School of Law, Indiana University and lectures on global privacy and data protection, health privacy, and data strategy.

Stan has served as the 2014-2016 Co-Chair of the federal Privacy and Security Workgroup for Health and Human Services Office of the National Coordinator for Health Information Technology (HHS/ONCHIT) providing guidance on health privacy and data security to HHS and the White House.  Stan also serves on the board of The Privacy Projects, and as the Co-Chair of the Research Committee for C-Change.  Stan is the former Chief Privacy Officer for Eli Lilly and Company and is a former board member of the International Association of Privacy Professionals (IAPP), the International Pharmaceutical Privacy Consortium, the Indiana Health Informatics Technology, Inc., served on the IOM HIPAA and Research Workgroup that delivered the first report on the research implications of HIPAA and the Brookings Institute workgroup providing guidance on the FDA’s Sentinel project.

In his global legal practice, and in his activity as a speaker/lecturer, Stan provides thought leadership and practical guidance on topics ranging from the full spectrum of health data issues, applied and innovative technology, and privacy to data strategy and applied data ethics working with a network of regulators from the US, EU and every region of the world and peers in multinational companies and start-ups in the technology, biopharma, medical device, biotech, communications, social media, and health provider industries.

Please join us in welcoming Stanley to the team!

Consumer Reports Publishes Initial Findings for Privacy and Security of Smart TVs

Today, Consumer Reports released their initial findings on the privacy and security aspects of Smart TVs. Applying their Digital Standard (developed with Ranking Digital Rights and other partner organizations), Consumer Reports identified a range of important privacy aspects and potential security vulnerabilities in Smart TVs from five leading manufacturers (Sony, Samsung, LG, TCL, and Vizio).

As we noted last week in our discussion of Smart TVs, it can be challenging for even well-informed buyers to locate and fully understand the data policies of their TVs. This is complicated by the fact that data from internet-connected TVs may be collected and processed by multiple entities (manufacturers, operating system providers, and third-party apps such as Netflix and Hulu). In addition, the market for TV data is still relatively nascent, although growing rapidly. As buyers become attuned to privacy and security features of all connected devices — including the broader Internet of Things (TVs, smartphones, smart homes, and connected cars) — we expect that the market for secure, privacy-conscious consumer technology will improve and grow.

In response, Roku has expressed their disagreement with Consumer Reports’ characterization of a potential security vulnerability, and Consumer Reports will be conducting a live Q&A Privacy Hour this evening (8-9pm ET / 5-6pm PT) about digital privacy, smart TV and toys, and best practices.

Looking Ahead

We expect Consumer Reports to continue studying and publishing their findings related to privacy and security features of connected devices. Here are some things we look forward to seeing:

Overall, we commend Consumer Reports on their important work. As internet-connected home devices continue to proliferate, these initial findings represent an important milestone in making privacy and security features accessible to consumers, researchers, and advocates.

FPF Welcomes New Interns

Lin-hsiu Huang

Lin is our Communication/Design intern for the Spring Semester. Lin is originally from Kaohsiung, Taiwan, and she is seeking a dual degree in BFA Art/Design and BS Mathematics at Morehead State University. She has presented her advocacy research and productions – documentaries and music videos – in over a dozen conferences (e.g. Poster on the Hill, the Appalachian Studies Conference, the Southern Honors Council Regional Conference). Lin works with Melanie Bates, the Director of Communications, and she will be focusing on various design and re-branding tasks such as the monthly newsletter, the Annual Report, and one-pagers, as well as updating FPF’s privacy calender and monitoring its website analytics.

Esther Lim

Esther is studying for her Masters of Laws in Global Health Law at the Georgetown University Law Center, where she previously served as a Global Health student fellow and a research assistant in the O’Neill Institute of National and Global Health Law. She earned her Bachelor of Laws degree from the University College of London. She previously worked as a Health Policy Analyst in Singapore’s Ministry of Health, working on issues of regulatory policy and health information.

Seeing the Big Picture on Smart TVs and Smart Home Tech

CES 2018 brought to light many exciting advancements in consumer technologies. Without a doubt, Smart TVs, Smart Homes, and voice assistants were dominant: LG has a TV that rolls up like a poster; Philips introduced a Google Assistant-enabled TV is designed for the kitchen; and Samsung revealed its new line of refrigerators, TVs, and other home devices powered by Bixby, their intelligent voice assistant. More than ever before, companies are emphasizing “seamless connectivity” between TVs and other connected home devices. In other words, users will be able to instruct their TV to dim the lights, display footage from their home security camera, show who is standing at the front door, or even see what’s inside the fridge — all features envisioning the TV as the command center of the ideal, futuristic Smart Home.

In the midst of this ongoing explosion in “intelligent” consumer electronics, we continue to see concern about TVs that “listen,” TVs that allegedly “spy,” and more recently, mobile apps that can “hear” what might be playing on the TV or happening in users’ living rooms. It can be challenging to distinguish accurate reports from inaccurate ones (for example, we wrote in 2015 about Samsung TVs and the confusion that stemmed from a privacy policy that was misinterpreted by many journalists).[1]

Nonetheless, Smart TVs are raising serious data privacy questions that apply broadly to all Smart Home devices – for example, how long should a manufacturer be responsible for installing software updates to keep an Internet-connected TV secure? Do buyers fully understand what kinds of data their TV manufacturer is collecting, and how to control it? How much information should be presented on the box before purchase? It is critical to identify solutions that maximize consumer benefits while avoiding privacy risks and harms.

A Deep Dive into Leading Smart TVs

In order to better understand the privacy and security issues raised by Smart TVs, we recently had the opportunity to informally review the policies and user interfaces of 2017 models from three leading manufacturers: Sony (which uses the Android TV interface), LG, and Samsung.[2] We aimed to learn more about the privacy and security aspects of leading Smart TVs.

Overall, Smart TV data practices vary considerably. Consumer choices are not always easy to exercise, and there remains a great need for transparency and consensus around how TV data should be used. Advertising using Smart TV data, for example, is a nascent but rapidly growing industry, and many TV buyers are not yet aware of the extent to which their other activities (online and offline) may be synced with their TV viewing information in a way that informs and drives advertising. Security is also a critical aspect — in today’s TVs, software updates are not necessarily automatic, or guaranteed to continue, and it can be difficult for even a well-informed person to make a purchasing decision on the basis of a company’s security practices. Although Smart TVs promise great benefits to consumers, there is clearly more work to be done to build consensus around privacy and security.

 

Skip ahead to:

What Makes a TV “Smart?”

Smart TVs Vary in their Privacy Settings and Features

Advertising using Smart TV Data is a Nascent, but Growing Industry

Conclusions

 

What Makes a TV “Smart?”

Smart TVs – or, as they are often promoted, “intelligent TVs,” are TVs that connect to the Internet to allow users to access streaming video services (such as Netflix or Hulu), other online media or entertainment, such as music, on-demand video, and web browsers. Many Smart TVs have their own App Stores, making them more similar to large-screen computers than traditional displays. Many are now integrating connectivity with other Smart Home technologies like lights, baby monitors, or kitchen appliances.

In addition to the wide variety of new entertainment options, a less appreciated benefit of Smart TVs is the ability to generate accurate, reliable TV viewing measurement data. Historically, TV viewing was difficult to measure, resulting in efforts by companies such as Nielsen to encourage families to voluntarily track their TV viewing habits. The inevitable result, as discussed by several speakers in the Federal Trade Commission’s recent workshop, was that only the most popular and mainstream TV viewing would typically be measured accurately.

In the last ten years, as TVs and streaming media has become more sophisticated, it has become possible to measure less popular or even obscure content. The ability to know what kinds of content people are actually interested in, even if it isn’t mainstream, has allowed for greater investment in content that previously would have been too risky – for example (as discussed by Samba TV’s Ashwin Navin at the FTC’s Smart TV workshop), Arrested Development (canceled and then re-launched by Netflix), or The Mindy Project (picked up by Hulu).

Nonetheless, the same data collection that allows for accurate TV viewing measurement often creates concerns around individual privacy. For example, individual data can be used to create detailed profiles based on viewing habits, sometimes in expected ways (e.g. Netflix suggestions), and sometimes in unexpected ways.

Smart TVs Vary in their Privacy Settings and Features

Are all Smart TVs the same with respect to their data practices? In many ways, they are not. The TVs we unboxed and set up had significant differences in privacy features, including things like: whether relevant policies are easily available; whether and how users are prompted to consent to data collection and uses; whether users can delete their personal data; and whether software updates are installed automatically. Notably, some TV manufacturers run software from other companies (e.g. LG TVs that run Android OS), but other manufacturers actively collect data for advertising and other purposes. Hardware manufacturers are responsible for many of the things buyers care about, like screen size, picture quality, and durability, but when it comes to data privacy, buyers should also think about the operating system and apps.

There are also some issues applicable to all modern TVs that deserve greater attention – specifically, the fact that digital advertising using TV data is a rapidly growing industry that has not yet developed consensus around privacy norms. These important questions about data privacy have broader implications for other connected devices in the Internet of Things (IoT) and the Smart Home.

Key privacy and security issues:

Relevant privacy policies are not always available.

A minimum standard for any connected device is the existence of a relevant, accurate, and, as far as possible, easy to comprehend privacy policy. Despite longstanding critiques of privacy policies – i.e. that many or most people do not read them – written policies nonetheless play a crucial role in U.S. privacy law. In addition to providing the basis for enforcement by the Federal Trade Commission if companies don’t keep their promises, they provide information to researchers, tech journalists, and privacy advocates, who routinely analyze and compare practices.

All Smart TVs that we reviewed provided users with access to privacy policies within the TV settings menu (although they varied somewhat in ease of navigation and text size). Some Smart TV manufacturers also provide access to their policies outside of the TV itself. For example, Samsung, Vizio, and Google (whose Android TV software powers devices offered by several manufacturers) make policies available online. In contrast, 2017 LG TVs provide a detailed privacy policy on the device, but it does not appear to be available on LG’s website or outside of the TV interface. No major manufacturers provide comprehensive privacy statements on TV packaging, which can make it challenging for prospective buyers in physical retail stores to compare privacy policies across brands.

Unsurprisingly, the TV manufacturer with the most readily available and clear Privacy Policy is most likely Vizio, which is also the only TV manufacturer to date (that we know of) whose TVs have been the subject of regulatory investigations, including a $2.2M settlement with the U.S. Federal Trade Commission and New Jersey Attorney General. Smart TV manufacturers would do well to follow the example of increased transparency and consumer education, and leaders in this space could go much further and present information “on the box” (as we called for in our connected toys report) so that prospective buyers can make well-informed decisions.

Automated Content Recognition (ACR) is a common feature of Smart TVs.

All Smart TVs that we reviewed – and probably, nearly all modern Smart TVs – are equipped with automated content recognition (ACR) technology.  ACR is usually built in to the TV software but also present in many third party apps. An early example of ACR technology is Shazam, the popular music recognition app that is now available on many leading TVs.

Generally, ACR technologies use one of two methods: fingerprinting or watermarking. The most common method, audio/video-based fingerprinting, relies on periodically extracting a “fingerprint” of unique characteristics of the content being watched, and sending it to a third party matching service to identify the content. Watermarking, in contrast, relies on the content creator to embed a unique “overlay” or “watermark” (often imperceptible) into the audio or video file so that it can be recognized again in the future.

Fingerprinting vs. Watermarking
Fingerprinting:

  • Audio or video-based
  • Extracts a set of unique properties from an audio or video signal (does not add any information)
  • Requires comparison with an external “matching” database to identify the content

Common Uses:

  • TV viewing measurement;
  • TV interactivity features (e.g. in-show trivia, live polls, or identifying songs or actors’ names during a show);
  • behavioral advertising;
  • detecting copyright infringement
Watermarking:

  • Audio or video-based
  • Adds a unique “overlay” or “watermark” of data (either perceptible or imperceptible), embedding it within an audio or video signal
  • Requires knowing what you’re looking for, i.e. allowing content creators to identify their own content

Common Uses:

  • tracking individually owned content;
  • identifying content creators;
  • tracing proprietary media (anti-piracy), e.g. in cinema distribution

Most Smart TVs provide users with notice of ACR data collection through on-screen notices, but the policies typically describe the collection of “Viewing Information,” or “Viewing History,” providing little detail about what the ACR technology collects or how it works in detail. Some examples of how ACR-enabled viewing data is described in on-screen policies (on file with author):

Some manufacturers also describe ACR data collection in privacy policies that users can access online; the most comprehensive description of ACR technology and its related uses appears in Vizio’s Privacy Policy, which contains a Viewing Data Supplement. The data practices and privacy disclosures of Vizio Smart TVs have been the subject of substantial public scrutiny, regulatory investigations, and a settlement with the U.S. Federal Trade Commission and New Jersey Attorney General.

TVs vary in how they obtain consent to collect and use ACR data.

A key principle of U.S. privacy law is that technology providers should ask users for their consent prior to the collection of use of their personal information, especially sensitive information such as granular TV viewing data. However, what this consent should look like, and how to structure users’ choices, is a frequent source of debate.

Occasionally, offering choices is not practical, because data might be necessary for a device to work as intended – for example, any Internet-connected device necessarily sends an IP address and MAC address in order to connect to a network. In contrast, automated content recognition data (“ACR Data”), while it may enable certain benefits, is not necessary for the TV to function. Furthermore, ACR involves the collection of sensitive information about everything that viewers are watching and when. As a result, it is appropriate for TV manufacturers to offer robust choices around the collection of this kind of data.

In the TVs that we “unboxed,” notice and consent for the collection of ACR data varied. For example, Samsung TVs ask users to opt in to optional data collection during set-up. In contrast, the LG TV presented a basic privacy statement during set-up, and then asked for additional permissions later when we tried to use the specific features that required those extra permissions. In general, these sorts of “just in time” notices reflect a more privacy-conscious design. Although in some ways it makes sense to place all the information “up front,” the set-up process is also a time when users are eager to get the device running, and not necessarily well-positioned to distinguish between routine terms of service (which may be required to set up the TV at all) and optional privacy choices related to added benefits.

Software updates (a key component of good security) are not necessarily automatic or guaranteed to continue.

As TVs become more like computers, a growing issue is the extent to which manufacturers have an obligation to continue supporting software and pushing updates to fix security vulnerabilities. As any smartphone user knows, receiving persistent reminders for app and OS updates can be frustrating, but updating software is crucial to good security.

In leading 2017 TVs, security updates are possible, but not necessarily automatic by default. In addition, it is not clear how often manufacturers push updates for security vulnerabilities. Most TV manufacturers have bug bounty programs (for example, Samsung’s Smart TV bug bounty program; Google’s vulnerability rewards program, which applies to Android TV; and Sony’s Secure@Sony program), which provide an incentive for independent security researchers to report security flaws so that companies can fix software before consumers are affected. However, without automatic updating or prominent notices on the TV interface, it can be difficult to ensure that TV buyers take the steps necessary to secure their devices.

Finally, many manufacturers do not yet make explicit assurances to their customers about how long they will continue to support older TV models. Given that the average lifespan of a TV is around 7-10 years, it is crucial that Smart TVs, with all of their added connectivity and software-dependent services, continue to be updated for a reasonable time. Furthermore, with enough transparency, software support can be a powerful selling point for a budget-impacting purchase. The importance of Smart TV security is heightened as the newest TVs become linked to other devices in smart homes.

Smart TVs vary in policies for data retention and deletion.

Finally, as with all connected devices, a key question for Smart TV providers is how they should handle users’ requests for deletion of accounts and associated data. Deletion of data is not only a practical consideration – for example, if a buyer decides to sell or re-purpose a TV – but increasingly viewed as an aspect of consumer privacy rights.

Leading Smart TV manufacturers have very different policies regarding data retention and users’ opportunity to meaningfully delete their personal information:

TV Manufacturer On-Screen Retention Policy
Sony (Bravia) “You can stop uploading the TV usage logs at any time in [Help] -> [Privacy setting]. If this uploading is disabled, the above information about the use of this TV will no longer be uploaded to Sony Corporation . . . Information already uploaded to Sony Corporation with a unique number shall be deleted or converted to anonymized statistical data within approximately six months. The viewing history data stored in this TV will also be deleted and as a result the functions which use viewing history data may not be available (such as “Popular” program recommendations).”
Samsung “Interest based advertisements will be linked to a randomized, non-persistent, and resettable device identifier called the “PSID.” You may reset your PSID at any time by visiting the settings menu, and once reset, your viewing history and Smart TV usage information will be cleared and de-linked.”
LGE “We will take reasonable steps to make sure that we keep your personal information for as long as is necessary for us to provide you with LG Smart TV Services or for the purpose for which it was collected, or as required by law.”
Vizio*

* We did not unbox a Vizio TV – the following is from Vizio’s online Privacy Policy

“If you request removal of Personal Information, you acknowledge that residual Personal Information may continue to reside in VIZIO’s records and archives, but VIZIO will not use that Personal Information going forward for commercial purposes.”

 

Digital advertising using Smart TV data is a nascent, but rapidly growing, industry.

While Smart TV data can be used for a wide range of useful features (including measurement, recommendations, and interactivity features such as in-show trivia, polling, or song recognition), it can also be used for personalized advertising in potentially unexpected or intrusive ways. As a result, there is a need for greater transparency, understanding, and consumer education on issues of TV data privacy.

Programmatic advertising, while well-established in the online ecosystem, is still nascent and growing rapidly for Smart TV data. There are two sides to TV data and advertising: (1) the use of TV viewing data for serving advertisements elsewhere, such as on associated devices; and (2) the use of data from other sources (online browsing behavior on associated devices, social media activities, or demographics) to display an advertisement on a TV. Although both activities may be surprising to consumers, they carry different implications for individual privacy.

In discussions around best practices, processors of TV data are often inclined to apply the same, or similar, standards as those that exist for online behavioral advertising, such as the Network Advertising Initiative’s Code of Conduct. Although similarities exist, direct application of standards for online advertising may not be appropriate unless they take into account key differences:

Finally, it was surprising to note that several leading Smart TVs are integrating advertising into their main user interfaces. In other words, the TV’s main screens are being designed to contain static placements for digital advertisements (whether personalized or otherwise). For many, this will be a serious downside that might not be understood at the time of purchase. Will we start to see differential pricing for Smart TVs – a less expensive TV with advertisements, and a more expensive TV without advertisements? Although this would not necessarily be unusual for connected devices, it remains to be seen how TV buyers would respond to such a pricing model.

Conclusions

Overall, the industry for Smart TVs and Smart TV data, much like the broader “Internet of Things” ecosystem, is relatively nascent. In the absence of baseline privacy legislation that would provide minimum standards for commercial collection and use of personal information, there is little consensus or consistency between different TV manufacturers about the appropriate ways to collect and use data. Smart TVs promise a range of benefits and interactive features – but are also collecting data for advertising and commercial purposes that might surprise many Smart TV users.

Unfortunately, even well-informed prospective buyers of Smart TVs do not yet have easily available tools to compare TVs on the basis of their privacy and security features. As more consumers start to use Smart TVs as a central hub for connected home devices, good security is also critical. Ironically, strong security practices can make it more difficult for independent researchers to evaluate privacy features. For example, a side effect of the increased use of SSL encryption (an important security safeguard for well-designed connected devices) is that security researchers are not able to analyze data being sent and received by a Smart TV.

We look forward to Consumer Reports’ emerging Digital Standard, which promises to provide such tools for prospective buyers to compare connected devices on many key privacy aspects – such as, for example, the existence of clear policies, or the default settings of ACR. Inevitably, it will be difficult for outside observers to compare Smart TVs on the basis of their internal business practices (especially as data is increasingly better secured and challenging to assess from external observation). For these reasons, independent trusted organizations will likely play a key role in addressing these challenges in years to come. By working towards greater transparency and privacy commitments,

 

 

[1] For more information on the 2015 Samsung concerns and other privacy issues related to voice data and speech-enabled devices, read Future of Privacy Forum’s 2015 report, “Always On: Privacy Implications of Microphone-Enabled Devices.”

[2] In evaluating Smart TVs, we approached each 2017-model TV from the perspective of an average user, relying only on public-facing documents and the communications presented in the TV’s user interface. Although some of the companies discussed here are also supporters of FPF, we applied the same approach to all TVs and believe that we have presented a fair, accurate comparison of key privacy and security aspects.

If You Can't Take the Heat Map: Benefits & Risks of Releasing Location Datasets

Strava’s location data controversy demonstrates the unique challenges of publicly releasing location datasets (open data), even when the data is aggregated.

This weekend, the Washington Post reported that an interactive “Global Heat Map,” published by fitness data company Strava, had revealed sensitive information about the location and movements of servicemen and women in Iraq, Syria, and other conflict zones. The data in Strava’s Global Heat Map originates from individual users’ fitness tracking data, although the company took steps to de-identify and aggregate the information. The resulting controversy has highlighted some of the serious privacy challenges of publicly releasing open data sets, particularly when they are based on sensitive underlying information (in this case, geo-location).

Until recently, almost all conversations around open data related to the risks of re-identification, or the possibility that individual users might be identified from within an otherwise anonymous dataset. The controversy around Strava demonstrates clearly that risks of open data can go far beyond individual privacy. In addition to the identity of any individual user, location data can also reveal: the existence of – or activities related to – sensitive locations; group mobility patterns; and individual mobility patterns (even if particular people aren’t identified).

As we recommended in our recent privacy assessment for the City of Seattle, companies should thoroughly analyze the range of potential unintended consequences of an open data set, including risks to individual privacy (re-identification), but also including societal risks: quality, fairness, equity, and public trust. 

What happened?

Strava is a San Francisco-based company that provides location-based fitness tracking for runners and cyclists (calling itself the “social network for athletes”). Users can download Strava’s free app and use it to directly map their workouts, or pair it with a fitness device, such as a FitBit. In this sense, Strava is very similar to dozens of other popular fitness tracking apps, such as MapMyRun, RunKeeper, or Nike + Run Club.

In providing this service, Strava collects precise location data from users’ smartphones as they run or cycle. Like many fitness apps, Strava makes this data “public” by default, but provides adjustable privacy controls for its community-based features, such as public leaderboards, visibility to nearby runners, and group activities. In fact, Strava deserves some credit for providing more granular privacy controls than most fitness apps — they allow users to selectively hide their activities from others, or to create “Privacy Zones” around their home or office. While there is undoubtedly work to be done to align defaults with users’ expectations (we note, for example, that Strava’s request for access to location data does not mention public sharing, but asks for location “so you can track your activities”), many users of fitness apps will be familiar with this kind of setup, and able to exercise informed privacy choices.

Apart from users’ privacy controls with respect to other athletes, Strava itself maintains users’ historical location data that it collects in the process of providing its fitness tracking. Like nearly all location-based apps, Strava states in its Privacy Policy that it has the right to use de-identified and aggregated location information for its own purposes, including selling or licensing it, or using it for research or commercial purposes. Because de-identified information is considered by some to present fewer privacy risks (or none at all), this is a common statement that can be found in many privacy policies.

Facing Heat for Heat Maps

Beginning in 2017, Strava began publishing an updated “Global Heat Map” of the aggregated jogging and cycling routes of its 27 million users. As far as we can tell, the Heat Map is comprised of the anonymous location data collected and aggregated from all users, even when they have enabled the app’s primary privacy controls vis a vis other athletes — although according to Strava’s support page, the Heat Map excludes “private activities” (activities that users have set to be totally private) and “zones of privacy” (users’ activities within areas that they have specifically submitted to Strava to be excluded, such as home and work addresses).

Per Strava, the aggregated patterns of its users are meant to be used to “improve bike- and pedestrian-friendly infrastructure in your area . . . [including for] departments of transportation and city planning groups to plan, measure and improve infrastructure for bicyclists and pedestrians.”

Strava is not alone in this endeavor — there is a growing industry for location data from a wide variety of sources, including fitness trackers and location-based mobile apps. For example, Uber Movement promises to harness the power of its user base by providing “anonymized data from over two billion trips to help urban planning around the world.” Similarly, many state and local governments are partnering with Waze to share mobility data and reduce traffic.

Strava allows users to opt out of contributing to the Global Heat Map, although the option is only available in the online dashboard (not the app).

Despite Strava’s removal of personal information from the data, and their aggregation (to the level of the street grid), careful observers noticed last week that the Heat Map contained location patterns from within U.S. military bases in active conflict zones, including Syria and Afghanistan. While the existence of many of these military installations are already known, others have noted that the Heat Map revealed other “airstrips and base-like shapes in places where neither the American-led military forces nor the Central Intelligence Agency are known to have personnel stations.” (NYTimes)

Perhaps more importantly, the Heat Map reveals mobility patterns between and within military installations that give rise to security and safety concerns. Accordingly, the U.S. coalition against the Islamic State has stated that it is “refining” its existing rules on fitness tracking devices (WP), and a spokesperson for US Central Command has noted that the military is looking “into the implications of the map.” (The Verge)

Addressing Open Data Challenges (Aggregating is Not Always Enough)

While there is certainly an important conversation to be had about individual users having a better understanding of the information they share, it is equally important to consider the responsibilities that are involved with any public release of a large dataset. In Strava’s case, the company was under no obligation to make the data they held available to the public. They did so in order to provide an interesting, useful feature, and perhaps to demonstrate their powerful mapping capabilities. Once they decided to release the data, however, they had a responsibility to thoroughly review it for potential risks (no easy task).

These kinds of challenges are by no means new. As FPF’s Jules Polonetsky, Omer Tene, and Kelsey Finch described in Shades of Grey: Seeing the Full Spectrum of Practical Data De-Identification, some of the most (in)famous examples of “re-identification” arose from the public release of AOL search data, a Massachusetts medical database, Netflix recommendations, and an open genomics database. In each of these cases, “even though administrators had removed any data fields they thought might uniquely identify individuals, researchers . . . unlocked identity by discovering pockets of surprising uniqueness remaining in the data.” Repeatedly, researchers have shown that in a big data world, even mundane data points, such as the battery life remaining on an individual’s phone, can serve as potent identifiers singling out an individual from the crowd.

Until recently, however, almost all conversations around open data related to the risks of re-identification, or the possibility that individual users might be identified from within an otherwise anonymous dataset. However, risks of open data can go far beyond individual privacy. In addition to the identity of any individual user, location data can also reveal:

In our recent work with the City of Seattle, we explored risks related to Seattle’s Open Data Program, in which the local government releases a variety of useful data about city activities (such as 911 calls, building permits, traffic flow counts, public parks and trails, and food bank locations). While Seattle’s civic data is “open by preference,” the city recognized that “some data elements, if released, could cause privacy harms, put critical infrastructure at risk, or put public safety personnel and initiatives at risk.”

In our Open Data Privacy Risk Assessment, we recommended that any entity deciding whether (or how) to publicly release a large dataset should undergo a privacy benefit-risk assessment. Specifically, we recommended a thorough analysis of several areas of risk:

#1. Re-identification. As we describe in the Open Data Risk Assessment, one of the principal risks of open datasets is the possibility that the data might reveal sensitive information about a specific individual. Once information has been published publicly, it is difficult or impossible to retract — and as a result, (per our Program Maturity Assessment) data holders should:

Although initial reports indicated that the Strava Heat Map did not reveal individual information, combining the aggregate Heat Map data and the Strava’s leaderboards can potentially reveal individual users and the details of their top runs. Reportedly, a computer scientist has already developed a programmatic method for accessing individual names directly from the Heat Map datasets without relying on outside information.

#2. Data Quality. If data is going to be released publicly, the data should be (as far as possible) accurate, complete, and current. Although this is certainly more important in the context of government-held data, it is applicable to companies like Strava in the sense that the data should reliably inform future decisions. Companies can take steps to check for inaccurate or outdated information, and institute procedures for individuals to submit corrections as appropriate.

#3. Equity and Fairness. Another key aspect of open data — equally applicable to private datasets as to government-held datasets — is equity and fairness. If data is going to be released publicly to be used by others, it should be collected fairly and assessed for its representativeness. For example, Strava permits its users to opt out of contributing to the open dataset (although many may have been unaware), an important aspect of fairness.

Equally, though, datasets should be representative. In his comments to the Washington Post, the Australian researcher who first noticed problems with Strava’s Heat Map mentioned that his father had suggested he check it out as “a map of rich white people.” Although the comment was likely offhand, the issues are serious — particularly in “smart” systems that use algorithmic decision-making, bad data can lead to bad policies. For example, both predictive policing and criminal sentencing have repeatedly demonstrated racial bias in both the inputs (historic arrest and recidivism data) and their outputs, leading to new forms of institutional racial profiling and discrimination.

#4. Public Trust. Although we might typically think of “public” trust as a government issue, trust is critical in the private sector as well. As we have seen in the resulting discussions of Strava and the collection of location data from fitness trackers, there is concern both in the military and amongst average consumers around the use of connected devices. Particularly in the absence of baseline privacy legislation, trust is critical for the growth of new technologies, including Internet of Things (IoT) devices, fitness trackers, Smart Cities, and connected cars.

Conclusion

Beyond individual privacy, the Strava Heat Map demonstrates that there are societal risks that inhere in sensitive datasets. In particular, geo-location data can reveal individual and group patterns of movement, as well as information related to sensitive locations that must be taken into account. As technology advances and we address the challenges of connected Internet of Things (IoT) devices in our homes, on our bodies, and embedded in our cities, it is more important than ever to address the privacy challenges of location and open data.

FPF Publishes Model Open Data Benefit-Risk Analysis

FPF recently released its City of Seattle Open Data Risk AssessmentThis Report provides tools and guidance to the City of Seattle and other municipalities navigating the complex policy, operational, technical, organizational, and ethical standards that support privacy-protective open data programs.

Given the risks described in the report, FPF developed a Privacy Maturity Assessment in order to help municipalities around the United States better evaluate their organizational structures and data handling practices related to open data privacy.

SEE MATURITY ASSESSMENT

This Report first describes inherent privacy risks in an open data landscape, with an emphasis on potential harms related to re-identification, data quality, and fairness. To address these risks, the Report includes a Model Open Data Benefit-Risk Analysis (“Model Analysis”). The Model Analysis evaluates the types of data contained in a proposed open dataset, the potential benefits – and concomitant risks – of releasing the dataset publicly, and strategies for effective de-identification and risk mitigation. This holistic assessment guides city officials to determine whether to release the dataset openly, in a limited access environment, or to withhold it from publication (absent countervailing public policy considerations). The Report methodology builds on extensive work done in this field by experts at the National Institute of Standards and Technology, the University of Washington, the Berkman Klein Center for Internet & Society at Harvard University, and others,[2] and adapts existing frameworks to the unique challenges faced by cities as local governments, technological system integrators, and consumer facing service providers.[3]

SEE MODEL ANALYSIS


Additional Resources:


 

New Future of Privacy Forum Study Finds the City of Seattle’s Open Data Program a National Leader in Privacy Program Management

FOR IMMEDIATE RELEASE 

January 25, 2018

Contact: Melanie Bates, Director of Communications, [email protected]

New Future of Privacy Forum Study Finds the City of Seattle’s Open Data Program a National Leader in Privacy Program Management

Washington, DC – Today, the Future of Privacy Forum released its City of Seattle Open Data Risk Assessment. The Assessment provides tools and guidance to the City of Seattle and other municipalities navigating the complex policy, operational, technical, organizational, and ethical standards that support privacy-protective open data programs.

“Although there is a growing body of research on open data privacy, open data managers and departmental data owners need to be able to employ a standardized methodology for assessing the privacy risks and benefits of particular datasets,” said Kelsey Finch, FPF Policy Counsel and lead author of the Assessment.

“The City of Seattle made the decision to be ‘Open by Preference,’ making it possible for problem solvers outside of government to help us find solutions to civic challenges and improve our community’s quality of life. At the same time, we must honor the privacy of those reflected in our data. We are proud to have partnered with the Future of Privacy Forum on this effort to make sure we can both open our data and maintain the public’s trust in how we collect and use their data,” said Michael Mattmiller, Chief Technology Officer, City of Seattle.

To address inherent privacy risks in the open data landscape, the Assessment includes a Model Open Data Benefit-Risk Analysis, which evaluates the types of data contained in a proposed open dataset, the potential benefits – and concomitant risks – of releasing the dataset publicly, and strategies for effective de-identification and risk mitigation. This holistic assessment guides city officials to determine whether to release the dataset openly, in a limited access environment, or to withhold it from publication (absent countervailing public policy considerations).

“By optimizing its internal processes and procedures, developing and investing in advanced statistical disclosure control strategies, and following a flexible, risk-based assessment process, the City of Seattle – and other municipalities nationwide – can build mature open data programs that maximize the utility and openness of civic data while minimizing privacy risks to individuals and addressing community concerns about ethical challenges, fairness, and equity,” Finch said.

“The City of Seattle is very grateful to the Future of Privacy Forum for their comprehensive privacy risk assessment of our open data program, and for providing a framework within which we can enhance our existing privacy protections when releasing open data to the public,” said David Doyle, Open Data Manager, City of Seattle. “We are excited to be continually improving our open data program maturity levels where needed, and to continue to act as a role model for other municipal governments when mitigating for privacy risk during the process of releasing open data.”

FPF found that the City of Seattle Open Data Program has developed and managed robust and innovative policies around data quality, public engagement, and transparency. Specifically:

Currently, both Seattle’s Open Data and Privacy programs are already collaborating on a number of initiatives related to recommendations called out in the report. Additionally, the programs are also implementing an updated internal process for reviewing new open datasets for privacy risks based on the Model Open Data Benefit Risk Analysis framework. The City of Seattle is committed to this work as part of the 2018 Open Data Plan. In the coming weeks, both Open Data and Privacy programs will assess what additional recommendations to commit to addressing in 2018 and beyond.

“The City of Seattle is one of the most innovative cities in the country, with an engaged and civic-minded citizenry, active urban leadership, and a technologically sophisticated business community,” said Finch. “By continuing to complement its growing Open Data Program with robust privacy protections and policies, the City of Seattle will be able to fulfill that program’s goals, supporting civic innovation while protecting individual privacy.”

 

###

The Future of Privacy Forum (FPF) is a non-profit organization that serves as a catalyst for privacy leadership and scholarship, advancing principled data practices in support of emerging technologies. Learn more about FPF by visiting www.fpf.org.

Examining the Open Data Movement

The transparency goals of the open data movement serve important social, economic, and democratic functions in cities like Seattle. At the same time, some municipal datasets about the city and its citizens’ activities carry inherent risks to individual privacy when shared publicly. In 2016, the City of Seattle declared in its Open Data Policy that the city’s data would be “open by preference,” except when doing so may affect individual privacy.[1] To ensure its Open Data Program effectively protects individuals, Seattle committed to performing an annual risk assessment and tasked the Future of Privacy Forum with creating and deploying an initial privacy risk assessment methodology for open data.

Today, FPF released its City of Seattle Open Data Risk AssessmentThis Report provides tools and guidance to the City of Seattle and other municipalities navigating the complex policy, operational, technical, organizational, and ethical standards that support privacy-protective open data programs. Although there is a growing body of research regarding open data privacy, open data managers and departmental data owners need to be able to employ a standardized methodology for assessing the privacy risks and benefits of particular datasets internally, without access to a bevy of expert statisticians, privacy lawyers, or philosophers. By optimizing its internal processes and procedures, developing and investing in advanced statistical disclosure control strategies, and following a flexible, risk-based assessment process, the City of Seattle – and other municipalities – can build mature open data programs that maximize the utility and openness of civic data while minimizing privacy risks to individuals and addressing community concerns about ethical challenges, fairness, and equity.

This Report first describes inherent privacy risks in an open data landscape, with an emphasis on potential harms related to re-identification, data quality, and fairness. To address these risks, the Report includes a Model Open Data Benefit-Risk Analysis (“Model Analysis”). The Model Analysis evaluates the types of data contained in a proposed open dataset, the potential benefits – and concomitant risks – of releasing the dataset publicly, and strategies for effective de-identification and risk mitigation. This holistic assessment guides city officials to determine whether to release the dataset openly, in a limited access environment, or to withhold it from publication (absent countervailing public policy considerations). The Report methodology builds on extensive work done in this field by experts at the National Institute of Standards and Technology, the University of Washington, the Berkman Klein Center for Internet & Society at Harvard University, and others,[2] and adapts existing frameworks to the unique challenges faced by cities as local governments, technological system integrators, and consumer facing service providers.[3]

FPF published a draft report and proposed methodology for public comment in August, 2017. Following this period of public comment and input, FPF assessed the City of Seattle as a model municipality, considering the maturity of its Open Data Program across six domains:

  1. Privacy leadership and management
  2. Benefit-risk assessments
  3. De-identification tools and strategies
  4. Data quality
  5. Data equity and fairness
  6. Transparency and public engagement

In our analysis, we found that the Seattle Open Data Program has largely demonstrated that its procedures and processes to address privacy risks are fully documented and implemented, and cover nearly all relevant aspects of these six domains. Specifically:

Although most aspects of Seattle’s programs are documented and implemented, some aspects are not as developed. This is unsurprising, given the novel challenges posed by the intersection of open government equities and privacy interests with emerging technologies and data analysis techniques.

The Report concludes by detailing concrete technical, operational, and organizational recommendations to enable the Seattle Open Data Program’s approach to identify and address key privacy, ethical, and equity risks, in light of the city’s current policies and practices. For example, we recommend that the City of Seattle and the Open Data Program:

The City of Seattle is one of the most innovative cities in the country, with an engaged and civic-minded citizenry, active urban leadership, and a technologically sophisticated business community. By continuing to complement its growing Open Data Program with robust privacy protections and policies, the City of Seattle will be able to fulfill that program’s goals, supporting civic innovation while protecting individual privacy.

[1]Exec. Order No. 2016-01 (Feb. 4, 2016), available at http://murray.seattle.gov/wp-content/uploads/2016/02/2.26-EO.pdf.

[2] See infra Appendix A for a full list of resources.

[3] See Kelsey Finch & Omer Tene, The City as a Platform: Enhancing Privacy and Transparency in Smart Communities, Cambridge Handbook of Consumer Privacy (forthcoming).


Additional Resources:


 

Public comments on proposed Open Data Risk Assessment for the City of Seattle

The Future of Privacy Forum requested feedback from the public on its proposed Draft Open Data Risk Assessment for the City of Seattle. In 2016, the City of Seattle declared in its Open Data Policy that the city’s data would be “open by preference,” except when doing so may affect individual privacy. To ensure its Open Data program effectively protects individuals, Seattle committed to performing an annual risk assessment and tasked FPF with creating and deploying an initial privacy risk assessment methodology for open data.

This report is intended to provide tools and guidance to the City of Seattle and other municipalities navigating the complex policy, operational, technical, organizational, and ethical standards that support privacy-protective open data programs. In the spirit of openness and collaboration, FPF invited public comments from the Seattle community, privacy and open data experts, and all other interested individuals and stakeholders regarding its proposed framework and methodology for assessing the privacy risks of a municipal open data program. The public comment period extended from April 18, 2017 to October 2, 2017.

Following this period of public comment, a Final Report was published that assessed the City of Seattle as a model municipality and provided detailed recommendations to support the Seattle Open Data Program’s ability to identify and address key privacy, ethical and equity risks, in light of the city’s current policies and practices.

PUBLIC COMMENTS:

FPF wishes to thank all of those who provided public comments to this draft report for their thoughtful feedback and active participation in this process. FPF received the following timely and responsive public comments via email and Madison:


Additional Resources:


From cross-border transfers to privacy engineering, check out all panels and events FPF will be a part of at CPDP2018

Computers Privacy and Data Protection conference  (CPDP) kicks off this week in Brussels, and the theme this year is “The Internet of Bodies”. The conference will gather 400 speakers for 80 panels to set the stage for the privacy and data protection conversation in Europe for 2018. And this is such an important year for data protection – not only the General Data Protection Regulation becomes applicable in May, but also the text of the new ePrivacy Regulation will likely be finalized.

Given the global impact of developments in EU data protection and privacy regulation, the Future of Privacy Forum is taking part in the conversation, aiming to drive understanding between the privacy cultures in the US and the EU.

If you’re in Brussels this week, don’t miss out the panels and events we’ll take part in, listed chronologically:

Cross-border data transfers: effective protection and government access, in particular in the transatlantic context

Tuesday, January 23, 18.00, Petite Halle

The Brussels Privacy Hub and the Privacy Salon will be hosting an exclusive launch event for the 11th edition of CPDP on 23 January 2018. This year’s launch will be an Evening Roundtable on “Cross-border Data Transfers” starting at 18.00, which will be followed by a cocktail.

The EU data protection regime claims that cross-border data transfers should not prejudice the level of protection individuals are entitled to. However, enforcing this claim in a cross-border context is not evident, especially since the EU data protection rules should also give effective remedies against governments of third countries, where EU jurisdiction is only indirect. This problem is widely recognised and in the last years the CJEU has set high standards in Schrems I, C-362/14 and Opinion 1/15 PNR Canada. The roundtable will address the latest developments and relevant court cases and discuss the contributions of the various actors, including Privacy Advocates, Data Protection Authorities, Companies and Governments.

Renate Nikolay, Head of Cabinet of Commissioner Vĕra Jourová, DG JUST, European Commission, Max Schrems, noyb and Gabriela Zanfir-Fortuna, FPF EU Policy Counsel, will discuss about the latest developments in cross-border transfers and the outlook for 2018. The discussion will be moderated by Omer Tene, VP of IAPP and FPF Senior Fellow.


Physical tracking

Wednesday  January 24, 8.45, Grand Halle

FPF CEO, Jules Polonetsky, is speaking in a panel on “Physical tracking” together with Anna Fielder, Trans Atlantic Consumer Dialogue (UK), Mathieu Cunche, INSA Lyon (FR), Monica McDonnell, Informatica (UK).  The panel is organized by INRIA, Chaired by Daniel Le Métayer, INRIA and moderated by Gloria González Fuster, VUB (BE).

Tracking people in the physical world, through a variety of sensors, cameras, and mobile devices, is now common but it is becoming increasingly controversial. Knowing human dynamics such as crowd sizes, paths or visit durations are extremely valuable information for many applications. It offers great prospects to retailers or for urban planning. More generally, the extension to the physical world of the tracking already in place on the internet and the lack of control or even awareness of individuals raise serious questions. The goal of this panel is to discuss in a multidisciplinary way the issues raised by physical tracking, including the following questions:

  • Is it possible to enhance individuals’ control over their information (including information, consent and “do not track” options) in the context of physical tracking?
  • What recommendations could be made regarding the ePrivacy Regulation and the implementation of the GDPR to ensure better protection against physical tracking?
  • Can self-regulation initiatives such as the Future of Privacy Forum code of conduct help improve the situation?

Privacy engineering, lingua franca for transatlantic privacy

Wednesday, January 24, 11.45, La Cave

The Future of Privacy Forum is organizing a panel that aims to contribute to the essential transatlantic privacy debate by focusing on privacy engineering and the role it could play as “lingua franca” between the US and the EU privacy and data protection worlds. Privacy engineering is more important than ever in a time where accountability is significantly transferred to the creators of data centric systems. The state of the art of privacy engineering will be discussed with leading experts from the US and Europe, focusing on the latest solutions for embedding privacy safeguards in processing systems from their design stage, but also for quintessential issues such as de-identification, data portability, encryption, user control over data.

  • How big of a role does privacy engineering have for enhancing privacy of individuals/ users/consumers/digital citizens? Should it bear most of the “burden”? With whom else should privacy engineering share the “burden”?
  • Could privacy engineering become a lingua franca of the US and EU privacy worlds? Is it already in this position?
  • Is privacy by design achievable on a mass scale? Which are the factors that would facilitate the overall adoption of privacy by design?

The guest speakers are Simon Hania, Tomtom (NL), Naomi Lefkovitz, NIST (US), Seda Gürses, KU Leuven (BE), Ari Ezra Waldman, New York Law School (US). The panel is Chaired by Achim Klabunde , EDPS (EU) and moderated by Gabriela Zanfir-Fortuna.


Data processing beneficial to individuals: the use of legitimate interests

Wednesday, January 24, 14.15, Area 42 Grand

In today’s age of the Internet of Things, Artifi cial Intelligence (AI), cloud computing, mobile devices, advanced analytics and profi ling, it is becoming ever more diffi cult to obtain valid consent to process an individual’s personal data. More organisations are therefore looking to use the legitimate interest ground to process personal data, while at the same time struggling with its correct application. The balancing test needs to be completed, a good overview of potential benefi ts to individuals when processing their personal data needs to be produced and an ethical assessment needs to be maintained. All this to ensure that the risks of the data processing for the individual are minimised. This panel will discuss various approaches to the use of legitimate interest and the pros and cons of leveraging the benefi ts to individuals to process personal data.

  • What needs to be done to use legitimate interest as a ground for processing personal data under GDPR?
  • How do you balance benefi ts to individuals against risks to their rights and freedoms?
  • How does the balancing test contribute to ethical data processing?

Paul Breitbarth, Nymity (NL), Dominique Hagenauw, Considerati (NL), Leonardo Cervera Navas, EDPS (EU), and Gabriela Zanfir-Fortuna, FPF (US) will speak about using legitimate interests of a controller or a third party as lawful ground for processing under EU data protection law. The panel is organized by Nymity, Chaired by Raphaël Gellert, Tilburg University (NL) and moderated by Aurélie Pols, Mind Your Privacy (ES).


Can citizenship of the target ever be a justified basis for different surveillance rules?

Thursday, January 25, 8.45, Grande Halle

This panel, organized by Georgia Institute of Technology, will examine the topic of whether the nationality of an individual under surveillance (the “target”) is and should be relevant to the legal standards for surveillance.

Legislation in effect today, in countries including the United States and Germany, applies stricter protections for national security surveillance of a nation’s own citizens than for foreigners. To date, there has been no systematic discussion of whether and on what basis those stricter standards might be justified. Some writers have asserted a “universalist” position, that national security surveillance must apply identically to both citizens and non-citizens. This panel will provide a description of current law and practice. It will then discuss and debate whether, and in what circumstances if any, it may be justified to apply different surveillance standards based on whether the individuals under surveillance have citizenship or other significant connections to the country undertaking the surveillance

Peter Swire, Georgia Institute of Technology and FPF Senior Fellow, Joseph Cannataci, UN Special Rapporteur on the Right to Privacy (INT), Mario Oetheimer, Fundamental Rights Agency (EU) and Thorsten Wetzling, Stiftung NV (DE) will discuss, in a panel moderated by Amie Stepanovich, Access Now (US) and Chaired by Wendy Grossman, Independent Journalist (US).


SIDE EVENT


Privacy by Design, Privacy Engineering

Thursday, January 25, 19.30 (preceded by a cocktail at 18.30), Grande Halle

“Privacy by Design, Privacy Engineering” is a side event organized by the European Data Protection Supervisor and supported by FPF and Qwant, that will explore the role of privacy by design in the current privacy landscape.

Giovanni Buttarelli, EDPS, Jules Polonetsky, CEO, Future of Privacy Forum, Marit Hansen, Data Protection Commissioner, ULD Schleswig-Holstein, Eric Léandri, Co-founder and CEO Qwant will discuss privacy by design in a panel moderated by Seda Gurses, KU Leuven.

The GDPR introduces the obligation of data protection by design and by default. This is a very important step ahead and a challenge for many organisations, but it cannot be the end of the road. For technology to serve humans, a broader view on privacy and ethical principles must be taken into account in its design and development. The panel will discuss the perspectives of businesses and regulators which are the principles that they see as important in this context, and which are the approaches of their own organisations, and their demands and recommendations for other stakeholders.


PLSC EUROPE


Saturday, January 27, 9.15

Peter Swire, Georgia Institute of Technology and FPF Senior Fellow, and Jesse Woo, Georgia Institute of Technology, will have their paper discussed at PLSC Europe, “Understanding Why Nationality Matters for Surveillance Rules”.


ADDITIONAL INFORMATION