Comprehensive Online Tracking is Not Unique to ISPs

|

Last week, the Senate Judiciary Committee (Subcommittee on Privacy, Technology, and the Law) held a hearing to explore the FCC’s proposed privacy rules regulating Broadband Internet Access Service providers (a subset of Internet Service Providers, or ISPs).

The discussion among top privacy regulators returned at several points to the value of consistency of privacy protection across the ecosystem. The proposed FCC rules would restrict ISPs from using customer proprietary information—defined broadly to include things like IP addresses, unique identifiers, and other personal information—for any purpose outside of providing their own services and marketing their own (or affiliates’) communications-related services, at least without seeking the customer’s affirmative Opt In consent. The justification given for these rules is that ISPs are uniquely “in a position to develop highly detailed and comprehensive profiles of their customers.” See para. 4, Notice of Proposed Rulemaking.

Near the end of the discussion, the FCC Chairman stated:

“When I go to Google [. . .] that is a decision that I am making. [. . .] I go to WebMD, and WebMD collects information on me. I go to Weather.com and Weather.com collects information on me. I go to Facebook and Facebook collects information on me. But only one entity connects all of that information, that I’m going to all those different sites, and can turn around and monetize it.” – Chairman Wheeler (approx. ~1h:30m)

This framing of the issue reflects a fundamental misunderstanding of the current online advertising ecosystem, which is fully capable of tracking individual behavior across the Internet as well as between devices.

Using Mozilla’s Lightbeam for Firefox extension, it can quickly be seen that the third party tracking industry is inter-woven and comprehensive. After installing the Firefox browser and visiting only one website (WebMD.com) (See Fig. 1, below), I have connected with 24 third party sites.

1 - webmd red sq

Fig. 1. Mozilla’s “Lightbeam for Firefox” extension demonstrates that my visit to a single website generated 24 third party connections. Circular nodes represent websites visited, and triangular nodes are third party sites. Purple lines identify when a site has stored data on the browser (cookies).

After visiting three additional sites—for a grand total of four websites—I have connected with 119 third party entities (see Fig. 2, below). Each “connection” means that entity can identify the web page the consumer is visiting and can share that data with other parties to which they are interconnected. Some parties are linked up to many web sites. But even for those that are not directly connected to a particular site, third party entities who are linked are capable of buying and selling this data at third party data exchanges. These data exchanges, by linking and compiling data from hundreds of different online and offline sources, can “match up” consumer behavior across the Internet, creating comprehensive and detailed individual profiles.

4 - cnn red sq

Fig. 2. Mozilla “Lightbeam for Firefox” display after visiting only four websites. Circular nodes represent websites visited, and triangular nodes are third party sites. Purple lines identify when a site has stored data on the browser (cookies).

The third party advertising networks and data partners visualized above use a variety of methods designed to create comprehensive profiles of a user’s entire web browsing history. This includes persistent identifiers (cookies), IP addresses, device identifiers, direct authentication (such as email addresses), or probabilistic methods (such as browser fingerprinting). For a more extensive explanation of these tracking methods, see our 2015 report on Cross-Device Tracking. Furthermore, this information can be combined with offline data (appended data), such as a user’s in-store purchase history, for an even more comprehensive consumer profile.

Many of the leading online platforms also correlate data across websites. For example, many websites (including WebMD, seen above) carry social media plug-ins that allow those social media platforms to compile browsing histories of individuals across the Internet and link that browsing activity to the same user’s social media behavior.  If a consumer browsing the Web sees a Twitter button on a website they visit,  that data goes to Twitter to help serves ads on Twitter.

Mobile apps often collect even more granular information, such as information about the user’s in-app behavior, and other mobile data such as the Calendar or Contacts. Access to some mobile data (such as Location Services) requires the user’s Opt In permission, but access to other mobile information (such as the nearby Wi-Fi networks, from which location can be inferred) sometimes does not.  Some leading apps serve ads based on knowing what other apps are installed on a user’s device. WebMD, for example, in addition to tracking and sharing the data visualized above, has a mobile app that enables it to track users across desktop and mobile platforms.

This data collection usually occurs without directly sharing explicitly personal information—rather, for security and privacy reasons, industry players typically match up individual behavior using “hashed” identifiers. And many, including Commissioner Ajit Pai, have pointed out that online tracking has generated benefits for consumers, including the availability of free and reduced-cost online content subsidized by online advertising that can be made more efficient and relevant through information about online audiences.

There have been many industry efforts in recent years to self-regulate the market in order to alleviate these privacy concerns and build consumer trust. For example, the National Advertising Initiative and Digital Advertising Alliance have enforceable codes and guidelines covering the uses of online data collected and used for online behavioral advertising (interest-based advertising). The Wireless Association (CTIA) has issued high-level voluntary guidelines around mobile data, especially geo-location. And the Future of Privacy Forum (FPF) has a Location & Ad Practices Working Group that is developing best practices and consumer awareness information around online advertising and another group that has developed best practices for data from wearables and wellness apps.

Despite industry efforts, for many advocates and consumers the comprehensive and pervasive nature of online tracking continues to be debated. But one thing is certain: it is not unique to ISPs.