However we want to define “Big Data” – and the FTC’s latest workshop on the subject suggests a consensus definition remains elusive – the path forward seems to call for more transparency and the establishment of firmer frameworks on the use of data. As Chairwoman Ramirez suggested in her opening remarks, Big Data calls for a serious conversation about “industry’s ethical obligations as stewards of information detailing nearly every facet of consumers’ lives.”
Part of that challenge is that some Big Data uses are often “discriminatory”. Highlighting findings from his paper on Big Data and discrimination, Solon Barocas began the workshop by noting that whole point of data mining is to differentiate and to draw distinctions. In effect, Big Data is a rational form of discrimination, driven by apparent statistical relationships rather than any capriciousness. When humans introduce unintentional biases into the data, there is no ready solution at a technical or legal level. Barocas called for a conversation for lawyers and public policy makers to have a conversation with the technologists and computer scientists working directly with data analytics – a sentiment echoed when panelists realized a predictive analytics conference was going on simultaneously across town.
But the key takeaway from the workshop wasn’t that Big Data could be used as tool to exclude or include. Everyone in the civil rights community agreed that data could be a good thing, and a number of examples were put forth to suggest once more that data had the potential to be used for good or for ill. Pam Dixon of the World Privacy Forum classifying individuals creates a “data paradox,” where the same data can be used to help or to harm that individual. For our part, FPF released a report alongside the Anti-Defamation League detailing Big Data’s ability to combat discrimination. Instead, there was considerable desire to understand more about industry’s approach to big data. FTC staff repeatedly asked not just for more positive uses of big data by the private sector, but inquired as to what degree of transparency would help policy makers understand Big Data decision-making.
FTC Chief Technologist Latanya Sweeney followed up her study that suggested web searches for African-American names were more likely than searches of white-sounding names to return ads suggesting the person had an arrest record by looking at credit card advertising and website demographics. Sweeney presented evidence that advertisements for harshly criticized credit cards were often directed to the homepage of Omega Psi Phi, a popular black fraternity.
danah boyd observed that there was a general lack of transparency about how Big Data is being used within industry, for a variety of complex reasons. FTC staff and Kristin Amerling from Senate Commerce singled out the opacity surrounding the practices of data brokers when describing some of the obstacles being faced when policy makers try to under how Big Data is being used.
Moreover, while consumers and policy makers are trying to grapple with what companies are doing with their streams of data, industry is also placed in the difficult position of making huge decisions about how that data can be used. For example, boyd cited the challenges JPMorgan Chase faces when using analytics to evaluate human trafficking. She applauded the positive work the company was doing, but noted that expecting it to have the ability or expertise to effectively intervene in trafficking perhaps asks too much. They don’t know when to intervene or whether to contact law enforcement or social services.
These questions are outside the scope of their expertise, but even general use of Big Data can prove challenging for companies. “A lot of the big names are trying their best, but they don’t always know what the best practices should be,” she concluded.
FTC Commissioner Brill explained that her support for a legislative approach to increase transparency and accountability among data brokers, their data sources, and their consumers, was to help consumers and policy makers “begin to understand how these profiles are being used in fact, and whether and under what circumstances they are harming vulnerable populations.” In the meantime, she encouraged industry to take more proactive steps. Specifically, she recommended again that data brokers explore how their clients are using their information and take steps to prevent any inappropriate uses and further inform the public. Companies can begin this work now, and provide all of us with greater insight into – and greater assurances about – their models,” she concluded.
A number of legal regimes may already apply to Big Data, however. Laws that govern the provision of credit, housing, and employment will likely play a role in the Big Data ecosystem. Carol Miaskoff at the Equal Employment Opportunity Commission suggested there was real potential with Big Data to gather information about successful employees and use that to screen people for employment in a way that exacerbates prejudices built into the data. Emphasizing his recent white paper, Peter Swire suggested there were analogies to be made between sectoral regulation in privacy and sectoral legislation in anti-discrimination law. With existing legal laws in place, he argued that it was past time to “go do the research and see what those laws cover” in the context of Big Data.
“Data is the economic lubricant of the economy,” the Better Business Bureau’s C. Lee Peeler argued, and he supported the FTC’s continued efforts to explore the subject of Big Data. He cited earlier efforts by the Commission to examine inner-city marketing practices, which produced a number of best practices still valid today. He encouraged the FTC to look at what companies are doing with Big Data on a self-regulatory basis as a basis for developing workable solutions to potential problems.
So what is the path forward? Because Big Data is, in the words of Promontory’s Michael Spadea, a nascent industry, there is a very real need for guidelines on not just how to evaluate the risks and benefits of Big Data but also how to understand what is ethically appropriate for business. Chris Wolf highlighted FPF’s recent Data-Benefit Analysis and suggested companies were already engaged in detailed analysis of the use of Big Data, though everyone recognized that businesses practices and trade secrets precluded making much of this public.
FTC staff noted there was a “transparency hurdle” to get over in Big Data. Recognizing that “dumping tons of information” onto consumers would be unhelpful, staff picked up on Swire’s suggestion that industry needed some mechanism to justify what is going on to either regulators or self-regulatory bodies. Spadea argued that “the answer isn’t more transparency, but better transparency.” The Electronic Frontier Foundation’s Jeremy Gillula recognized the challenge companies face revealing their “secret sauce,” but encouraged them to look at more way to give consumer more general information about what was going on. Otherwise, he recommended, consumers ought to collect big data on big data and turn data analysis back on data brokers and industry at large through open-source efforts.
At the same time, Institutional Review Boards, which are used in human subject testing research, were again proposed as a model for how companies can begin affirmatively working through these problems. Citing a KPMG report, Chris Wolf insisted that strong governance regimes, including “a strong ethical code, along with process, training, people, and metrics,” were essential to confront the many ethical and philosophical challenges that flirted around the day’s discussions.
Jessica Rich, the Director on the FTC’s Consumer Protection Bureau, cautioned that the FTC would be watching. In the meantime, industry is on notice. The need for clearer data governance frameworks is clear, and careful consideration of Big Data project should be both reflexive and something every industry privacy professional talks about.
Other Relevant Reading from the Workshop: