Big Data Research
TechAmerica hosted a Congressional Briefing, Big Data: What it Means and How it Drives Innovation, this week. The event’s distinguished panel included a variety of industry experts and focused on the meaning of “Big Data”, how data is being used in the market-place, government uses of “Big Data”, and the future of “Big Data”.
The panel explained that “Big Data” as a concept has existed for several years and is also referred to as “Data Mining”. The terms refer to the processing, or mining, of large raw data sets with the goal of establishing usable information. Often cited examples of Big Data use are identity verification, market research, and fraud protection.
Despite not being a new concept in itself, the evolution of technological capabilities is changing the way in which Big Data is used. Heightened data processing capabilities mean that analysis can be done with larger data sets and using fewer resources. Essentially, this is changing the way that companies analyze data and allowing them to develop a greater amount of insights through their analysis.
Bill Perlowitz, CTO of the Science Technology & Engineering Group at Wyle, illustrated the emerging form of data analysis as a shift from hypothetical research to data driven research. Rather than analyzing data to reach pre-determined information goals, actors can now process data to establish their goals and reach new factual assumptions. Big Data Research is no longer limited to what researchers can imagine.
This shift in analytical paradigms highlights two factors that could lead to privacy concerns. First, the new model relies on large amounts of data, which incentivizes companies to collect and retain data on a larger scale and for longer periods of time. This can potentially conflict with privacy practices such as data minimization and purpose limitation.
Second, the new data driven research model, which no longer necessarily relies on a pre-established research goal, maximizes the factual discoveries that are established in each analytic cycle. When the data is about individuals, insights gleaned can potentially be intrusive by nature.
The increased usability of Big Data is placing strain on companies that deal with personal information to maintain privacy practices. How can companies maintain privacy under these circumstances? For Nuala O’Connor, privacy lead at General Electric, industry actors should focus on good data stewardship and establishing best practices to keep data confidential. Ms. O’Connor indicated data de-identification as an example of a good privacy practice. However, though potentially mitigating privacy threats, de-identification is not a privacy ‘silver bullet’. Mainly, some data processors question whether de-identifying data could limit its usability and may therefore be reluctant to use the privacy practice.
As Big Data becomes increasingly accessible and usable, good privacy practices are also contingent on policy-makers and companies’ ability to balance privacy with the gains expected from data analysis. This involves making a value judgment regarding what appropriate data uses are. As indicated by Jules Polonetsky and Omer Tene, “It is doubtful that such a value choice has consciously been made”. Privacy leaders would do well to begin to consider both the tools that can promote privacy protections when data is used and the societal merits of the use of Big Data.
-Julian Flamant