One of the defining features of the data economy is that research is increasingly taking place outside of universities and traditional academic settings. With information becoming the raw material for production of products and services, more organizations are exposed to and closely examining vast amounts of personal data about citizens, consumers, patients and employees. This includes companies in industries ranging from technology and education to financial services and healthcare, and also non-profit entities, which may seek to advance societal causes, or other agenda-driven projects.
For research on data subject to the Common Rule, institutional review boards (IRBs) provide an essential ethical check on experimentation and research. However, much of the research relying on corporate data is beyond the scope of IRBs, because the data has been previously collected, the project or researcher is not federally funded, the data may be a public data set or other reasons.
Future of Privacy Forum (FPF) has received a Schmidt Futures grant to create an independent party of experts for an ethical review process that can provide trusted vetting of corporate-academic research projects. FPF will establish a pool of respected reviewers to operate as a standalone, on-demand review board to evaluate research uses of personal data and create a set of transparent policies and processes to be applied to such reviews.
FPF will define the review structure, establish procedural guidelines, and articulate the substantive principles and requirements for governance. Other considerations to be addressed include companies’ common concerns about risk analysis, disclosure of intellectual property and trade secrets, and exposure to negative media and public reaction. Following this phase, members who can be available for reviews will be recruited from a range of backgrounds. The project will include input and review by government, civil society, industry and academic stakeholders.
Sara Jordan, who will be cooperating with FPF on this project, has proposed one model for addressing this challenge. Her paper, Designing an AI Research Review Committee, calls for a review committee dedicated to ethical oversight of AI research by giving serious consideration of the design of such an organization. This model proposes a design for such a committee drawing upon the history and structure of existing research review committees such as IRBs, Institutional Animal Care and Use Committees (IACUC), and Institutional Biosafety Committees. This model follows that of the IBC but with a blend of features from human subject and animal care and use committees in order to improve implementation of risk-adjusted oversight mechanisms.
Another analysis and recommendation was published recently by Northeastern University Ethics Institute and Accenture: Building Data and AI Ethics Committees. This paper comments that an ethics committee is a potentially valuable component of accomplishing responsible collection, sharing, and use of data, machine learning, and AI within and between organizations. However, to be effective, such a committee must be thoughtfully designed, adequately resourced, clearly charged, sufficiently empowered, and appropriately situated within the organization.
Likewise the EU is considering these challenges with several recent AI guidance publications including the Council of Europe established an ad hoc committee on Artificial Intelligence, which will examine the feasibility and potential elements on the basis of broad multi-stakeholder consultations, of a legal framework for the development, design and application of artificial intelligence, based on Council of Europe’s standards on human rights, democracy and the rule of law.
The ethical framework applying to human subject research in the biomedical and behavioral research fields dates back to the Belmont Report. Drafted in 1976 and adopted by the United States government in 1991 as the Common Rule, the Belmont principles were geared towards a paradigmatic controlled scientific experiment with a limited population of human subjects interacting directly with researchers and manifesting their informed consent. These days, researchers in academic institutions as well as private sector businesses not subject to the Common Rule, seek to conduct analysis of a wide array of data sources, from massive commercial or government databases to individual tweets or Facebook postings publicly available online, with little or no opportunity to directly engage human subjects to obtain their consent or even inform them of research activities. Data analysis is now used in multiple contexts, such as combatting fraud in the payment card industry, reducing the time commuters spend on the road, detecting harmful drug interactions, improving marketing mechanisms, personalizing the delivery of education in K-12 schools, encouraging exercise and weight loss, and much more.
These data uses promise tremendous research opportunities and societal benefits but at the same time create new risks to privacy, fairness, due process and other civil liberties. Increasingly, researchers and corporate officers find themselves struggling to navigate unsettled social norms and make ethical choices for ways to use this data to achieve appropriate goals. The ethical dilemmas arising from data analysis may transcend privacy and trigger concerns about stigmatization, discrimination, human subject research, algorithmic decision making and filter bubbles.
In many cases, the scoping definitions of the Common Rule are strained by new data-focused research paradigms, which are often product-oriented and based on the analysis of preexisting datasets. For starters, it is not clear whether research of large datasets collected from public or semi-public sources even constitutes human subject research. “Human subject” is defined in the Common Rule as “a living individual about whom an investigator (whether professional or student) conducting research obtains (1) data through intervention or interaction with the individual, or (2) identifiable private information.” Yet, data driven research often leaves little or no footprint on individual subjects (“intervention or interaction”), such as in the case of automated testing for security flaws.
While obtaining individuals’ informed consent may be feasible in a controlled research setting involving a well-defined group of individuals, such as a clinical trial, it is untenable for researchers experimenting on a database that contains the footprints of millions, or indeed billions, of data subjects. In response to these developments, the Department of Homeland Security commissioned a series of workshops in 2011-2012, leading to the publication of the Menlo Report on Ethical Principles Guiding Information and Communication Technology Research. That report remains anchored in the Belmont Principles, which it interprets to adapt them to the domain of computer science and network engineering, in addition to introducing a fourth principle, respect for law and public interest, to reflect the “expansive and evolving yet often varied and discordant, legal controls relevant for communication privacy and information assurance.”
Ryan Calo foresaw the establishment of “Consumer Subject Review Boards” to address ethical questions about corporate data research. Calo suggested that organizations should “take a page from biomedical and behavioral science” and create small committees with diverse expertise that could operate according to predetermined principles for ethical use of data. No model has a direct correlation to the current challenges, however. The categorical non-appealable decision making of an academic IRB, which is staffed by tenured professors to ensure independence, will be difficult to reproduce in a corporate setting. And corporations face legitimate concerns about sharing trade secrets and intellectual property with external stakeholders who may serve on IRBs.
FPF’s work on this grant will seek to demonstrate the composition and viability of one way to address these challenges.