Public Trust in Artificial Intelligence Starts With Institutional Reform
October 2022 Update: We recently commented on NIST’s latest draft of a framework to manage the risks of AI. To improve the framework before it is finalized in January 2023, we think NIST should adjust the framework in several ways, including to better center communities impacted by AI systems, interrogate the broader structures around AI systems, and provide guidance on decommissioning AI systems or considering non-AI alternatives when appropriate.
Gaining the public’s trust in artificial intelligence (AI) will take more than just setting technical standards. It will require that institutions using this technology first prove they are worthy of our trust. Earlier this year, Congress passed its most significant law to date on AI. As part of this law, Congress tasked the National Institute of Standards and Technology (NIST) with creating a framework to manage risks associated with the use of AI. While this is a critical first step towards AI accountability, technical standards alone can’t guarantee trustworthiness in AI development and use. That’s why last week we sent a letter responding to NIST’s draft proposal on methods to manage bias in AI.
AI might evoke science fiction to some, but it is already being deployed throughout our society in ways that directly impact rights and liberties. For example, AI is used to determine where police resources will be deployed and who will stay incarcerated or be released on bail. Oftentimes, it is introduced without anything close to a sufficient evaluation of whether it is replicating or even super-charging existing biases in our society. That’s why we’re glad to see that NIST is soliciting feedback on the issue of bias in AI. Still, we’re concerned that their focus might be too narrow.
AI bias takes many forms, as NIST’s own proposal makes clear by identifying more than 35 types of biases. While some AI bias is due to technical factors such as the choice of variables that are closely related to demographic characteristics — such as when a tool can encode race based on zip code — many are not technical. These non-technical biases include population bias, which occurs when the people contributing data differ from the general population; historical bias, which occurs when models are trained on past biased data; and systemic biases, like institutional racism and sexism. Given this array of biases, it makes sense that bias can’t be mitigated only through technological changes, especially when technical fields themselves still struggle with a severe lack of diversity.
NIST’s draft proposal places a lot of emphasis on technical techniques and not enough on the societal context in which AI tools are designed and used. For example, the publication recommends engaging with the community when developing AI, but this engagement appears limited to “experts” and “end users.” Notably absent from the discussion is outreach to those who will be directly impacted by the use of AI systems, such as people in pretrial detention and families in the child welfare system. Impacted community members may hold very different views on how AI should be designed in a particular context — and whether it should be used at all.
Further, the draft proposal frames managing bias as a way to cultivate public trust. In doing so, it puts the burden on the public to increase its trust rather than on institutions to prove they are creating trustworthy technologies. AI with well-documented bias is being used in some of this country’s least trusted institutions, from the criminal legal system to the banking system. While we commend NIST for seeking public input, the other institutions where AI is already in use have historically failed to respond to public opinion and critique. That needs to be fixed before the public can begin to trust the use of even more AI by these institutions.
It's also important to remember that algorithmic design choices aren’t just technical choices. The choices made when designing algorithms are essentially policy decisions, and they often go unexamined. For example, in the context of bail decisions, a practitioner designing a tool to predict who is at risk of failing to reappear in court is essentially acting as a member of the legal system when they choose the variables that will or won’t go into the model and set the boundaries between high and low risk. In doing so, they enforce policies in the coding choices they make.
Transparency is vital in algorithmic decision-making systems. Making this process truly transparent may require regular audits and ongoing impact assessments by neutral third parties, with the results made public. NIST should take an active role in setting clear standards for these audits and impact assessments. They should also articulate the safeguards entities must put in place for those affected by algorithmic decision-making, which could include effective appeals processes for impacted individuals and compensation when people have been misjudged by AI tools.
NIST has proposed mitigating AI bias through a process of gradual refinement at the pre-design, development, and deployment stages. However, this approach seems to accept the deployment of AI as inevitable and always the best solution to a problem. We encourage NIST to reframe its approach to reflect the possibility that an algorithm might need to be terminated at each of these stages if unacceptable biases, harms, or impacts arise. Moreover, we suggest NIST set clear standards for instances in which AI should either be dropped or not developed in the first place.
Finally, NIST should emphasize that data privacy violations, flawed or unethical collection methods, and data inaccuracy can all contribute to algorithmic bias. Examples of each abound, from the creation of facial recognition databases without people’s consent, to problematic predictive policing technologies based on historically biased data.
Reducing AI bias is a noble goal, but the public’s trust in AI and other technology-enabled decision-making tools can’t be earned through the release of technical standards alone. Earning the public’s trust will require the meaningful engagement of impacted communities, increased transparency, improvements in data quality, and, most importantly, institutional reform.