Meaningful Standards for Auditing High-Stakes Artificial Intelligence
When hiring, many organizations use artificial intelligence (AI) tools to analyze resumes and predict job-relevant skills. Colleges and universities are using AI to automatically grade essays, process transcripts, and review extracurricular activities to determine in advance who is likely to be a “good student.” With so many unique use cases, it’s important to ask: can AI tools ever be truly unbiased decision makers? In response to allegations of unfairness and bias in tools used for hiring, college admissions, predictive policing, health interventions, and more, the University of Minnesota (U of M) recently developed a new set of audit guidelines for AI tools.
The audit guidelines, published in the american psychologist, were developed by Richard Landers, an associate professor of psychology at the University of Minnesota, and Tara Behrend of Purdue University. They apply a century of research and professional standards to measure personal characteristics by researchers in psychology and education to ensure fairness in AI.
The researchers developed guidelines for auditing AI by first considering the ideas of fairness and bias through three broad lenses:
- How individuals decide if a decision was fair and impartial
- How society’s legal, ethical and moral standards present fairness and bias
- How individual technical fields, such as computer science, statistics, and psychology, define fairness and bias internally
Using these lenses, the researchers presented psychological audits as a standardized approach to assess the fairness and bias of AI systems that make predictions about humans in high-stakes application areas, such as as hiring and college admissions.
The audit framework consists of twelve components divided into three categories, including:
- Components related to creation, processing performed by and predictions created by AI
- Components related to how AI is used, who its decisions affect, and why, and
- Components related to overriding challenges: the cultural context in which AI is used, respect for those affected by it, and the scientific integrity of the research used by AI vendors to support their claims.
“Using AI, especially when hiring employees, is a decades-old practice, but recent advances in AI sophistication have created a bit of a ‘wild west’ feel. for AI developers,” Landers said. “There are a ton of startups out there now that don’t know the existing ethical and legal standards for hiring people using algorithms, and they sometimes harm people due to ignoring established practices. So we developed this framework to help inform both these companies and the relevant regulators.
The researchers recommend that the standards they developed be followed both by internal auditors when developing high-stakes predictive AI technologies, and by independent external auditors thereafter. Any system that claims to make meaningful recommendations about how people should be treated must be evaluated within this framework.
“Industrial psychologists have unique expertise in evaluating high-stakes assessments,” Behrend said. “Our goal for this document was to educate developers and users of AI-based assessments on existing requirements for fairness and efficiency, and to guide the development of future policy that will protect workers. and candidates.
AI models are developing so rapidly that it can be difficult to keep up with the most appropriate pace for auditing a particular type of AI system. The researchers hope to develop more precise standards for specific use cases, partner with other organizations around the world interested in establishing auditing as the default approach in these situations, and work more broadly towards a better future with the ‘IA.