Earlier this month researchers from the Massachusetts Institute of Technology and Stanford University reported that they had found that three commercial facial-analysis programs from major tech companies showed bias in both skin-type and gender. The error rates for determining the gender of light-skinned men were 0.8% compared with much higher error rates for darker-skinned women, which in some cases was as much as 20% and 34%.
This is not the first time an algorithm powering an AI application has delivered an erroneous — to say nothing of embarrassing — result. In 2015, Flickr, a photo-sharing site owned by Yahoo launched image-recognition software that automatically created tags for photos. The problem? Some of the tags being created were highly offensive — such as “sport” and “jungle gym” for pictures of concentration camps and “ape” for pictures of humans including an African American man. The service also tagged a white woman wearing face paint as an ape as well.
In 2016 when Microsoft unveiled Tay, a chatbot for Twitter, it took about 24 hours for Tay to pick up misogynistic and racist language from Twitter users and then repeat this language back to Twitter's users.
How do such events happen? Algorithms, after all, are formal rules that make predictions based on historical patterns. That would seem to be the antithesis of bias.
Related Article: Why the Benefits of Artificial Intelligence Outweigh the Risks
Why Algorithms Go Bad
There are several reasons how algorithms deliver unexpected results said Julia Stoyanovich, a professor in Drexel’s College of Computing & Informatics who studies the ethical development of algorithms and artificial intelligence, but it usually comes down to bias in the original data on which the algorithm is trained, validated and ultimately deployed.
Humans play a role too, she added, through the scoring methods developed for the data set and then decisions on how to weigh the different attributes in the data set. Here is one example she offered, “For college admissions, someone might say 'I’m going to give an equal importance to SAT scores and to GPAs,' and they may not realize that math SAT scores are lower systematically for women and that English SAT scores are systematically lower for African Americans.”
Or humans might give the algorithm faulty data against which to train. Eliezer Yudkowsky of the Machine Intelligence Research Institute, in a research paper, tells of a computer vision system that the US Army had set out to build with the goal of having it automatically detect camouflaged enemy tanks. The system was supposed to identify pictures of tanks, but in reality was identifying backgrounds of such images.
Yudkowsky explained that the researchers trained a neural net on 50 photos of camouflaged tanks in trees, and 50 photos of trees without tanks. “It turned out that in the researchers’ dataset, photos of camouflaged tanks had been taken on cloudy days, while photos of plain forest had been taken on sunny days. The neural network had learned to distinguish cloudy days from sunny days, instead of distinguishing camouflaged tanks from empty forest,” she wrote.
Related Article: 8 Examples of Artificial Intelligence (AI) in the Workplace
While examples of African Americans being tagged as apes or concentration camps as gyms are highly offensive ultimately such incidents don’t have a societal impact, Stoyanovich said. “The algorithms that can do the most damage are both discriminating against groups of individuals systematically and are opaque so an individual cannot know that there’s something going,” she said. One of the first examples of this kind was documented by Latanya Sweeney, a professor of Government and Technology in Residence at Harvard University, in 2013 when she published work she conducted on the ads served against racially identifiable names.
Sweeney, an African American, Googled her name and received an ad for a service that asked ‘Would you like to see the criminal record of Latanya Sweeney?’ Sweeney, who doesn’t have a criminal record, began a study comparing the incidents of these ads being served in the result of a Google search against racially identifiable names. She was finding these ads served in statistically significant numbers more than, say, a person searching for Mary Jones. “This, of course, is terrible because when a potential employer Googles your name, and they receive an ad offering to serve up that person’s criminal records they might assume that person actually has a criminal record,” Stoyanovich said.