Artificial bias : Intelligent machines will inherit our prejudices
Back in 2010, the android Philip, crafted by David Hanson and a collaboration of artists and scientists, when asked if he could think, replied : "A lot of humans ask me if... everything I do is programmed." He then explains that everything humans and animals do is, to some extent, programmed as well. Philip hauntingly resembles the late science fiction writer Philip K. Dick. But his most striking feature is his ability to use a speech recognition software, compare words to a database, and give the correct answer, making him a great conversationalist. Humble enough to admit his current limitations, Philip went on to assure us he will become better once he starts integrating new words from the Web. The ability to integrate language in this manner has paved the way for some amazing technologies. At Facebook, artificial intelligence researchers developed a system that answers questions about The Lord of The Rings, after reading the summary of the book. Similarly, the AI startup MetaMind published a research about a system that answers questions about a piece of natural language, and even analyze its emotions, using a type of short-term memory. More remarkable is the demonstration by Rick Rashid, the founder of Microsoft Research, of a speech recognition software that translates English to Chinese, in real-time, with an error rate of 7 percent. Researchers at Princeton University are concerned, however, that machine's language learning, while offering tremendous advantages in diverse applications, will come at a price : intelligent machines that integrate our most harmful prejudices.
The notion of biased machines may sound strange. After all, they don't contain the human historical background necessary for a regularity, and thus a bias, to develop. But researchers at Princeton and Bath showed that the mere integration of human language by a machine is enough to provide such regularity, making AI machines prone to acquire our biases, including our most harmful prejudices : racism and sexism. These findings have broad implications, not only in AI machine learning, but in diverse fields including psychology, sociology, and human ethics.
Measuring machine bias
To quantify machine bias, the team used a variant of The Implicit Association Test, a test used to document human bias. Unlike in machines, bias detection in humans is relatively straightforward. The Implicit Association Test, introduced in 1998, has a simple idea : to ask participants to pair two objects, and then measure their response time. Quicker response time means that the objects are closely linked in the participants' minds. For example, the participants were quicker to pair flowers as pleasant, and insects as unpleasant than they were to pair them in reverse (insects as pleasant and flowers as unpleasant). This means that there's an implicit bias to associate flowers with the quality of being pleasant. Of note, bias is used here without negative connotation but solely to indicate an implicit preference. This bias towards flowers is called a neutral bias, not generating any social concern. But ever since its introduction, the Implicit Association Test has documented, in addition to the universally accepted neutral biases, implicit racial and sexist prejudices.
To implement this test in a machine, the researchers used word embedding : a representation of words as points in a vector space. Of note, this same technique is used in natural-language processing, including web search and document classification. Additionally, word embedding is used in cognitive science for understanding human memory and recall. And because, in a research of this kind, size does matter, they used roughly 840 billion words issuing from a corpus obtained from a large-scale crawl of the Web. There are different ways to implement that, but they chose the state-of-the-art Glove embedding and predicted the same results from using other embedding algorithms. The idea is that, by measuring the distance - technically cosine similarity scores- between vectors representing words, we can measure semantic similarities between those words. For example, if programmer is closer to man than to woman, it suggests a gender stereotype. But in order to correct for chance and "noise", they used small baskets of terms to represent similar concepts, making the results statistically-significant. By using this technique, they have been able to document all the classical biases.
Machines are neutral ? Guess again
In the original Implicit Association Test as well as in the word embedding used by the researchers, European American names are more likely than African American names to be closer to pleasant than to unpleasant. Additionally, females are more associated with family and males with career. Also, female terms, like in the original test, were more associated with arts than mathematics, compared to male terms. In a world where AI is given increasing agency, these results raise too many questions by undermining the myth of machine neutrality. The team concluded that "if AI is to exploit via our language the vast knowledge that culture has compiled, it will inevitably inherit human-like prejudices."
The intuitive solution could be to correct for biases using a different algorithm, but the researchers warn that it's not that simple; it is impossible to employ language meaningfully without bias. The algorithm does not pick gender biases alone but rather a spectrum of human biases, reflected in language, and making it meaningful : there's no meaning without bias. In the case of humans, we integrate different forms and layers of meaning. We can later on learn that "prejudice is bad" and keep other layers of bias necessary for language and intelligence. But for now, that is impossible for an AI machine, because we try to make them as simple as possible. This research is a step in the right direction and opens the way for others to try and implement this multi-layered complexity into AI.
The results discussed above lend credibility to the theory that language alone is sufficient to explain the transmission of prejudices from generation to generation, and how these prejudices are not easy to correct. According to this view, prejudice issues from the preference of one's group, and not from active malice towards others. And correcting it requires direct intervention by de-categorizing and recategorizing outgroups. This knowledge is so crucial given all the hate rhetoric towards others, prevalent in this time. Maybe AI prejudices are not easy to correct at this point, but for now, we can start by correcting ours.
- Aylin Caliskan-Islam, Joanna J. Bryson, Arvind Narayanan: Semantics derived automatically from language corpora necessarily contain human biases.
- MIT Technology Review: Deep learning
- GloVe: Global Vectors for Word Representation : http://nlp.stanford.edu
- Ecyclopedia Britannica