Be honest. Do you believe that the data is always right? Or that algorithms never make mistakes? While it may be very tempting to hide behind the data and algorithms, you must not forget that with great algorithmic power, comes great responsibility. So, let’s face the truth. As much as we would like to believe that we have perfect data and algorithms, this is more than often not the case. As algorithms increasingly find their way into replacing human decision-making processes, it is important that you understand what the implications and risks are. As of today, algorithms already make high-impact decisions such as: whether or not you are eligible for a mortgage, whether you will be hired, how likely you are to commit fraud and so on. Algorithms are great at finding patterns we are most likely unable to find. But if you are not careful, the algorithm might favour unwanted patterns.
Case: Amazon and AI recruiting
In 2014, Amazon launched an experimental recruitment tool for their technical branch driven by artificial intelligence, which rates incoming applications. The AI model was trained using submitted resumes over a 10-year timespan and prior human recruitment decisions. After a year, however, it was found that the AI model for some reason started to penalise women applicants.
So, what went wrong? As at the time the technical branch was male-dominated, that very same given in the data used to train the AI model had a bias towards men. As a result, Amazon decided to strip indicative information such as name and gender to counter this. Case closed? Well, no. The model had retrained itself a new pattern to penalise resumes including the word ‘women’ (for example, women’s chess club) and all-women’s colleges. In the end, Amazon abandoned the recruitment tool as they were unable to address this issue.
The Black Box
The problem with complex AI models is that it is often very difficult to determine which features in the data were used to find predicting patterns. This phenomenon is also referred to as ‘the black box’; a ‘machine’ which takes a certain input, uses or transforms it in some way, and delivers an output. Though, in many cases, you would want to know how the AI model arrived at a certain decision. Especially in cases where the automated decision could potentially have a significant impact on your personal life (such as with fraud detection).
Profiling and the law
Such automated processing of personal data in order to analyse or predict certain aspects of individuals, is also referred to as ‘profiling’. Legal safeguards against unlawful profiling do exist, for example through the General Data Protection Regulation (GDPR), a legal framework concerning the collection and processing of personal data of individuals in the European Union. So does Article 22 of the GDPR specify that individuals have the right to not be subject to automated processing and profiling which may yield negative (legal) effects.
One popular case in the Netherlands which has had significant implications to individuals, was the SyRI (System Risk Indication) case where the Dutch government used algorithms to detect fraud with social benefits and taxes. The problems of this system were that the amount of data used was unknown, datasets were linked using unknown risk models and ‘suspicious’ individuals were automatically flagged and stored in a dossier without the individual being informed in any way. Individuals affected by such automated decision-making suffered from significant financial and mental issues for several years, before the Dutch court ruled such profiling to be in violation with the European Convention on Human Rights. While the Dutch government has resigned over this case and promised all affected individuals to be compensated, the government has only managed to compensate a fraction of the eligible individuals.
Countering bias
While AI models can achieve high accuracy scores in terms of making correct classifications, this does not automatically mean that the predicted value is fair, free of bias or non-discriminatory. So, what can you do? Here are some pointers according to the FACT principle:
- Be mindful when processing personal data and beware of the potential implications on individuals. Ensure that decisions are fair and find ways to detect unfair decisions.
- Ensure that decisions are accurate, such that misleading conclusions are avoided. Test multiple hypotheses before deploying your model and make sure that the input data is ‘clean’.
- Confidentiality should be ensured in order to use the input data in a safe and controlled manner.
- Transparency is crucial. People should be able to trust, verify and correctly interpret the results.
References
https://www.reuters.com/article/us-amazon-com-jobs-automation-insight-idUSKCN1MK08G
https://www.bbc.com/news/business-44466213
Very interesting post. I wonder if we will ever find a solution for discriminating algorithms. The algorithms learn from human input, so as long as humans discriminate, we cannot expect completely fair algorithms. It is an important issue, as the groups that are put behind by algorithms, for example women or people with a migrant background, are often already placed behind in a certain way in society. Using algorithms increases the risk of these gaps in opportunities to grow even larger, which is something we should be aware of. As it stands now, I think that important decisions, such as fraud detection, can make use of AI models to detect risks and flag suspects, but the final decision should still be in human hands, as the consequences are devastating for the victims of discriminating algorithms.
I really liked the article, it is true that more and more things are being decided based on the output of algorithms. And the importance of the decisions of such algorithms seems to increase fairly quickly too. The examples in this post were very interesting and illustrated well that you always have to critically assess the output of these processes. Progressively more things that humans used to do are being taken over by these types of processes and it’s good to be aware of the flaws, they simply don’t have the emotional intelligence of humans. I do find it interesting to see that even a company as big as Amazon wasn’t able to make the algorithm act fairly and non-discriminatory. That says a lot about the current state of the knowledge we have on this topic nowadays.
Hi Andrew,
Personally I have been working on a project in the past to identify opportunities for one of my previous employers to make recruiting and hiring more inclusive. In your post you clearly state the case of Amazon and how they decided to abandon their recruitment tool based on unfair bias against women. As Niek also mentions, these algorithms in the AI models are shaped by the initial human input. My experience is however that screening tools using AI in HR processes are usually more transparent than any form of human screening. The difference between screening with AI or without is mostly the transparency. I belief that, when we are able to understand the scoring mechanism of an AI model, it will be a more reliable way of screening than any human every could offer. However, based on the current state of development for AI I too would argue we could not opt for an autonomous intelligence approach as described by PwC, but should go with augmented intelligence approach. This mitigates some of the risks mentioned but also brings a lot of advantages for employers. In my opinion, a very cool balance between company profits and CSR requirements. Do you feel that such approach would be acceptable, or should we stick with a fully human approach?
Sources:
https://www.pwc.nl/nl/assets/documents/artificial-intelligence-in-hr-a-no-brainer.pdf
https://hbr.org/2019/04/the-legal-and-ethical-implications-of-using-ai-in-hiring
I am very curious to discuss further!
Kwint
Hi Kwint! One of the largest challenges in automated decision-making is that the models are rather, in most cases, not transparent at all. The SyRI case I discussed in my article shows that even the Dutch government, which should be transparent and accountable, failed to design fair algorithms. I would love to hear your experiences of AI in HR though!
The problem with semi-autonomous decision-making is that the person who has to evaluate the produced scores of the AI model, often does not have a thorough understanding on how such a score is calculated. Moreover, that same person may be tempted to just blindly follow the produced advice of the scoring model. In other variations, such as where ‘bad’ applicants are automatically removed from human review, may also impose unfair bias if the AI model is not properly trained or encounters an outlier.
Despite the large challenges, it is important that we stay critical with our use of AI models in sensitive use cases. I don’t think a fully human approach is necessary, as long as the AI models are transparent and the end user (or ‘data subject’) can hold an entity accountable in case of unfair automated decision-making.
Hi Andrew,
The case you are pointing out is really alarming nowadays. Algorithms seems to be a solution for many businesses and I totally agree it makes business processes more effective. However, as you already stated there are some risks coming along and lots of people mitigate these threats of Algorithms. The four suggestions you make in the end sounds promising, nevertheless won’t you think Amazon took these suggestions in mind when creating their new recruitment tool. They did really underestimate the power of Artificial Intelligence which is threatening for our future. Don’t you think that one point in time we do not even signal the harms of Artificial Intelligence and we accept it as the new standard?
Kwint, your comment sheds a different light on this case. Due to your experience in this industry you even think Algorithms are less harmful than the human approach. However, what should we do as humans to understand the scoring mechanism of an AI model? I think when we know how to understand these tools there are really opportunities to make use of augmented intelligence approach, but we should never overlook the human role in this process to protect our social values.
Hi Myrthe! The FACT data principle has been introduced by the Responsible Data Science (RDS) consortium in 2016, while the development of the Amazon’s recruiting tool started in 2014. I would personally say that in those days, there was a fixed mindset in which the majority of people working with AI models falsely assumed that algorithms and data must always be right. It is only until recent years, that we are becoming more concious of the implications and risks of using such models.
Great article! I think we often forget that algorithms are made by people and are therefore open to being influenced by common human biases or errors. However, determining whether a certain algorithm can be trusted also has its challenges. While a lot of us want to be critical of algorithms and how they arrive at their output, the technical complexity of these systems and the black-box phenomena makes it difficult for normal people to fully evaluate the trustworthiness of the algorithms. As a result, we have to rely on the reputation of the companies that develop these algorithms or the number of people that use them as indicators of trustworthiness. But, as we can see in the Amazon example, this does not always work.
I think we still have a long way to go before we can create algorithms that are fully free from bias and errors. However, we could work on creating more systems that can detect these errors and correct them, to hopefully move us closer to a world where algorithms can be fully trusted. There are already researchers that are working towards this, for example this report by Lee, Resnick, and Barton that proposes some solutions for mitigating the common biases that we observe in algorithms (https://www.brookings.edu/research/algorithmic-bias-detection-and-mitigation-best-practices-and-policies-to-reduce-consumer-harms/).
Interesting blogpost Andrew! Like you briefly touch upon, creating an algorithm without any sort of bias whatsoever is tricky, as bias can enter during every stage of its lifecycle (development, training, etc.).
In my blogpost I actually take a more surface-level approach in examining one of the practical applications you mention, namely algorithms that determine whether a candidate is hired or advances to the next stage. Considering our posts are inherently intertwined, I also refer to your post for more background information on algorithmic bias. I invite you to read my blogpost as I’m curious to hear your thoughts on Pymetrics!