The Dark Side of ChatGPT: Dubious Labor

1

October

2023

No ratings yet.

ChatGPT is a large language model developed by OpenAI. It has enjoyed major success since its launch in November 2022. It achieved 100 million monthly users in merely 2 months, making it the fastest-growing consumer application in history (Hu, 2023). But, how did it get so good? GPT-3, ChatGPT’s underlying large language model, already exhibited a very good understanding of human language at an early stage. A major requirement in the training process is to feed it as many texts as possible because the model needs to extract patterns from it in order to ‘understand’ it, or at least exhibit a form of understanding. This is possible thanks to the enormous amount of open data on the internet. You are probably aware, however, that much content on the internet is not particularly well thought-out and sometimes even downright factually wrong, morally wrong, or biased. So, aside from having a linguistic understanding, the model should also follow instructions on its behavior in order to prevent it from talking like many morally dubious texts on the internet. 

So, how did OpenAI realize this? Well, they took a leaf out of the playbook of social media companies. Experienced players like Meta, X, and Google have been battling against “harmful” content for many years now. Since human intervention cannot nearly cover all the content that gets uploaded on their enormous platforms, they rely on AI as well. Their strategy is to supply these models with texts that are labeled by humans as containing violence, sexual abuse, misinformation, and other forms of harmful content. Once the AI model has extracted a representation (i.e., ‘learned’) of these forms of content, it can greatly support these tech companies in flagging and banning harmful content. 

So, what is the issue? Well, before an AI model can learn what harmful text looks like, it should be supplied with a large number of texts that are labeled as such. And, bigger deep learning models have more parameters (more neurons) that need to be calibrated, meaning that even more training data is needed. So, OpenAI resorted to cheap offshore labor to realize this. A revealing article by Perrigo (2023) exposes that the company hired Kenyan workers for this job, paying them between $1.32 and $2 per hour to label texts. What’s more, is that this concerns very dark, harmful texts from the internet. These texts described awful things like murder, torture, and more. 

These ethical concerns are rarely raised amidst the hype around ChatGPT and similar technologies. But, they do form a serious dilemma. Labeled texts transformed the GPT model from merely understanding language to becoming a docile colleague for everyday use. This improvement was a key development that made the technology commercially viable, thus creating its big impact on business and our everyday lives. The question is, was it worth the method? That is a question that we should answer as a collective. People should at least be aware of it so it can be incorporated in the public discourse on AI. 

Sources

Hu, K. (2023, February 2). ChatGPT sets record for fastest-growing user base – analyst note. Reuters. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/

‌Perrigo, B. (2023, January 18). Exclusive: The $2 Per Hour Workers Who Made ChatGPT Safer. Time. https://time.com/6247678/openai-chatgpt-kenya-workers/

Please rate this

Generative AI in education: a pivotal technology

23

September

2023

No ratings yet.

Generative artificial intelligence is capable of generating text, images and other media. These AI models learn (a representation of) the patterns of data and can, as a result, generate ‘new’ data from these patterns. This technology has become very sophisticated, generating texts or images that are indiscernible from those created by humans. Needless to say, their introduction to the general public sparked a lot of enthusiasm, doubt, and debate. 

In education, generative AI models like ChatGPT made a big impact. It became obvious very quickly that it enormously impacted the ease with which students could now cheat on their work. Because of its ability to generate well thought-out, human-like texts, it is the perfect tool to use in a wide range of assignments and tests. Some institutions embraced it, others fought it. Stopping this technology’s influence on education is, however, an unrealistic expectation.

Rather, generative AI models should be sensibly incorporated within the vision of education, utilizing its potential and limiting its drawbacks. The quality of education is always under pressure. In the Netherlands, for example, there has been a persistent shortage of well-qualified teachers. These AI tools could come to the rescue with their unique ability to offer personalized learning for every student, tailoring the content to the student’s strengths and weaknesses and providing them with a level of attention that could not have been attained before. Secondly, these AI tools could create study materials and lecture notes, allowing educators to focus more on teaching and oversight. Also, generative AI models are particularly good at ‘understanding’ and producing text, so they can be outstanding assistants when learning a new language. They can correct one’s grammar and formulation, as well as provide engaging assignments that are related to the culture or history of the associated country. 

All in all, generative AI holds the potential to massively improve education. Thoughtful policy from educators and legislators is a precondition, however. Shifting focus from testing a student’s memorization, which is easily manipulable using these tools, to comprehension, which these tools can support, would be a big step in the right direction.

Please rate this