Powerful and unreleased– Learnings from GPT -2

15

October

2019

5/5 (1)

Whenever you read an article, blog post or book, you would assume another human set down, took time and wrote the piece. Our next generation will not take this for granted anymore.

In February 2019 OpenAI, a non-profit organization researching AI, announced that it created a language model able to create realistic text in various forms, including news, articles and fiction in an unprecedented way. Based on an input-text, the model generates coherent text over a page or more, adapted to the writing style of the input (an example can be found here). Concerned about possible malicious OpenAI decided to not publish the model with its 1.5 billion parameter and only releasing a much smaller version of the model (Radford, 2019).

Natural Language Processing (NLP) models are trained using large amount of (online) text, in case of GTR-2 about 40 GB. During the training the model learns to predict the next word in a sentence. With better network architecture, larger amount of data and improved training methods, the accuracy of such models steadily increases (Radford, 2019).

Six months after the announcement OpenAI posted a follow-up. In the meantime, two larger versions with more parameter of the GPT-2 model had been released. In the statement OpenAI revealed that they spoke to 5 other teams, which were able to replicate the model. They also present studies that suggesting that humans can be convinced by AI generated, synthetic text and admit that detection of such synthetic text is very difficult (Clark, 2019).

The so far cautious release strategy of OpenAI did not prevent others to replicate the model and even train more complex models. In August NVIDIA announced the training of a language model with 8.3 billion parameters, much larger than Google AI’s BERT or GPT-2 (NVIDIA, 2019).

With the incredibly fast development in the last two years and the astonishing increase in size and performance of current language models, I suggest that the deployment of synthetic generated text with its up- and downside will be highly relevant within the next 3 years. The possible implications are huge, the public debate is just starting and, as shown in the case of GPT-2, further progress is hard to delay.

 

References:

Clark, J. (2019). GPT-2: 6-Month Follow-Up. Retrieved October 14, 2019, from OpenAI website: https://openai.com/blog/gpt-2-6-month-follow-up/

NVIDIA Newsroom. (2019). NVIDIA Achieves Breakthroughs in Language Understanding to Enable Real-Time Conversational AI. Retrieved October 14, 2019, from NVIDIA Newsroom Newsroom website: https://nvidianews.nvidia.com/news/nvidia-achieves-breakthroughs-in-language-understandingto-enable-real-time-conversational-ai

Radford, A. (2019). Better Language Models and Their Implications. Retrieved October 14, 2019, from OpenAI website: https://openai.com/blog/better-language-models/

Please rate this

Leave a Reply

Your email address will not be published. Required fields are marked *