GPT-3 is a promising new program developed by OpenAI, a startup founded by (who else than) Elon Musk. GPT-3 is a linguistic neural network, which in essence is nothing more or less than a stochastic language model that predicts the most likely word or token to follow a prompted sequence, just like GPT-2 is. However, what truly sets GPT-3 apart is its scale, containing 175 billion parameters in contrast to its GPT-2’s 1.5 billion parameters (arXiv, 2020). This gives rise to new and surprising possibilities, which may possibly have implications for the skills that you are developing right now.
Whereas you may think that the scale of a language model merely improves the accuracy of its language interpretation and prediction, which would come down to an embellished auto-completer, GPT-3 can do more. It can actually generate differently formatted responses to whatever prompts it is given. This means that it does not only produce text that seems written by humans, but can give surprisingly accurate answers to questions, as long as they are given a few examples and a prompt. This gives rise to “few-shot” learning, something that its predecessor(s) weren’t capable of. Without being fine-tuned to perform a specific task, GPT-3 may perform these tasks as effectively as the traditional domain-specific programs that were written to perform these tasks (ZDNet, 2020). Such a general utility, low effort, natural language-prompted program provides interesting opportunities.
Training and fine-tuning language models to perform specific tasks are strenuous endeavours that require lots of data, human capital, and patience. With the scale of GPT-3, such training is taken out of the equation; the newest language model by OpenAI requires no gradient updates or fine-tuning of the model. This means that the GPT-3 can be deployed and grows more effective as it accesses more data.
Whereas GPT-3 is only accessible to a group of Beta users, it could potentially spur another wave of automation. Customer support or content writing are the most obvious tasks that can be automated, but GPT-3 can do more than that. The linguistic neural network can actually infer and generate code in different programming languages, to create functional scripts (DaleonAI, 2020). Just consider the applications thought of by the following Twitter users:
- @sharifshameem with…
- a layout generator that generates JSX code: https://twitter.com/sharifshameem/status/1282676454690451457
- a todo list app by simply describing how it should work:
https://twitter.com/sharifshameem/status/1284421499915403264
- @mattshumer_ with…
- an automatically built machine learning model, by simply describing the dataset and required output to GPT-3:
https://twitter.com/mattshumer_/status/1287125015528341506
- an automatically built machine learning model, by simply describing the dataset and required output to GPT-3:
Does this make you second-guess whether choosing the BIM: Data Science track is still the right choice? Well, my guess would be not to overthink it. There are still a number of limitations to GPT-3 that limit it from being viable for enterprise adoption (ZDnet, 2020).
GPT-3 …
- becomes repetitive and loses coherence throughout longer texts.
- does not possess common sense; it is not a ‘reasoning’ machine, although some answers may lead you to think it is. It is a stochastic linguistic model at its core, and cannot reason through logic.
- has variable output quality; some answers are amazingly accurate, whereas others are complete nonsense.
- operates on ‘garbage in, garbage out’. Some prompts will be more effective at retrieving high-quality answers than others.
- amplifies human biases and can sneak in racism, sexism other types of discrimination and prejudice throughout its prompted answers.
- drives out creative answers that are sometimes needed; the program only follows the ‘fat tail’ of the probability distribution, due to its stochastic nature.
- cannot be tuned to company-specific data, and is difficult to specialize for an industry.
- Some start-ups, such as Sapling, may bridge this gap. However, we don’t know whether it is even possible as of yet.
- may lead people to infringe copyrights; GPT-3 scrapes its data from different online sources, and does not consider whether these are copyrighted when producing its answers.
- is a black box for its users. This becomes particularly worrying when considering its variable output quality, amplification of human biases, copyright issues.
- requires tremendous computational power and investments to run. It is in no way commercially viable as of yet.
So should you then not care at all? Well, not really.
This could be a new chapter in machine learning. Consider the synergies of linguistic neural networks with neuromorphic systems (link to my other blog discussing this: https://digitalstrategy.rsm.nl//2020/09/29/biomimicry-from-neural-networks-to-neural-architecture/) integrating audio and image processing in future versions of GPT-x, and running these algorithms at the edge rather than the cloud.
Who knows what our future may hold. Perhaps in a decade or two, we will all be working with virtual employee assistants that integrate data flows from all of your enterprise systems, that can query whatever system you want through conversational prompts, or generate its own code based on our description of the end product. Imagine asking your Siri-on-steroids to optimize your scheduling, setting agendas for meetings, interpreting data for decision-making, and performing your customer due diligence work.
However, we are not there yet by far. For now, learning how to program and analyse data still seems like a great choice. Just make sure to keep an eye out for future developments.
References
ZDNet. (2020) https://www.zdnet.com/article/what-is-gpt-3-everything-business-needs-to-know-about-openais-breakthrough-ai-language-program/ [Accessed October 6, 2020]
arXiv. (2020) https://arxiv.org/abs/2005.14165 [Accessed October 6, 2020]
DaleonAI. (2020) https://daleonai.com/gpt3-explained-fast [Accessed October 6, 2020]