The power of GPT-3

18

October

2022

5/5 (1)

In 1950, the British mathematician Alan Turing proposed a test for artificial intelligence that is still widely used today. The Turing Test, as it is nicknamed, assesses a machine’s ability to generate responses indistinguishable from a human. To pass, the machine must fool a person into thinking it is human at least 30% of the time during a five-minute conversation. The Turing Test is not a perfect measure of intelligence, but it is a useful way to compare the capabilities of different machines. And on that score, the latest artificial intelligence system from Google, called GPT-3, looks very promising. GPT-3 is the latest incarnation of a so-called “language model” developed by Google Brain, the company’s deep-learning research group. Previous versions of the model, known as GPT-2 and GPT-1, were released in 2018 and 2019, respectively. But GPT-3 is much larger and more powerful than its predecessors. To train GPT-3, Google fed it a dataset of over 300 billion words, which is about 10 times the size of the training data used to develop GPT-2. As a result, GPT-3 is far better at understanding and responding to natural language queries. In a recent test, GPT-3 was given a set of questions typically used to assess a machine’s reading comprehension. The questions were taken from the SQuAD 2.0 dataset, which is a standard benchmark for natural language processing systems. GPT-3 answered 95% of the questions correctly, while the best previous system, BERT, got only 93% correct. Similarly, GPT-3 outperformed all other systems on a reading comprehension test designed for elementary school children. On this test, GPT-3 was correct 94% of the time, while the best previous system got only 86% correct. These results suggest that GPT-3 is not only the best language model currently available, but also that it is rapidly approaching human-level performance on reading comprehension tasks. This is remarkable progress, and it suggests that GPT-3 could be used for a variety of applications that require reading comprehension, such as question-answering, summarization, and machine translation. But GPT-3 is not just a reading comprehension machine. It is also very good at generating text.

At this point of the article, an interesting question can be posed. Do you think that the first paragraph was written by the human or a machine? If you thought that the correct answer is human, it may indicate that GPT-3 truly is at the verge of passing the “Turing’s Principle”, as the texts created by it become less and less distinguishable from those written by humans. One of the main allures of this AI is the simplicity of use – first paragraph was generated using the prompt “Write a short article about Turing’s principle. Describe how emergence of GPT 3 has changed article writing. Use the style of ‘The Economist'” (the entire code can be found below; I truly recommend using this AI as opportunities offered by it are truly marvelous).

 As GPT-3 has mentioned above, the Turing principle is an imperfect measure since it does not account for whether an AI is intelligent or is it just very skilled at imitating intelligence. This idea is encapsulated in the so called “Chinese room argument” (Searle, 1999). It is an idea which proposes a thought experiment – imagine that a person who does not know a single word in Chinese is locked in the room. In this room, there is a set of instructions which explain every single rule of how to translate a sentence in English to Chinese. A person is given an input in English and translates it to Chinese according to all of the rules present in the room. To an outside observer it may appear that the person is fluent in Chinese, as it is able to translate every single phrase from English without a single mistake. Similar dynamic applies to the artificial neural network mechanism upon which GPT-3 is based – GPT-3 uses 175 billion of different learning parameters to accurately predict and produce the text according to the user’s input (Floridi & Chiriatti, 2020). Based on those parameters it analyzes which words are associated with the ones used in the prompt, and gives the output based on this probability. But if it was just based on those probabilities, shouldn’t it produce a list of unconnected words? How is it able to produce a coherent text which adhere to all of the rules of English grammar? It uses the so called convoluted neural network (for a crude visualization take a look at the featured image) – an architecture which not only considers of probability of word being connected to the prompt text, but also the probability of one word appearing after the other (Bousquet et al., 2021). To give an example, after using the word “Alan Turing” GPT-3 calculates what is the most likely word to occur afterwards, taking into consideration all of the previous words – “In 1950, the British mathematician”. Since it associates 1950 with the past, and in the prompt, it is requested to describe Turing’s principle, it assesses that the word with the highest probability is “proposed” (The procedure has been simplified for the sake of this article). Therefore, most scientist assume that machines have not reached sentience and are unlikely to do so, at least as far as neural network architecture remains the most popular form of training AI. Now, I would like to pose the question: How close are we to reaching true, human-like artificial intelligence? Is it even possible, or machines will always remain powerful calculators which are really good at imitating it but will never be able to understand it?

Code used:

import openai

openai.api_key= “sk-47TvSdwMJaIpFh9E22znT3BlbkFJ8lCt41LoDMRHv7V*****” (If you want to use it u need to have an original API key, for more information visit https://beta.openai.com/overview)

length = 300

prompt = “Write a short article about Turings principle.Describe how emergence of GPT 3 has changed article writing. Use the style of ‘The Economist'”

response = openai.Completion.create(

    engine=”text-davinci-002″,

    prompt=prompt,

    logprobs=1,

    temperature=1,

    presence_penalty = 1,

    best_of = 5,

    max_tokens=3000

)

print(response)

References:

Bousquet, O., Boucheron, S., & Lugosi, G. (2021). Introduction to Statistical Learning Theory. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2nd ed., Vol. 112). Springer Verlag. https://doi.org/10.1007/978-3-540-28650-9_8

Floridi, L., & Chiriatti, M. (2020). GPT-3: Its Nature, Scope, Limits, and Consequences. Minds and Machines, 30(4), 681–694. https://doi.org/10.1007/S11023-020-09548-1/FIGURES/5

Searle, J. (1999). The Chinese Room. https://rintintin.colorado.edu/~vancecd/phil201/Searle.pdf

Please rate this

Q-Day and the fall of Internet:

7

October

2022

No ratings yet.

To those who have never heard the term Q-Day it may sound mysterious, as if it was a major event from a Sci-Fi novel which has changed the fate of the whole humanity. This description is not far off the truth, as the technology hiding behind the “Q” is quantum computing, a concept which for decades were constrained to such novels. So, what is “Q-Day” than? It is a day in which the quantum computers become stable enough to operate for the prolonged period of time. But don’t we have operational quantum computers right now? Similarly, to the physics behind the concept, the answer is not straightforward. In order to understand it, we first have to understand the difference between quantum and semiconductor-based computers (duh, physics). Regular computers operate based on bits – electrical signals which can take value of 0 or 1. They are processed by the CPU, a device consisting of millions of transistors etched onto a silicon chip – for example, a CPU in iPhone 14 has 16 million transistors (Ganti, 2022). Those transistors are organized into logic gates, which execute operations according to the predefined programs (Gayde, 2019). Quantum computers operate using qubits which can also take a value from 0 to 1. However, contrary to regular bits, they are in the state of superposition between 0 and 1 (Nielsen & Chuang, 2010). They can be treated as being 0 and 1 at the same time (a bit of oversimplification, but detailed explanation is outside of the scope of this article). It means, that with every added qubit their power grows in a quadratic fashion: 1 qubit = 1 bit, but 1000 qubits = 1000000 bits. It means that their theoretical power vastly outperforms those of the standard computers. So, you may ask, what is the problem with quantum computers and why the Q-Day has not arrived yet? The main issue is maintaining the state of superposition. It requires the qubits to be fully isolated from their surrounding – they have to be kept at temperature close to absolute zero (Jones, 2013) and shielded from any outside interactions, since things as miniscule as cosmic radiation can break the quantum state of superposition (Vepsäläinen et al., 2020). To illustrate how big of the hurdle it is, on the 30th of September 2022 researcher from the University of South Wales announced a breakthrough – they have managed to maintain the quantum state of superposition for the staggering 2 milliseconds (100 times more than the previous record) (For the Longest Time: Quantum Computing Engineers Set New Standard in Silicon Chip Performance, 2022). Despite being operational for such a fleeting period of time, quantum computers have already shown immense power. In 2019 team of scientists from Google and NASA achieved the so called “Quantum Supremacy”. Quantum computer developed by them managed to conduct calculations which the most powerful traditional supercomputer, Summit, would calculate for 3 million years (Liu et al., 2021). There is no official definition of the Q-Day but try to imagine that the very same computer could operate for 2 minutes. Then surely a point of no-return will be reached.

But how will the Q-day contribute to the fall of Internet? It all boils down to cryptography and how the digital information is secured. Nowadays, vast majority of online data is encrypted via TLS/SSL protocols. In a nutshell, the main idea behind them is multiplication of prime numbers. To give an example, a 2048-bit encryption would mean that a server would send in a public message (visible to everyone) a 2048-digit number which is a product of two primes. In order to authorize the access, the user’s computer would have to provide the server with those two primes. Trying to find two divisors of 2048 digit number by brute force is virtually impossible – according to some estimates, it would take 300 trillion years for a standard computer to break this encryption. In this case, how is it even possible that you can log in into your bank account without waiting for a heat death of the universe? Every account has a private prime number which matches one of the prime numbers sent by the server. The only thing the computer has to do is to divide one number by the other, which can be done in milliseconds. How does it compare to quantum computers? A quantum computer with 4099 qubits (this threshold has already been reached  (Rolston-Duce, 2022)), could break the 2048-bit encryption in 10 seconds! It means that someone with a quantum computer able to maintain quantum superposition for long enough could gain access to anything on the internet – bank accounts or government secrets, nothing will be able to withstand the unbelievable power of a stable quantum computer. Does it mean that the world will have to go back to pre-digital era, since nothing cannot be safely encrypted ever again? Fortunately, major players in the encryption business have recognized the problem. In 2016 US government organization, National Institute of Standards and Technology (NIST), has asked scientist to submit propositions of encryption algorithms which will be ready for post quantum future. The results of the contest were announced this year, with the winner (in public Key-Encryption area) being Crystals-Kyber encryption method (Bos et al., 2018; NIST, 2022). Unfortunately, despite my best efforts I am unable to explain how this method works, it makes sense that the complex problem requires complex solution. Even though solutions to the problem exists today, companies are reluctant to implement them. They face similar dynamic when it comes to Post Quantum (PQ) encryption as they do with climate change. Implementation of the solutions is costly and does not offer immediate benefits, and the only incentive to implement them is in the future. There is little awareness to this problem, hence companies face little pressure from the consumers to improve the security of their encryption. Thus, the question remains, will the internet as we know it succumb to the unimaginable power of future quantum computers? Or will we be able prepare ourselves for the inevitable emergence of the quantum monster?

References:

Bos, J., Ducas, L., Kiltz, E., Lepoint, T., Lyubashevsky, V., Schanck, J. M., Schwabe, P., Seiler, G., & Stehle, D. (2018). CRYSTALS – Kyber: A CCA-Secure Module-Lattice-Based KEM. Proceedings – 3rd IEEE European Symposium on Security and Privacy, EURO S and P 2018, 353–367. https://doi.org/10.1109/EUROSP.2018.00032

For the longest time: Quantum computing engineers set new standard in silicon chip performance. (2022). https://archive.ph/HikMD

Ganti, A. (2022). Apple A16 Bionic announced for the iPhone 14 Pro and iPhone 14 Pro Max – NotebookCheck.net News. https://www.notebookcheck.net/Apple-A16-Bionic-announced-for-the-iPhone-14-Pro-and-iPhone-14-Pro-Max.647967.0.html

Gayde, W. (2019). How CPUs are Designed and Built, Part 2: CPU Design Process | TechSpot. https://www.techspot.com/article/1830-how-cpus-are-designed-and-built-part-2/

Jones, N. (2013). Computing: The quantum company. Nature, 498(7454), 286–288. https://doi.org/10.1038/498286A

Liu1, Y. A., Liu1, X. L., Li1, F. N., Fu, H., Yang, Y., Song, J., Zhao, P., Wang, Z., Peng, D., Chen, H., Guo, C., Huang, H., Wu, W., & Chen, D. (2021). Closing the “quantum supremacy” gap: Achieving real-Time simulation of a random quantum circuit using a new sunway supercomputer. International Conference for High Performance Computing, Networking, Storage and Analysis, SC. https://doi.org/10.1145/3458817.3487399

Nielsen, M. A., & Chuang, I. L. (2010). Quantum Computation and Quantum Information. www.cambridge.org

NIST. (2022). Post-Quantum Cryptography | CSRC. https://csrc.nist.gov/Projects/post-quantum-cryptography/selected-algorithms-2022

Rolston-Duce, K. (2022). Quantinuum Announces Quantum Volume 4096 Achievement. https://www.quantinuum.com/pressrelease/quantinuum-announces-quantum-volume-4096-achievement

Vepsäläinen, A. P., Karamlou, A. H., Orrell, J. L., Dogra, A. S., Loer, B., Vasconcelos, F., Kim, D. K., Melville, A. J., Niedzielski, B. M., Yoder, J. L., Gustavsson, S., Formaggio, J. A., VanDevender, B. A., & Oliver, W. D. (2020). Impact of ionizing radiation on superconducting qubit coherence. Nature 2020 584:7822, 584(7822), 551–556. https://doi.org/10.1038/s41586-020-2619-8

Please rate this