Welcome to the Future: Google DeepMind’s Artificial Intelligence Algorithm

21

September

2016

5/5 (16)

In March 2016, something historical happened, something scientists did not expect to happen for at least another decade. An Artificial Intelligence (A.I.) algorithm named AlphaGO beat the current world champion Lee Sedol at the GO board game.  Now, in order to clarify why this indeed has historical consequences, some background information is required. First, some information about the ancient Chinese board game GO: GO is known as the world’s hardest game with close to an infinite number of optional plays. The game is comparable to chess, only with more moves than there are atoms in the entire universe.

Given the amount of options the game GO has, it was not possible for the algorithm to win by ‘brute forcing’. Previously, when computer systems played other games (e.g. chess) brute forcing was used in order to make in-game decisions. Simply said, brute forcing means that an algorithm would calculate all moves it could possibly make and select the most ‘successful’ move. Because brute forcing is not possible with GO, this ancient Chinese game is seen as the Holy Grail (i.e. test) of Artificial Intelligence. As brute forcing was not an option, Google DeepMind’s AlphaGO used a different technique in order to win. And this is where things get spectacular.

For the first time in history, we are seeing a successful demonstration of general purpose Artificial Intelligence. What does general purpose mean in this case? Because the machine learns from experience and data, it can perform well through a wide variety of tasks. Not only within one specific area.
By using reinforced learning and neural networks AlphaGO could mimic the learning process of a human brain. Meaning: it shows that machines have the potential to learn on their own.
AlphaGO’s win over a human with the game GO shows that machines can really learn and think in a human way. According to the DeepMind founders, the algorithm can learn many more things besides GO. Without alteration or guidance by human hand. In other words, AlphaGO can indeed be defined as general purpose A.I.

Google DeepMind, the company responsible for the creation of AlphaGO, is a British Artificial Intelligence company that was founded in September 2010 as DeepMind technologies. It was renamed into Google DeepMind after it was acquired by Google for a shy $500 million. Fun fact: Next to Google’s interest in the company, other well established names are connected with DeepMind. For example, Elon Musk is one of the investors behind the DeepMind company.
The company goal of DeepMind is ‘to solve intelligence’. The team is trying to achieve this by combining the best techniques from machine learning and systems neuroscience, in order to build powerful general purpose algorithms.  Summarized, the goal of Google DeepMind is to formalize intelligence, find out what it is, and how it works. The end game is not just to implement this into machines, but also to understand the human brain.

Back to the AlphaGO algorithm. So far, we have established that it can be defined as a general purpose A.I. with the potential to learn on its own.
But how does the program learn? Through Deep Reinforcement Learning.  Which again makes it very different from other A.I.’s. Other current A.I.’s were only developed for a predefined purpose and only function within their scope (e.g. Siri on your new iPhone 7). DeepMind claims that their algorithm is not pre-programmed and learns from experience. Using only raw pixels as data input.
Reinforcement learning can be seen as the foundation for the program’s win over the current world champion. AlphaGO was trained by showing the programme over 100.000 pictures of amateurs playing GO. Its first task was to mimic these pictures. After it learned to mimic human amateurs their plays, the algorithm was allowed to play against itself. 13 million times to be exact.
Using deep reinforcement learning, the system learns to improve itself incrementally by increasing its win-rate against older versions of itself. The reinforcement system DeepMind uses, is model free. Meaning that it doesn’t need a structure or a set of rules to learn.

Where can DeepMind’s algorithm be used for in the future? In an interview with one of the founders, Mustafa Suleyman, he explains their vision of applying A.I. in healthcare and science. He mentions it will help speed up the process of major breakthroughs by helping human experts discover patterns.

Concluding, general purpose artificial intelligence that uses reinforcement learning could lay the groundwork for unimaginable things in the near future.

Please rate this