In the 1990s, IBM’s Deep Blue AI was able to beat the reigning world champion in a game of chess. An astounding achievement, but scientists quickly set eyes on the ancient Chinese game of Go, a game with a much greater quantity of possible outcomes. Go is estimated to have around 10^170 possible move sets, more than the 10^80 atoms that make up the known universe. AI had begun playing amateurs in Go in recent years, but it was predicted in 2014 that it would be at least a decade before an AI would be among the elite Go players in the world. Three years later, Google’s DeepMind AlphaGo is the world’s best GO player.
While AI has bested human players in games in the past, AlphaGo is an entirely different beast. Past AIs leveraged algorithms that would continuously search every possible move that could be made, and select the most advantageous one for the current layout. AlphaGo does something entirely different, it uses a technique known as the Monte-Carlo tree search, a search algorithm in which it plays out the reminder of the game over thousands of times, and picks the best move based on the simulation. AlphaGo combines the tree search algorithm with two deep neural networks, the policy network and value network, both of which are made up of millions of neuron-mimicking connections. The policy network predicts the next move and considers the best move that will lead to a win. The value network reduces the depth of the search by estimating the winning move in each position all the way to the end of the game.
The original AlphaGo was trained on 30 million move sets against human players, until it could predict a human’s move 57% of the time. In order to beat human players, AlphaGo developed new techniques. By playing thousands of games against itself, AlphaGo showed tremendous improvements through reinforcement learning, a-trial and-error process
Jumping ahead to late 2015, Google’s DeepMind had invited a three-time European Champion Fan Hui to play against the AI. AlphaGo defeated Hui 5 games to 0, the first time ever a computer program had beaten a professional Go player and shattering predictions that it be a least a decade before this could happen. In 2016, DeedMind had invited Go legend and world champion Lee Sedol to take on AlphaGo, and again Google’s computer Go system surpassed predictions, beating the world champion in 4 out of 5 games.
Now in late 2017, DeepMind has just introduced AlphaGo Zero, the latest version of the AI. Recently this new iteration of AlphaGo played against the older version that had beaten the world champion Lee Sedol. AlpahGo Zero beat the older version of itself 100 games to 0. Just as AlphaGo took a revolutionary approach to AI game computing, AlphaGo Zero leverages an entirely new approach of its own. It learns completely from scratch, using only one neural network. The system competes against itself to improve, and has no prior experience playing the game other than basic rules. In just 40 days of learning, AlphaGo Zero mastered a 2,500-year-old game and surpassed all previous iterations of itself to become the greatest Go player in history.