The AlphaGo artificial intelligence (AI) program has defeated the world’s best Go player, China’s 19-year-old Ke Jie. Developed by DeepMind, a subsidiary of Google parent corporation Alphabet Inc., AlphaGo previously beat the Korean grandmaster Lee Sedol in March 2016, marking the first time an AI defeated a professional 9-dan Go player without handicaps.
AlphaGo is based on an algorithm that determines optimal moves from a database of game knowledge acquired through machine learning. The AI acquires that knowledge using an artificial neural network, a type of machine learning framework that simulates the biological functions of natural brains.
AlphaGo’s neural network was trained using both human and computer play. Initially, the program was fed about 30 million match moves from recorded historical games of expert human players. It was then set to play against copies of itself to seek the optimal solution through a trial-and-error process called reinforcement learning.
In past cases of computer programs beating human players – such as when Deep Blue beat Gary Kasparov in 1997 – a brute force technique was used, in which every possible sequence of moves was calculated and evaluated. AlphaGo, by contrast, uses two neural networks to narrow the potential field of optimal moves – a necessary approach due to Go’s massive number of possible solutions (for reference, it's more than the number of atoms in the universe).
Its “policy network” and “value network” are each comprised of millions of connections mimicking the neurons of a brain. The policy network restricts the predictive search to moves with the best possible chance to result in a win. The value network, meanwhile, reduces the search tree depth by estimating which player is in the lead in each round, instead of calculating all possible moves.
During each round of a game of Go, AlphaGo uses a Monte Carlo tree search algorithm to simulate how the game will play out many times over. Its policy network suggests moves to play, while its value network evaluates the position reached. The most successful move from these simulations is then chosen.
The latest victory for AlphaGo happened at the Future of Go Summit taking place in the town of Wuzhen in China’s Zhejiang province. The summit will continue with a second match between Ke Jie and AlphaGo at 10:30 a.m. China Standard Time on May 25 (10:30 p.m. Eastern Daylight Time, Wednesday, May 24). A final match starts at 8:30 a.m. China Standard Time on May 27 (8:30 p.m. Eastern Daylight Time, Friday, May 26). You can watch future games live or see replays at the event’s website.