According to a paper published December 5 by Google DeepMind researchers, their AlphaZero algorithm achieved a superhuman level of play in chess, shogi (Japanese chess) and Go, all within 24 hours.
AlphaZero was given no domain knowledge except the rules for each game, and its training proceeded from random play. Within 24 hours, the algorithm had mastered all three games and “convincingly defeated a world-champion program” in each domain. This deep learning approach is completely different from IBM’s Deep Blue supercomputer, which defeated chess grandmaster Garry Kasparov in 1997. The latter AI was simply programmed with the best moves, while AlphaZero taught itself strategy in a remarkably short time.
AlphaZero is the generalized algorithm behind AlphaGo Zero, an improved derivative of DeepMind’s AlphaGo program that has repeatedly beat master Go players. In 2017, AlphaGo won a three-game match against Ke Jie, who continously held the No. 1 world ranking for two years. In October of this year, AlphaGo Zero defeated the original AlphaGo 100 games to zero after training itself on the game’s strategy for only 72 hours.
AlphaZero trains by playing millions of games against itself and attempting to increase its win rate. While its predecessor used two separate networks, Zero uses a single neural network and needs much less computing power.
DeepMind was founded in London in 2010, and was acquired by Google in 2014 for $500 million.