Coffee Break with AI - Games and AI Part 1

by Guest Blogger | Jul 16, 2020 | Estimated reading time: 8 minutes | in Creativity & Innovation, IFS Labs | tagged AI, Artificial Intelligence, Coffee Break with AI, Deep Learning, IFS Labs, Machine Learning

Coffee Break with AI is brought to you by Elisio Quintino and Martijn Loos.

In this edition of Coffee Break with AI, we want to take you on a journey of AI and games. There have been multiple milestones where nobody thought an AI model would be able to beat professionals in their respective fields, but every single time we have been proven wrong.

There will be two blogs about this topic: this blog will discuss AI models that do not involve a neural network and the next blog will talk about models that do use them.

Let’s get into it, and turn the page of our history book to 1997.

Deep Blue beats Kasparov at Chess

One of the earliest events where an AI was pitted against a human was in 1997. IBM built a chess-playing model, called Deep Blue. It beat the reigning chess world champion, Garry Kasparov, in a six-match game, with a score of 3½ to 2½ (half points are awarded when a game ends in a draw), under standard chess time constraints.

Why chess?

During Deep Blue’s development, from 1989 to 1997, chess was the generally accepted measure for the ability of machines to act intelligently. One can debate whether this is actually correct, but at that time, the idea was that if an AI could beat a human at chess, it would prove that the AI indeed had a form of human-like intelligence.

How did it work?

The AI methods that are used today weren’t available yet in the 90s. There was not enough processing power and not enough data to run the big neural networks. What IBM used was a brute force approach where, for every position on the board, they analyzed which moves they would be able to do in future steps. The final version of the AI model was able to look up 200.000.000 moves into the future in one second, evaluate what the best move would be based on statistics, and decide on a play within the time limit of chess rounds. This version of the model was able to beat Kasparov on May 11th, 1997.

What are the benefits?

The IBM team needed as many lookups as possible per second, but the computer chips at that time didn’t allow for such intense parallel processing. So they made a chip that could do that and highly optimized the parallel programming techniques. This is the basis for modern parallel processing in today’s computer chips and smart devices.

It’s now possible to have 3 different social media apps open on your phone while simultaneously listening to music. It also allows the processing of big data to train very complex models like the protein folding model we have talked about in this blog. And this all was instantiated because IBM needed more optimized parallel programming to win at a chess game!

Watson wins Jeopardy

The next milestone involves IBM again. They were looking for a new PR event which could put the spotlight on themselves. This turned into the creation of a big machine — it literally filled a room the size of a master bedroom — named Watson to play Jeopardy. Jeopardy is a game where the quiz-master provides an answer and the associated question needs to be provided by a contestant. Again, success. Watson beat the two top Jeopardy players, Brad Rutter and Ken Jennings, in 2011, with a huge difference in points. The AI won with a 50.000 dollar difference to its opponents.

Why Jeopardy?

It started out as a PR stunt only, with the bonus that it would also challenge IBM’s team to tackle natural language processing. Watson needed to parse the question given by the quiz-master, understand the question, find an appropriate answer and respond, all the while having to adhere to the rules of the game like any human player would.

After a prompt was given, it had 6 seconds after pressing a buzzer to provide a response. Otherwise, Watson would get a wrong answer and lose its opportunity to one of the other contestants. If the answer was wrong, there was a monetary penalty. And on top of all of that, just like the human players, it could not access the internet during the game, so it needed to have a big knowledge base.

How did it work?

The IBM team opted to put multiple smaller systems in place that worked together. They created a system called DeepQA (Deep Question Answering), built on top of a system they had already worked on previously. This system parsed the prompt of the quiz-master and worked out what possible responses were available by using parallel computation (throwback to the chess accomplishment!).

In mere seconds, 100s of algorithms came back with possible responses and an evaluation score on how probable it was that each one of the options was correct. This was turned into a ranked list and the top candidate was chosen as the final response.

What are the benefits?

Watson really pushed the field of natural language processing forward. Any field that has loads of unstructured data, such as text, could use a system that is similar to Watson. You can think of healthcare, where there are lots of patient records that could be parsed to provide help with a diagnosis. Nowadays, a version of Watson is also used in customer service, for example, in chatbots.

Libratus wins at poker

It is important to know that in 2012 there was a boom in neural network usage. The science community realized there was now enough data and computing power that neural networks were outperforming the state of the art. From that point on, almost all AI-related game playing models used neural networks and we will talk about those milestones in chapter two of this blog series.

However, in this section of this blog, we will talk about a model that didn’t use a neural network, but could still hold its own in this new era.

In 2017, Carnegie Mellon University created a machine learning model called Libratus. This model beat 4 top poker players in a 20-day tournament of no-limit Texas hold ’em. Libratus ended up winning with a difference of 1.7 million dollars. Luckily for the contestants, they weren’t playing with real money!

In 2019, Libratus was used as a basis for an even better model, called Pluribus. This model beat the two best poker players in the world: Darren Elias, who holds the record for most World Poker Tour titles, and Chris “Jesus” Ferguson, winner of six World Series of Poker events. Where Libratus only played 1 on 1, Pluribus played six-player games and also won!

Why poker?

Poker is an imperfect information game, which means that not all the information is known at all times. In chess, both players know the configuration of the board at all times. However, in poker, you only know your own cards and the cards that are on the table, but not which cards will be drawn and you also don’t know the cards of your opponents. This is an imperfect information situation. On top of that, your opponents will also try to throw you off by bluffing.

How did it work?

Libratus has three different systems working together. The first system is a model that trained on the game of poker for millions of hours, learning from scratch how poker works and getting better after each iteration through reinforcement learning.

Think of reinforcement learning as a computer model trying out different things, getting a reward when it does well, getting a penalty when it doesn’t go well, and learning from this interaction. This first system gives the model a basic understanding of the game and a global strategy.

The second system uses the general information of the first system to plan a detailed strategy for the game that is currently being played. That strategy will differ per game, as each plays out differently. Since poker has so many different possibilities (10160), there are always game situations that the model hasn’t encountered yet, so it requires a third system. The model is retrained regularly, using the data of the games that were played most recently.

What are the benefits?

The idea of teaching poker to an AI is that, if a model can make strategic decisions in a situation with imperfect information and can also bluff correctly, this could be extrapolated and applied in other fields as well. These include automated negotiations, better fraud detection, self-driving cars, setting military or cybersecurity strategies, or planning medical treatment.

Conclusion

We have discussed three big milestones in the world of AI and games. Hopefully, we have started to give insight into why putting research into AI and games can be very beneficial for other areas. Of course, we are not finished yet. Our next blog will talk about three milestones where AI models used neural networks to beat games, including videogames!

Coffee Break with AI – Games and AI Part 1