AI is already winning in battle games. What if it was a real battle?
In popular perception, skill in cerebral board games, like chess, are assumed to be a proxy for or measure of intelligence. This perception has motivated people to build computer programmes that can play these games in an effort to make programmes look intelligent or behave intelligently. But the reality is different. The complexity of decision making that a child, or even a dog, demonstrates while crossing a busy road is several orders of magnitude higher than what a grandmaster uses to win a game of chess. But because the act of crossing a road is something that we do every day we feel that it is somewhat trivial when compared to playing chess. Nevertheless, Artificial Intelligence (AI) research has enthusiastically used games as a testbed to try out more and more complex tasks that they would like computers to perform.
Programming a computer to play chess has been an obsession with many of the key personalities from the world of theoretical computer science.
Norbert Wiener, who coined the word cybernetics, the science of information and control in both humans and machines, was the first to design a programme to play chess way back in 1948.
Claude Shannon and Alan Turing, the so-called ‘father’s of information theory and computer science respectively, were both very actively involved in designing and building chess programmes though their initial efforts were not very successful. With the passage of time, these programmes became more sophisticated until in 1997, the Deep Blue programme from IBM managed to beat Gary Kasparov, the World (Human) Champion.
Since then, these programmes have improved constantly and the current World (Computer) Champion is a program called Komodo that has an ELO ranking of ~ 3300 which is more than 450 points higher than the current World (Human) Champion Magnus Carlsen.
Bridge, on the other hand, is a game where computer programmes have had rather limited success. While bridge playing programmes do exist and can indeed defeat most amateur players, they are yet to win against the world champions. The reason for this lack of success vis-a-vis chess can be explained in two ways.
First, unlike chess, where the entire board is visible, bridge is played with incomplete information because any player can see the cards in only two hands, his own and that of the dummy. But the second and perhaps more important reason is that unlike the rigorous and near brute force approach of “looking ahead” into many possible moves, bridge involves a lot of intangible psychology - of feinting, of deception and the need to anticipate the same in an intuitive manner. To put it bluntly, it is possible to determine nearly all possible moves and countermoves in chess and choose the one that works best but there is no easy way to enumerate, and hence evaluate, all possible future moves in bridge.
Game playing programmes were stuck in this rut for a very long time until the arrival of the artificial neural network (ANN) based programmes like Google DeepMind. ANNs are not really programmed in the classical sense of computer programming where a human programmer specifies what all needs to be done and when. Instead, ANNs are designed to “learn” on their own by taking certain actions and then detecting whether that action leads to success or failure of the overall goal.
The “learning” process involves changing -- either increasing or decreasing -- the numerical values of certain parameters, called weights and biases, over successive iterations in a way that gradually increases the the possibility of a success. What is interesting in all such cases is that there is no clear explanation about why a particular value of a specific parameter results in success. It is just that a certain set of values leads to success while any deviation leads to failure. This is very similar to the intuitive approach that many humans take while making decisions that they otherwise cannot explain.
Success of course can be defined in many ways -- from recognising an image of a person, navigating a car around an obstacle, or in our case, winning a game of chess. This technology -- based on the backpropagation algorithm and the gradient descent method -- was an astonishing force multiplier that propelled computer programmes to new heights.
While bridge was somehow never quite in the radar of the ANN enthusiasts, games like Go, which are orders of magnitude more complex than chess, have been cracked. Google DeepMind’s AlphaGo program beats the best human challengers on a consistent basis. But what is really startling is that the best players have admitted that AlphaGo’s style of play is so unusual that it seems to have developed strategies that no human has ever used or even thought of before. So it is not that it has been taught by, or has learnt from, any human being. It is in fact creating new knowledge or strategies on its own without human assistance.
Traditional ANN technology is based on feeding past data and expecting the computer to learn from successes and failures. This is possible when you have lots of past data and the computing power to crunch through the same. A variation to this strategy is to make the machine observe actual human behaviour -- as in driving a car or playing a game -- and from there ‘magically’ learn how to behave in a manner that leads to success. But AlphaGo and similar programs use a variation of the standard ANN called Generative Adversarial Networks (GAN) where the programme fights, or competes, with its own “brother”, a clone or copy of itself, and by recording the success or failure, learns by itself when it has been successful or has failed.
By challenging it’s own copy, it builds up its own training data and learns from this at a speed that is humanly impossible. This allows the programme to venture into more and more interesting tasks. Just as in the case of new styles for playing Go, GAN based machines have trained themselves to create artificial images of human faces that are indistinguishable from actual photographs of real people.
As we have noted earlier, the ability to play a board game, however complex, may be viewed as a proxy for human intelligence but in reality, true intelligence lies in doing apparently simple daily tasks like crossing a busy street or in haggling with a roadside hawker. Here success lies in reaching the other side of the road within a reasonable time without being hit by a car or to convince another person to part with a product at a lower price. Here again, we can have ANN/GANs being trained for success but the difficulty lies in creating a learning environment where the system can train itself. Unlike board games, we cannot have half trained machines running amok on real roads or bargain with real shopkeepers. But if we cannot have real environments, can we look for virtual ones?
Enter strategy games! Arcade style video games have been around since the 1970s when the first such game, Computer Space, was launched in 1971. This was followed by the commercial release of Atari’s Pong. From these humble beginnings, the computer gaming industry has now become a US$ 140 billion business that is far larger than the global movie business. In fact the top selling game Call of Duty : Black Ops collected an incredible US$ 650 million in five days after its launch, which is a record for any book, movie or game.
In all these games, the human player has a quest, or a challenge that increase in complexity like acquiring resources -- firewood, gold, weapons, land, territory, soldiers -- and this quest is obstructed or sought to be foiled by other human players or by specifically designed computer programmes that put hurdles in the way. Since these games are generally played online, hundreds of players log in and participate through avatars -- realistic or fantasy representations of themselves -- that they control. These Massively Multi-user Online Role Playing Games ( MMORPG) could be very realistic representations of actual reality, as in World War II battlefields or well known urban locations, or can be played out in fantasy sci-fi locations in deep space or in ancient Jurassic parks. But irrespective of the location where these games are set, what really matters is that players, or combatants, are humans who use every possible trick, guile and deception to beat other players so as to achieve success in their respective goals.
In fact, most games allow players to create teams, build coalitions and join guilds to improve their chances of success. Playing these games brings one quite close to operating within a community of intelligent individuals, because as explained by James Vincent in the Verge “Although video games lack the intellectual reputation of Go and chess, they’re actually much harder for computers to play. They withhold information from players; take place in complex, ever-changing environments; and require the sort of strategic thinking that can’t be easily simulated. In other words, they’re closer to the sorts of problems we want AI to tackle in real life.”
After chewing through human champions in chess and Go, and ignoring bridge for the time being, computer AI has now set its targets on these online games and the results are just as startling. Dota 2 is an online multiplayer battle arena, a derivative of the wildly successful Warcraft, where players form five member teams to occupy and defend territory on a map. Elon Musk’s OpenAI team has been working to build AI systems that can play -- or rather simulate a human player -- against the usual human teams and on more occasions than one have successfully beaten humans.
In January 2019, AI crossed another milestone when Google’s DeepMind learnt how to play Starcraft II and beat two very highly ranked professional human gamers. Starcraft II is a science fiction, real-time strategy game where three species -- humans and two others with many kinds of natural and supernatural powers -- battle it out to create bases, build teams, invade other’s territory and eventually establish their dominion across the galaxy. AlphaStar, the AI programme that ‘plays’ Starcraft is based on the principles of Reinforcement Learning where it begins as a novice playing randomly and then builds up expertise by observing the outcome. Since no human would ever have the physical stamina to play with an AI, AlphaStar generally plays with multiple copies of itself and hence can learn and build expertise at a speed that is impossible for humans.
Games like Dota 2, Starcraft II, Call of Duty and the current favourite shooter game PUBG are, at the end of the day, harmless. You can shoot, maim and kill your enemy as much as you like in the game but when you shutdown the game, you are back to your mundane world of home and work that was temporarily suspended while you battled with hostile adversaries. There is no physical touch point between your actions on the keyboard and the world of brick and mortar.
But what if there was?
What if you were a drone operator sitting in front of a computer screen and controlling, not just a fantastical beast in a fantasy terrain, but an armed drone flying across Afghanistan or even along the LoC in Kashmir?
Or what if you were controlling a team of armed semi-autonomous vehicles that are tasked to cross the LoC and perform a surgical strike on terrorist camps in PoK?
Operating an armed drone or autonomous vehicle on the ground is not really different, conceptually and qualitatively, from what human players are do, both tactically and strategically, in combat games like Dota and StarCraft. So after AlphaGo and AlphaStar -- the AI programmes that beat humans in Go and StarCraft -- will there be an AlphaWar that can beat any human operator in a real war that spills real blood and blows up real buildings? Why not? We could have a fleet of AI controlled armed drones relentless flying over the LoC in Kashmir, observing the terrain and remorselessly killing anyone that looks or behaves like a terrorist while ignoring civilians -- just as it is done in so many popular first person shooter games like Doom or PUBG. And of course we could live with some collateral damage.
Much as society would like it to, science and technology never waits for ethics to catch up with its own progress -- as in the case of He Jiankui, the Chinese scientist who broke state laws to create genetically edited human babies with the CRISPR. Asimov’s First Law of Robotics that mandates that robots will not harm humans, is good science fiction but will not stand the test of ground reality. Given the proliferation of war and conflict themed games and the ability for AI systems to learn from them, it is matter of time before they acquire the capability to wage real war.
We may argue that the actual instruments of war, for example armed drones, are not connected to the AI systems but that is obviously a transient problem. Access to and connectivity across multiple computers -- from gaming consoles to armed drones -- is a matter of policy and behaviour and not just technology. Given time patience and computing power any computer can be hacked. In principle, one could hack into any military computers that control the war machines and since hacking is essentially a matter of strategy and guile, there is no reason why an AI system cannot be taught to do so or, as is more likely, learn on its own -- if that is a goal that “it” is directed to achieve.
Now that AI systems will have the ability to wage war, who will direct them to actually do so? And against whom? Or, and this could be ominous, will “they” decide whether they will do so or not? That is where the programmer stops and the psychologist, the philosopher and the politician must take over to first understand this technology and then to divine or to define the trajectory that it would follow in the near future.