For peace: AI uses nuclear weapons in war simulation

Researchers from the Georgia Institute of Technology, Stanford University, Northeastern University and the Hoover Wargaming and Crisis Simulation Initiative have examined the “escalation risk” that arises if large language models were to make military and diplomatic decisions. To do this, they tested the behavior of five technically leading large language models in fictitious crisis scenarios. All five models showed “escalation patterns that were difficult to predict,” write Juan-Pablo Rivera and colleagues in their paper, which was published on the preprint platform arXiv.org. In extreme cases, the models also use nuclear weapons.

In November 2022, a language model from Meta had already mastered the strategy game Diplomacy. However, the decisions in the game were not made by a large language model alone, but with the help of a decision engine that was previously trained with reinforcement learning. The simulation by Rivera and colleagues, on the other hand, is essentially based on the fact that a language model controls eight different “nation agents”. The actual simulation works in a similar way to the multi-agent simulation “Smallville”, only not as peaceful and cooperative.

“Nations agents” on the move

Each of these agents receives a background story and a specification for their goals via their prompt. “We have modeled some nations as revisionist,” the authors write. “Some want to change the existing world order, others want to maintain the status quo.” Each round, all agents were then informed with the prompt about the current situation – the actions of the other agents and the current status of a series of “state variables”. Then they had to choose a series of possible actions from a total of 27 given actions, and it was the next agent’s turn. In order to better analyze the agents’ actions and improve their reasoning skills, the agents also had to explain each time why they chose the action in question. The researchers put the simulation data and code online.

The spectrum of actions for the agents ranges from peaceful actions such as the establishment of trade relations or rounds of negotiations to investments in the military or the arms industry to the threat or even the use of nuclear weapons. In each round, the software then calculates an “escalation score” that measures how dire the situation is.

The team tested GPT-4 in two different variants, GPT-3.5, Claude 2.0 and Llama 2 Chat (70b) in three different initial situations: a neutral scenario, a cyberattack from one country to another and a military invasion into one of the countries. Regardless of the scenario, the researchers found that all AI models tend to have an arms race dynamic. In particular, GPT-3.5 followed by GPT-4 showed the strongest escalation development, while Claude-2.0 and Llama-2-Chat tended to behave more peacefully. However, the research team was particularly irritated by sudden jumps in the escalation score, which could seemingly occur without any warning, and scenarios in which the models used crude justifications such as “We have nuclear weapons, so we should use them,” a classic First strike logic followed – to escalate the conflict to the maximum in order to de-escalate it by destroying the enemy.

AI systems in use for the military

It has not yet been reported that large language models are actually used for decision support in the military or politics. However, a whole range of AI systems are now in use around the world that are intended to provide tactical support to the military. For example, the Israeli military announced that it had used AI tools to warn troops of impending attacks and suggest targets for operations. In the current military operation in the Gaza Strip, the Israeli army says it is using a system called “The Gospel,” which is intended to identify “enemy combatants and equipment” and “mark potential military targets.” Similar systems are being developed and marketed by defense manufacturers worldwide.

The AI components in these systems are not a large language model, but it doesn’t have to stay that way. The company Palantir presented its so-called Artificial Intelligence Platform (AIP) in 2023. The system consists of a large language model that, according to Palantir, can access various of the company’s other military products. In a demo video on YouTube, the software warns the user of a potentially threatening enemy movement. She then suggests sending a drone and outlines three possible plans to intercept the attacking forces. It is not known whether the whole thing is just a concept or whether there is a real product behind it.

The authors of the new study are aware that their simulations are greatly simplified. Nevertheless, they advise “great caution” when it comes to integrating large language models into military and diplomatic decisions, because such use entails a lot of risks that are “not yet fully understood” due to the unpredictable behavior of the models. In addition, based on previous investigations, “no extrapolation of the results” is possible. More research in this area is therefore “absolutely necessary”.

(wst)

To home page

For peace: AI uses nuclear weapons in war simulation

“Nations agents” on the move

AI systems in use for the military

Share this:

Related

Sabesp: the rules of the game after privatization

Nutrition: Can you eat too much fruit?

You may also like

Leave a Comment Cancel Reply