Home » Overseas, LLM is combined with the game “Quick Fight” to compare 14 large language models to see who is the strongest | Computer King Ada

Overseas, LLM is combined with the game “Quick Fight” to compare 14 large language models to see who is the strongest | Computer King Ada

by admin
Overseas, LLM is combined with the game “Quick Fight” to compare 14 large language models to see who is the strongest | Computer King Ada

Overseas, LLM is combined with the game “Quick Fight” to compare 14 large language models to see who is the strongest.

There are currently a lot of LLM large language models on the Internet. As far as AI chatbots are concerned, the more training data, the more powerful they will basically be. However, this is not the case if applied to fighting games. Recently, some people abroad have compared LLM with “Quick Fighting Tornado” Combined with the game, 14 large language models were tested, and the final winners were all small models.

This open source project is called LLM Colosseum, developed by Stan Girard and Quivr Brain. According to the introduction, this game runs in an emulator, allowing LLM to operate the characters in the game and compete (the character is limited to Ken), and everyone can Download and install this project to test it yourself.

Amazon employee Banjo Obayomi shared an article a few days ago about the results of his use of this project to test 14 LLMs. The content also detailed how LLM controls the characters in the game “Fighter”. LLM will continuously read the current state of the game, such as character position, health and scores. These data will be translated into a prompt, such as actions that can be taken and recommended strategies, to facilitate LLM’s understanding and use.

After receiving this prompt, LLM will analyze the current game status and decide the next action, convert it into game instructions, and implement them in the simulator, such as approaching, retreating, wave fist, and Shoryuken. For details, please refer to the video below:

See also  15. April 2024

From the video shared by Matthew Berman, a well-known foreign YouTube channel, you can see a relatively complete duel. On the left is the MISTRAL SMALL model, and on the right is the MISTRAL MEDIUM model. The two models fight quite smoothly, but there are some details to pay attention to. These Both models seem to have no so-called defensive actions, just movement and attack. If it were a fight with humans, no surprise humans would win easily:

Anyway, this is a battle between LLMs, and MISTRAL SMALL wins in the end, the small model is stronger than the big model. It can be seen that unlike AI chat, fighting games value speed and reaction most, and LLM small models usually have lower latency and speed.

Matthew Berman In the second half of the video, there are instructional steps for installing the LLM Colosseum project. It is recommended for those who want to play around with it themselves.

Among the 14 large language models tested by Banjo Obayomi, the final winner was claude_3_haiku, with a total of 314 games. He also found that small models have lower latency, faster reaction times and more movements in each game, so it is not surprising that Anthropic’s Claude won the front position:

However, although LLM is very smart, it is not without its shortcomings. Sometimes there will be some special situations, such as “hallucination” and “refuse to play”. In addition, each LLM also has its own unique play style. Some like aggressive attacks, while others adopt more defensive counterattacks. There are even spam attacks that repeatedly send the same actions.

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy