Home » DeepMind game AI learns fast programming

DeepMind game AI learns fast programming

by admin
DeepMind game AI learns fast programming

DeepMind game AI learns fast programming

DeepMind’s conquest of science continues. Last year, the Google-owned company used a version of its gaming AI, AlphaZero, to speed up a crucial math calculation — underlying many different types of code — beating a 50-year-old record.


Now the company has done the same thing again, twice to be precise. With a new version of AlphaZero called AlphaDev, the UK-based company (recently renamed Google DeepMind after a merger with its parent company’s AI lab in April) has found a way to sort items in a list up to 70 percent faster than the best existing method can.

It also found a way to speed up a key algorithm used in cryptography by 30 percent. These algorithms are among the most common building blocks of software. Small increases in speed can make a big difference, reducing costs and saving energy. “Moore’s Law is coming to an end, and chips are nearing their fundamental physical limits,” says Daniel Mankowitz, a researcher at Google DeepMind. “We need to find new and innovative ways to streamline data processing.”

“It’s an interesting new approach,” says Peter Sanders, who works on the design and implementation of efficient algorithms at the Karlsruhe Institute of Technology (KIT) and was not involved in the work. “Sorting is still one of the most used subroutines in computer science.”

DeepMind recently published its results in the specialist journal Nature published. But the techniques discovered by AlphaDev are already being used by millions of software developers. In January last year, DeepMind submitted its new sorting algorithms to the organization responsible for governing C++, one of the world‘s most popular programming languages. After two months of rigorous independent testing, AlphaDev’s algorithms were incorporated into the language. This was the first change to the C++ sorting algorithms in more than a decade, and the first update ever to include an algorithm discovered with the help of AI.

See also  New Vulnerability in macOS Allows Attackers to Bypass Apple's Triple Defense Mechanism

DeepMind added its other new algorithms to Abseil, an open-source collection of pre-built C++ algorithms that anyone who programmes with C++ is free to use. These cryptographic algorithms calculate hashes that serve as unique identifiers for all types of data. DeepMind estimates that its new algorithms are now being used trillions of times a day.


AlphaDev is based on AlphaZero, a reinforcement learning model that DeepMind trained to master games like Go and chess. The company’s breakthrough was to treat the problem of finding a faster algorithm as a game, and then train the AI ​​to win it — the same approach it used to speed up matrix multiplication last year.

In the case of AlphaDev, the game consists of selecting computer commands and arranging them in such a way that the resulting lines of code form an algorithm. AlphaDev wins the game when the algorithm is both correct and faster than the existing algorithms. This sounds simple, but to play well, AlphaDev has to search through a huge number of possible moves.

DeepMind has opted for the hardware-related programming language Assembler. Few people write in assembler today. Code written in other languages, such as C++, is translated into this language before it is executed. The advantage of assembler is that algorithms can be divided into finely graduated steps. This is a good place to start when looking for shortcuts.

AlphaDev makes a move by adding a new assembler instruction to the algorithm it is building. Initially, AlphaDev added the instructions randomly and generated algorithms that didn’t run. Over time, just like board games, AlphaZero learned to play winning moves. It added instructions that resulted in algorithms that not only worked but were correct and fast.

DeepMind focused on algorithms for sorting short lists of three to five items. Such algorithms are called again and again in programs that sort longer lists. Speed ​​increases in these short algorithms therefore have a cumulative effect.

See also  Conscious deep learning is still a utopia

But even short algorithms have been studied and optimized by humans for decades. To test their concept, Mankowitz and his colleagues started with an algorithm for sorting a list with even just three entries. The best human-developed version of this algorithm has 18 instructions.

“To be honest, we didn’t expect to achieve anything better,” says Mankowitz. “But to our surprise, we managed to make the algorithm faster. We initially thought it was a bug, but when we analyzed the program, we realized that AlphaDev had actually spotted something here.”

AlphaDev found a way to sort a list of three items into 17 instead of 18 statements. The AI ​​had discovered that certain steps could be skipped. “When we looked at it afterwards, we thought, ‘Wow, that definitely makes sense,'” says Mankowitz. “But something like that [ohne AlphaDev] to discover, you need a lot of people who are experts in assembler.”

AlphaDev couldn’t beat the best human version of the algorithm for sorting a four-element list, which takes 28 statements. However, at five entries, it beat the best human version by reducing the number of instructions from 46 to 42.

That means a significant acceleration. The existing C++ algorithm for sorting a five-item list took about 6.91 nanoseconds on a typical Intel Skylake chip. AlphaDev’s did it in 2.01 nanoseconds, about 70 percent faster.

DeepMind compares AlphaDev’s discovery to one of AlphaGo’s weird but successful moves in his Go match against Grandmaster Lee Sedol in 2016. “All the pundits looked at him and said, ‘That’s not the right move. That’s a bad move'”, says Mankowitz. “But actually it was the right move, and AlphaGo not only won the game in the end, but also influenced the strategies that professional Go players began to use.”

Sanders is impressed but believes the results shouldn’t be overstated. “I agree that machine learning techniques are increasingly transforming programming, and everyone expects that AI will soon be able to create new, better algorithms,” he says. “But we’re not quite there yet.”

See also  The Oracle service protects against anti-money laundering risks

First, Sanders points out that AlphaDev uses only a subset of the instructions available in assembly language programming. Many existing sorting algorithms would use instructions that AlphaDev has not tried. That makes it harder to compare AlphaDev to the best competing approaches.

In fact, AlphaDev has limitations. His longest algorithm was 130 statements long to sort a list of up to five items. At each step, AlphaDev chose from 297 possible assembly instructions. “With more than 297 instructions and ‘games’ longer than 130 instructions, learning became slow,” says Mankowitz.

This is because even with 297 instructions (or moves), the number of possible algorithms AlphaDev could construct is greater than the number of possible chess games (10120) and the number of atoms in the universe (which is believed to be 1080).

For longer algorithms, the team plans to adapt AlphaDev to work with C++ instructions instead of assembler. With less fine-grained control, AlphaDev might not find certain shortcuts, but the approach would be applicable to a wider range of algorithms.

Sanders would also like to see a more comprehensive comparison with the best human-developed ideas, especially for longer algorithms. According to DeepMind, this is already planned. Mankowitz wants to combine AlphaDev with human methods and get the AI ​​to build on their intuition instead of starting from scratch.

Finally, there may be other ways to speed things up. “For a person to do that, it takes a lot of expertise and a lot of hours — maybe days, maybe weeks — to look through these programs and find improvements,” Mankowitz says. “That’s why it hasn’t been tried before.”


To home page

You may also like

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More

Privacy & Cookies Policy