U.S. sanctions are prompting Chinese tech companies to step up research as they seek to develop cutting-edge AI without relying on the most advanced U.S. chips.

A Wall Street Journal review of research papers and interviews with employees found that Chinese companies are working on technologies that might allow them to build cutting-edge AI performance from relatively few or less powerful semiconductors. They are also working on how to combine different types of chips to avoid dependence on any one kind of hardware.

Chinese communications equipment supplier Huawei Technologies Co., search company Baidu Inc, 9888.HK, BIDU and e-commerce giant Alibaba Group Holding Limited, 9988.HK, BABA ), among others, are trying to make more efficient use of existing computer chips.

Using these workarounds to catch up to America’s AI leaders remains a major challenge, researchers and analysts say. Still, they say some experiments have offered promise that, if successful, Chinese technology companies may be able to overcome the difficulties posed by U.S. sanctions and be more resilient to future restrictions.

Huawei and Baidu declined to comment. Alibaba did not respond to a request for comment.

The race to commercialize models like ChatGPT is heating up, with companies around the world desperate for more powerful chips and looking for ways to use them more efficiently to drive down surging AI development costs.

More critically for Chinese companies, employees, AI researchers and industry analysts said, U.S. sanctions have made it difficult for them to obtain the most advanced process chips made by companies such as Nvidia Corp. In order to build products that can compete with ChatGPT, these Chinese companies have rapidly consumed their existing US chip inventories.

According to Susan Zhang, an AI researcher at Meta Platforms who specializes in AI infrastructure and large language models, it can be seen from various signs that Chinese companies are trying to tap all available computing power to make up for the lack of top-level hardware. In the AI ​​industry, computing power refers to the computing power that a set of chips can provide.

China‘s top decision-making body said last month that China should attach importance to the development of general artificial intelligence and create an innovative ecology.

The administration of Joe Biden has hinted at the possibility of further sanctions since the Commerce Department imposed sweeping restrictions on chip supplies to China in October.

Sanctions have kept Chinese companies out of Nvidia’s A100, the most popular chip in the AI ​​development industry, as well as its next-generation version, the H100, released in March. The latter provides more powerful computing power.

Nvidia designed downgraded chips for the Chinese market, namely A800 and H800, to comply with sanctions. Both modified chips reduce the ability to communicate between the chips.

These products offer an effective alternative to developing small AI models, such as the one used by Bytedance Ltd.’s short-video app TikTok for recommendations. But this hurdle can stifle the development of larger AI models that require hundreds or even thousands of chips to collaborate.

One month after the United States imposed sanctions on China‘s chips, OpenAI released ChatGPT. The release of ChatGPT triggered a wave of generative AI development around the world. Generative AI, software that can generate text and images, requires unprecedented computing power to develop. Analysts at UBS estimate that between 5,000 and 10,000 A100 chips are needed to train such large AI models. OpenAI did not respond to a request for comment.

At a recent closed-door industry meeting, a survey released by a semiconductor industry association linked to the Chinese government pointed to supply constraints, finding that there were about 40,000 to 50,000 A100 blocks in China, according to a person who attended the meeting. Chips can be used to train large AI models. The association did not respond to a request for comment.

According to people familiar with the matter, Chinese companies such as Alibaba and Baidu, which stockpiled the A100 before the US sanctions, have strictly restricted the use of foreign advanced process chips internally, reserving them for the most computing-intensive tasks.

According to a previous report by the “Wall Street Journal”, Baidu has requisitioned the A100 from various business teams, including the autonomous driving department, to focus on promoting the development of Ernie Bot. Wenxin Yiyan is Baidu’s own AI product that benchmarks against ChatGPT.

Baidu has in recent years sought to incorporate domestic chips into its AI development, including Hygon Information Technology Co., Ltd.’s DCU and Huawei’s AI training chip, according to open-source research papers and people familiar with the matter. Tencent (Ascend), and Baidu’s own Kunlun core (Kunlun). However, many domestic chips are still unreliable for training large models because they are very prone to crashing, some people familiar with the matter said.

You Yang, a professor at the National University of Singapore, said many Chinese companies are now looking at combining three or four chips with relatively poor performance, including the A800 and H800, to replace Nvidia’s most advanced processors. You Yang runs AI infrastructure company HPC-AI Tech.

Tencent Holdings Ltd. (Tencent Holdings Ltd., 0700.HK, referred to as: Tencent) released a new computing cluster that can be used for AI large-scale model training in April this year. Nvidia H800 is used in these interconnected chips.

This method of combining low-performance chips can be costly, You Yang said, and if an American company needs 1,000 H100 chips to train a large language model, a Chinese company may need at least 3,000 H800 chips to achieve the same result.

This is prompting some companies to accelerate the development of technologies that use different types of chips to train large-scale AI models, You Yang said. This area of ​​research has previously been common among Chinese companies with limited hardware resources and a desire to keep costs down. Alibaba, Baidu and Huawei have sought to use various combinations of the A100, older-generation Nvidia chips V100 and P100, and Huawei’s Ascend chip, the paper shows.

By contrast, using multiple types of chips at the same time is rare among U.S. companies because of the technical challenges of getting the different types to work reliably together, AI experts said. It’s a last resort, Meta’s Zhang said.

Meanwhile, Chinese companies are also researching the use of various software techniques to reduce the computational intensity of training large-scale AI models, an approach that has accelerated globally, including among U.S. companies. However, the paper shows that, unlike American firms, Chinese firms have been more proactive in combining multiple software technologies.

While many of these approaches are still evolving and difficult to implement in the global research community, Chinese researchers have had some success.

In a paper in March, Huawei researchers showed how they could use the technique to train its latest generation of large language models using only the company’s Ascend chips, rather than Nvidia’s. Despite some shortcomings, the large language model, called Pangu, has state-of-the-art performance on several Chinese tasks, including reading comprehension and grammatical challenges, the researchers wrote in the paper.

Without access to the new Nvidia H100 chips, the pain points for Chinese researchers will only intensify, said Dylan Patel, principal analyst at SemiAnalysis, a semiconductor research and consulting firm. The H100 chip includes an additional performance-boosting feature that is especially helpful for training ChatGPT-like language models.

But a paper last year from Baidu and the Peng Cheng Laboratory, a Shenzhen-based research institute, showed that researchers were training large language models in a way that didn’t require that capability. Although the research is in its early stages, it looks promising, Patel said.

If the research goes well, sanctions can be effectively circumvented, he said.