Ant Group Leverages Homegrown Chips to Power AI Innovation and Reduce Costs

 


Ant Group is taking significant steps to reduce costs and lessen its reliance on US technology by using Chinese-made semiconductors to train its artificial intelligence models. According to sources familiar with the matter, the Alibaba-owned company has opted for chips from domestic suppliers, including Alibaba and Huawei, to develop large language models using the Mixture of Experts (MoE) method. Reports suggest that the performance of these models is comparable to those trained with Nvidia’s H800 chips. Although Ant still uses Nvidia’s hardware for some AI projects, one source revealed that the company is shifting towards alternatives like AMD and Chinese chip-makers for its latest models.

This shift highlights Ant’s increasing participation in the ongoing AI race between China and the US, particularly as Chinese companies seek cost-effective ways to train AI models. The decision to explore domestic semiconductor options reflects a larger strategy among Chinese firms to navigate US-imposed export restrictions on high-end chips like Nvidia’s H800. Though not the most cutting-edge, the H800 remains one of the more powerful GPUs available to Chinese organizations.

Ant has also published a research paper detailing its work, claiming that in some instances, its AI models outperformed those developed by Meta. Bloomberg News, which initially reported on the matter, has not independently verified these claims. If accurate, Ant’s research represents a step forward in China’s broader efforts to make AI development more cost-efficient while reducing reliance on foreign hardware.

The Mixture of Experts (MoE) Approach and its ImpactMoE models are designed to divide complex tasks into smaller, specialized components, allowing for a more efficient approach to AI model training. This method has gained traction among AI researchers and data scientists and is currently used by industry giants like Google and the Hangzhou-based startup DeepSeek. The concept is similar to assembling a team of specialists, each handling a specific part of a task, making the overall process more streamlined and effective.

The success of MoE models largely depends on access to high-performance GPUs, which can be prohibitively expensive for smaller businesses. Ant’s research focuses on overcoming this cost barrier. The company’s research paper explicitly outlines its goal of scaling models "without premium GPUs," highlighting its efforts to develop a more affordable AI training strategy.

The Cost Factor: Optimizing AI Training ExpensesAnt’s approach to AI training starkly contrasts with Nvidia’s business model. Nvidia CEO Jensen Huang believes that demand for computing power will continue to rise, even with the development of more efficient AI models like DeepSeek’s R1. He argues that rather than reducing costs, companies will prioritize acquiring more powerful chips to drive revenue growth. Nvidia’s strategy revolves around producing GPUs with increasing cores, transistors, and memory to meet this demand.

In contrast, Ant’s paper presents a more cost-conscious perspective. It states that training one trillion tokens—the foundational data units AI models learn from—would typically cost around 6.35 million yuan (approximately $880,000) using conventional high-performance hardware. However, by optimizing its training method and utilizing lower-specification chips, the company reduced that expense to about 5.1 million yuan, representing a significant cost-saving.

Applications of Ant’s AI ModelsAnt has ambitious plans for deploying its AI models, Ling-Plus and Ling-Lite, across various industries, including healthcare and finance. Earlier this year, the company acquired Haodf.com, a leading Chinese online medical platform, signaling its intent to integrate AI-driven solutions into the healthcare sector. Ant also operates other AI services, such as the virtual assistant app Zhixiaobao and the financial advisory platform Maxiaocai, further expanding its AI portfolio.

Robin Yu, the chief technology officer of Beijing-based AI firm Shengshang Tech, emphasized the importance of real-world applications when assessing AI performance. "If you find one point of attack to beat the world’s best kung fu master, you can still say you beat them, which is why real-world application is important," he remarked.

Open Source Innovation and Model SpecificationsAnt has chosen to make its AI models open-source, allowing researchers and developers to access and build upon its advancements. Ling-Lite features 16.8 billion parameters, while Ling-Plus boasts an impressive 290 billion parameters. For context, closed-source AI models like GPT-4.5, as estimated by MIT Technology Review, contain around 1.8 trillion parameters, making them significantly larger in scale.

Challenges in Training AI ModelsDespite Ant’s progress, the company acknowledges the persistent challenges associated with training AI models. The research paper notes that even minor changes to hardware configurations or model structures can lead to instability, sometimes causing error rates to spike unexpectedly. These challenges highlight the complexities involved in refining AI models and ensuring consistent performance across different hardware setups.

The Broader Implications of Ant’s StrategyAnt Group’s AI strategy underscores a broader movement among Chinese tech firms to develop self-reliant AI ecosystems. By leveraging domestic semiconductor technology, companies like Ant are not only reducing costs but also mitigating risks associated with geopolitical tensions and trade restrictions.

The adoption of MoE models and cost-effective training methods may inspire other Chinese companies to follow suit, potentially leading to a more competitive AI landscape in the global market. As China continues to push forward with AI innovation, developments like these will play a crucial role in shaping the future of artificial intelligence and its applications across various industries.

In conclusion, Ant Group’s investment in domestic chip technology and its innovative AI training approach highlight a new era of AI development in China. As the AI race between China and the US intensifies, the company’s efforts to optimize training costs and explore alternative hardware solutions will be crucial in determining its long-term success in the AI sector.

Post a Comment

0 Comments