Baidu ups the ante with the release of its latest AI models


Baidu Inc on Friday unveiled its latest large language model Ernie 4.5 Turbo and new deep-thinking model Ernie X1 Turbo with enhanced multimodal capacities, stronger reasoning and lower costs, as the Chinese tech heavyweight is doubling down on the fast-evolving artificial intelligence sector.
The multimodal LLM, which possesses the ability to process and generate various types of content, covering text, images, audio and video, will become a common feature for future foundational models, said Robin Li, co-founder, chairman and CEO of Baidu, at the company's annual AI developer conference on Friday.
Li said the market of AI models which can just deal with pure text prompts will shrink, while that of multimodal AI models will continue to expand, while highlighting the importance of bolstering the application of AI.
Without applications, chips and models are worthless, Li said. "There are many models, but it's the apps that rule the world. The application is the king," he added.
"The application of LLMs will not be outdated if right scenarios and appropriate foundational models are selected and these models are fine-tuned," Li said, adding that with the enhancement of AI model capabilities, there will be more and more models integrated with application scenarios, which are the real opportunities for AI developers.
Li believes that currently, one of the major obstacles for AI developers is that the LLMs are costly. The developers and entrepreneurs can boldly engage in developing models only if the costs are reduced, and the substantial reduction in costs will ultimately drive the explosive use of AI in various industries, he added.
As a deep-thinking reasoning model, Ernie X1 Turbo delivers comprehensive performance, surpassing DeepSeek R1 and V3, but at only 25 percent of the DeepSeek R1's price, Baidu said.
It boasts improvements in terms of Q&A, literary creation, logical reasoning and multimodal capacities. Its updated Ernie 4.5 Turbo model has a faster response rate, with the price dropping by 80 percent compared with the previous version.
At the conference, Baidu launched its multi-agent collaborative app Xinxiang, which can solve complex problems, such as legal consultation, travel planning, and knowledge analysis.
Charlie Dai, vice-president and principal analyst at research company Forrester, said Baidu's advancements in key AI products and services such as multimodal LLMs, and multi-agent collaboration app will accelerate AI adoption in various industries in China and lower barriers for developers to boost AI application and innovation.
Dai noted that the company's upgraded developer platforms will further simplify AI application creation and deployment through a selection of optimization covering deep-learning frameworks, while its own hardware infrastructure evolution such as super cluster around Kunlun P800 chipset, is crucial for achieving technological self-reliance.
Baidu has officially illuminated a computing power cluster comprising 30,000 of its self-developed Kunlun chips. The cluster can support the training of AI models like Deep-Seek with hundreds of billions parameters or 1,000 customers to fine-tune models with tens of billions of parameters at the same time.
"The multimodal LLM is an undeniable future development direction for generative AI technology," said Lu Yanxia, research director at market consultancy IDC China, adding the LLMs necessitate higher demand for data and knowledge in professional fields, and for talent that can fine-tune specialized models based on diverse industrial demands.
The continuous advancements in AI models will bring about fresh business opportunities for domestic AI servers, cloud computing and chip companies, she said, adding Chinese tech companies should pool more resources into improving computing power, algorithms and quality of data to gain a competitive edge in the increasingly fierce international AI chatbot race.
Pan Helin, a member of the Expert Committee for Information and Communication Economy, which operates under the Ministry of Industry and Information Technology, said the Ernie models made achievements in multimodal and reasoning abilities, and more efforts are required to bolster the efficient circulation of data elements, and expand application of LLMs in a wider range of sectors.