Li Yanhong: Open source costly in large models; multimodality key to AGI

“In the future, natural language will be the new universal programming language. If you can speak, you can be a developer,” proclaimed Robin Li, Founder, Chairman, and CEO of Baidu at the 2024 Baidu AI Developer Conference on April 16th. Li suggested that AI is sparking a creativity revolution, making app development as straightforward as shooting a short video. According to him, everyone has the potential to be a developer.

Li highlighted Baidu’s commitment as a tech company to provide essential development tools, including a foundational model series and three major AI development tools. He noted the strong support and encouragement from the state for the ‘AI+’ initiative, which is pivotal in leveraging these tools for unleashing boundless creativity and productivity.

Chiefly, Li’s discussions at the conference focused on tools and examples primarily based on large language models. He forecasts a significant long-term trend towards multimodal large models, integrating text, images, voice, and video, which are essential steps toward achieving General AI (AGI). Baidu continues to invest heavily in these areas and keeps its technology up-to-date.

Latest achievements of Baidu’s Wenxin Model: More than 200 million daily API calls

Li first revealed the latest developments concerning Wenxin Model, which launched its Wenxin Yiyan last March. Over the past year, the number of users and the API’s daily call volume have both surpassed 200 million, with services provided to 85,000 customers and over 190,000 AI-native apps developed using the Qianfan platform.

The Wenxin 4.0 model has significantly evolved from its earlier 3.0 to 3.5 and now 4.0 versions, excelling in comprehension, generation, logic, and memory. Li mentioned the remarkable improvements in generic abilities such as code generation, interpretation, and optimization, positioning it at an international leading level.

Cost efficiency has dramatically improved, with the model’s algorithm training efficiency increasing by 5.1 times, a weekly training efficiency of 98.8%, and inferencing performance boosting by 105 times, reducing costs to just 1% of previous levels.

During the conference, Baidu released a tool version of the Wenxin 4.0 model that allows developers to experience the code interpreter feature, enabling complex data and file management and analysis through natural language interactions.

“The media might not be thrilled with a 99% reduction in costs. However, for businesses and developers, what matters most is the performance and cost efficiency,” Li explained, emphasizing how Baidu’s comprehensive framework across chips, frameworks, models, and applications significantly reduces costs.

Despite the continuous buzz around large models in 2024 and ongoing technological breakthroughs, Li stressed that large models themselves don’t create value; it’s the AI applications developed from these models that meet real market demands.

Baidu’s strategy for developing AI-native applications is based on lessons learned from previous mistakes

Li shared that the strategies and tools for developing AI-native applications are the results of costly setbacks Baidu experienced over the past year. He sees MoEs (mixture of experts), smaller models, and agents as critical future directions.

Upcoming large AI-native applications, he suggests, will primarily use MoEs without relying on a single model for all problems. Smaller models, being low-cost and fast, can achieve effects comparable to large models in certain scenarios, which is why Baidu released its Speed, Lite, and Tiny models.

Regarding agent technologies—understanding, planning, reflecting, and evolving capabilities—these allow machines to think and act like humans, autonomously completing complex tasks and continuously learning and evolving within their environment.

At the conference, Baidu launched three new tools: the AgentBuilder for developing intelligent agents, the AppBuilder for AI-native applications, and the ModelBuilder for customizing models of various sizes.

Li also commented on the disadvantages of open-source models, noting that while open-source might seem cheaper, it is actually more expensive in large model scenarios.

Despite some international and domestic open-source models, such as Meta’s Llama and Mistral from a French startup, gaining traction, Baidu has decided against releasing their model as open-source because it would entail maintaining a separate version, which isn’t cost-effective.

Li believes that proprietary models can better attract talent and computational power, forming a viable commercial strategy.

As per its financial report for Q4 2023, Baidu showed significant growth, with annual gross revenue reaching 134.598 billion yuan, a 9% increase year-over-year, and a non-GAAP net profit of 28.7 billion yuan, marking a 39% increase. For 2024, Li anticipates generative AI and foundational models will contribute significantly to Baidu’s revenue.

The report also indicated a 4% increase in R&D expenditure, driven by depreciation costs for servers and rack fees supporting generative AI research.

As of midday, Baidu’s stock in Hong Kong had fallen by 2.17%, while its NASDAQ shares dropped by 1.25% as of the last close.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.