2024 Baidu Create: Robin Li introduces 3 new AI tools!

On April 16, at the Baidu AI Developer Conference Create 2024, Baidu’s founder, chairman, and CEO Robin Li delivered a keynote speech titled “Everyone is a Developer.” He emphasized that “AI is sparking a creativity revolution, where in the future, developing applications will be as simple as shooting a short video. Everyone is a developer, everyone is a creator.”

He mentioned that as a technology company, Baidu aims to provide everyone with the necessary development tools to enhance societal creativity. This includes a powerful series of basic models and three major AI development tools, collectively forming a toolbox for developers to use on-the-go.

The impressive basic model series, known as the Wenxin Big Model Series, consists of flagship models such as ERNIE 3.5 and 4.0, and lightweight versions like ERNIE Speed, Lite, and Tiny. Li announced the official release of the tool version of the Wenxin Big Model 4.0. With over 200 million users since its launch over a year ago, Wenxin Big Model has become China’s most advanced and widely applied AI basic model.

Li highlighted that the smaller-sized models derived from the powerful Wenxin 4.0 through dimensionality reduction and pruning are significantly more effective and cost-efficient compared to directly using open-source models. He emphasized that open-source models are becoming increasingly obsolete.

Furthermore, Baidu has prepared three major “ready-to-use” AI development tools, including the Intelligent Entity Development Tool AgentBuilder, AI Native Application Development Tool AppBuilder, and various model customization tools such as ModelBuilder. According to Li, these three tools represent advanced productivity.

During the event, Li shared Baidu’s specific approach to developing AI native applications based on large models, pointing out MoE, small models, and intelligent entities as three key areas to focus on. He stated, “This is the result of Baidu’s countless trials and costly lessons over the past year.”

Below is a transcript of Robin Li’s keynote speech “Everyone is a Developer”:

As long as you can speak, you can become a developer

Hello, welcome to Create 2024, the Baidu AI Developer Conference. This is the first time Create is held in the Guangdong-Hong Kong-Macao Greater Bay Area. Today, we have over 5000 developers and tech enthusiasts in attendance. Over the past year, I have interacted with many entrepreneurs and developers, and it seems like everyone is in a “FOMO” state – excited yet afraid of missing out. Indeed, large models and generative AI will completely transform the developer community.

In the past, developers changed the world with code; in the future, natural language will become the new universal programming language. As long as you can speak, you can become a developer and use your creativity to change the world.

That day is not far away; with the powerful basic large models and many low-threshold or even zero-threshold development tools, developers’ productivity has greatly increased.

For instance, the intelligent code assistant Comate based on the Wenxin Big Model not only supports over 100 languages and all mainstream IDE platforms but also offers code suggestions, generates code comments, detects code defects, provides optimization suggestions, and can deeply interpret code repositories, linking private knowledge to generate new code. After over a year in operation, Comate has been adopted by tens of thousands of enterprises like Ximalaya, Mitsubishi Elevator, and Softbank Dynamics, with a code adoption rate of 46%. 27% of Baidu’s daily new code is generated by Comate.

Today, you don’t need to write code to create an AI application; you don’t have to program to develop an intelligent entity. AI is leading a creativity revolution, where developing applications in the future will be as easy as shooting a short video. Everyone is a developer, everyone is a creator.

As a technology company, Baidu’s role is to provide the necessary development tools to everyone and continually enhance the creativity of society. Specifically, we offer a powerful basic model series, the Wenxin Big Model Series, which includes flagship versions like ERNIE 3.5 and ERNIE 4.0, as well as lightweight versions like ERNIE Speed, Lite, and Tiny.

We also provide tools for developing various applications based on large models, including the AgentBuilder for intelligent agent development, the AppBuilder for AI-native application development, and the ModelBuilder for customizing models of various sizes. These three tools represent advanced productivity. Below, I will introduce them one by one.

Wenxin Yiyán surpasses 200 million users, Wenxin large model 4.0 tool version released

First, let’s talk about the latest developments in Wenxin Yiyán and Wenxin large models:

Wenxin Yiyán was launched on March 16th last year and has been in operation for one year and one month today. Our user base has exceeded 200 million, with an API daily call volume also surpassing 200 million. The number of clients served has reached 85,000, and the number of AI-native applications developed using the Qianfan platform has exceeded 190,000.

Let’s see what everyone is doing with Wenxin Yiyán?

The foundation model supporting Wenxin Yiyán is the Wenxin large model. Over the past year, it has evolved from version 3.0 to 3.5, and now to version 4.0. Wenxin 4.0 has reached the industry-leading level in four major capabilities: understanding, generation, logic, and memory.

In recent months, the Wenxin large model has made further significant improvements in general capabilities such as code generation, interpretation, and optimization, reaching the international leading level.

Today, we officially release the tool version of Wenxin large model 4.0. Now, you can experience the code interpreter function on the tool version, which enables you to process and analyze complex data and files through natural language interaction. You can also generate charts or files to quickly gain insights into the characteristics of the data, analyze changes, and provide efficient and accurate support for subsequent decisions.

The Wenxin large model has become the most advanced and widely used AI basic model in China.

Moreover, compared to a year ago, the algorithm training efficiency of the Wenxin large model has increased by 5.1 times, with a weekly training efficiency reaching 98.8%, and the inference performance has increased by 105 times, reducing the cost of inference to 1% of the original.

In other words, customers who used to make 10,000 calls a day can now make 1 million calls at the same cost. Media may not be excited about a 99% cost reduction, but for businesses and developers, once they start using it, the focus is on effectiveness and cost.

The reason we can reduce the inference cost to 1% while improving performance is because Baidu has a full-stack layout in the four layers of chip, framework, model, and application. Through end-to-end optimization, we continuously reduce costs, allowing more people to use large models to develop AI applications efficiently and inexpensively.

Undoubtedly, topics related to large models will still be hot in 2024, and various technological breakthroughs will continue to emerge. The media will continue to be enthusiastic about using headlines such as “shocking release” and “epic updates” for rendering. But what I want to emphasize is that large models themselves do not directly create value; AI applications developed based on large models can meet real market demands.

Developing AI Native Applications: A Journey of Challenges and Insights

AI Tools

Today, I would like to share some specific approaches and tools for developing AI native applications based on large models. These insights stem from Baidu’s practices over the past year, where we’ve encountered numerous challenges and paid substantial learning fees in return for valuable lessons.

AI Tools Image

Firstly, MoE (Mixture of Experts). The future of large-scale AI native applications largely revolves around the MoE architecture. Here, MoE refers not merely to an academic concept but to the integration of small and large models. Instead of relying on a single model to address all issues, it involves strategically selecting when to utilize small models, large models, or no models at all, tailored to different application scenarios.

AI Tools Image

Secondly, small models. With low inference costs and fast response times, finely tuned small models, especially those optimized through SFT (Structured Fine-Tuning), can rival the performance of large models in specific scenarios. This rationale led to the introduction of our Speed, Lite, and Tiny lightweight models. By compressing and distilling a base model from large models, followed by training with data, we achieve superior outcomes compared to training small models from scratch. These models outperform those derived from open-source models, delivering enhanced efficiency at lower costs.

AI Tools Image

Thirdly, Intelligent Agents. Intelligent Agents are currently a hot topic, expected to spawn numerous new applications as their capabilities advance. This mechanism, encompassing understanding, planning, reflection, and evolution, enables machines to think and act like humans. They autonomously execute complex tasks, continually learn in environments, achieve self-iteration, and evolution. In complex systems, facilitating interactions among different intelligent agents to collaborate enhances task completion quality significantly. These capabilities have already been developed, and we’ve made them fully accessible to developers.

AI Tools Image

In these three domains of MoE, small models, and Intelligent Agents, Baidu has prepared “out-of-the-box” tools for everyone. Here, I’ll introduce three distinct tools: AgentBuilder for Intelligent Agents development, AppBuilder for AI native application development, and ModelBuilder for customizing models of various sizes.

AgentBuilder for Intelligent Agents Development:

Intelligent Agents are the closest and most mainstream use of large models for everyone

AI Tools Image

First up is the AgentBuilder for Intelligent Agents development. Intelligent Agents are poised to become the closest and most mainstream use of large models for everyone. Leveraging powerful base models, Intelligent Agents can be mass-produced and applied across diverse scenarios.

AI Tools Image

Baidu has recently upgraded the Wenxin Intelligent Agent platform. To date, over 30,000 Intelligent Agents have been created, with over 50,000 developers and thousands of businesses participating. Our aim is to empower individuals and organizations to become Intelligent Agent developers, fostering the most comprehensive Intelligent Agent ecosystem in China.

So, how do we achieve this goal? By offering a zero-threshold Intelligent Agent development tool – AgentBuilder.

AI Tools Image

Next, let’s explore how an Intelligent Agent is created using the example of the “Singapore Tourism Board.”

Firstly, we access the Wenxin Intelligent Agent platform, where the creation page provides two modes: “zero-code” and “low-code.” Beginners can directly opt for the “zero-code mode” to craft an Intelligent Agent in just a few sentences, using natural language. # Creating Specialized AI Assistants for Tourism and Education

We start by naming the AI assistant as the “Singapore Tourism Board” and specifying in the settings that it should be designed to create travel plans, answer questions, and provide hotel and ticket booking services. These settings are meant to guide the AI assistant on what it can do.

If only a basic AI assistant is required, the platform will automatically fill in the details. However, we aim for the “Singapore Tourism Board” to be a professional AI assistant, so advanced configurations are necessary. I can add Singapore encyclopedia entries and official website links to the knowledge base for daily updates. Then, by adding tools such as hotel inquiries and attraction ticket purchases, we enhance its service capabilities. Currently, we have partnered with Ctrip to provide tools for hotel bookings, attraction visits, ticket purchase, and other travel services. This way, an AI assistant for the Singapore Tourism Board is ready, allowing for further previewing and fine-tuning.

Now, open the Baidu app and search for “When is the least crowded time to visit Singapore?” as travelers often seek to avoid crowds while going on trips. The AI assistant will consolidate information from various sources to generate an answer, indicating “The least crowded period is from January to March.” We can also interact further with the AI assistant for additional details on traveling to Singapore, recommendations for the top three hotels in Singapore, and even directly book tickets for Universal Studios Singapore. This one-stop solution significantly saves users time.

In addition to Singapore, intelligent agents in the cultural and tourism sectors for cities like Dalian and Shenyang are also available on the Wenzhen Intelligent Agent Platform, along with various types of agents for knowledge, creativity, learning, entertainment, etc., all built using AgentBuilder.

When Wenzhen Yanyan was first launched last year, I mentioned that Wenzhen Yanyan would impact every company, given its powerful natural language understanding, expression, and reasoning capabilities that bring any company closer to its customers.

Now, every business and every customer can have their own dedicated intelligent agent on Baidu. The entire process requires no programming; through information input similar to prompts and simple steps for optimization, a personalized intelligent assistant can be quickly generated, becoming a 24/7 online top-notch salesperson.

Let’s take a look at how a business’s intelligent agent is built.

Xuexue Education is a well-known education company with over 60 branches nationwide and many overseas subsidiaries, serving a wide range of countries with high requirements for reception dialogue. How can they reply to customer inquiries 24/7, enhance service quality, and reduce operating costs?

Xuexue Education utilized Baidu’s AgentBuilder to create a dedicated intelligent agent.

Now let’s see how to create an intelligent agent with basic capabilities. It’s straightforward – fill in the agent’s avatar, name, business scope, welcome message on the platform, set up some information to be provided by the user, such as age, education level. In just 5 minutes, with zero barriers, an intelligent agent is ready.

Xuexue Education also wanted this intelligent agent to be a knowledgeable study abroad consultant, understanding students’ various situations like preferences for the US or Australia, pursuing a master’s or bachelor’s degree, scores in IELTS and TOEFL, etc., providing professional analyses and precise answers. By adding knowledge, roles, and tools modules, we can create a more advanced intelligent agent.

In the knowledge module, upload exclusive knowledge for real-time analysis by the platform to generate conversation scripts; in the role module, filter out study abroad countries not within the business scope to enhance user lead efficiency; in the tool module, include services like scheduling appointments. Through these simple steps, a professional Xuexue Education intelligent agent with specialized capabilities is created.

Now, let’s search for “Requirements for studying in Australia” to see how quickly the intelligent agent provides essential language skills, professional choices, and corresponding study abroad consultation plans, ensuring thorough responses to various inquiries and requests.

The Xuexue Education intelligent agent has been well-received; in the first week of its launch, it distributed 1.55 million times, interacted with users 58,000 times, witnessed a sharp increase in lead conversion volume, significantly reduced the conversion cost of effective leads, and greatly improved operational efficiency.

Next, let’s introduce an intelligent agent in the home industry.

Sophia is a home furnishing brand specializing in full-house customization. Like shown earlier, it can create a basic business intelligent agent by simply providing minimal information. However, for the home industry, offline consumer experience is crucial, so Sophia aims to build a top-notch sales representative online to replicate the offline service experience.

During further settings, Sophia chose a digital avatar as the display method in the role module, then selected suitable background and voice for the avatar, and combined with the platform’s intelligent parsing capabilities, automatically summarized a set of sales scripts. Eventually, a gentle, professional, top-notch sales representative was created, ready to address user needs and provide high-quality service experiences 24/7.

When Baidu search users have decoration needs, Sophia’s intelligent agent will use the Wenzhen large model’s capabilities to promptly provide answers to questions. In addition, it actively confirms specific requirements with customers, such as decoration type, budget, and recommends nearby offline stores.

Since the Sophia business intelligent agent went online, the cost of effective leads has decreased by 30%. That is, acquiring a new client previously costing 100 yuan now only requires 70 yuan.

Currently, more than 10,000 of Baidu’s clients have adopted the Business Intelligence Agent, spanning across over 30 industries including education and training, real estate, machinery and equipment, business services, and more.

Above, through three demos, I showcased how developers and businesses can utilize AgentBuilder to create intelligent agents tailored to different industries.

Nowadays, creating an intelligent agent is as easy as pie. But here’s the catch! Without traffic, distribution, visibility, and user engagement, developers and businesses won’t yield any profits. And without profits, there’s no drive. How to tackle this pain point?

Our Wenxin Intelligent Agent Platform provides developers with a channel for traffic monetization. Apart from Baidu Search, other products within the Baidu ecosystem such as Xiaodu, Maps, Tieba, and CarLife can all integrate with the capabilities of intelligent agents, alleviating concerns about traffic distribution for developers and ensuring tangible profits.

With distribution comes data feedback; with data feedback, the flywheel starts turning, allowing the intelligent agents to autonomously iterate and become smarter over time. The Wenxin Intelligent Agent Platform also launched modules for data analysis and question-answer tuning of agents, with more new capabilities coming soon. Through the data flywheel of distribution-diagnosis-profit, the Wenxin Intelligent Agent Platform aims to drive the formation of a positive cycle with better quality, more traffic, and greater profits for the agents.

AppBuilder: Develop an App in Three Steps with Natural Language

Next, let me introduce the second development tool, AppBuilder. It’s currently the most user-friendly AI-native application development tool. On AppBuilder, we have pre-packaged and pre-set various components and frameworks required for developing AI-native applications, significantly lowering the barrier to entry for developers.

In just three steps, developers can create an AI-native application using natural language and seamlessly deploy it into various business environments. Let’s look at some examples:

Earlier this year, we organized an AI-native application development challenge, where participants used AppBuilder to create a “Theme Park Queue Planning Assistant” to help visitors better understand queue situations and design personalized routes for optimal amusement park experiences within a limited time frame.

The champion of this challenge developed the application without writing a single line of code and won a prize of 100,000 RMB provided by Baidu. While this challenge might not be difficult if you can code, the ability to develop it without writing any code heavily relies on the foundation model and the capabilities of AppBuilder.

Now, let’s see how to use AppBuilder to create this AI application.

Let’s first review the challenge. It assumed the queue times and excitement levels of various attractions at “Universal Studios,” and the task was to achieve the highest excitement level within a limited time.

Firstly, open the development interface of AppBuilder and name the application “Theme Park Queue Assistant.” The second step is to describe specific requirements in the role instructions, including calling the code interpreter, calculating the optimal combination within a fixed time, and outputting the results. The third step is to add the code interpreter to the tool components to assist in computation.

Now, let’s test the effect. By inputting the question “I have 3 and a half hours, how can I have the most excitement?” on the right side, you can see that the code interpreter translates this question into code, which is then processed by the data understanding tool to analyze the given conditions. After a series of calculations, it suggests that the combination of “Forbidden Journey of Harry Potter,” “Jurassic Adventure,” “Tyrannosaurus Rex Roller Coaster,” and “Bumblebee Swinging” would provide the optimal amusement. With no issues in testing, click publish, and an application is generated with zero code.

Now, AppBuilder has been further upgraded. Throughout the creation process, all steps can be automatically optimized through the “AI optimization configuration” feature, further enhancing development efficiency for developers.

Let’s look at another example.

Earlier this year, North China Electric Power University proposed providing intelligent exclusive services for all students and teachers. Based on Baidu’s AppBuilder, we jointly created a Hua Dian AI Assistant. Now, let me show you how the Hua Dian AI Assistant was created:

Step one: Open AppBuilder, enter the creation page, and click on AI auto-generation configuration. Firstly, set the name, description, avatar, and other basic information for the application.

Step two: Describe specific requirements in the role instructions through natural language, including tasks, component capabilities, requirements, and limitations.

Step three: Insert custom components such as book borrowing inquiries, schedule inquiries, and student grade inquiries to enable the campus assistant to provide intelligent services. Then, add an opening statement for the campus assistant to configure the application.

Next, in the preview interface, debug the assistant based on user questions, such as inquiring about the registration time for the College English Test Band 4 or 6, to test the automatic calling effects of various components.

As you can see, through these simple operations, the application is completed. It has been tested on a small scale and has been launched, providing services for common scenarios such as checking regulations, courses, meal cards, and borrowing books for the vast number of students and teachers. In the future, we will further deepen our cooperation with North China Electric Power University to provide more rich and convenient services.

Baidu also has accumulated years of technology in cross-modal fields.
In AppBuilder, we also provide certain cross-modal capabilities. With just a piece of text or a few sentences, you can quickly create applications related to drawing, such as comics or children’s picture books.

The process is simple: open AppBuilder, click on “App Creation,” input the character instructions, select the “Add Graphics from Text” component, input three recommended questions, and then you can click on publish. Once the application is created, all we need to do is input a rough idea of the character or plot we desire, and AppBuilder can automatically generate the story and output the graphics.

The latest feature introduced by Baidu Wenku, the intelligent comic and picture book generation, utilizes such components provided on AppBuilder. Let’s take a look at how Baidu Wenku’s comic generation feature allows anyone with creative ideas to produce great works.

Let’s take the classic story of “Zhou Chu Eliminating Three Evils” as an example.

Open Baidu Wenku, input the theme “Zhou Chu Eliminating Three Evils,” which is documented in “Jin Shu” and “Shi Shuo Xin Yu.” Upon searching in Wenku, the platform will generate a story based on the original text, which we can further modify. Then, click on the AI toolbar on the right side to begin creating this comic.

Upon entering the comic creation interface, Wenku will automatically generate comic frames for us based on the story’s plot. Then, from various styles such as light and shadow, realism, and cartoon, choose the comic style that best fits the story. Finally, select different character images based on the roles, and the comic generation is complete.

After the comic is generated, we can browse the complete comic in Baidu Wenku’s intelligent comic editor. Moreover, Wenku supports editing, modifying, and refining each frame. For example, by selecting the frame “Zhou Chu and the White-Spotted Tiger” on the left side, clicking on edit, and adding a description like “characters with clear facial features, bright scene,” we can fine-tune the comic to better meet our needs. As you can see, Baidu Wenku’s comic feature excels in maintaining consistency in character and scene styles.

Baidu Wenku’s intelligent comic capabilities greatly enhance the efficiency of creating comics, reducing the cost and threshold of comic creation, allowing more people with ideas and creativity to realize their comic creation dreams.

Not only comic generation, but Baidu Wenku can also assist users in creating picture books effortlessly. You may not know that the average annual per capita picture book reading volume for Chinese children is only 10 books, while it is around 50 books in European and American countries. Now, AI enables parents with no drawing skills to create a personalized picture book for their children. Now, let’s take a look at this illustrated sound book!

Since last year, we’ve redesigned Baidu Wenku with AI, making it the “starting point for content creation” for users. Now, with the support of AppBuilder, the newly introduced intelligent comic and picture book features of Baidu Wenku extend the scene to a more interesting cross-modal creative field.

I just showcased three cases to demonstrate how to use Baidu’s AppBuilder to create AI-native applications. You should be able to feel two obvious advantages of AppBuilder:

First, it’s powerful. Leveraging the understanding and compliance capabilities of Wenxin 4.0 to instructions, our AppBuilder ensures a decent level of performance from the cold start, avoiding spending a long time on tuning due to poor performance, significantly lowering the development threshold. With the support of Retrieval-Augmented Generation (RAG) technology, in typical scenarios such as knowledge Q&A, our Q&A accuracy and friendly response level have exceeded 95%, far surpassing other similar products. AppBuilder also provides a rich set of component tools, including AI capabilities components based on Baidu’s multi-year technological accumulation like Baidu Search, large model capabilities components, and 55 exclusive business components opened by Baidu, as well as some mainstream third-party APIs such as flight inquiry and paper inquiry. We’ve just supported custom components, so customers can directly access their own proprietary tools and data. These rich components collectively support efficient development of AI-native applications.

Second, it’s simple and easy to use. With AppBuilder, you can create and distribute applications quickly in just three steps. We also support open-source SDKs for convenient secondary development.

ModelBuilder, the model customization tool: Efficient and low-cost model production

Let’s take a look at how ModelBuilder achieves model fine-tuning for essay correction.

(See [Juxian Education] demo)

Step one: Create a dataset. The effectiveness of model fine-tuning largely depends on the quality of our data. In this case, the original data only consisted of 180 samples, and the quality was not high enough. Thus, we need to utilize three functions: data cleaning, data labeling, and data augmentation. Data cleaning quickly removes issues like data gaps and garbled characters. For data labeling, we added more dimensions to essays, such as content depth and writing techniques evaluation. Data augmentation generates similar but non-repetitive data to expand the dataset. After augmentation, ModelBuilder provided us with 920 high-quality data samples.

Step two: Enter the model fine-tuning stage. Firstly, we need to select a base model for fine-tuning, and here we chose ERNIE Speed as the base model. Then, according to the platform’s recommended values and considering we have close to 1000 samples, we configured it for 10 rounds of iterations as recommended, and then we could start training the model.

Step three: Deploy the model on the platform, and the entire fine-tuning process is completed.

Let’s test the effect of the fine-tuned model with a high school essay titled “Looking Forward to Growing Up,” with no specific theme and a minimum of 800 words.

We input a student’s essay and the requirements for correction format into two models before and after fine-tuning. Let’s see the comparison between the two models and the teacher’s comments. It can be seen that the post-fine-tuned large model has a more professional teacher’s comment and better follows the formatting requirements. What used to take teachers a long time to complete essay correction tasks, now with the help of large models, can be done in a few seconds, greatly improving the speed and efficiency of providing feedback to students.
## Application of Refined Models in Education Industry

Refined models like these are now being employed by leading companies in various sectors of the education industry.

Earlier, I mentioned that most large AI native applications are usually MoE (Mixture of Experts). So, how can we combine these models to achieve the best value for money? Below, I’ll illustrate a case of MoE model routing using XiaoDu. Let’s first see if XiaoDu has become smarter with the support of large models:

Robin: XiaoDu, are you there?

XiaoDu: Yes, I’m here.

Robin: Any news today?

XiaoDu: XiaoDu provides 3-5 daily news briefs.

Robin: XiaoDu, XiaoDu. Heard you’ve learned new skills recently?

XiaoDu: Not only can I chat with you, but I can also serve as your AI scheduler.

Robin: Let me test you then. Help me schedule my day. After today’s activities, I’ll have dinner with Mr. Zhang, followed by a management meeting at the hotel, which should end by 10 PM. Tomorrow morning, I plan to jog and check the weather. Then, there’s a morning meeting at 8 AM. Please arrange my schedule.

XiaoDu: According to your requirements, here’s the optimal schedule:

  • You’ll have dinner with Mr. Zhang from 6:00 PM to 7:30 PM. After that, you’ll attend the management meeting at 8:00 PM, ending around 9:30 PM.
  • Tomorrow morning at 6:00 AM, your alarm will remind you to wake up for jogging. Keep the jogging time within an hour to ensure you’re not late for the 8:00 AM meeting.
  • The weather tomorrow morning will be sunny with a temperature of 25°C. I suggest wearing short sleeves and applying sunscreen when going out.
  • Can I be your secretary? I have more new skills to offer.

Robin: Sounds good, impressive! (thumbs up to XiaoDu) Sync these schedules to my phone, please.

For each question posed to XiaoDu today, it was delegated to different models for execution. When calling the application’s API interface, the ERNIE Functions model was used. For teaching children, the Wenxin large model 3.5 or 4.0 was used. And for creating intelligent assistants, the ERNIE Character model was employed to enhance consistency and stimulate user engagement in conversations.

Through this combination and scheduling of models of varying sizes, XiaoDu not only smoothly underwent a “brain replacement” operation, installing the brand-new AI native operating system DuerOS X, but also achieved the optimal combination of effectiveness, speed, and cost. Compared to using only the flagship version of the Wenxin large model, this approach doubled the response speed while reducing costs by 99%. The XiaoDu Tiantian AI tablet robot I conversed with just now was officially launched on major platforms yesterday. Interested friends can place orders immediately for a firsthand experience.

These examples regarding ModelBuilder demonstrate Baidu’s ability to efficiently and inexpensively produce models. With the most powerful base model, Wenxin 4.0, we can tailor models of smaller sizes to various scenarios based on considerations such as effectiveness, response speed, and inference costs, supporting fine-tuning and post-pretraining. Models derived through this dimensionality reduction are significantly better in performance compared to directly using open-source models of the same size, and noticeably more cost-effective for the same level of performance. Previously, people thought open source was cheaper, but in the context of large models, open source is the most expensive. Therefore, open-source models will become increasingly outdated.

To facilitate rapid adoption, ModelBuilder comes preloaded with the most comprehensive and rich set of large models. Including ERNIE 3.5 and ERNIE 4.0, these are flagship large models suitable for complex scenarios with powerful capabilities. There are also three lightweight large models: ERNIE Speed, Lite, and Tiny. Additionally, there are two models for vertical scenarios: ERNIE Character for role-playing and ERNIE Functions for dialogue or question-answer scenarios and external tool usage and business function calls. Of course, ModelBuilder also supports mainstream third-party models from both domestic and international sources, with a total of 77 models, making it the platform with the largest number of large models in China.

The Second “Wenxin Cup” Entrepreneurship Competition Officially Launches: Offering a Special Award of 50 Million RMB

For developers, in addition to providing these development tools, we also offer support in terms of funds and resources.

In May last year, Baidu launched the “Wenxin Cup” Entrepreneurship Competition, aiming to promote the prosperity of large model ecosystems, create a more vibrant ecosystem, and help entrepreneurs and developers develop various AI-native applications. In the first “Wenxin Cup” Entrepreneurship Competition, we received nearly 1,000 applications from entrepreneurial teams. Baidu provided nearly 100 million RMB in investment support to 15 outstanding teams and continued to provide comprehensive support in technology, team, and resources.

Today, I am pleased to announce that the second “Wenxin Cup” Entrepreneurship Competition is officially launched. This time, we will expand the scope of project selection, set up sub-competitions, and recruit entrepreneurial and innovative teams globally and from universities. As long as your entrepreneurial direction is AI-native applications, you can sign up on the competition’s official website. At the same time, we have increased support for entrepreneurs, providing more abundant investment funds, richer business resources, and for the first time, set up a “Special Award.” Exceptional projects will have the opportunity to receive up to 50 million RMB in cash and resource support.

Chinese entrepreneurs and developers are very good at using new technologies to develop applications. I believe that Wenxin large models will become the first choice for Chinese AI entrepreneurs and developers, and more and more applications will be built on Wenxin large models. I also look forward to more entrepreneurs and developers joining us in building a prosperous AI ecosystem.

The biggest application scenario for visual large models is autonomous driving

Most of the tools and cases we’ve discussed earlier are based on large language models. Looking to the future, I believe multimodal large models, or the integration of text, images, speech, videos, and other multimodalities, are a very important long-term development direction for base models and a necessary path to AGI (Artificial General Intelligence). Baidu has been investing in these areas for a long time and will update technological progress in a timely manner.
Baidu Apollo

Based on data from testing over 100 million kilometers of complex urban roads in China, Baidu has developed the Apollo Vision Perception Large Model. It possesses four core capabilities: detection, tracking, understanding, and mapping. This enables Baidu to offer a more intelligent, adaptable, and safer autonomous driving solution.

Baidu Maps has also pioneered the application of the Vision Perception Large Model in the field of cartography. Currently, the world’s largest-scale lane-level map data has been launched in 360 cities nationwide. Wherever Baidu Maps can navigate, autonomous driving can go.

“Take it, use it anytime”

Just now, I showed you the Baidu Wenxin Large Model Series and three development tools—AgentBuilder, AppBuilder, ModelBuilder. Together, they form a toolbox that you can take with you right away and use as needed.

At this moment, standing here, I am also a developer, an entrepreneur, just like all of you, feeling excited. In today’s China, with 1 billion internet users, robust foundational large models, numerous AI application scenarios, the most comprehensive industrial system globally, and strong national encouragement and support for the “artificial intelligence+” initiative, every individual, every business, just needs to fully utilize these tools to unleash infinite creativity and productivity.

Today, everyone can be a developer.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.