股票正规平台 Top LLMs in China and the U.S. Only 5 Months Apart: Kai - 线上股票配资平台_股票在线配资

栏目分类

在线配资平台: 线上股票配资平台; 股票在线配资; 在线配资平台

你的位置：线上股票配资平台_股票在线配资_在线配资平台 > 在线配资平台 > 股票正规平台 Top LLMs in China and the U.S. Only 5 Months Apart: Kai

股票正规平台 Top LLMs in China and the U.S. Only 5 Months Apart: Kai

发布日期：2024-11-09 22:46 点击次数：147

股票正规平台 Top LLMs in China and the U.S. Only 5 Months Apart: Kai

9月11日，有网友偶遇靳尚谊和他的妻子一同出席某画展，瞬间引发了网络的讨论。视频中，靳尚谊一头花白的头发，微微弯曲的背部依然挡不住他步伐的矫健。他行走速度之快，让人很难相信他已经90岁高龄。走在他身边的是一位气质优雅、打扮时尚的女士，身穿黑色抹胸长裙，整个人散发着成熟女人的魅力。这位挽着靳尚谊的女士，正是他的小44岁的第二任妻子——祁艳。

说起仪征市人民法院党组副书记、常务副院长边晓斌，他的同事、亲友有说不完的思念、道不尽的缅怀。“凡是跟边副院长接触过的，没有一个不说他好！”49岁的边晓斌虽然走了，但是在大家心中，他从未走远。

Photo by TMTPost App Editor Lin Zhijia

"Recently, there were claims that several of the six major Chinese large language model companies have abandoned pre-training. We officially refute this. Zero One Infinity will never abandon pre-training, and our pre-training is both fast and excellent," said Kai-Fu Lee,the founder of One Infinity, on Wednesday.

Lee made the comments during the launch of new flagship pre-training model Yi-Lightning "Lightning Model.

He also said that Yi-Lightning is a "top model at a bargain price," with inference speed and price advantages. Yi-Lightning's maximum generation speed has increased by nearly 40%, costing only 0.99 yuan per million tokens, and this pricing still yields a profit.

"It is the first Chinese model to achieve a very high ranking on an international authoritative list, surpassing most American major models, and becoming the first to exceed the global leading OpenAI GPT-4o. The Yi-Lightning Lightning Model not only has world-class model performance and very fast inference but also comes at a very low price, making it very suitable for both app calls and enterprise application scenarios," stated Kai-Fu Lee.

During the post-meeting discussion, Lee told TMTPost that many people ask whether China's pre-training is lagging behind the United States. "We candidly admit that China is behind the U.S., but some say it's ten or twenty years behind. Today, we can calculate with very precise numbers that the model developed by GPT-4o in May has been surpassed by us as of today in October. If we really calculate how far China is from surpassing the U.S., at least, 01.AI is only 5 months behind OpenAI's model," said Lee.

01.AI was founded on May 16, 2023, dedicated to creating a new AI 2.0 platform and a globally-oriented AI-first productivity application company. It was founded by Kai-Fu Lee, Chairman and CEO of Innovation Works, who also serves as the CEO of 01.AI. The core team members come from companies like Google, Microsoft, IBM, and Baidu.

In terms of financing,so far, 01.AI has completed three rounds of financing, including a $250 million (approximately 1.808 billion RMB) Pre-A round led by Alibaba, with a valuation exceeding $1 billion, making it a unicorn.

On the product and commercialization front, 01.AI focuses on the overseas To C (consumer-level) paid market and the domestic To B (enterprise-level) paid market.

Regarding the layout for B-end and C-end,Lee stated that it is challenging for a large model company to simultaneously engage in To B and To C, requiring diversified management approaches because the teams have different genes, methods of operation, and KPI measurements. In the To B field, 01.AI is currently focusing on the domestic market, such as digital human solutions in the catering sector; in the To C field, it is mainly targeting overseas markets because traffic for domestic To C products is a significant cost issue, requiring cautious judgment in the currently challenging environment.

Lee said, "For to C, we mainly focus on overseas markets for several reasons: When we started working on Zero One Ten Thousand, there was no suitable Chinese model domestically, so we had to first experiment abroad. After some time, we gained insights and iterated one, two, three products. Some of these products are performing very well, while others are not as successful, and we are continuously adjusting. We are also observing when it would be suitable to launch certain products domestically. The cost of traffic for domestic to C products is increasing, and we have noticed that some competitors have seen their user acquisition costs rise from over ten yuan to more than thirty yuan, with significant user attrition recently. In such a challenging environment, we will be very cautious and will not launch new to C applications in China for now. We will continue to maintain our existing products, but more of our efforts will be focused on acquiring high-quality users at a lower cost abroad, or directly selling the app and charging users through subions. This subion habit is relatively well-established overseas. These are the main reasons. The biggest reason at present is that for to C products abroad, we can balance monetization capabilities and user growth costs. We will focus on domestic opportunities when they arise."

Currently, Zero One Ten Thousand has once again chosen to optimize the pre-training model and has, for the first time at the conference, unveiled its first industry application product under the new ToB strategy, AI 2.0 Digital Human, focusing on domestic retail and e-commerce To B business scenarios.

"The generation of responses relies on our Yi-Lightning large model, with a certain hospitality and tourism company's GMV sales soaring by 170%. I believe that rather than selling models or using models for customer service applications, the best approach is for a large model company to understand the complete user needs and create an end-to-end comprehensive solution. This allows companies purchasing large models and digital humans to immediately see profits with each use, earning more the more they use it, which in turn increases their willingness to pay us," Lee said. Zero One Ten Thousand's choice to focus on domestic to B is because they have found some breakthrough areas, such as using digital humans for retail and catering, forming a complete solution. There are also two or three other fields they are starting to explore, but it is not convenient to disclose them at the moment.

Lee said, "Globally speaking, to B suppliers are basically local. Establishing subsidiaries for to B operations across countries is not something we or other startups can do, so we have abandoned to B overseas. For to B, we focus on domestic operations, creating profitable solutions rather than just selling models or doing project-based work. This is our approach to B."

Regarding the cost issue,Lee stated that the pre-training of Zero One Everything used 2000 GPUs, took one and a half months, and cost over three million dollars, but the cost was about 1%-2% of Grok's.

Lee believes that OpenAI is a very impressive company. Although the released OpenAI o1 has hidden all the intermediate thinking states, o1 has extended the Scaling trend from previous pre-training to reasoning, changing the industry's perception and making the industry realize that post-training SFT (Supervised Fine-Tuning) and reinforcement training are very important. "I believe many companies in China and the US are racing towards the o1 direction," said Lee.

Speaking about the future industry situation, Lee emphasized that companies will not abandon pre-training, but it is a technical task that requires understanding of chips, reasoning, models, and algorithms. "If a company can have so many excellent talents and can collaborate across fields, I believe China can create a top ten pre-training general model in the world. However, due to the relatively high cost, there may be fewer and fewer large model companies doing pre-training in the future."

"For China's six large model companies, as long as they have enough good talent and the determination to do pre-training, financing and chips will not be a problem," said Lee.

The following is part of the conversation between Kai-Fu Lee and TMTPost:

Q: How was it achieved to make the model "sixth in the world, first in China" while launching it to the market at a relatively low price?

Kai-Fu Lee:Zero One Everything did not lose money on the pricing of Yi-Lightning.

From the first day of its establishment, Zero One Everything simultaneously launched three teams: Model Training, AI Infra, and AI Applications. Once these teams matured, they were integrated together. Zero One Everything summarizes this model as two major strategies: Model-Based Co-Building and Model-Application Integration. AI Infra supports model training and inference, producing high-performance models at lower training costs and supporting application layer exploration at lower inference costs.

We won't sell models at a loss, nor will we make a lot of money. Instead, we add a small profit margin on top of the cost line, resulting in today's price of 0.99 yuan per million tokens.

The most important aspect of selecting a model API is that the model's performance must be excellent. Only under this premise should you choose the cheapest option. I believe that, considering the quality and price of the Yi-Lightning model, Yi-Lightning is likely to be the most recognized and cost-effective model for many developers.

Q: There have been reports that some of the "Six Little Tigers" of large models (Zero One, Zhipu, Baichuan, MiniMax, Dark Side of the Moon, Step Star) have abandoned pre-training. From an industry perspective, will gradually abandoning model pre-training become an overall industry trend?

Kaifu Lee:Creating a good pre-trained model is a technical task that requires many talented people working together, taking time to produce fine results. It requires people who understand chips, inference, infrastructure, models, and excellent algorithm colleagues to work together.

If a company is fortunate enough to have such outstanding talent and can collaborate across fields, I believe China can definitely produce a top ten globally ranked general pre-trained model. However, not every company can do this, and the cost of doing so is relatively high. In the future, there may be fewer and fewer large model companies doing pre-training.

However, as far as I know, the funding levels of these six companies are sufficient. Our production run for pre-training costs three to four million dollars per training session, and the leading companies can afford this. I believe that as long as China's six large model companies have enough talent and the determination to do pre-training, funding and chips will not be an issue.

Q: Zero One Everything has announced its ToB-related matrix for the first time. Will it further deepen its efforts in the ToB direction in the future?

Kai-Fu Lee:In China, the approach to large model ToB (business-to-business) is different from the AI 1.0 era. The primary task is to find a few methods that can charge based on usage rather than customized projects. Only take on orders with relatively high profit margins.

Today, the AI 2.0 digital human solution launched by Zero One Infinity is not a loss-making approach for each order. It focuses on addressing significant pain points and profit points for users, such as the time wasted by a store manager or KOL during a live broadcast. Even if an hour of live broadcasting earns a thousand yuan, it's still just a thousand yuan. However, if digital humans are used for live broadcasting, it won't be limited to one hour; it could be a thousand hours. Even if each hour earns only half the money, a thousand hours can still earn five hundred times the money, making the calculation very straightforward.

If digital humans can truly be made end-to-end, where you just input internal company information, select an image, voice, and press a button to start broadcasting, it's like selling a money-printing machine to the enterprise, and charging a rental fee for the machine is feasible.Besides live broadcasting, our AI 2.0 digital human solution has already been applied to more business scenarios, such as AI companions, IP images, office meetings, and more.

Overall, Zero One Infinity's ToB overall solution will adopt a "horizontal and vertical" strategy. First, deeply penetrate a single industry, then, based on its own technical capabilities and industry accumulation, refine standardized ToB solutions to enhance efficiency for enterprise clients across various industries.

Question: Besides the digital human solution, does Zero One Infinity have other ToB solutions?

Kai-Fu Lee:In addition to our already released AI 2.0 digital humans and API, Zero One Infinity currently has other ToB businesses such as AI Infra solutions and private customized models. We will officially release these soon, so please stay tuned.

Question: Zero One Infinity has launched ToC products overseas and is gradually launching ToB products domestically. What is the current status of B-end and C-end products?

Kaifu Lee:It's quite challenging for a large model company to simultaneously engage in both ToB and ToC markets. The sales methods, profit ratios, and evaluation systems such as the amount of investment needed to generate revenue are completely different. Diverse management approaches are also required because the teams have different genes, methods of operation, and ways of measuring KPIs. I have experience in both fields and am trying to do it, but it's definitely not possible to do everything.

For ToB, Zero One chose to focus on the domestic market because we found some breakthrough opportunities, such as using digital humans for retail and catering, which can provide a complete solution.There are also two or three other fields we are starting to explore, but it's not convenient to disclose them now. We are not targeting the overseas market for ToB because, globally, ToB suppliers are mostly local. Choosing to do ToB domestically also means selecting profitable solutions rather than just selling models or working on a project basis, which is our approach to ToB.

For ToC, we are mainly focusing on overseas markets. When we started Zero One Everything, there were no suitable Chinese models domestically, so we had to try abroad first, iterating on one, two, and three products. Some of these products are performing well, while others are not as good, and we are continuously adjusting.

We are also observing when it would be appropriate to launch certain products in the domestic market. Currently, a major issue with ToC products is the increasing cost of traffic. We've seen some competitors' user acquisition costs rise from a dozen RMB to over thirty RMB, with significant user attrition recently,in such an environment, we will be very cautious. We won't launch new ToC applications in China for now, while existing products will continue to be maintained. However, more effort will be focused on acquiring high-quality users at a lower cost overseas, or directly selling the app to users for subion fees, as the subion habit is relatively mature there.

In other words, the current situation favors developing ToC products overseas, where the monetization capability and the cost of user growth are manageable. We will look for opportunities to launch in the domestic market later.

Question: In May this year, Yi-Large reduced the time gap between top models in China and the US to six months. This time, Yi-Lightning's release directly surpassed GPT-4o, further shortening the time gap to five months. How do you think the time gap between large models in China and the US can be further reduced in the future?