硅谷 AI 研究员中国十日行:中国实验室更谦逊,美国充满零和博弈

硅谷 AI 研究员中国十日行:中国实验室更谦逊,美国充满零和博弈

导读

\n

This article records the author's first-hand insights from visiting top Chinese AI labs, comparing the cultural, operational and ecological differences between Chinese and American AI R&D systems, and analyzing the unique advantages and development potential of China's AI industry.

\n

这篇文章记录了作者走访中国头部AI实验室的一手洞察,对比了中美AI研发体系在文化、运作方式、产业生态上的差异,分析了中国AI行业独特的优势与发展潜力,打破了很多外界对中国AI发展的刻板印象,非常值得对全球AI竞争格局感兴趣的人一读。

\n

Staring out the window on a new, high-speed train from Hangzhou to Shanghai I’m gifted with views of dramatic ridgelines speckled with wind turbines that are silhouetted against the setting sun. The mountains cast a backdrop to a mix of spanning fields and clustered skyscrapers. I’m returning from China with great humility. It’s a very warming, human experience to go somewhere so foreign and be so welcomed. I had the honor of meeting so many people in the AI ecosystem who I knew from afar, and they greeted me with big smiles and cheer, reminding me how global my work and the AI ecosystem is.

\n

坐着杭州到上海的新高铁看着窗外,落日余晖下起伏的山脊上错落着风力发电机的剪影,田野和密集的摩天大楼在山脚下交织。这次中国之行结束的时候我满是敬畏,到一个完全陌生的地方却受到如此热情的接待,是非常温暖的人文体验。我有幸见到了很多之前只在远程接触过的AI行业从业者,他们都带着灿烂的笑容欢迎我,也让我真切感受到我所做的工作、整个AI生态其实是全球性的。

\n

Article illustration

\n

The Chinese companies building language models are set up as the perfect fast-followers for the technology, building on long-standing cultural traditions in education and work, along with subtly different approaches to building technology companies. When you look at the outputs, the latest, biggest models enabling agentic workflows, and the ingredients, excellent scientists, large-scale data, and accelerated computing, the Chinese and American labs look largely similar. The lasting differences emerge in how these are organized and conditioned.

\n

中国的大模型公司天然是技术的优秀快速跟进者,这既源于长期以来的教育和工作文化传统,也和他们搭建科技公司的思路和海外有微妙差异有关。如果只看产出——支持智能体工作流的最新大模型,以及研发要素——优秀的科学家、海量数据、加速计算资源,中美的实验室看起来几乎没什么差别,真正的核心差异在组织和激励方式上。

\n

I’ve long thought that a reason that the Chinese labs are so good at catching up and keeping up with the frontier is that they’re culturally aligned for this task, but without talking to people directly I felt like it wasn’t my place to attribute substantial influence to this hunch. Speaking with many wonderful, humble, and open scientists at the leading Chinese labs has crystallized a lot of my beliefs.

\n

我之前一直觉得中国实验室擅长追赶并紧跟技术前沿,和文化适配性有关,但没和从业者直接交流之前,我总觉得不能随便把这个猜想当成定论。和中国头部实验室里很多优秀、谦逊、开放的科学家聊过之后,我的很多想法才真的被印证了。

\n

So much of building the best LLMs today comes down to meticulous work across the entire stack, from data to architecture details and RL algorithm implementations. All points of the model can give some improvements, and fitting them in together is a complex process where the work of some brilliant individuals needs to get shelved in favor of the overall model maximizing a multi-objective optimization.

\n

现在要做最好的大模型,需要全栈的细致工作,从数据到架构细节,再到强化学习算法的落地,每个环节都能带来一些提升,把这些整合起来是个非常复杂的过程,有时候为了整体模型的多目标优化最优,甚至需要搁置一些优秀个体的工作成果。

\n

Where American researchers are obviously also brilliant at solving the individual components, there’s more of a culture of speaking up for yourself in the U.S. As a scientist, you’re more successful when you speak up for your work and modern culture is pushing the new path to fame of “leading AI scientists”. This results in direct conflict. The Llama organization is heavily rumored to have collapsed under the political weight of these interests embedding themselves in a hierarchical organization. I’ve heard of other labs saying that it can be needed to pay off a top researcher to get them to stop complaining about their idea not making it in the final model. Whether or not that’s exactly true, the idea is clear. Ego和职业晋升的欲望确实会阻碍做出最好的模型。中美在这种文化上的微小差异,会对最终产出产生非常大的影响。

\n

美国的研究者当然也非常擅长解决单个环节的问题,但美国文化更鼓励自我表达,作为科学家,如果你能为自己的工作发声,会更容易获得成功,现在的文化也在推“顶级AI科学家”这种成名路径,这就会带来直接的冲突。有很多传言说Llama团队就是因为层级组织里的利益博弈内耗才解散的,我也听过其他实验室说,有时候甚至需要花钱安抚顶级研究者,让他们别再抱怨自己的想法没被最终模型采用。不管这些传闻是不是完全准确,核心逻辑很清楚:自我意识和职业晋升的欲望,确实会阻碍做出最好的模型,中美在这种文化上的微小差异,会对最终产出产生非常大的影响。

\n

Some of this has to do with who is building the models in China. There’s an immediate reality at all of the labs that a large proportion of the core contributors are active students. The labs are quite young, and it reminds me of our setup at Ai2, where students are seen as peers and directly integrated in the LLM team. This is incredibly different from the top labs in the US, where the likes of OpenAI, Anthropic, Cursor, etc. simply don’t offer internships. Other companies like Google nominally have internships related to Gemini,但很多人担心实习会被孤立,接触不到真正的核心工作。

\n

这和中国大模型的研发人员构成也有关系,一个非常直观的现实是,所有中国实验室里,核心贡献者里有很大比例是在读学生,团队都很年轻,这让我想起我们在Ai2的配置,学生被当成平等的团队成员,直接加入大模型研发团队。这和美国的顶级实验室完全不一样,像OpenAI、Anthropic、Cursor这些公司根本不提供实习岗位,谷歌这类公司名义上有Gemini相关的实习,但很多人担心实习会被孤立,接触不到真正的核心工作。

\n

Article illustration

\n

The thing that makes building an AI model today so interesting is that it’s not just about getting a group of great researchers in one building together to produce an engineering marvel. It used to be this, but to sustain AI businesses, the LLMs are becoming a mix of building, deploying, funding, and getting adoption for this creation. The leading AI companies exist in complex ecosystems that supply money, compute, data and more in order to keep pushing the frontier.

\n

现在做AI模型有意思的地方在于,它不再只是把一群优秀研究员凑到一起做出工程奇迹这么简单了,以前确实是这样,但要维持AI业务的长期发展,大模型现在已经变成了研发、部署、融资、落地的综合体,头部AI公司都处在复杂的生态里,需要资金、算力、数据等各种资源的支持,才能不断推进技术前沿。

\n

I’ve documented the biggest “AI Industry” level take-aways from talking to these labs:

\n

First, early signs of domestic AI demand have emerged. There's a widely circulated view that Chinese companies are not used to paying for software, so the AI market size will be limited. But this is only true for SaaS-type software, which has always been small in China, while the cloud computing market in China is actually very large. Now the industry is generally judging that AI spending will be more inclined to the attribute of cloud infrastructure, so no one is worried about the future growth of the market.

\n

我把和这些实验室交流得到的最核心的产业层面的结论整理了出来:

\n

第一,国内AI需求已经出现早期信号。有个流传很广的观点说中国企业不习惯为软件付费,所以AI市场规模会有限,但这只适用于SaaS类软件,这类产品在中国本来规模就很小,而中国的云计算市场其实非常大,现在行业普遍判断AI支出会更偏向云基础设施的属性,所以没人担心市场的未来增长。

\n

Second, most developers are very fond of Claude. Even though Claude is officially banned in China, most Chinese AI developers are obsessed with it, and it has changed their way of developing software. Although some people will use domestic tools like Kimi or GLM command line tools, almost everyone mentioned that they will use Claude for development. Surprisingly, Codex, which is becoming more and more popular in the Bay Area, is rarely mentioned here.

\n

第二,大多数开发者都非常青睐Claude。尽管Claude在国内官方是被禁止的,但大部分中国AI开发者都很痴迷它,它改变了这些开发者做软件的方式。虽然也有人会用Kimi或者GLM的命令行工具这类国产工具,但几乎所有人都提到会用Claude做开发,有意思的是,在湾区越来越火的Codex,这里反而很少有人提到。

\n

Third, Chinese companies have a strong mentality of independent technology ownership. Many companies are willing to invest in building their own large models not because of any overall plan, but because they believe that large models will be the core of future technology products, and having their own models can consolidate their technical stack. The "open source first" philosophy of many domestic models is also very pragmatic: open source can get a lot of feedback from the community, give back to the open source ecosystem, and also help promote their own mission.

\n

第三,中国企业有很强的技术自主所有的心态。很多公司愿意投入做自己的大模型,不是因为有什么整体规划,而是因为他们认为大模型会是未来科技产品的核心,自己拥有模型可以巩固自身的技术栈。国内很多模型的“开源优先”理念也非常务实:开源能拿到社区的大量反馈,反哺开源生态,也能助力自身的使命推广。

\n

Fourth, government support exists, but the specific scale is not clear. It is often said that the Chinese government is strongly supporting the open large model race. In fact, the support is more reflected in reducing bureaucratic procedures such as permits. There is no sign that the top government is interfering with the technical decision-making of the models, and there is no clear evidence to judge how much impact government support can have on the development trajectory of AI.

\n

第四,政府扶持确实存在,但具体规模不明确。外界常说中国政府在大力支持开源大模型竞赛,实际上扶持更多体现在减少许可这类 bureaucratic 流程上,没有迹象显示顶层政府在干预模型的技术决策,也没有明确的证据能判断政府支持对AI发展轨迹能有多大影响。

\n

Fifth, the data industry is still underdeveloped. Compared with American labs that spend hundreds of millions of dollars a year to buy data and training environments, Chinese labs generally feel that the quality of domestic data services is not high, so they are more willing to build training environments and data sets by themselves. Researchers will spend a lot of time making RL training environments, and large companies like ByteDance and Alibaba will have their own internal data labeling teams.

\n

第五,数据产业还不够发达。和美国实验室每年花费上亿美元购买数据和训练环境不同,中国实验室普遍觉得国内数据服务的质量不高,所以更愿意自己搭建训练环境和数据集,研究者会花不少时间制作强化学习训练环境,字节、阿里这类大公司会有自己的内部数据标注团队。

\n

Sixth, there is a strong demand for Nvidia chips. Nvidia computing power is still the gold standard for training, and almost all labs are limited by the lack of Nvidia chips. If there is sufficient supply, they will definitely buy more. For inference, domestic accelerators represented by Huawei are well received, and many labs are already using Huawei chips.

\n

第六,对英伟达芯片的需求非常迫切。英伟达算力仍然是训练的黄金标准,几乎所有实验室的进展都被缺英伟达芯片限制,如果有充足供应肯定会大量采购。推理端的话,华为为代表的国产加速器评价不错,很多实验室都已经在使用华为的芯片。

\n

Article illustration

\n

I knew so little about China going into the trip and came out with the feeling of just starting to learn. China isn’t a place that can be expressed by rules or recipes, but one with very different dynamics and chemistry. The culture is so old, so deep, and still completely intertwined with how domestic technology is built. I have much more learning ahead.

\n

来中国之前我对这里了解很少,走的时候反而觉得自己才刚刚入门。中国不是能用几条规则或者公式就能概括的地方,它有着完全不一样的运行逻辑和化学反应,古老深厚的文化仍然和本土科技的发展深度绑定,我还有太多需要了解的东西。


来源:https://www.interconnects.ai/p/notes-from-inside-chinas-ai-labs