StepFun’s Series B+ funding round, exceeding RMB 5 billion (USD 700 million), has drawn attention for its size. The investor mix and the speed at which the round closed are equally revealing. From launch to completion, the process took roughly six months.
Following Hong Kong IPOs by Z.ai (Zhipu AI) and MiniMax, as well as Moonshot AI’s recent fundraising announcement, StepFun has extended the capital market’s artificial intelligence narrative into the new year.
Long known for maintaining a low profile, StepFun is now stepping into public view. Beyond the size of the raise, another significant development was the formal appointment of Yin Qi as chairman.
Yin, chairman of Afari and co-founder and former CEO of Megvii, previously appeared in StepFun’s orbit as a strategic partner from the automotive sector. His move into the chair role places him at the center of the company’s decision-making. With Yin, StepFun gains a spokesperson with greater visibility across technology, capital, and industrial circles. His assumption of strategic control also means that the technical systems, operational experience, and commercial resources he built at Megvii and Afari will now shape StepFun as it enters a more decisive phase.
Why investors are backing AI again
After Z.ai and MiniMax went public, a view briefly circulated that the private market could no longer support AI model developers, and that IPOs would become the dividing line between winners and laggards.
Moonshot and StepFun’s consecutive large rounds have pushed back against that assumption. For both investors and founders, the signal is cautiously optimistic: the private market still has the capacity, and the willingness, to support foundational model development. Investors remain prepared to back base models, though with higher standards.
At the same time, aggressive investments by ByteDance, Alibaba, and Tencent, combined with uncertainty around DeepSeek’s next model release, have tightened expectations. According to one industry insider, investors are now focused on whether independent foundation models still have room to survive, whether their commercialization strategies are truly differentiated, and whether those strategies can hold up under competition.
StepFun’s latest financing carries added weight. To understand the round, it helps to examine the two areas private investors care about most today: technology and commercialization.
On the technical side, StepFun is the only company among the so-called “six tigers” that has consistently focused on multimodality. On the commercial side, its differentiation lies in its push into physical-world terminals.
As one StepFun executive told 36Kr:
“Others want to be China’s OpenAI or Anthropic. StepFun is closer to being China’s xAI plus Tesla.”
This positioning has shaped StepFun’s capital structure, which balances US dollar funds, state-backed capital, and industrial investors. Among its peers, StepFun emerged later and maintained the lowest profile. Yet it is also one of the few companies to draw backing from all three capital types, despite being a notably non-consensus player.
Choosing a different path
Founded in April 2023, StepFun did not gain visibility in the sector until nearly a year later. By then, private-market attention and capital had largely been absorbed by earlier entrants.
At the time, the most sought-after target was Light Year (Guangnian Zhiwai), founded by Wang Huiwen after his return to the technology sector. The former Meituan co-founder’s reentry quickly attracted more than a dozen top-tier institutions and individual investors, drawing in much of the available capital. AI companies that emerged afterward faced a more difficult fundraising environment.
StepFun’s first round therefore relied heavily on the personal networks of its co-founders, including HongShan, Qiming Venture Partners, and IDG Capital. Internally, the company viewed the delay as intentional. StepFun CEO Jiang Daxin later told 36Kr that the perceived lateness reflected preparation rather than hesitation.
“We wanted to accumulate enough capacity to build trillion-parameter models,” Jiang said.
That preparation included computing power, algorithms, data, and system-level capabilities that were often overlooked at the time. StepFun recruited Zhu Yibo, formerly of ByteDance and experienced in operating large-scale clusters, to lead systems construction. The team also mapped out a clear development sequence: unimodal models, multimodal models, unified multimodal understanding and generation, world models, and ultimately AI. That roadmap, Jiang said, was not subject to change.
Despite launching during an unfavorable fundraising window, StepFun became the only company among the “six tigers” to reach unicorn status in its first round. Within two months, it trained its first 100-billion-parameter model, Step 1, on its first attempt. That milestone helped anchor its second round, which drew a broader investor base, including USD-denominated funds managed by 5Y Capital and Shunwei Capital, as well as strategic investors such as Lenovo Capital.
Resisting the traffic race
After a turbulent start, the sector entered a period of rapid acceleration. Consensus and fear of missing out formed quickly around a handful of breakout products, only to dissipate just as quickly.
The peak came in 2024, when AI companies aggressively purchased traffic. Quarterly marketing budgets reaching hundreds of millions of RMB fueled Moonshot’s Kimi and ByteDance’s Doubao. Traffic brought visibility and data, but also capital. In early 2024, Moonshot raised more than USD 1 billion led by Alibaba, pushing its valuation to USD 2.5 billion. Additional rounds followed, ultimately giving Moonshot the highest valuation among the “six tigers.”
StepFun’s Series B round, launched in mid-2024, proved its most difficult. Investors repeatedly asked why the company avoided traffic purchases. Internally, the team debated whether to allocate funds toward growth.
According to StepFun, it declined based on first-principles reasoning. Increased usage would have amplified losses, consumer data offered limited value for model training, and traffic-driven growth did not guarantee retention. The company instead used limited resources to test consumer products while remaining focused on foundation models, particularly multimodality.
The team’s view was that reaching artificial general intelligence (AGI) requires models not only to think, but also to see, hear, perceive, and understand the physical world. In 2024, StepFun set its goal as moving from native multimodality to unified multimodal understanding and generation, a path with longer R&D cycles and fewer short-term rewards.
Series B investors were therefore betting on conviction. According to one investment manager, confidence was bolstered by StepFun’s July 2024 releases, including the trillion-parameter language model Step 2 and the multimodal understanding model Step 1.5V. Investors included Shanghai State-owned Capital Investment and affiliated funds, Hong Kong Investment Corporation, Tencent, and returning shareholders.
“StepFun isn’t aggressive enough in some areas and may miss certain windows,” one executive said. “But time will prove that steadiness is a real advantage.”
A year later, the Series B+ round launched in July 2025 rewarded that conviction.
A tightening market
After several years of experimentation underwritten by investors, AI has moved beyond its exhibition phase. Both private and public markets now demand measurable results.
By early 2026, signs of consolidation were already visible. Z.ai and MiniMax listed in Hong Kong, while StepFun and Moonshot secured large financing rounds. In the view of some investors, these four now form a group with sufficient capital to remain competitive.
For StepFun, the shift has also prompted a more public posture. Since 2025, several senior leaders have stepped into the spotlight. The company’s technical leadership includes Jiang Daxin, previously involved in products such as Bing and Microsoft 365; CTO Zhu Yibo, formerly head of AI infrastructure at ByteDance; and chief scientist Zhang Xiangyu, a core author of ResNet.
With Yin formally in place, StepFun appears positioned for a longer, more disciplined run, pairing technical depth with an operator who has navigated prior industry cycles.
Commercial performance will be decisive. Most remaining players have adopted conservative strategies modeled on OpenAI and Anthropic. Z.ai focuses on enterprise clients, while MiniMax and Moonshot emphasize consumer software.
Consumer subscriptions and enterprise AI sales remain the preferred approaches, which makes StepFun’s focus on consumer terminal devices distinct. According to people familiar with the company cited by 36Kr, StepFun ruled out mainstream models one by one. Consumer subscriptions face weak willingness to pay in China. Enterprise APIs risk price competition with cloud providers. Customization depends heavily on relationships and is difficult to scale.
What remained were on-device deployment and outcome-based pricing. Both require strong edge-cloud coordination, a full multimodal technology stack, and stable model performance. These demands align with StepFun’s long-term focus, though terminal-side deployment in China remains challenging due to deep operating system integration requirements.
StepFun has pursued this path through close co-development with partners including Geely and Oppo. By the end of 2025, terminal API calls had reportedly grown by nearly 170% for three consecutive quarters. Oppo and other partners now account for roughly 60% of China’s leading smartphone brands, with more than 42 million devices shipped and nearly 20 million daily users served. In automotive applications, StepFun is targeting integration with one million vehicles this year.
Its fundraising strategy reflects the same logic. StepFun has sought capital that brings strategic value alongside funding, drawing long-term financial investors, industrial partners, and market-oriented state-backed investors.
After a year of consolidation, investors are practicing pragmatic idealism. While belief in AGI persists, fewer companies can withstand heightened scrutiny. For some, physical AI represents a credible direction. After all, large models enabling natural-language interaction could reshape hardware terminals into new traffic gateways.
It remains a non-consensus frontier, but one that StepFun has already begun to explore through customer collaboration and hardware experimentation.
KrASIA Connection features translated and adapted content that was originally published by 36Kr. This article was written by Xiao Xi for 36Kr.