Luo Fuli, head of Xiaomi’s MiMo large model team, said Anthropic’s Claude Fable 5 remains an interim-stage product, even as the model has drawn attention for its coding and agent capabilities.

Luo made the comments while discussing Fable 5 at the eighth Beijing Academy of Artificial Intelligence (BAAI) Conference, held on June 12.

Anthropic launched Claude Fable 5 on June 9, describing it as its most capable generally available artificial intelligence model and saying it was built for demanding reasoning and long-horizon agentic work. The company said the model showed state-of-the-art performance on nearly all tested benchmarks, with particular strength in software engineering, knowledge work, vision, scientific research, and other complex tasks.

In one example cited by Anthropic, Stripe reported that Fable 5 completed a codebase-wide migration in a 50 million-line Ruby codebase in one day, a project that would otherwise have taken a full team more than two months by hand.

Luo said Fable 5 marks a substantial improvement in coding and agent capabilities, but she argued that it is still best understood as the result of continued scaling across three dimensions:

  • The first is parameter scale. Luo said she speculates that Fable 5’s parameter count may be several times larger than that of today’s strongest open-source models, reflecting further scale-up during pretraining.
  • The second is computing power. She said the model appears to have required significant compute for test-time scaling and reinforcement learning.
  • The third is data. As AI moves from the chat era into the agent era, Luo said training data is also changing. Model training is expanding beyond internet text to synthetic data generated jointly by humans and agents, pushing data scale into a new order of magnitude. In the past, she said, the full corpus of internet text data may have involved 40–80 trillion unique tokens. Today, that scale has entered a new phase.

“Fable 5 is the product of a very natural and outward extension of large models across three dimensions: parameter scale during pretraining, data, including agent-generated synthetic data, and the combination of post-training and reinforcement learning. It is an interim model,” Luo said.

As foundation models and agent technologies advance, self-evolution has become a recurring topic in the industry. Anthropic recently published a post titled “When AI builds itself,” outlining recursive self-improvement, or the possibility that AI systems could eventually automate the design and development of next-generation AI systems. Anthropic said that outcome is not here and is not inevitable, but could arrive sooner than institutions are prepared for.

Asked how she views model self-improvement, Luo said today’s leading models can already solve some abstract problems at the execution layer. By contrast, the previous generation of leading models could execute well only when human instructions were clear.

She pointed to how models are being used in research workflows. A complete research process involves proposing hypotheses, designing experiments, carrying them out, validating results, communicating with peers, and revising one’s views. Today’s large models, Luo said, can already design reasonable validation metrics, verify the accuracy of their own execution results, and plan the overall experimental process.

The main gap between models and top researchers, she said, lies in their ability to propose hypotheses or raise questions worth testing. That ability amounts to research judgment, including knowing when to stop pursuing unproductive directions. Luo said stronger models, combined with better recursive improvement systems, are gradually pushing that gap toward the frontier.

If AI’s capacity for self-improvement continues to strengthen, the next question is how it will reshape the world. Will it first reconstruct the digital world before entering the physical world, or will it regain an understanding of foundation models through the physical world?

Luo said language models and world models are advancing in parallel, but language models are moving faster for now. She said their current capabilities are better suited to recreating the conditions in which intelligence emerges from agent environments.

“We use a set of agent systems that can drive models to reach a higher ceiling, then layer models on top of them and let them freely explore in an environment. This allows us to design a more precise incentive system to drive models’ self-improvement,” she said. This path, she added, is already unfolding in the digital world.

In Luo’s view, the world model layer has yet to produce a highly efficient vision model. The first step should be an efficient generator architecture, followed by a scaffolded agent system capable of handling complex real-world tasks. The full paradigm then needs to be scaled. Language models, she said, are likely to clarify this path first.

This article was adapted based on a feature originally written by SY and published on IPO Zaozhidao. KrASIA is authorized to translate, adapt, and publish its contents.