When generating cinematic dynamic storytelling with a single sentence or sketch is no longer a dream, what transformation will the entire visual content industry undergo?
Alibaba's Wan series is redefining the boundaries of AI video at an astonishing pace. The four consecutively launched large language models in late July 2025 heralded that AI large language model technology has entered into an explosive phase of "weekly-level iterations." Merely over a month after the open-sourcing of Wan2.2, Alibaba unveiled the Wan2.5-preview series at the September 2025 Apsara Conference.
If this iteration rhythm continues, industry insiders widely predict that Wan2.6 is likely to make its debut soon, delivering an industry-transforming technological breakthrough once again.
Since its inception, Alibaba's Wan series has advanced along a trajectory of rapid iteration. In February 2025, Alibaba first released the open-source video generation model Wan 2.1, quickly garnering widespread attention from the global technical community.
By July, Wan2.2 was officially open-sourced and fully integrated into the Tongyi APP, achieving significant improvements in key dimensions such as facial expressions, multi-person interactions, and complex movements.
Just one month later, on August 26, Alibaba further open-sourced the multimodal video generation model "Wan2.2-S2V," which can generate cinematic digital human videos from just a single static image and an audio clip.
Meanwhile, Wan2.5-preview was showcased at the September Apsara Conference, supporting video generation of up to 10 seconds with qualitative leaps in motion naturalness, scene coherence, and lighting realism.
At this iteration speed, the launch of Wan2.6 is imminent, expected to achieve greater breakthroughs across multiple dimensions.
The performance improvement of the Wan series follows a clear trajectory. The initial Wan2.1 already demonstrated robust capabilities: its 14B parameter version was trained on a massive dataset containing billions of images and videos, achieving excellent results in multiple benchmark tests.
With the launch of Wan2.2, the model's capabilities were further expanded. This version introduced a mixture-of-experts (MoE) architecture, including a 5-billion-parameter text/image-to-video model that can run on consumer-grade GPUs.
In terms of technical characteristics, the Wan series has formed a clear product positioning: it offers both a powerful 14B-parameter model for ultimate performance and an efficient 1.3B-parameter model to meet the needs of consumer-grade hardware.
Based on the existing development trajectory of the Wan series and industry trends, Wan2.6 may achieve breakthroughs in the following areas:
Architectural Innovation: Wan2.6 is likely to further optimize the mixture-of-experts system, expanding from two experts in Wan2.2 to a more complex multi-expert collaboration framework. This design can activate different expert networks for various generation tasks, improving efficiency and quality.
Enhanced Multimodal Understanding: The existing Wan2.2-S2V can generate digital human videos by combining static images and audio. Wan2.6 may further integrate cross-modal understanding of text, images, audio, and video to enable more natural content generation.
Generation Length and Consistency: While the current Wan2.5 supports 10-second videos, Wan2.6 is expected to break this limit, extending to 15-20 seconds or even longer coherent videos while maintaining consistency in characters, scenes, and styles.

If launched as scheduled, Wan2.6 will exert a profound impact on the entire AI video generation industry. The Wan series' consistent open-source strategy has already put pressure on closed-source models, forcing the industry to rethink business models.
If Wan2.6 can deliver generation quality close to commercial platforms while maintaining open-source availability and a low hardware barrier to entry, it may significantly reduce video production costs. For content creators, this means a further lowering of entry barriers. A sufficiently powerful open-source video model will allow individual creators and small studios to produce high-quality visual content at a lower cost.
Looking ahead from the current juncture, Wan2.6 is not only the next step in technological iteration but also a key milestone in the popularization of AI video generation. Alibaba has announced plans to invest 380 billion RMB in cloud computing and AI development over the next three years, providing solid support for the continuous advancement of the Wan series.
The open-source strategy is transforming the competitive landscape of the AI industry. With the emergence of more high-quality open-source models, the pace of innovation across the entire industry is accelerating.
The process of technological democratization is gathering momentum. Wan2.6 is expected to further lower the technical threshold for high-quality video generation, enabling more developers and creators to access and apply this technology.
When the power of open source meets top-tier engineering capabilities, and when consumer-grade hardware can run models that once required massive computing power, AI video generation is evolving from a technical challenge in professional fields to a creative tool accessible to the general public.
The above is a summary of current online predictions about Wan2.6. Please note that this information is not officially released and is for reference only.