Qwen3-Max: A 1-Trillion-Parameter MoE That Pushes Coding, Agents, and Reasoning to the Edge
Qwen has unveiled Qwen3-Max, its largest and most capable model to date—and the headline numbers are eye-catching: ~1 trillion parameters trained on 36 trillion tokens, delivered in a Mixture-of-Experts (MoE) architecture that emphasizes both training stability and throughput. The team says the preview of Qwen3-Max-Instruct hit the top three on the Text Arena leaderboard, and the official release improves coding and agent performance further. You can try Qwen3-Max-Instruct via Alibaba Cloud API or in Qwen Chat, with a Thinking variant under active training. ...