The 397B native multimodal agent with 17B active params
An open-weight, native vision-language model built for long-horizon agentic tasks. Its hybrid architecture (linear attention + MoE) delivers the capabilities of a 397B giant with the inference speed of a 17B model.
Qwen3.5 is here. It is a native vision-language model with a massive 397B parameter count.
Built on the Qwen3-Next architecture (Linear Attention + MoE), only 17B parameters are active per forward pass. This hits a specific sweet spot: you get the reasoning depth of a giant model with the inference latency of a much smaller one.
For applications, this efficiency is key for agents.
It is natively multimodal with no glued-on vision adapters, demonstrating outstanding results on agentic tasks. This means handling complex workflows without burning through tokens.
Apache 2.0 and ready for vLLM/SGLang out of the box!
I’ve been using Qwen for building a simple code and website generator, and it works really well for fast iterations. Great for prototyping and lightweight generation.
What needs improvement
I need more on the history pages, a section when we can re-edit the input/process/output with easy UX. Basically, better handling of edge cases without extra prompting
vs Alternatives
I choose Qwen because it’s fast, lightweight, and great for turning ideas into simple, working code or websites. It was also the first web-based tool I explored for code generation, which made it easy to start prototyping right away.
How accurate is Qwen3 on real coding tasks you tried?
Quite good, but still need some touch-up especially on the logic.
Does Qwen3-Coder reduce PR review time or defects?
I’ve been trying Qwen alongside GPT-4o, and honestly it feels great — it’s noticeably faster and cheaper, yet most of the time the answer quality is hard to tell apart. For quick everyday tasks, I barely notice any trade-offs, which makes it a super practical choice.
I chose the Qwen model as the default starting in version 1.2 because it delivers an ideal balance of speed, accuracy, and lightweight performance. It runs efficiently on-device, uses very little storage, and responds quickly even on less powerful hardware. This makes it a perfect fit for an offline AI assistant where reliability, low resource usage, and a smooth user experience are essential.
Flowtica Scribe
Hi everyone!
Qwen3.5 is here. It is a native vision-language model with a massive 397B parameter count.
Built on the Qwen3-Next architecture (Linear Attention + MoE), only 17B parameters are active per forward pass. This hits a specific sweet spot: you get the reasoning depth of a giant model with the inference latency of a much smaller one.
For applications, this efficiency is key for agents.
It is natively multimodal with no glued-on vision adapters, demonstrating outstanding results on agentic tasks. This means handling complex workflows without burning through tokens.
Apache 2.0 and ready for vLLM/SGLang out of the box!