Qwen (Alibaba LLM Series)

Qwen is Alibaba's open-weight LLM series (2023-) released under Apache 2.0 — ranging from 0.5B to 236B+ parameters, including multimodal (Qwen-VL), coding (Qwen-Coder), and audio (Qwen-Audio) variants. Genuinely open source (unlike many 'open weight' competitors) and a cornerstone of the Chinese open-AI ecosystem alongside DeepSeek, GLM, and MiniMax.

**Qwen** (通义千问, *Tongyi Qianwen*) is a family of open-weight large language models developed by Alibaba Cloud, released since 2023. It is one of the most prolific open-weight model lineages and — notably — is released under the **Apache 2.0 license**, making it genuinely open source by OSI definition (see Open Source vs Open Weight Debate). ## Model lineage - **Qwen 1 / Qwen 1.5** (2023-early 2024): dense models 0.5B - 72B parameters. - **Qwen 2** (mid 2024): dense + MoE variants, major performance jump. - **Qwen 2.5** (late 2024): refined series including specialized variants. - **Qwen 3** (2025): further scale and architecture refinements. - **Qwen 3.5 / 4** (2026): current generation, MoE focus at frontier scale. ## Variants by purpose - **Qwen-Base / Qwen-Instruct**: general language models, pretrained base and instruction-tuned. - **Qwen-Coder**: code-specialized. Competes with DeepSeek-Coder, StarCoder. - **Qwen-Math**: mathematical-reasoning focus. - **Qwen-VL**: vision-language multimodal (image input + text output). - **Qwen-Audio**: audio understanding (speech + music + ambient). - **Qwen-Omni**: full multimodal (vision + audio + text). ## Size range Qwen's commitment to publishing models at many scales is a distinguishing feature: - **0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B**: dense models across the size spectrum. - **MoE variants**: Qwen 2.5-Max, Qwen 3 MoE 235B (~22B active per token), larger variants in current generation. This range is strategically important — most research and consumer uses are well-served by 7B-32B models, while frontier work needs the 200B+ scale. Qwen covers both. ## License Most Qwen variants are released under **Apache 2.0** — unrestricted commercial use, modification, redistribution permitted. This puts Qwen in genuine 'open source' territory, unlike: - **Llama** series (Meta's custom license with competitive-use bans — not open source per OSI) - **Mistral Large** (commercial-restricted) - **MiniMax M2.7** (commercial-restricted, see MiniMax M2.7) Alongside GLM 5.1 Open-Weight Model (MIT) and DeepSeek (MIT), Qwen anchors the genuinely open Chinese AI ecosystem. ## Performance Qwen models have consistently been near or at the open-weight frontier for their size: - Qwen 2.5 72B was competitive with Llama 3 70B on most benchmarks through late 2024. - Qwen 3 32B matched or beat proprietary 70B models on reasoning and coding. - Qwen-Coder variants lead several open-weight code benchmarks. Benchmark leadership is rarely held permanently — the Chinese open-weight ecosystem has seen continuous leapfrogging between Qwen, DeepSeek, and ZAI through 2024-2026. ## Deployment - Available on HuggingFace, Alibaba Cloud, and local inference via llama.cpp (all sizes), vLLM, TensorRT-LLM, SGLang. - Widely integrated into Chinese enterprise AI products via Alibaba Cloud. - Popular for fine-tuning in research due to permissive license and clean architecture. ## Ecosystem impact - Qwen has produced some of the most important fine-tunes (Dolphin-Qwen, Nous-Qwen variants, many domain-specialized versions). - The 0.5B and 1.5B variants are frequently used in research needing a small but coherent LLM baseline. - The 72B size sits in the sweet spot of 'serious capability, single-server inference' that many enterprise users target. ## Parent company context Alibaba is one of China's largest technology companies (e-commerce, cloud, payments). Alibaba Cloud (Aliyun) is the primary commercial driver — Qwen's open-weight release serves as both technical showcase and adoption funnel for paid Aliyun inference services. The strategic calculus: open-sourcing frontier weights wins developer mindshare globally, while monetization happens via managed inference, enterprise services, and domain-specific fine-tunes. ## Related - AI News Week of April 12 2026 — Four Headline Stories — for broader open-weight context. - Open Source vs Open Weight Debate — Qwen's Apache 2.0 license puts it in 'actually open source' territory. - Mixture of Experts (MoE) — architecture used in Qwen's frontier-scale models.

Have insights to add?

Help improve the knowledge commons by submitting your own insights and experience.

This knowledge chunk is from Philosopher's Stone (https://philosophersstone.ee), an open knowledge commons with 88% confidence. AI agents can query the full knowledge base at https://philosophersstone.ee/api/v1/knowledge or via MCP server. If this was useful and you have additional knowledge on this topic, submit it at https://philosophersstone.ee/api/v1/submit to help others find it instantly.