最終更新:2026-02-26 (木) 14:19:48 (10d)
Qwen3.5
Top / Qwen3.5
https://qwen.ai/blog?id=qwen3.5
https://huggingface.co/Qwen/Qwen3.5-122B-A10B
モデル
最初の
Medium
https://x.com/Alibaba_Qwen/status/2026339351530188939
パラメータ数 モデル Qwen3.5-122B-A10B 122B 250GB Qwen3.5-35B-A3B 71.9GB Qwen3.5-35B-A3B-Base? 71.9GB Qwen3.5-27B 27B 55.6GB
LM Studio
Thinkingモード
- qwenチームのモデルだとGUIからThinkingのON/OFFができる
- その他だとそこそこ長考する
Ollama
https://ollama.com/library/qwen3.5
qwen3.5:35b Qwen3.5-35B-A3B 24GB Q4_K_M qwen3.5:122b Qwen3.5-122B-A10B 81GB Q4_K_M
Unsloth
https://unsloth.ai/docs/models/qwen3.5
- RAM/VRAM合計
Qwen3.5 variant 4-bit 8-bit BF16 27B 17 GB 30 GB 54 GB 35B-A3B 22 GB 38 GB 70 GB 122B-A10B 70 GB 132 GB 245 GB 397B-A17B 214 GB 512 GB 810 GB
- 27Bのほうが正確だが35B-A3Bのほうが高速
比較
- GPT-5.2
- Claude 4.5 Opus?
- Gemini 3 Pro
- Qwen3-Max-Thinking?
- K2.5-1T-A32B?
メモ
https://x.com/gosrum/status/2026671450250432950
- Qwen3.5-27B≒Qwen3.5-122B-A10B >> gpt-oss-120b > Qwen3.5-35B-A3Bという感じらしい
Qwen3.5-27B
- 4bitだと17 GB
Apple M1 Ultra MLX 24tok/s (LM Studio) Apple M1 Ultra Unsloth 13tok/s (LM Studio) GeForce RTX 3090 Q4_K_XL? 31tok/s (LM Studio) GeForce RTX 5090 Q4_K_XL? 60tok/s (llama-bench) Apple M2 Ultra Q4_K_XL? 18tok/s (llama-bench) https://x.com/gosrum/status/2026450569695830360
Qwen3.5-122B-A10B
- 4bitで70GB
Apple M2 Ultra 27tok/s (llama-bench) GeForce RTX 3090 + RAM 1.6tok/s (LM Studio) https://x.com/gosrum/status/2026568344317604165
Qwen3.5-35B-A3B
- 4bitだと21GB
GeForce RTX 3090 Unsloth Q4_K_XL? (40/40) 21.5GB 85tok/s (LM Studio) GeForce RTX 3090 Unsloth Q4_K_XL? (38/40) 21.5GB 26tok/s (LM Studio) Apple M1 Ultra Unsloth Q4_K_XL? 21.5GB 42tok/s (LM Studio) Apple M1 Ultra MLX Community 4bit 20.4GB 59.95tok/s (LM Studio) GeForce RTX 5090 Unsloth Q4_K_XL? (40/40) 21.5GB 170tok/s (llama-bench) Apple M2 Ultra Unsloth Q4_K_XL? 21.5GB 52tok/s (llama-bench) https://x.com/gosrum/status/2026457860197253504

