最終更新:2025-05-30 (金) 11:10:15 (24d)
DeepSeek-R1
Top / DeepSeek-R1
https://github.com/deepseek-ai/DeepSeek-R1
DeepSeek-R1-Zero? 671B 669.28GB (BF16) DeepSeek-R1 671B 656.38GB (BF16)
DeepSeek-R1-0528
DeepSeek-R1-Distill
- DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.
Model BF16 Base Model DeepSeek-R1-Distill-Qwen-1.5B 3.55GB Qwen2.5-Math-1.5B DeepSeek-R1-Distill-Qwen-7B 15.23GB Qwen2.5-Math-7B? DeepSeek-R1-Distill-Llama-8B 16.06GB Llama-3.1-8B? DeepSeek-R1-Distill-Qwen-14B 29.54GB Qwen2.5-14B DeepSeek-R1-Distill-Qwen-32B 65.56GB Qwen2.5-32B DeepSeek-R1-Distill-Llama-70B 136.87GB Llama-3.3-70B-Instruct
メモ
- DeepSeek-R1-Distill-Qwen-32BはさまざまなベンチマークでOpenAI o1-mini?を上回った
テクニカルレポート
関連
- DeepSeek-R1-Zero?