最終更新:2025-01-22 (水) 11:43:03 (17d)
DeepSeek-R1
Top / DeepSeek-R1
https://github.com/deepseek-ai/DeepSeek-R1
DeepSeek-R1-Zero? 671B 669.28GB (BF16) DeepSeek-R1 671B 656.38GB (BF16)
DeepSeek-R1-Distill
- DeepSeek-R1-Distill models are fine-tuned based on open-source models, using samples generated by DeepSeek-R1. We slightly change their configs and tokenizers. Please use our setting to run these models.
Model BF16 Base Model DeepSeek-R1-Distill-Qwen-1.5B 3.55GB Qwen2.5-Math-1.5B DeepSeek-R1-Distill-Qwen-7B? 15.23GB Qwen2.5-Math-7B? DeepSeek-R1-Distill-Llama-8B 16.06GB Llama-3.1-8B? DeepSeek-R1-Distill-Qwen-14B? 29.54GB Qwen2.5-14B DeepSeek-R1-Distill-Qwen-32B 65.56GB Qwen2.5-32B? DeepSeek-R1-Distill-Llama-70B 136.87GB Llama-3.3-70B-Instruct?
メモ
- DeepSeek-R1-Distill-Qwen-32BはさまざまなベンチマークでOpenAI o1-mini?を上回った
関連
- DeepSeek-R1-Zero?