最終更新:2025-01-24 (金) 17:11:49 (14d)  

llama.cpp/convert-hf-to-gguf.py
Top / llama.cpp / convert-hf-to-gguf.py

https://github.com/ggerganov/llama.cpp/blob/master/convert_hf_to_gguf.py

メモ

--outtype

  • "--outtype", type=str, choices=["f32", "f16", "bf16", "q8_0", "tq1_0", "tq2_0", "auto"], default="f16",
    output format - use f32 for float32, f16 for float16, bf16 for bfloat16, q8_0 for Q8_0, tq1_0 or tq2_0 for ternary, and auto for the highest-fidelity 16-bit float type depending on the first loaded tensor type