検索

クイックアクセス

チラ裏

おなかすいた族！

リンク

人気の50件

最終更新:2025-05-15 (木) 00:24:01 (65d)

GGUF/変換
llama.cpp/convert-hf-to-gguf.py
Top / llama.cpp / convert-hf-to-gguf.py

Hugging Faceモデルをllama.cppで使用できるGGUF形式に変換するスクリプト

https://github.com/ggerganov/llama.cpp/blob/master/convert_hf_to_gguf.py

メモ

torch~=2.2.1
Python 3.13だと依存のNumPyがエラー
Python 3.11だとOK

例

python convert_hf_to_gguf.py <Hugging Faceモデルのパス> --outtype <出力形式>

使い方

  --vocab-only          extract only the vocab  
  --outfile OUTFILE     path to write to; default: based on input. {ftype} will  
                        be replaced by the outtype.  
  --outtype {f32,f16 (default),bf16,q8_0,tq1_0,tq2_0,auto}  
                        output format - use f32 for float32, f16 for float16,  
                        bf16 for bfloat16, q8_0 for Q8_0, tq1_0 or tq2_0 for  
                        ternary, and auto for the highest-fidelity 16-bit float  
                        type depending on the first loaded tensor type  
  --bigendian           model is executed on big endian machine  
  --use-temp-file       use the tempfile library while processing (helpful when  
                        running out of memory, process killed)  
  --no-lazy             use more RAM by computing all outputs before writing  
                        (use in case lazy evaluation is broken)  
  --model-name MODEL_NAME  
                        name of the model  
  --verbose             increase output verbosity  
  --split-max-tensors SPLIT_MAX_TENSORS  
                        max tensors in each split  
  --split-max-size SPLIT_MAX_SIZE  
                        max size per split N(M|G)  
  --dry-run             only print out a split plan and exit, without writing  
                        any new files  
  --no-tensor-first-split  
                        do not add tensors to the first split (disabled by  
                        default)  
  --metadata METADATA   Specify the path for an authorship metadata override  
                        file  
  --print-supported-models  
                        Print the supported models  
  --remote              (Experimental) Read safetensors file remotely without  
                        downloading to disk. Config and tokenizer files will  
                        still be downloaded. To use this feature, you need to  
                        specify Hugging Face model repo name instead of a local  
                        directory. For example: 'HuggingFaceTB/SmolLM2-1.7B-  
                        Instruct'. Note: To access gated repo, set HF_TOKEN  
                        environment variable to your Hugging Face token.

--outtype

"--outtype", type=str, choices=["f32", "f16", "bf16", "q8_0", "tq1_0", "tq2_0", "auto"], default="f16",
output format - use f32 for float32, f16 for float16, bf16 for bfloat16, q8_0 for Q8_0, tq1_0 or tq2_0 for ternary, and auto for the highest-fidelity 16-bit float type depending on the first loaded tensor type