最終更新:2024-04-05 (金) 11:35:50 (25d)  

GGUF
Top / GGUF

GPT-Generated Unified Format?

a binary format that is designed for fast loading and saving of models, and for ease of reading.

https://github.com/ggerganov/ggml/blob/master/docs/gguf.md

量子化

  • q4_0?
  • q4_1?
  • q5_0?
  • q5_1?
  • q8_0?
  • q2_K?
  • q3_K_S?
  • q3_K_M?
  • q3_K_L?
  • q4_K_S?
  • q4_K_M?
  • q5_K_S?
  • q5_K_M?
  • q6_K?

メモ

  • K: k-quantメソッドなる新方式による量子化モデル

参考

参考

関連