最終更新:2025-05-19 (月) 08:32:05 (53d)  

ExLlama
Top / ExLlama

https://github.com/turboderp/exllama

ExLlamaV3?

ExLlamaV2

  • an inference library for running local LLMs on modern consumer GPUs.

APIサーバ