最終更新:2025-01-23 (木) 15:28:16 (15d)  

AutoAWQ
Top / AutoAWQ

implements the AWQ? algorithm for 4-bit quantization with a 2x speedup during inference.

https://github.com/casper-hansen/AutoAWQ