最終更新:2025-01-23 (木) 15:28:16 (163d)  

AutoAWQ
Top / AutoAWQ

implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference.

https://github.com/casper-hansen/AutoAWQ