Quantize Stable Diffusion examples #368

LukeLIN-web · 2025-01-09T03:13:11Z

I am using A6000.
I tried python quantize_StableDiffusion.py --batch_size=1 --torch_dtype="fp16" , it can work well.
But python quantize_StableDiffusion.py --batch_size=1 --unet_qtype="fp8" , it is very slow. Why?

Because readme.md we have installed latest

git clone https://github.com/huggingface/quanto
cd quanto
pip install -e .

Problem

What is different between this repo Optimum Quanto and https://github.com/huggingface/quanto
what is different between torch_dtype and unet dtype

The text was updated successfully, but these errors were encountered:

dacorvo · 2025-01-09T07:59:07Z

quanto has just been renamed to optimum-quanto
the dtype is the type used in non-quantized operations (basically everything except Linear layers), and the qtype is the weight quantization for Linear in unet

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantize Stable Diffusion examples #368

Quantize Stable Diffusion examples #368

LukeLIN-web commented Jan 9, 2025 •

edited

Loading

dacorvo commented Jan 9, 2025

Quantize Stable Diffusion examples #368

Quantize Stable Diffusion examples #368

Comments

LukeLIN-web commented Jan 9, 2025 • edited Loading

dacorvo commented Jan 9, 2025

LukeLIN-web commented Jan 9, 2025 •

edited

Loading