Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression Ratio #1

Open
NicoNico6 opened this issue Oct 11, 2023 · 1 comment
Open

Compression Ratio #1

NicoNico6 opened this issue Oct 11, 2023 · 1 comment

Comments

@NicoNico6
Copy link

Really solid work!

May I ask what the actual compressed model size is, considering that it is a partial binarization way and there are some 8-bit parameters inside each weight matrix? Can we compress the model using techniques like bitpacking?

@hahnyuan
Copy link
Owner

hahnyuan commented Jan 4, 2024

Apologies for the delayed response.

Regarding your inquiry, for the 1-bit weights, we indeed use the packed format. As for the 8-bit parameters within each weight matrix, due to their sparse nature with a low percentage of density, conventional techniques like CSR may not be the most suitable. Currently, we are exploring the modified run-length encoding (RLE) to achieve an efficient compression ratio for the 8-bit sparse data.

In our modified RLE, each 8-bit data point is represented by a pair of values: the actual 8-bit data and the count of consecutive occurrences of leading zeros. For example,
original sequence: 0 0 0 0 0 0 5 0 0 1.
RLE representation: (6, 2) (5, 1).

Considering the storage cost, the RLE representation typically involves storing the value and count as pairs, and each pair might require 12 bits (8 bits for the value and 4 bits for the count).

For 10% outlier, if we quantize the weights to 8-bit, the average bits for each value is 1+(8+4)*0.1=2.2 bits (compression ratio=1-2.2/16=86.3%). If we quantize the weights to 4-bit, the average bits for each value can be reduced to 1+(4+4)*0.1=1.8 bits (compression ratio=1-1.8/16=88.8%).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants