Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: handle full_output compression correctly #461

Open
wants to merge 3 commits into
base: ft/compress
Choose a base branch
from

Conversation

whichqua
Copy link
Collaborator

Issue Number: N/A

Type

  • feature
  • bugfix
  • dev (no functional changes, no API changes)
  • fmt (formatting, renaming)
  • build
  • docs
  • testing

Description

This PR provides:

  • Fixes on handling felts, uses BigUint.
  • Fixes on keeping the order of insertion, replaces HashMap with IndexMap.
  • Fixes minor logical errors in the compress implementation.
  • Adds parameterised tests for the compress method.

Breaking changes?

  • yes
  • no

@whichqua whichqua marked this pull request as ready for review January 13, 2025 15:01
@whichqua whichqua changed the title fix: handle kzg compression correctly fix: handle full_output compression correctly Jan 13, 2025
@whichqua whichqua requested a review from ftheirs January 13, 2025 15:03
@ftheirs
Copy link
Collaborator

ftheirs commented Jan 13, 2025

This looks REALLY good 🚀 The check orphan job is failing cause is using the old script. That will be fixed once this is merge in the other branch 💪

@ftheirs ftheirs self-requested a review January 13, 2025 17:04
Copy link
Collaborator

@HermanObst HermanObst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great in general @whichqua
As this is a very complex hint I've asked for much more doc and unit tests

static ref HEADER_ELM_BOUND: BigUint = BigUint::one() << HEADER_ELM_N_BITS;
}

#[derive(Default, Clone, Debug)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docs explaining what this struct is needed for, what is the data it contains, etc

}

pub const COMPRESS: &str = indoc! {r#"from starkware.starknet.core.os.data_availability.compression import compress
fn get_n_elms_per_felt(elm_bound: &BigUint) -> usize {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add docs explaining what this function is doing

}
}

#[derive(Default, Clone, Debug)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add doc on what this struct and its data is for


let packed_bucket_index_per_elm = pack_in_felts(bucket_index_per_elm, TOTAL_N_BUCKETS as usize);
fn pack_in_felts(elms: &[BigUint], elm_bound: &BigUint) -> Vec<BigUint> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs

.chain(packed_repeating_value_pointers.into_iter())
.chain(packed_bucket_index_per_elm.into_iter())
.collect::<Vec<Felt252>>();
fn pack_in_felt(elms: &[BigUint], elm_bound: &BigUint) -> BigUint {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs

if value.bits() as usize <= bucket.n_bits {
if bucket.contains(value) {
self.repeating_value_locations.push((*bucket_index, bucket.get_index(value)));
self.bucket_index_per_elm.push(self.repeating_values_bucket_index());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont follow why this is the len() of buckets and not the bucket_index

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ftheirs @whichqua seems a bug

self.bucket_index_per_elm.push(*bucket_index);
}
break;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This break is to store the value in the first bucket that can store it, right? (and not continue adding it to other "bigger" buckets)

/// Returns a list of pointers corresponding to the repeating values.
/// The pointers point to the chained unique value buckets.
pub fn get_repeating_value_pointers(&self) -> Vec<usize> {
fn get_repeating_value_pointers(&self) -> Vec<BigUint> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also lets add docs and unit tests

self.buckets.len()
}

fn update(&mut self, values: &[BigUint]) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add docs and some unit test for this function.
Its complex to follow what it should be doing.

Comment on lines +278 to +285
#[rstest]
#[case::single_value_1(vec![1u32], vec!["1393796574908163946345982392040522595172352", "1", "5"])]
#[case::single_value_2(vec![2u32], vec!["1393796574908163946345982392040522595172352", "2", "5"])]
#[case::single_value_3(vec![10u32], vec!["1393796574908163946345982392040522595172352", "10", "5"])]
#[case::two_values(vec![1u32, 2], vec!["2787593149816327892691964784081045190344704", "65537", "40"])]
#[case::three_values(vec![2u32, 3, 1], vec!["4181389724724491839037947176121567785517056", "1073840130", "285"])]
#[case::four_values(vec![1u32, 2, 3, 4], vec!["5575186299632655785383929568162090380689408", "140740709646337", "2000"])]
#[case::extracted_kzg_example(vec![1u32, 1, 6, 1991, 66, 0], vec!["1461508606313777459023416562628243222268909453312", "2324306378031105", "0", "98047"])]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How these test cases where calculated? Could you add the link where these test lives?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants