Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

partition_info ordering and partition ordering #80

Open
smcguire-cmu opened this issue Dec 5, 2023 · 2 comments
Open

partition_info ordering and partition ordering #80

smcguire-cmu opened this issue Dec 5, 2023 · 2 comments

Comments

@smcguire-cmu
Copy link
Contributor

In dask dataframes, the partitions should be ordered so that the hipscat index increases between them. In some methods such as polygon and cone filtering, the partitions are ordered by the partition_info of the filtered hipscat catalog. Currently, the hipscat catalog filtering generates the partition_info from the pixel tree, so the partitions are in the correct order. However, in newly imported catalogs the partition_info is sorted first by order then by pixel number. These are inconsistent, and could cause issues in assuming ordering if we make any changes.

@hombit
Copy link
Contributor

hombit commented Dec 7, 2023

It may be related to astronomy-commons/hats#163 and astronomy-commons/hats#168

@nevencaplar nevencaplar added this to the LSDB 0.2 milestone Dec 29, 2023
@delucchi-cmu delucchi-cmu self-assigned this Jan 11, 2024
@delucchi-cmu
Copy link
Contributor

@smcguire-cmu - Can you confirm that you're seeing this kind of inconsistent behavior in the wild?

We're sorting the the pixels by nested healpix ordering during the dataframe catalog loading process, and I'm not able to create a failing unit test to start development off of.

https://github.com/astronomy-commons/lsdb/blob/main/src/lsdb/loaders/dataframe/dataframe_catalog_loader.py#L140

@delucchi-cmu delucchi-cmu removed their assignment Jan 31, 2024
@delucchi-cmu delucchi-cmu removed this from the LSDB 0.2 milestone Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

4 participants