Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from ngff-zarr to ome-zarr-models #80

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

imagejan
Copy link

@imagejan imagejan commented Jan 8, 2025

Link to Relevant Issue

See bioio-devs/bioio-base#25 (comment), as well as bioio-devs/bioio-ome-zarr#29.

This PR also makes #78 obsolete.

Description of Changes

The ngff-zarr dependency is replaced by ome-zarr-models, mapping the same metadata classes of the OME-NGFF spec.

@imagejan imagejan force-pushed the ome-zarr-models branch 2 times, most recently from 4ff121b to d2d64e7 Compare January 8, 2025 15:31
@imagejan imagejan marked this pull request as ready for review January 8, 2025 15:34
@imagejan imagejan requested a review from a team as a code owner January 8, 2025 15:34
@toloudis
Copy link
Contributor

toloudis commented Jan 8, 2025

Code changes look good, nice to see that it was a pretty simple change.

I am not super familiar with the ome-zarr-models repo yet.
One of the considerations here needs to be future versions of ome-zarr 0.5 and above. Any idea how soon this library or the Janelia pydantic one will have full implementation of 0.5? Do the maintainers of ome-zarr-models want to be aggressive about supporting published ome-ngff specs?

Since this code is used by our ome-zarr writer, we want, at least internally, to be ready to deal with sharded zarrv3 some time soon (vague, I know).

@toloudis
Copy link
Contributor

toloudis commented Jan 8, 2025

I just found context around the choice of package over here: bioio-devs/bioio-base#25 (comment)

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.09%. Comparing base (454181f) to head (d2d64e7).
Report is 111 commits behind head on main.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##             main      #80       +/-   ##
===========================================
+ Coverage   78.54%   89.09%   +10.55%     
===========================================
  Files          17       21        +4     
  Lines         974     1531      +557     
===========================================
+ Hits          765     1364      +599     
+ Misses        209      167       -42     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@imagejan
Copy link
Author

imagejan commented Jan 9, 2025

Any idea how soon this library or the Janelia pydantic one will have full implementation of 0.5? Do the maintainers of ome-zarr-models want to be aggressive about supporting published ome-ngff specs?

ome-zarr-models-py aims to be a reference implementation for the NGFF spec.
See https://ome-zarr-models-py.readthedocs.io/en/latest/#design

Also, RFC authors are encouraged to implement proposed changes directly in ome-zarr-models-py to test them in practice, so I propose this would be the package of choice to support any new version of the spec.
See https://ome-zarr-models-py.readthedocs.io/en/latest/contributing/#implementing-ome-zarr-rfcs

Since this code is used by our ome-zarr writer, we want, at least internally, to be ready to deal with sharded zarrv3 some time soon (vague, I know).

Fully agreed with this goal 🙂

How strong is the need to support Python 3.9 and 3.10 in bioio?

ome-zarr-models-py is following the scientific python recommendations here:
https://scientific-python.org/specs/spec-0000/#support-window

I'd suggest that bioio-base and bioio might adapt that scheme as well. Happy to submit pull requests along that line if others agree.

@toloudis
Copy link
Contributor

toloudis commented Jan 9, 2025

The only reason right now that bioio should support 3.9 or 3.10 is to maintain compatibility with pre-exising user codebases that have already adopted bioio. Of course, we can use versioned bioio releases to drop old python compatibility.

I would love to advocate for following the scientific-python lifecycle.
We probably need to do an internal survey at AICS to see how much of our code uses the legacy versions of Python.

Copy link
Contributor

@SeanLeRoy SeanLeRoy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes seem good to me too! Does this drop support for 3.10/3.11 or was that just a side question @imagejan?

@imagejan
Copy link
Author

Does this drop support for 3.10/3.11 or was that just a side question

Yes, unfortunately, ome-zarr-models only support Python 3.11+, so I assume this might be a blocker?!

(My motivation for removing ngff-zarr is to simplify deployment of bioio to conda-forge.)

@toloudis
Copy link
Contributor

As much as I support this PR, I think supporting 3.9 and 3.10 is a requirement to continue to use this within AICS for now. We have to update other computational python code before we can restrict bioio in that way.

If this code change removes support for 3.9 and 3.10 then we have to modify pyproject.toml and the unit tests and possibly consider it a major version change, or say it very loudly in release notes..

We will have to decide how best to move forward!

@evamaxfield
Copy link
Contributor

As much as I support this PR, I think supporting 3.9 and 3.10 is a requirement to continue to use this within AICS for now. We have to update other computational python code before we can restrict bioio in that way.

If this code change removes support for 3.9 and 3.10 then we have to modify pyproject.toml and the unit tests and possibly consider it a major version change, or say it very loudly in release notes..

We will have to decide how best to move forward!

Is there anything coming down the pipeline for this repo / plugins that AICS internal is looking to use? If there isn't anything that needs upgrading in BioIO for AICS internal usage, I don't necessarily see the problem in allowing this PR to go through (and bumping the minimum supported versions) if there aren't features or bugfixes needed from AICS internally because everyone should be able to continue using the various BioIO versions already in use.

That said, if it is a case of waiting for current features and bugfixes to go through before updating min supported versions, here is the list of PRs open in the overarching project:

Maybe AICS Internal?

Maybe AICS External?

@toloudis
Copy link
Contributor

We have one or two major internal computational packages that need updating to 3.11+.

Future key work for bioio:

  • Internal: Hard to predict right now but there are lots of anticipated issues IF we ramp up our usage of non-zeiss imaging. (e.g. nd2, sldy, lif)

  • zarr v3 adoption is internal+external impact. Unclear when we will start on it but the sooner we start experimenting the better, because it affects storage efficiency.

  • Also we have a long standing issue with the way we read chunks only at whole-slice granularity, if I am remembering right. Which affects dask efficiency and the ability to read the same chunk sizes that we will write back to zarr. I think most of our zarr conversions could incur a rechunking on dask array.

@imagejan
Copy link
Author

FWIW, I changed the python support in this PR to >=3.11, <=3.13. Feel free to close (or leave around unmerged) if this doesn't fit the current plans for bioio.

I'd be happy with any route towards making bioio and all its dependencies available on conda-forge.

@toloudis
Copy link
Contributor

toloudis commented Jan 30, 2025

I guess I could be on board with:

  • basically version every package in bioio-devs at something like 1.99 (silly example number)
  • then cut a 2.0 that kills the older python support. (doesn't have to be a major version but you get the idea) Announce what we did.
  • move on with our lives
    BUT: I'm fairly sure that if AICS internal is stuck on old python, we will still need improvements to some of the bioio readers and make a zarrv3 writer AT LEAST.
    @pgarrison is there some kind of path to drag AICS internal python code up to >=3.11?

@pgarrison
Copy link

@pgarrison is there some kind of path to drag AICS internal python code up to >=3.11?

With the exception of cyto-dl and current projects that use cyto-dl, AFAIK most of our code that is stuck on Python < 3.11 is legacy code that probably doesn't need to be upgraded to the latest bioio version.

Even if bioio 2 drops support for Python < 3.11, we could consider backporting some bugfixes to bioio 1.x as necessary, though I know we'd much rather have a single version to maintain.

@toloudis
Copy link
Contributor

After one more round of internal discussions, personally I feel totally comfortable with moving ahead with this and basically following (or keeping up with) the scientific-python roadmap. I'm all in favor of it. I would just like us to be careful about communicating what is the last bioio version before changing the supported python version.

Copy link
Contributor

@toloudis toloudis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, maybe a followup pr for documentation about python version etc before we do a release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants