-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support bbox in in geometry #138
Conversation
@dabhicusp @naschmitz @alxmrs (+others; not sure who to cc) would you all be receptive to something like this? Any thoughts? |
xee/ext_integration_test.py
Outdated
@@ -41,17 +41,14 @@ | |||
] | |||
|
|||
|
|||
def _read_identity_pool_creds() -> identity_pool.Credentials: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to revert to this, but it was easier for me to get tests up and running with google.auth.default
, which looks for GOOGLE_APPLICATION_CREDENTIALS
by default anyways.
xee/ext_integration_test.py
Outdated
# This is off slightly due to bounds determined by geometry e.g. .getInfo() | ||
# seems to cause a super slight shift in the bounds. Thhe coords change before | ||
# and after the call to .getInfo()! | ||
np.testing.assert_almost_equal( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really want this to pass with assert_equal
.
This reverts commit 7efe992.
I removed changes around ee credentials/auth in case that was the issue with the previous workflow run failure. |
I am not sure what's up with failing ci workflows. Looks like it may be a secret issue. |
Thanks for the PR, Leonard! You can mostly ignore the ci workflows, they'll be run in a different environment. Note that currently we're on a biweekly issue and PR triage cycle. Thanks for your patience as we get this assigned for review! |
xee/ext.py
Outdated
@@ -40,6 +40,7 @@ | |||
from xarray.backends import store as backends_store | |||
from xarray.core import indexing | |||
from xarray.core import utils | |||
import xarray as xr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Xarray is already imported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use double precision sounds like a useful feature, but I am uncertain about match_xarray
. This doesn’t look like a standard part of rasterio — am I mistaken?
I would advise not modifying the base interface to achieve this kind of compatibility (given a review on mobile).
|
xee/ext.py
Outdated
@@ -1020,6 +1044,13 @@ def open_dataset( | |||
frameworks. | |||
ee_init_kwargs: keywords to pass to Earth Engine Initialize when | |||
attempting to auto init for remote workers. | |||
use_coords_double_precision: Whether to use double precision for coordinates |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should float64 just be the default? I think float32 was an offhand desire to save bytes, but it could be simply incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone on the EE team will better be able to clarify.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think this might be a good idea. If people want float32, they can cast before they write things out.
I think this may exist already, or is close to existing. You don’t have to pass in ee.Geometry to set the geometry. Does proj’s CRS object provide one the way?
I like this idea! It makes sense that it would be related to the projection. |
Sorry, what may exist already or close to existing? Support for
Me, too, but I am not so sure there is a straightforward way to determine if the bounds provided are float32 or float64. I am starting to think that the safest/simplest thing would be to change to float64 for coords to avoid truncation, and add support for a |
Sorry Leonard, I have to admit I’m not looking that closely. I see now that
you’re right: the geometry arg should take a tuple of bounds.
No objections to your proposal from my end! Thanks for your contribution. :)
…On Fri, Feb 16, 2024 at 11:30 PM Leonard ***@***.***> wrote:
I think this may exist already, or is close to existing. You don’t have to
pass in ee.Geometry to set the geometry. Does proj’s CRS object provide one
the way?
Sorry, what may exist already or close to existing? Support for tuple[float,
float, float, float] type in geometry? As far as I understand, you can
pass geometry or None. If None, the bounds are taken from the CRS's area
of use
<https://pyproj4.github.io/pyproj/stable/api/crs/crs.html#pyproj.crs.CRS.area_of_use>.
This object doesn't have a set method for area_of_use as far as I can tell,
which makes sense 🤷 .
I like this idea! It makes sense that it would be related to the
projection.
Me, too, but I am not so sure there is a straightforward way to determine
if the bounds provided are float32 or float64.
I am starting to think that the safest/simplest thing would be to change
to float64 for coords to avoid truncation, and add support for a tuple[float,
float, float, float] type in geometry. That would let me precisely define
the dataset's transform. Thoughts? I can implement and push up those
changes if there are no major objections.
—
Reply to this email directly, view it on GitHub
<#138 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AARXAB2V7NTXJ2ASLRZW6I3YT6CSZAVCNFSM6AAAAABDF2AQDOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBYHAZDAMJWGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
match_xarray
and use_coords_double_precision
@alxmrs I am not sure how you will feel about keeping geometry only or if you would prefer a bounds or bbox argument. I added the proposed changes. Also, apologies for the messy commit history, but the last three should be helpful in demonstrating things. |
xee/ext.py
Outdated
@@ -139,7 +139,7 @@ def open( | |||
crs: Optional[str] = None, | |||
scale: Optional[float] = None, | |||
projection: Optional[ee.Projection] = None, | |||
geometry: Optional[ee.Geometry] = None, | |||
geometry: Optional[Union[ee.Geometry, types.types.BBox]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it just be just types
and not types.types
?
xee/ext.py
Outdated
x_min_0, y_min_0, x_max_0, y_max_0 = _ee_bounds_to_bounds( | ||
self.get_info['bounds'] | ||
) | ||
elif isinstance(geometry, Union[List, Tuple, np.ndarray]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does typing.Sequence work here?
Consider adding a check for length of the sequence, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! You have one :)
xee_dataset = xr.open_dataset( | ||
ee.ImageCollection(ic), | ||
engine='ee', | ||
geometry=tuple(raster.rio.bounds()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
xee/ext_integration_test.py
Outdated
ee.ImageCollection(ic), | ||
engine='ee', | ||
geometry=tuple(raster.rio.bounds()), | ||
scale=raster.rio.resolution()[0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related: should we let users pass in x and y scale?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Is this possible somehow today, eg via proj?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is. Great call! I added this to the integration test in a2b3c55
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Added a couple of nits, but overall I really like this improvement. I think CI needs to pass for this to be merged.
xee/ext.py
Outdated
@@ -139,7 +139,7 @@ def open( | |||
crs: Optional[str] = None, | |||
scale: Optional[float] = None, | |||
projection: Optional[ee.Projection] = None, | |||
geometry: Optional[ee.Geometry] = None, | |||
geometry: Optional[Union[ee.Geometry, types.BBox]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider just using a union with None. Optional is like a Union, so this is somewhat redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you misread or I am misunderstanding. It sounds like you think I put Optional[Union[ee.Geometry, None]]
in which case, yes, agreed--redundant, but I added types.BBox. I actually am not sure if the type types.BBox
will display as tuple[float, float, float, float] in jupyter or something and I am somewhat inclined to change to Optional[Union[ee.Geometry, Tuple[float, float, float, float]] = None
to make it as clear as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented in 4c0dd6b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Union[ee.Geometry, types.BBox, None]
is equivalent to Optional[Union[ee.Geometry, types.BBox]]
and in my opinion a bit cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for being unclear. I'm referring to the definition of Optional: https://docs.python.org/3/library/typing.html#typing.Optional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was confused, but your preference is clear now! Thanks. Updated in 23ea0c9
xee/ext.py
Outdated
x_min_0, y_min_0, x_max_0, y_max_0 = _ee_bounds_to_bounds( | ||
self.get_info['bounds'] | ||
) | ||
elif isinstance(geometry, Union[List, Tuple, np.ndarray, Sequence]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant can sequence replace the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to consider is that sequence will let in a simple string e.g. isinstance("hotdog", Sequence)
evaluates to True. Still want me to drop all types in favor of Sequence alone? There is a part of me that is also weirded out by even using np.ndarray since that could be an incorrect shape that still evaluates to length four.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would like to be more stringent with types and changes this to only Union[List, Tuple]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented in 4c0dd6b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking. SGTM.
engine='ee', | ||
geometry=tuple(raster.rio.bounds()), | ||
projection=ee.Projection( | ||
crs=str(raster.rio.crs), transform=raster.rio.transform()[:6] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also nice!
@alxmrs thanks for another review! I am going to simplify types as I mentioned in comments and add a small demo in the readme. As far as CI, I think auth is failing because my user name does have the credentials? I am not sure if this is on my end or yours? I am guessing an author/google employee may need to trigger CI for auth to work correctly? |
I think @jdbcode or @naschmitz can help with CI. I’m not familiar with how it’s set up. |
@jdbcode or @naschmitz any tips? The tests workflow fails at authentication. |
@jdbcode @naschmitz kindly bumping |
Sorry for the delay. We're extremely understaffed right now. I'm going to take a look at your PR now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! I'll approve once the changes are made.
xee/ext.py
Outdated
geometry: Optional[ee.Geometry] = None, | ||
geometry: Optional[ | ||
Union[ee.Geometry, Tuple[float, float, float, float]] | ||
] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you update the type annotation to the following. Geometry | Tuple | None
is preferred over Optional[Geometry | Tuple]
.
Union[ee.Geometry, Tuple[float, float, float, float], None]
Here and other places in the PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implemented here b640d95
xee/ext.py
Outdated
try: | ||
x_min_0, y_min_0, x_max_0, y_max_0 = self.crs.area_of_use.bounds | ||
except AttributeError: | ||
# `area_of_use` is probable `None`. Parse the geometry from the first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: probably
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implemented here b640d95
xee/ext.py
Outdated
x_min_0, y_min_0, x_max_0, y_max_0 = _ee_bounds_to_bounds( | ||
self.get_info['bounds'] | ||
) | ||
elif isinstance(geometry, Union[List, Tuple]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to check against Sequence
here instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is fine, but a string is also sequence and this would catch and raise in that case. Updated anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implemented here b640d95
xee/ext.py
Outdated
@@ -231,19 +235,31 @@ def __init__( | |||
# Parse the dataset bounds from the native projection (either from the CRS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider breaking this out into a separate function. This logic is much more complex now.
Looks like the only input is geometry
. Return tuple should be a 4-length tuple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added in b640d95
@naschmitz thanks for the review and no worries on the timeline. I'm just stoked to get it in as this is basically my most substantial OSS contribution to date after 6 years of startup life. Tests pass locally btw! Thanks again and let me know if there is anything else you would like to see 🙏 |
Any idea what is up with CI test step authentication? |
This pr adds support for matching an existing xarray dataset that adheres to rioxarray standards and to support float64 precision with coordinates.
Motivating Use Case of
match_xarray
:Motivating Use Case of
use_coords_double_precision
:.getInfo()
seems to change the values so slightly! This motivatednp.testing.assert_all_close
instead ofnp.testing.assert_equal
🦩