-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate to tiff crate #17
Conversation
580fbd7
to
10676f9
Compare
In case you want a reference for parsing the geokey tags, I have an implementation here: https://github.com/developmentseed/aiocogeo-rs/blob/main/src/geo_key_directory.rs (it doesn't use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gschulze for pulling this together! Just some minor comments for now, mostly related to variable names (yes, naming is hard), and a panic!
that could probably be removed.
I'll be doing a Rust pair programming session with a colleague later, and might try to convince him to go though this PR so there might be a few more comments coming!
src/lib.rs
Outdated
if sample >= *num_samples { | ||
panic!( | ||
"sample out of bounds: the number of samples is {} but the sample is {}", | ||
num_samples, sample | ||
) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we raise an error (e.g. TiffError::LimitsExceeded
) here instead of panicking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this was similar to accessing an array element, which also panics if the index is out of bounds. Also, returning a result might make it more cumbersome to use as a library consumer. What should be checked in addition is whether x
and y
are less than width
and height
, respectively. However, this will change again if we integrate coordinate transformations.
If we decide on returning a result from this function, I'd vote for having a separate GeoTiffError
type. As far as I understand, the TiffError
deals with errors related to encoding/decoding, whereas here, we are handling a potential programming error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this was similar to accessing an array element, which also panics if the index is out of bounds.
A panic can happen if you access using data[index]
, but not if you use data.get(index)
which returns an Option
(see https://doc.rust-lang.org/std/vec/struct.Vec.html#indexing and https://doc.rust-lang.org/std/primitive.slice.html#method.get.
What should be checked in addition is whether
x
andy
are less thanwidth
andheight
, respectively. However, this will change again if we integrate coordinate transformations.
Yes, ok with leaving x > width
and y > height
unchecked for now, and do it after coordinate transforms are implemented.
If we decide on returning a result from this function, I'd vote for having a separate
GeoTiffError
type. As far as I understand, theTiffError
deals with errors related to encoding/decoding, whereas here, we are handling a potential programming error.
Don't want us to get sidetracked with implementing GeoTIFFError
and GeoTIFFResult
(though I'm sure that will come at some point). For now, I'm happy to compromise on keeping the panic here since it matches previous behaviour, and we can refactor later to return a Result
instead of panic.
Edit: there's actually an issue at #6 that mentions panicking, which we could use to track this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just got distracted by the looks of GeoTIFFError
and GeoTIFFResult
. I think it should be written GeoTiffError
and GeoTiffResult
, according to the Rust naming conventions. Consequently, I believe the type in the crate should also be called GeoTiff
.
src/lib.rs
Outdated
pub struct GeoTIFF { | ||
pub dimensions: (usize, usize), | ||
pub num_samples: usize, | ||
image_data: ImageData, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thought I had was whether we should just expose the tiff Decoder
, and have the attributes like dimensions
, num_channels
, etc be accessible via methods only. Thoughts on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that Decoder<R>
consumes an underlying stream of type R
, I think this would prevent us from implementing write support in the future.
For instance, I have a use case where I need to create a new, empty GeoTIFF, which is then populated with data. Finally, I want to save it to a file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, I've been thinking in read-only mode for now, but you're correct that we can't put in Decoder<R>
if implementing write support later.
For instance, I have a use case where I need to create a new, empty GeoTIFF, which is then populated with data. Finally, I want to save it to a file.
Secretly, I would love to see this implemented soon, mostly just so we can write some roundtrip read/write tests and not have to store sample GeoTIFF files in this repo!
10676f9
to
1d3b66d
Compare
1d3b66d
to
9d74e3d
Compare
Hi @weiji14, I've incorporated your feedback and force-pushed the branch. I've renamed the I wonder whether we should rename Besides that, I think the PR is ready for merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @weiji14, I've incorporated your feedback and force-pushed the branch. I've renamed the
GeoTIFF
type toGeoTiff
, to follow the Rust naming conventions, if you don't mind.
Awesome, thanks @gschulze! I do need to check https://rust-lang.github.io/api-guidelines/naming.html more often 😄 Will merge this in after you do the ImageData
-> RasterData
rename (probably after the weekend).
On the force-pushing, I'm a little against it as a reviewer because it undos all the Viewed
checkboxes I tick in the 'Files Changed' tab, and somewhat forces me to have to carefully look over everything again if I'm checking things locally on my diff viewer or on GitHub mobile (it's a little better on the GitHub webpage where I can click on 'Compare'). Also, I tend to squash merge so many little commits don't matter, but if you prefer regular merges or rebase+merge, I can go with that since you've only got one commit here.
assert_eq!(geotiff.get_value_at::<u8>(761, 599, 0), 147); | ||
assert_eq!(geotiff.get_value_at::<u8>(761, 599, 1), 128); | ||
assert_eq!(geotiff.get_value_at::<u8>(761, 599, 2), 165); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/image_data.rs
Outdated
use std::fmt; | ||
use std::fmt::{Debug, Formatter}; | ||
|
||
pub(super) enum ImageData { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether we should rename
ImageData
toRasterData
to align more closely withraster_width
andraster_height
. What do you think?
Yep, ok with renaming to RasterData
.
I had a half-thought about marking this enum as non_exhaustive
in case we need to support complex numbers later (CInt16
, CInt32
, CFloat32
and CFloat64
are listed at https://gdal.org/en/latest/drivers/raster/gtiff.html#gtiff-geotiff-file-format), but https://docs.rs/tiff/0.9.1/tiff/decoder/enum.DecodingResult.html doesn't list those yet so nevermind for now.
Understandable. Just tell me the workflow you prefer. I usually try to do commits that are somehow semantically complete. That's why I try to avoid "Typo"-level commits. I'm okay with you squashing them together later. |
@weiji14 Renamed 'ImageData' to 'RasterData' |
This PR migrates everything to the
tiff
crate, doing significant simplification and cleanup along the way.TIFF
struct has been renamed toGeoTIFF
. The functionality inreader.rs
andlowlevel.rs
has been removed; reading TIFF data is accomplished through theread
function in the struct implementation, which directly calls the functionality provided by thetiff
crate.get_value_at<T>(x: usize, y: usize, sample: usize)
function. The desired output type is specified by the generic parameterT
. Thesample
parameter has been added to the function signature to allow retrieving data from multi-channel TIFFs. The function panics if the value is not representable as the desired output type or the requested sample is out of range.This refactoring aims to reduce the code to the bare minimum needed to provide the functionality covered by the integration tests. The next step is to apply the correct coordinate transformation defined by the corresponding GeoTIFF tags.