Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

force-level1-landsat does not download all the found scenes and fails to populate the queue #320

Open
kelewinska opened this issue Apr 11, 2024 · 8 comments

Comments

@kelewinska
Copy link

Hi,
I am observing a peculiar behavior when executing the force-level1-landsat command.
I run the following command

force-level1-landsat search --download /data/test3/aoi.txt -s OLI -d 20240101,20240131 --secret /home/user/secret.txt -q /data/test3/pool3.txt /data/test3

seeking Landsat OLI scenes acquired between in January 2024 for the 005-047 Landsat tile (Puerto Rico) (my aoi.txt comprises a single line of 005047).

The output is:

Sensor(s): OLI
Tile(s): 005047
Date range: 2024-01-01 to 2024-01-31
Included months: 1,2,3,4,5,6,7,8,9,10,11,12
Cloud cover: 0% to 100%

3 Landsat Level 1 scenes matching criteria found
3.14 GB data volume found
Downloading: 100%|=========================================================================================| 1/1 [00:57<00:00, 57.90s/product bundle]
Download complete

For some reason only one out of three found scenes is pulled down and added to the queue file.

Upon the second execution of exactly the same command two remaining scenes are downloaded but the queue file is not updated.

Sensor(s): OLI
Tile(s): 005047
Date range: 2024-01-01 to 2024-01-31
Included months: 1,2,3,4,5,6,7,8,9,10,11,12
Cloud cover: 0% to 100%

3 Landsat Level 1 scenes matching criteria found
3.14 GB data volume found
1 product bundles found in output directory, 2 not downloaded yet.
Remaining download size: 2.22 GB
Downloading: 100%|=========================================================================================| 2/2 [00:46<00:00, 23.34s/product bundle]
Download complete

Even if upon the second execution of the command, I point to a new queue file, the file is created but remains empty. The .tar archives are downloaded correctly and the data are not corrupted.

I run a test for Landsat tiles 200-028, and 103-073 (selected randomly). For the 200-028 the behavior was exactly as described above: a single scene was pulled upon the first run, and two remaining scenes were downloaded upon the second run, with the queue file not being updated upon the second run.
for the 103-073 tile, however, the first execution resulted in zero scenes being downloaded:

Sensor(s): OLI
Tile(s): 103073
Date range: 2024-01-01 to 2024-01-31
Included months: 1,2,3,4,5,6,7,8,9,10,11,12
Cloud cover: 0% to 100%

2 Landsat Level 1 scenes matching criteria found
2.35 GB data volume found
Downloading: 0product bundle [00:00, ?product bundle/s]
Download complete

and only the second execution of the command downloaded the data. A queue file was generated, but remained empty.

Tile(s): 103073
Date range: 2024-01-01 to 2024-01-31
Included months: 1,2,3,4,5,6,7,8,9,10,11,12
Cloud cover: 0% to 100%

2 Landsat Level 1 scenes matching criteria found
2.35 GB data volume found
Downloading: 100%|=========================================================================================| 2/2 [00:39<00:00, 19.53s/product bundle]
Download complete

I ran force version v. 3.7.12 and tested it on two servers, one a computer-server running Ubuntu 20.04.6 LTS | Kernel: Linux 5.4.0-172-generic | Architecture: x86-64 and the other one a computer-server running Ubuntu 22.04.4 LTS | Kernel: Linux 5.15.0-91-generic | Architecture: x86-64. In both cases, force is executed in a docker container davidfrantz/force:latest (the image was pulled from the hub.docker.com, the name and tag correspond with the image pointer used in the docker pull command).

On both servers, I observe exactly the same behavior - to download the data i need to execute the command more than once, and the files pulled down on the second attempt are not added to the queue.

Furthermore, when I just ran

force-level1-landsat search /data/test3/aoi.txt -s OLI -d 20240101,20240131 --secret /home/user/secret.txt /data/test3

The download links were generated also only for one image when I ran it for the first time. The links for the two remaining files were created (in a new file) upon the second execution of the command.

Did anyone come across anything like this? Do you have any idea what might be the issue here, and how to troubleshoot it? It is easy to spot in cases when only a handful of images need to be downloaded, but becomes a pain for bigger data pulls and processing.

Thank you in advance!
Kasia

@davidfrantz
Copy link
Owner

... huh

I never noticed this myself, but I can reproduce it 100%...

@ernstste, do you have some idea why this could happen?

Best,
David

@ernstste
Copy link
Collaborator

I spent some time looking into this and it's a bit tricky.
The way that the API responds to download-requests seems to have changed. The results are now in two different objects ('available downloads' and 'preparing downloads') and not all of the results contain the product id or any other identifier that would allow to derive what hides behind the URL.

Creating a fix that makes sure that we are getting all downloads with the first request was easy to implement. However, creating the queue files is a different story. I had already implemented a workaround but then had to realize that the API can change the order of the returned results for download-requests and we have no way of knowing what's going on. I'll have to look into another solution.

@ernstste
Copy link
Collaborator

Okay I think it should all be fixed now. Not exactly thrilled by the changes but it should work okay now. Would be great if you guys can confirm as I have very limited time for testing now.

@ernstste
Copy link
Collaborator

@davidfrantz
Copy link
Owner

Thanks so much Stefan!

No, i don't think so. It might be a problem with the depatch action. The signal didn't arrive in the base repository.

I will try to have a look next week

@ernstste
Copy link
Collaborator

Ah true, it's actually the credentials for the dispatch that seem to fail.
I saw you updated the version of the dispatch action and did the same, but the result is unfortunately the same.
Let me know if there is something that needs to be changed on my end once you had the chance to look into it next week.

@davidfrantz
Copy link
Owner

Ah true, it's actually the credentials for the dispatch that seem to fail. I saw you updated the version of the dispatch action and did the same, but the result is unfortunately the same. Let me know if there is something that needs to be changed on my end once you had the chance to look into it next week.

I cannot see anything obvious. I fear we need to sit together to solve this...

@ernstste
Copy link
Collaborator

Sure, feel free to give me a call and we'll see what we can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants