Image loading in dataloader code #108

jshtok · 2023-09-06T13:02:15Z

Hello, and thank you very much for the contribution of the code.
While running your I have noticed that only the first page of a PDF file is loaded (as image). Indeed, in your class PregeneratedDatasetBase, the add_images() routine features the line
im = convert_from_path(im_path)[0]
While in the original DUE code the _get_page_img() routine uses the page_no field to fetch the relevant page.

Can you please explain this situation?
Thank you!

The text was updated successfully, but these errors were encountered:

Coobiw · 2023-09-06T17:04:28Z

Hello, I also busy with it! I'm curious about how can we get the images and corresponding Q-A pairs. Do you have any experience?

jshtok · 2023-09-06T18:17:21Z

Well, I just took the DUE repo as reference, and brute force fixed the dataloader to fetch the relevant page.

…

On Wed, 6 Sep 2023, 20:04 Coobiw, ***@***.***> wrote: Hello, I also busy with it! I'm curious about how can we get the images and corresponding Q-A pairs. Do you have any experience? — Reply to this email directly, view it on GitHub <#108 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACOBU6UTZVAHSHOWE5GQ5RDXZCUKPANCNFSM6AAAAAA4NJXH7A> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Coobiw · 2023-09-07T07:39:32Z

Thanks for your reply! So I thought that you first generate the memmaps. After that, you use the memmaps to build the dataloader, and save the results(image-QA pairs) into a format file just like json?

jshtok · 2023-09-09T15:22:49Z

Yes, I used the DUE-baselines repo to generate the memmaps, then did the finetuning, and then the --evaluate option in the train config saves the predictions. Please notice you need to convert these predictions to another format (back in the DUE-baselines) in order to run them against the GT annotations in the DUE-evaluator repo. Then, finally, you get the performance numbers.

…

On Thu, Sep 7, 2023 at 10:39 AM Coobiw ***@***.***> wrote: Thanks for your reply! So I thought that you first generate the memmaps. After that, you use the memmaps to build the dataloader, and save the results(image-QA pairs) into a format file just like json? — Reply to this email directly, view it on GitHub <#108 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACOBU6QUD5RCALMVOLQKP63XZF24BANCNFSM6AAAAAA4NJXH7A> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Coobiw · 2023-09-09T15:57:52Z

Thank you very much!!! I want convert the memmaps into png files because I want to use the images as inputs. I want to ask that what process repo did you use? The benchmarker in UDOP repo or the original one in DUEBenchmark/baselines?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image loading in dataloader code #108

Image loading in dataloader code #108

jshtok commented Sep 6, 2023

Coobiw commented Sep 6, 2023

jshtok commented Sep 6, 2023 via email

Coobiw commented Sep 7, 2023

jshtok commented Sep 9, 2023 via email

Coobiw commented Sep 9, 2023

Image loading in dataloader code #108

Image loading in dataloader code #108

Comments

jshtok commented Sep 6, 2023

Coobiw commented Sep 6, 2023

jshtok commented Sep 6, 2023 via email

Coobiw commented Sep 7, 2023

jshtok commented Sep 9, 2023 via email

Coobiw commented Sep 9, 2023