Document Classification Mertic (UDOP) #93

ofir1080 · 2023-07-30T16:00:52Z

Hello,
Thank you very much for sharing the code for this amazing work.
I have been trying to reproduce and evaluate doc cls. I noticed that you encode the class names and those represent your labels. However, this creates sequences of 2-3 tokens. How do you use it to evaluate the accuracy?

Thank you very much

ofir1080 · 2023-08-01T12:09:14Z

I manually decoded and mapped the result string to class labels (which does not seems intuitive). However, the results where sometimes close but not exact, here's some examples which the model "failed":

news
advertisementwritten
news
scientific
formwritten
presentationwritten
news
news
file
scientific
file report
news
news
file
scientific
email report
file
presentationwritten
file report
scientific
scientific
scientific
hand
advertisement article
news
scientific
form folder
news
file
presentationwritten
scientific
scientific
scientific article
presentation report
presentation article
news
scientificwritten
file
scientificwritten
hand
scientific

How do you deal with such cases?

Another things, the checkpoints you supplied are finetuned in RVL-CDIP? because it seems that you guys use it for the example_io notebook.
Thanks

zinengtang · 2023-09-01T00:10:09Z

for example, if scientificwritten results in 2-3 tokens, then the evaluation will be exact match of these tokens. the model should be able to predict all the subtokens and will evaluates to correct if all tokens match

ofir1080 changed the title ~~Document Classification Mertic~~ Document Classification Mertic (UDOP) Jul 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document Classification Mertic (UDOP) #93

Document Classification Mertic (UDOP) #93

ofir1080 commented Jul 30, 2023

ofir1080 commented Aug 1, 2023 •

edited

Loading

zinengtang commented Sep 1, 2023

Document Classification Mertic (UDOP) #93

Document Classification Mertic (UDOP) #93

Comments

ofir1080 commented Jul 30, 2023

ofir1080 commented Aug 1, 2023 • edited Loading

zinengtang commented Sep 1, 2023

ofir1080 commented Aug 1, 2023 •

edited

Loading