Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PR] Adding benchmarks between models #31

Merged
merged 49 commits into from
Dec 24, 2023
Merged

[PR] Adding benchmarks between models #31

merged 49 commits into from
Dec 24, 2023

Conversation

LuchoTurtle
Copy link
Member

closes #12

This PR will create a benchmark between some multimodal models that are available on Bumblebee that allow image captioning.

I'm going to make a performance text with COCO dataset (perhaps the most famous open-source labelled set) to evaluate the performance of each model in Elixir.

Although the performance benchmark tests will be made in Elixir, the results will be exported to a file and then processed with Python, because it has more support for libraries to perform NLP metric evaluation (R was also considered, but Python is more beginner-friendly for anyone that's curious with this repo).

To measure the model performance, I'll try to get scores on different metrics: BLEU, CIDER, METEOR, SPICE and ROUGE-L. Although not all of these will be measured (BLEU and ROUGE are probably the most relevant), it's interesting to mention them as alternative routes.

@LuchoTurtle LuchoTurtle added documentation Improvements or additions to documentation enhancement New feature or enhancement of existing functionality in-progress An issue or pull request that is being worked on by the assigned person epic A feature idea that is large enough to require a sprint (5 days) or more and has smaller sub-issues. labels Dec 12, 2023
@LuchoTurtle LuchoTurtle self-assigned this Dec 12, 2023
Copy link

codecov bot commented Dec 12, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (9d38d67) 100.00% compared to head (8ecf891) 100.00%.
Report is 36 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main       #31   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            2         3    +1     
  Lines           54        76   +22     
=========================================
+ Hits            54        76   +22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@LuchoTurtle
Copy link
Member Author

I think this PR is getting big already and it provides a good foundation to add more benchmarks in the future - it's already fairly streamlined.

The process of seeing the supported models in Bumblebee is a bit cumbersome. I'm using https://jonatanklosko-bumblebee-tools.hf.space/apps/repository-inspector in conjunction and, for image captioning (not to be confused with image classification) is fairly limited. It seems that on the top 10 most downloaded models in Hugging Face, only BLIP-base and BLIP-large are supported.

I've added benchmarks for ResNet-50 because, even though it falls under image classification, it sometimes yields proper captions. However, the metric scores are bad (when compared with the others, which is expected).

I'm submitting for review now 👌

@LuchoTurtle LuchoTurtle marked this pull request as ready for review December 21, 2023 00:18
@LuchoTurtle LuchoTurtle added awaiting-review An issue or pull request that needs to be reviewed and removed in-progress An issue or pull request that is being worked on by the assigned person labels Dec 21, 2023
@LuchoTurtle LuchoTurtle assigned nelsonic and unassigned LuchoTurtle Dec 21, 2023
Copy link
Member

@nelsonic nelsonic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great additions @LuchoTurtle 👌

@nelsonic nelsonic merged commit f9890ba into main Dec 24, 2023
3 checks passed
@nelsonic nelsonic deleted the evaluation branch December 24, 2023 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review An issue or pull request that needs to be reviewed documentation Improvements or additions to documentation enhancement New feature or enhancement of existing functionality epic A feature idea that is large enough to require a sprint (5 days) or more and has smaller sub-issues.
Projects
Status: Done
Status: Done
Development

Successfully merging this pull request may close these issues.

Feat: Comparing Pre-trained Image Classification Models
2 participants