Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About reproducing Table 2 #49

Open
SeongRyong0726 opened this issue Jan 1, 2025 · 9 comments
Open

About reproducing Table 2 #49

SeongRyong0726 opened this issue Jan 1, 2025 · 9 comments

Comments

@SeongRyong0726
Copy link

Hi !
Thank you for your great work.

I am interest in this paper, and before studying about it, I want to reproduce the Table 2. (that uses COIN, Ego4D LTA benchmark)
However, I could not find the way to reproduce Table 2 result.
I don't need to test training.(I also have only 1 3090 GPU) I want to evaluate two bench mark with retrained model.

If you know the location of document or how to do it (steps) and reply, I will be happy.

Thank you!
Happy new year.

@SeongRyong0726
Copy link
Author

The demo is executed well.
I executed the /scripts/coin/live1+_evaluate.sh script.
And the error seems like "the model and dataset is not prepared."

First, I face the error ValueError: Can't find 'adapter_config.json' at 'outputs/coin_benchmarks/live1+/' and I changed --resume_from_checkpoint from 'outputs/coin_benchmarks/live1+/' to chenjoya/videollm-online-8b-v1plus which I used in demo.
Then I faced the problem of FileNotFoundError: [Errno 2] No such file or directory: 'datasets/coin/videos_2fps_384_1+3x3_google--siglip-large-patch16-384'

I thought I have to prepare dataset and model properly.

However, I didn't catch the exact way to go.

Thank you in advance..

@chenjoya
Copy link
Collaborator

chenjoya commented Jan 4, 2025

Hello, thanks for your interests & happy new year! Sorry for late response. It seems that you want to evaluate the model performance on COIN and Ego4D LTA. Let me clarify firstly, Table 2 is obtained by specific fine-tuning on their training set, ie, trained on COIN / Ego4D LTA training set, then evaluate on COIN / Ego4D LTA testing set.

chenjoya/videollm-online-8b-v1plus is not the model for that. We did not release the two checkpoints as they are just downstream experiments. But if you want to retrain the model on these sets, I am very happy to help!

@SeongRyong0726
Copy link
Author

Thank you for your reply!
My understanding is that I have to start from baseline model and fine-tune the model with each dataset. Then I can get evaluate result, right?

My concerns is, in the other issues, the hyper parameter values seem to be different for each GPU configuration.
I think if author release the fine-tuned model weight, I can easily check the evaluation result and go to next step.

The reasons why I think like that,

  1. The Table 2 is the only way to check the performance of your model.
  2. It is easy using fine-tuned model for performance checking rather than finding hyper parameter with trial-and-error way and fine-tuning again.

I'm a not expert on this area, so I think I might have wrong knowledge. Feel free to fix my wrong knowledge haha..

So, can you tell me "fine-tune" is available or not. If not, can you tell me how can I retrain..!!

Thank you so much!!!

@chenjoya
Copy link
Collaborator

chenjoya commented Jan 7, 2025

Hi, thanks for your question!

start from baseline model and fine-tune the model with each dataset

Actually we do not need to start from the baseline model. We just directly fine-tune the newly initialized model and get the results.

check the performance of your model

The results are reliable, please rest assured. For COIN reproduce, please check #26. For Ego4D LTA, the results are obtained by submitting the inference results and the official server performs the evaluation (https://eval.ai/web/challenges/challenge-page/1598/leaderboard/3881/Action).

These two experiments were conducted a long time ago, and since they were not very important, the checkpoints have already been cleared. Sorry for that. Furthermore, we cannot release the COIN checkpoint due to the company policy. If you really need Ego4D LTA checkpoints, I can consider redoing the experiment ;)

@SeongRyong0726
Copy link
Author

To prepare the coin dataset first, I use the coin.json file from https://github.com/coin-dataset/annotations/blob/master/COIN.json .
However, I observe that in the COIN class (/data/coin/coin.py) it detect 'train' and 'test' as 'subset' (but in the file, those are "training" and "testing").
I thought that is official file (uploaded years ago), or is there another file for this repo? (also, is there another files should I get?)
Sorry, it is first time to training model.

Thank you in advance..!

@chenjoya
Copy link
Collaborator

chenjoya commented Jan 8, 2025

Hi! Yes we use this. It is the correct file. The spilt is the argument passed by ourselves, either train or test. This name is to be compatible with other datasets, and the actual usage is in split in anno['subset'].lower() (

} for video_uid, anno in annos.items() if (split in anno['subset'].lower()) and (video_uid in self.metadata)]
). So there should be okay.

@SeongRyong0726
Copy link
Author

Hi!
I tried to download the coin dataset, but some of video are private and some of videos are not downloaded well.
Also, I don't have enough source..
Can I get Ego4D LTA checkpoints for reproducing result..? (I really want it)

If you do this, I really appreciate it!..!
Thank you in advance.

@chenjoya
Copy link
Collaborator

Ok, I will help to retrain a Ego4D LTA checkpoint.

COIN missing videos are okay. We also do not get the full version.

@SeongRyong0726
Copy link
Author

Thanks a lot, when I can get it?
Have a good holiday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants