Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inquiry Regarding Experiment with Aristo-RoBERTa Encoder on OBQA Dataset in GreaseLM Paper #15

Open
EchoDreamer opened this issue Mar 11, 2024 · 7 comments

Comments

@EchoDreamer
Copy link

I am reaching out to seek assistance regarding my attempts to reproduce the experimental results mentioned in the GreaseLM paper, specifically concerning the utilization of the Aristo-RoBERTa encoder on the OBQA dataset.

Despite multiple attempts, I have been unable to replicate the performance reported in the paper. In order to facilitate my efforts, I would greatly appreciate it if you could provide more comprehensive details regarding the hyperparameters used in this particular experiment.

Your guidance on this matter would be immensely valuable to me, and I am eager to hear from you at your earliest convenience. Thank you very much for your attention to this matter.

@chit-ang
Copy link

Have you succeeded in replicating it so far

@EchoDreamer
Copy link
Author

Not yet. Do you have an opinion on this?

@chit-ang
Copy link

Could you please reproduce the results on roberta-large

@chit-ang
Copy link

I was unable to load the model directly from the transforms while replicating roberta-large. I downloaded roberta-large under FacebookAI in huggingface, but the training result was very poor, only more than 30%, have you ever encountered this problem?

@EchoDreamer
Copy link
Author

I was unable to load the model directly from the transforms while replicating roberta-large. I downloaded roberta-large under FacebookAI in huggingface, but the training result was very poor, only more than 30%, have you ever encountered this problem?

I downloaded the roberta-large model directly from the huggingface mirror site(https://hf-mirror.com/) and then trained it, and the results were close to the original paper's results

@chit-ang
Copy link

不知道是不是方便加个联系方式,我最近也在复现这篇文章可以一起交流一下

@biegekekeke
Copy link

不知道是不是方便加个联系方式,我最近也在复现这篇文章可以一起交流一下

请问复现这篇论文需要多大的显存?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants