-
Notifications
You must be signed in to change notification settings - Fork 27.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PR #35438 introduced a new bug #35649
Comments
PR link: #35438 |
I think the PR: #35438 should be reverted and the proper way to fix the bug mentioned in the PR is as follows. In the following code, But the true meaning of this code is to scale loss when GA bug fix is not performed. This is not identical to
|
Thanks! Would you like to make a PR for this? Else I can do so today |
@muellerzr Thanks for reply. I will open a PR today. |
Hi @techkang , it has been an evidence that #35121 introduces bug making the loss of the Qwen2VL model incorrect through our rigorous experiments in #35438 . I think we should not only focus on the model with loss function but also pay attention to the models without |
@techkang Yep, the new PR looks better to me, let us perform some experiments on it |
System Info
(base) MBP-HD6JD9Q599-2052 :: ~/code/transformers ‹main*› % transformers-cli env 1 ↵
Copy-and-paste the text below in your GitHub issue and FILL OUT the two last points.
transformers
version: 4.49.0.dev0Who can help?
@muellerzr @hiyouga @ArthurZucker @SunMarc
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
Test Passed.
The text was updated successfully, but these errors were encountered: