Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于训练的Loss #22

Open
yenanjing opened this issue May 6, 2023 · 2 comments
Open

关于训练的Loss #22

yenanjing opened this issue May 6, 2023 · 2 comments

Comments

@yenanjing
Copy link

您好,
(1)请问训练loss中,除了两个对比损失,生成任务的L(task)是仅指交叉熵损失吗?(即torch.nn.CrossEntropyLoss)
(2)T5原文中提到了类似spanBert的bert-style的mask损失,请问论文中是否应用了这种目标函数呢?还是仅使用seq2seq的目标函数呢?
非常感谢!

@LeMei
Copy link
Owner

LeMei commented May 6, 2023

您好, (1)请问训练loss中,除了两个对比损失,生成任务的L(task)是仅指交叉熵损失吗?(即torch.nn.CrossEntropyLoss) (2)T5原文中提到了类似spanBert的bert-style的mask损失,请问论文中是否应用了这种目标函数呢?还是仅使用seq2seq的目标函数呢? 非常感谢!

(1)是的,生成序列的每一步还是一个分类任务,因此用的是交叉熵损失。
(2)T5原文中提到了类似spanBert的bert-style的mask损失应该是训练T5所需要的损失。我们是在T5上进行微调。用到的是有监督损失(seq2seq的目标函数)+无监督的对比损失。

@huigeStudent
Copy link

您好, (1)请问训练loss中,除了两个对比损失,生成任务的L(task)是仅指交叉熵损失吗?(即torch.nn.CrossEntropyLoss) (2)T5原文中提到了类似spanBert的bert-style的mask损失,请问论文中是否应用了这种目标函数呢?还是仅使用seq2seq的目标函数呢? 非常感谢!

你好 请问一下你跑通了吗,能提供相关代码嘛?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants