Using NTXentLoss with PairMarginMiner in different distance functions #555
-
Hi, Kevin. @KevinMusgrave I noticed that the default distance used by NTXentLoss is cosine similarity while that used by PairMarginMiner is L2. Does it make sense that directly using NTXentloss and PairMarginMiner without modifying their default distance at the same time? Or I should change any one distance function to keep their distance computation consistent? |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
Conceptually, it's nice if they use the same distance function. But in practice, I don't know if it matters. |
Beta Was this translation helpful? Give feedback.
-
Thanks for @KevinMusgrave 's prompt reply. Although in fact this combination of the loss and the miner with their different default distance formulas can work at the program level, still hoping that someone can notice this issue and theoretically explain whether such a method is reasonable. There are 2 main concern about this topic:
|
Beta Was this translation helpful? Give feedback.
-
Moving this to Discussions |
Beta Was this translation helpful? Give feedback.
-
Hi, Kevin. @KevinMusgrave I think this discussion can be closed. After my deeper research on NTXentLoss and SupConLoss, I found that both of them have "intrinsic ability to perform hard positive/negative mining" (chapter 3.2.2 from paper https://arxiv.org/abs/2004.11362). So maybe we don't need apply a miner along with these loss functions. For a more detailed explanation, we could find that:
|
Beta Was this translation helpful? Give feedback.
Hi, Kevin. @KevinMusgrave
I think this discussion can be closed. After my deeper research on NTXentLoss and SupConLoss, I found that both of them have "intrinsic ability to perform hard positive/negative mining" (chapter 3.2.2 from paper https://arxiv.org/abs/2004.11362). So maybe we don't need apply a miner along with these loss functions.
For a more detailed explanation, we could find that: