Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nans gradients for tensors with singleton dimensions for budget_bisect. #35

Open
ssantos97 opened this issue Feb 28, 2024 · 5 comments
Open

Comments

@ssantos97
Copy link

ssantos97 commented Feb 28, 2024

Tensors with singleton dimensions like

    x = torch.tensor([[[-0.0744, -0.0904]],

    [[-0.0452, -0.0386]],

    [[-0.0187, -0.0100]],

    [[-0.0060,  0.0100]],

    [[-0.0660, -0.1066]],

    [[-0.0289, -0.0087]],

    [[-0.0227, -0.0159]],

    [[-0.0547, -0.0428]],

    [[-0.0941, -0.0747]],

    [[-0.0653, -0.0478]],

    [[-0.0747, -0.0417]],

    [[-0.0740, -0.0367]]], dtype=torch.float64, requires_grad=True)

yield gradients with Nans for budget_bisect with budget =2. If we squeeze the singleton dimention gradients are correct.

@ssantos97 ssantos97 changed the title Nans gradients for tensors with singleton dimensions. Nans gradients for tensors with singleton dimensions for budget_bisect. Feb 28, 2024
@bpopeters
Copy link
Collaborator

Hi Saul,

Could you provide some code that causes this error? I haven't been able to reproduce it.

@bpopeters
Copy link
Collaborator

First couple of things I've noticed:

  1. The nans occur elsewhere if you change from float64 to float32.
  2. The bug does not only occur with budget==2. I can produce it for many inputs with budget==1.

@ssantos97
Copy link
Author

ssantos97 commented Mar 6, 2024

Here is some code:
x = torch.tensor([[[-0.0743, -0.0903]], [[-0.0453, -0.0387]], [[-0.0187, -0.0100]], [[-0.0060, 0.0101]], [[-0.0660, -0.1066]], [[-0.0289, -0.0087]], [[-0.0226, -0.0159]], [[-0.0548, -0.0428]], [[-0.0940, -0.0746]], [[-0.0654, -0.0479]], [[-0.0747, -0.0418]], [[-0.0741, -0.0367]]], requires_grad=True)

check = gradcheck(budget_bisect, (x, 2, -1), eps=1e-5)

@bpopeters
Copy link
Collaborator

bpopeters commented Mar 6, 2024

Backward also fails if dim == len(x.shape). I believe this problems may be related to the gradient issues.

(it's also something we should have already been checking for, but test_arbitrary_dimension_grad only includes entmax_bisect)

@bpopeters
Copy link
Collaborator

@ssantos97 @andre-martins does either of you have a solution for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants