-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fails on zero grad #21
Comments
That should've been fixed with issue #7 with the following line: https://github.com/MichiganCOG/ViP/blob/dev/train.py#L182. Do you have this version from dev pulled? |
I'm using an older version (apart from pulling from master, I immediately made train.py unmergeable). My mistake for missing that issue. |
I came back to this --- it appears the modification in the dev branch resolves a different problem. That is, the weights that are causing and issue for me are not frozen, but have no gradient because they do not contribute to the loss. Consider three regression nodes --- yaw, pitch, and roll. I modify training to only regress yaw by performing backpropagation on that node directly. The weights leading into the nodes for roll and pitch are left as "None" by the autograd on loss.backward(), and thus fail at the cited line. |
Can you post your code? Training and relevant loss and model files. A github link would work. |
In instances where a neuron doesn't factor into the loss (e.g., a component of the loss is disabled for a specific experiment, resulting in a neuron or set of neurons being unused), autograd returns None for the unused connections. This results in a crash at the line:
param.grad *= 1./float(args['psuedo_batch_loop']*args['batch_size']
With the error:
TypeError: unsupported operand type(s) for *=: 'NoneType' and 'float'
This can be remedied by inserting:
if param.grad is not None:
prior to the line in question, but I'm unsure of any upstream consequences.
The text was updated successfully, but these errors were encountered: