Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

分布式训练 #33

Open
weiMytian opened this issue Jul 31, 2022 · 1 comment
Open

分布式训练 #33

weiMytian opened this issue Jul 31, 2022 · 1 comment

Comments

@weiMytian
Copy link

万分感谢您分享的代码,不知道您是否使用过多卡对程序进行训练,我将3D可变形卷积用在自己的任务上,当我使用单卡训练时程序可以正常运行,但是使用多卡运行时程序报了如下错误,始终没有解决该问题:
error in deformable_col2im_cuda: an illegal memory access was encountered
error in deformable_im2col_cuda: an illegal memory access was encountered
Traceback (most recent call last):
File "train.py", line 598, in
main()
File "train.py", line 396, in main
loss.backward() # cal grad
File "/home/jp/anaconda3/envs/torch/lib/python3.8/site-packages/torch/_tensor.py", line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "/home/jp/anaconda3/envs/torch/lib/python3.8/site-packages/torch/autograd/init.py", line 154, in backward
Variable._execution_engine.run_backward(
File "/home/jp/anaconda3/envs/torch/lib/python3.8/site-packages/torch/autograd/function.py", line 199, in apply
return user_fn(self, *args)
File "/home/jp/anaconda3/envs/torch/lib/python3.8/site-packages/torch/autograd/function.py", line 340, in wrapper
outputs = fn(ctx, *args)
File "/MIP/zhang/3d_code/dcn/functions/deform_conv_func.py", line 46, in backward
D3D.deform_conv_backward(input, weight,
RuntimeError: CUDA error: an illegal memory access was encountered
不知您是否遇到过同样的问题,期待您的回复!

@fzs347
Copy link

fzs347 commented Mar 20, 2023

万分感谢您分享的代码,不知道您是否使用过多卡对程序进行训练,我将3D可变形卷积用在自己的任务上,当我使用单卡训练时程序可以正常运行,但是使用多卡运行时程序报了如下错误,始终没有解决该问题: error in deformable_col2im_cuda: an illegal memory access was encountered error in deformable_im2col_cuda: an illegal memory access was encountered Traceback (most recent call last): File "train.py", line 598, in main() File "train.py", line 396, in main loss.backward() # cal grad File "/home/jp/anaconda3/envs/torch/lib/python3.8/site-packages/torch/_tensor.py", line 307, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs) File "/home/jp/anaconda3/envs/torch/lib/python3.8/site-packages/torch/autograd/init.py", line 154, in backward Variable._execution_engine.run_backward( File "/home/jp/anaconda3/envs/torch/lib/python3.8/site-packages/torch/autograd/function.py", line 199, in apply return user_fn(self, *args) File "/home/jp/anaconda3/envs/torch/lib/python3.8/site-packages/torch/autograd/function.py", line 340, in wrapper outputs = fn(ctx, *args) File "/MIP/zhang/3d_code/dcn/functions/deform_conv_func.py", line 46, in backward D3D.deform_conv_backward(input, weight, RuntimeError: CUDA error: an illegal memory access was encountered 不知您是否遇到过同样的问题,期待您的回复!

请问您是否解决了这个问题?我也遭遇了同样问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants