Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU memory usage during training #8

Open
wren93 opened this issue Aug 25, 2023 · 1 comment
Open

GPU memory usage during training #8

wren93 opened this issue Aug 25, 2023 · 1 comment

Comments

@wren93
Copy link

wren93 commented Aug 25, 2023

Hi, thank you for sharing this great work. I'm trying to train LVDM on ucf101 for unconditional generation and I observed a weird gpu memory usage during training. When I used batch size=2 I got from nvidia-smi that my gpu memory usage during training is roughly 73000mb. However, when I increased the batch size to 32 the memory usage went down to ~35000mb during training. I tried to debug the code and it seems that the UNet model is consuming a large amount of memory when the batch size is small (memory increases from 8g to ~73g before and after line 626 - line 634 in lvdm/models/modules/openaimodel3d.py). I wonder if you have any insights on this issue. Thanks!

@wren93
Copy link
Author

wren93 commented Aug 25, 2023

Also I'm wondering what are the usages of the batch size and num_workers parameters under the "trainer" attribute in the config file (as shown in the figure). It seems that the actual parameters to control the dataloader are under "data"?
1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant