Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA_OUT_OF_MEMORY #11

Open
mayanktiwariiiitdmj opened this issue Mar 1, 2021 · 1 comment
Open

CUDA_OUT_OF_MEMORY #11

mayanktiwariiiitdmj opened this issue Mar 1, 2021 · 1 comment

Comments

@mayanktiwariiiitdmj
Copy link

mayanktiwariiiitdmj commented Mar 1, 2021

When I am running the code using the following command:

./scripts/train/train_on_target.sh Obama head2headDataset

with contents of the train_on_target.sh file as:

target_name=$1
dataset_name=$2

python train.py --checkpoints_dir checkpoints/$dataset_name \
                --target_name $target_name \
                --name head2head_$target_name \
                --dataroot datasets/$dataset_name/dataset \
                --serial_batches

Then I am getting the following error:

Traceback (most recent call last):
  File "train.py", line 108, in <module>
    flow_ref, conf_ref, t_scales, n_frames_D)
  File "/home/nitin/head2head/util/util.py", line 48, in get_skipped_flows
    flow_ref_skipped[s], conf_ref_skipped[s] = flowNet(real_B[s][:,1:], real_B[s][:,:-1])
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet.py", line 38, in forward
    flow, conf = self.compute_flow_and_conf(input_A, input_B)
  File "/home/nitin/head2head/models/flownet.py", line 55, in compute_flow_and_conf
    flow1 = self.flowNet(data1)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet2_pytorch/models.py", line 156, in forward
    flownetfusion_flow = self.flownetfusion(concat3)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet2_pytorch/networks/FlowNetFusion.py", line 62, in forward
    concat0 = torch.cat((out_conv0,out_deconv0,flow1_up),1)
RuntimeError: CUDA out of memory. Tried to allocate 82.00 MiB (GPU 0; 5.80 GiB total capacity; 4.77 GiB already allocated; 73.56 MiB free; 4.88 GiB reserved in total by PyTorch)

I have checked the batch size in the file options/base_options.py. It is already set to 1. How can I solve the above mentioned exception. My system has 6 GB NVIDIA GTX 1660 Super GPU.

@cyrala
Copy link

cyrala commented Jul 29, 2022

When I am running the code using the following command:

./scripts/train/train_on_target.sh Obama head2headDataset

with contents of the train_on_target.sh file as:

target_name=$1
dataset_name=$2

python train.py --checkpoints_dir checkpoints/$dataset_name \
                --target_name $target_name \
                --name head2head_$target_name \
                --dataroot datasets/$dataset_name/dataset \
                --serial_batches

Then I am getting the following error:

Traceback (most recent call last):
  File "train.py", line 108, in <module>
    flow_ref, conf_ref, t_scales, n_frames_D)
  File "/home/nitin/head2head/util/util.py", line 48, in get_skipped_flows
    flow_ref_skipped[s], conf_ref_skipped[s] = flowNet(real_B[s][:,1:], real_B[s][:,:-1])
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet.py", line 38, in forward
    flow, conf = self.compute_flow_and_conf(input_A, input_B)
  File "/home/nitin/head2head/models/flownet.py", line 55, in compute_flow_and_conf
    flow1 = self.flowNet(data1)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet2_pytorch/models.py", line 156, in forward
    flownetfusion_flow = self.flownetfusion(concat3)
  File "/home/nitin/anaconda3/envs/head2head/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nitin/head2head/models/flownet2_pytorch/networks/FlowNetFusion.py", line 62, in forward
    concat0 = torch.cat((out_conv0,out_deconv0,flow1_up),1)
RuntimeError: CUDA out of memory. Tried to allocate 82.00 MiB (GPU 0; 5.80 GiB total capacity; 4.77 GiB already allocated; 73.56 MiB free; 4.88 GiB reserved in total by PyTorch)

I have checked the batch size in the file options/base_options.py. It is already set to 1. How can I solve the above mentioned exception. My system has 6 GB NVIDIA GTX 1660 Super GPU.

Hi, do you have the head2headDataset? it showed file was corrupted when I unzip it, can you provide the dataset.zip for me? Thank you ! my email: [email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants