Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1]) #143

Open
YadongLau opened this issue Dec 10, 2019 · 5 comments

Comments

@YadongLau
Copy link

YadongLau commented Dec 10, 2019

When I train my coco-style dataset by use : bash train_coco.sh
then the errors as follows:

Namespace(backbone='resnet', base_size=513, batch_size=4, checkname='deeplab-resnet', crop_size=513, cuda=True, dataset='coco', epochs=10, eval_interval=1, freeze_bn=False, ft=False, gpu_ids=[0], loss_type='ce', lr=0.01, lr_scheduler='poly', momentum=0.9, nesterov=False, no_cuda=False, no_val=False, out_stride=16, resume=None, seed=1, start_epoch=0, sync_bn=False, test_batch_size=4, use_balanced_weights=False, use_sbd=True, weight_decay=0.0005, workers=4)
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
Using poly LR Scheduler!
Starting Epoch: 0
Total Epoches: 10
0%| | 0/1 [00:00<?, ?it/s]
=>Epoches 0, learning rate = 0.0100, previous best = 0.0000
Traceback (most recent call last):
File "train.py", line 306, in
main()
File "train.py", line 299, in main
trainer.training(epoch)
File "train.py", line 104, in training
output = self.model(image)
File "/home/dachen/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/dachen/conda/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/dachen/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/dachen/Music/pytorch-deeplab-xception/modeling/deeplab.py", line 30, in forward
x = self.aspp(x)
File "/home/dachen/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/dachen/Music/pytorch-deeplab-xception/modeling/aspp.py", line 70, in forward
x5 = self.global_avg_pool(x)
File "/home/dachen/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/dachen/conda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/home/dachen/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 541, in call
result = self.forward(*input, **kwargs)
File "/home/dachen/conda/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 81, in forward
exponential_average_factor, self.eps)
File "/home/dachen/conda/lib/python3.7/site-packages/torch/nn/functional.py", line 1666, in batch_norm
raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 256, 1, 1])
what should i do ?

@HuangLian126
Copy link

@jfzhang95@lyd953621450 Hi, I have faced the same problem. Do you train your own datasets? I think the key is the label.

@hlwang1124
Copy link

@lyd953621450 @HuangLian126 I think the reason is that the BatchNorm in the global_avg_pool requires the batch size to be larger than 1. If you have already set a batch size larger than 1 and still faces this problem, that is probably because the remainder of the training data number divided by the
batch size is 1, which leads to the fact that there can always be one single training data in one epoch. In such a case, I suggest you set the drop_last flag in the dataloader as true to drop this last single training data.

@linzhenyuyuchen
Copy link

set batch size bigger than 1

@kimsu1219
Copy link

@hlwang1124 i have same issue,
how can i set the drop_last flag as true?,
also if i have problem in my dataset, can dataset trigger this issue??

@SpectorSong
Copy link

python train.py --backbone xception --lr 0.0001 --epochs 10 --batch-size 2 --gpu-ids 0 --checkname deeplab-xception
Hi, I still got this error when my batch size is 2. Have you solved this problem? @YadongLau @kimsu1219 @HuangLian126

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants