Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error(s) in loading state_dict for Cnn14: #8

Open
callzhang opened this issue May 5, 2021 · 5 comments
Open

Error(s) in loading state_dict for Cnn14: #8

callzhang opened this issue May 5, 2021 · 5 comments

Comments

@callzhang
Copy link

Strictly following the sample code yields the following error:

Exception has occurred: RuntimeError
Error(s) in loading state_dict for Cnn14:
	size mismatch for fc_audioset.weight: copying a param with shape torch.Size([527, 2048]) from checkpoint, the shape in current model is torch.Size([0, 2048]).
	size mismatch for fc_audioset.bias: copying a param with shape torch.Size([527]) from checkpoint, the shape in current model is torch.Size([0]).
  File "/home/stardust/algorithms-playground/audio/kedaxunfei/audio_event_detection.py", line 19, in <module>
    at = AudioTagging(checkpoint_path=None, device='cuda')

code to reproduce:

paths = glob('path/to/audio/*.mp3')
audio_path = paths[0]
(audio, _) = librosa.core.load(audio_path, sr=32000, mono=True)
audio = audio[None, :]  # (batch_size, segment_samples)

print('------ Audio tagging ------')
at = AudioTagging(checkpoint_path=None, device='cuda')
(clipwise_output, embedding) = at.inference(audio)

print('------ Sound event detection ------')
sed = SoundEventDetection(checkpoint_path=None, device='cuda')
framewise_output = sed.inference(audio)
@qiuqiangkong
Copy link
Owner

qiuqiangkong commented May 6, 2021 via email

@callzhang
Copy link
Author

callzhang commented May 6, 2021

But I didn't change anything after installing the package from pip. Any idea?

@qiuqiangkong
Copy link
Owner

qiuqiangkong commented May 9, 2021 via email

@Dexter1618
Copy link

Using the new 16 KHz model as updated on 24th August 2020 gave me a similar error. I haven't even passed a audio signal yet.

Code line who caused the error : AudioTagging(checkpoint_path = "Cnn14_16k_mAP%3D0.438.pth", device = "cuda")
Error:

RuntimeError: Error(s) in loading state_dict for Cnn14: size mismatch for spectrogram_extractor.stft.conv_real.weight: copying a param with shape torch.Size([257, 1, 512]) from checkpoint, the shape in current model is torch.Size([513, 1, 1024]). size mismatch for spectrogram_extractor.stft.conv_imag.weight: copying a param with shape torch.Size([257, 1, 512]) from checkpoint, the shape in current model is torch.Size([513, 1, 1024]). size mismatch for logmel_extractor.melW: copying a param with shape torch.Size([257, 64]) from checkpoint, the shape in current model is torch.Size([513, 64]).

Please advise @qiuqiangkong

@qiuqiangkong
Copy link
Owner

qiuqiangkong commented Jul 31, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants