Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the minimum input audio length/size of AudioTagging class? #13

Open
underdogliu opened this issue Sep 29, 2023 · 0 comments
Open

Comments

@underdogliu
Copy link

Hi first of all thanks for the amazing work!

I am now having a taste of PANNs by looking at example.py. However, when I was trying with some short audios, I found I have to do padding to make it work, otherwise below error would occur. I randomly padded by audio length to 10000 and it worked. But I know it is just a placefolder.

RuntimeError: Given input size: (1024x1x4). Calculated output size: (1024x0x2). Output size is too small

Since we do re-sampling at 32Khz, I wonder:

  1. What is the minimum length of the audio to be input into the model?
  2. If the input is shorter than that, is it valid to call for example numpy.pad(audio, (0, shortage), 'wrap') to pad the audio?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant