-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can not recognize Silence, less stable than yamnet #6
Comments
Honghe
changed the title
Can not recognize Silence, and it seems different to yamnet
Can not recognize Silence, less stable than yamnet
Jan 28, 2021
Hi Jack,
Thank you very much for the feedback! I have tried the panns inference
of the wav you attached and get the following result:
[image: image.png]
The panns_inference version is 0.0.7. Look the number of frames is 800
here.
For this example, Yamnet performs better than PANNs in detecting
silence. Here are two possible reasons:
1. Yamnet is trained on 1-second segments. While PANNs are trained on
10-second segments with weak labels to obtain better audio tagging
performance.
2. PANNs applies mixup to improve the detection of other sound events,
while mixup lower the performance for silence.
It is very useful for us to know this feedback! We are very happy to
know more comparision between Yamnet and PANNs if there are any!
Best wishes,
Qiuqiang
…On Thu, 28 Jan 2021 at 11:48, Jack ***@***.***> wrote:
Hi qiuqiangkong, Thanks for your great job.
Recently, I tested panns_inference with the following wav audio.
silence.zip
<https://github.com/qiuqiangkong/panns_inference/files/5884423/silence.zip>
The wav is a generated audio with "little noise, silence, little noise".
[image: image]
<https://user-images.githubusercontent.com/1092722/106086684-9c799a80-615d-11eb-95ca-efdf54903873.png>
The paans_inference's output is as below. It can not recognize Silence,
and the probability gap of Pink noise between the wav's head and tail is
a little big.
[image: image]
<https://user-images.githubusercontent.com/1092722/106083000-b82d7280-6156-11eb-8109-62adff25b3d3.png>
In contrast, the yamnet's output is more ressonable as follow.
[image: image]
<https://user-images.githubusercontent.com/1092722/106085778-cc27a300-615b-11eb-82ce-19d1b18c7185.png>
The panns_inference code I used was 013c0f6
<013c0f6>
Sincerely!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#6>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADFXTSPJZDV4PLBQVXB7MQ3S4DNB3ANCNFSM4WWLXHSQ>
.
|
@qiuqiangkong Thank you for your reply, but it seems your pic upload failed. |
@Honghe Sorry! See prediction figure attached: |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi qiuqiangkong, Thanks for your great job.
Recently, I tested
panns_inference
with the following wav audio.silence.zip
The wav is a generated audio with "little noise, silence, little noise".
The
paans_inference
's output is as below. It can not recognizeSilence
, and the probability gap ofPink noise
between the wav's head and tail is a little big.In contrast, the
yamnet
's output is more ressonable as follow.The
panns_inference
code I used was 013c0f6Sincerely!
The text was updated successfully, but these errors were encountered: