Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question about gumbel softmax #9

Open
XiaoyuShi97 opened this issue Sep 21, 2021 · 2 comments
Open

question about gumbel softmax #9

XiaoyuShi97 opened this issue Sep 21, 2021 · 2 comments

Comments

@XiaoyuShi97
Copy link

Hi, nice work! I am a bit confused about gumbel softmax. You mention in your paper that, during traininig, gumbel softmax is used. I wonder if it can be replaced by pure softmax (i.e. torch.softmax)? Could you please give more explanation on this design choice? Thx!

@LongguangWang
Copy link
Member

Hi @btwbtm, thanks for your interest in our work. Softmax is also used in several network quantization or pruning methods to soften one-hot distributions. In my opinion, softmax may also works in our SMSR but I have not tried it. In our experiments, gumbel softmax is adopted since it is theorically identical to one-hot distribution while softmax is not.

@wangqiim
Copy link

wangqiim commented Apr 1, 2022

Hi @btwbtm, thanks for your interest in our work. Softmax is also used in several network quantization or pruning methods to soften one-hot distributions. In my opinion, softmax may also works in our SMSR but I have not tried it. In our experiments, gumbel softmax is adopted since it is theorically identical to one-hot distribution while softmax is not.

https://github.com/The-Learning-And-Vision-Atelier-LAVA/SMSR/blob/daac49c9a107778c95e11a16fd5b4a8b45513678/model/smsr.py#L12-L21

image

I found the implement of gumbel softmax in your code is different from original paper("Categorical reparameterization with gumbel-softmax"), why do you modify this? which is better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants