question about gumbel softmax #9

XiaoyuShi97 · 2021-09-21T10:45:06Z

Hi, nice work! I am a bit confused about gumbel softmax. You mention in your paper that, during traininig, gumbel softmax is used. I wonder if it can be replaced by pure softmax (i.e. torch.softmax)? Could you please give more explanation on this design choice? Thx!

LongguangWang · 2021-09-25T04:07:40Z

Hi @btwbtm, thanks for your interest in our work. Softmax is also used in several network quantization or pruning methods to soften one-hot distributions. In my opinion, softmax may also works in our SMSR but I have not tried it. In our experiments, gumbel softmax is adopted since it is theorically identical to one-hot distribution while softmax is not.

wangqiim · 2022-04-01T05:43:02Z

Hi @btwbtm, thanks for your interest in our work. Softmax is also used in several network quantization or pruning methods to soften one-hot distributions. In my opinion, softmax may also works in our SMSR but I have not tried it. In our experiments, gumbel softmax is adopted since it is theorically identical to one-hot distribution while softmax is not.

https://github.com/The-Learning-And-Vision-Atelier-LAVA/SMSR/blob/daac49c9a107778c95e11a16fd5b4a8b45513678/model/smsr.py#L12-L21

I found the implement of gumbel softmax in your code is different from original paper("Categorical reparameterization with gumbel-softmax"), why do you modify this? which is better?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about gumbel softmax #9

question about gumbel softmax #9

XiaoyuShi97 commented Sep 21, 2021

LongguangWang commented Sep 25, 2021

wangqiim commented Apr 1, 2022

question about gumbel softmax #9

question about gumbel softmax #9

Comments

XiaoyuShi97 commented Sep 21, 2021

LongguangWang commented Sep 25, 2021

wangqiim commented Apr 1, 2022