Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about MMD implementaton #2

Open
hiwonjoon opened this issue May 3, 2018 · 7 comments
Open

Question about MMD implementaton #2

hiwonjoon opened this issue May 3, 2018 · 7 comments

Comments

@hiwonjoon
Copy link

Thanks for sharing a code with your amazing paper! I really enjoyed reading it.

Anyway, I am interested in extending your work in other direction, and I come up with a question on MMD part. I was able to understand the overall concept, but not sure on this multi-scale part.

wae/wae.py

Line 294 in 068a257

for scale in [.1, .2, .5, 1., 2., 5., 10.]:

Are you just trying multiple kernels to get a better estimate of MMD?

It would be also very nice of you to recommend some readings to get a better understanding of MMDS.

@tolstikhin
Copy link
Owner

Dear Wonjoon,

thank you for asking. The property we are using here is that the sum of positive definite kernels is also a positive definite kernel. We were initially using IMQ kernel with one fixed width parameter, but noticed it works slightly better if you sum those kernels with a range of widths, which allows the kernel to simultaneously "look at various scales". This is a bit hand-wavy, but I hope it gives you a correct intuition.

Regarding MMDs in general, I can recommend you looking into this overview https://arxiv.org/pdf/1605.09522.pdf

Best wishes,
Ilya

@hiwonjoon
Copy link
Author

Thanks for an instant response! So, a sigma of the kernel is not related to a prior distribution's sigma. Is it correct?

@tolstikhin
Copy link
Owner

Correct, these are two different things. But you may want to choose the kernel width depending on your prior.

@ttgump
Copy link

ttgump commented Feb 8, 2019

Thanks to the great discussion. I have a question. When I am using the MMD penalty, I trained my WAE model on some other datasets (not MNIST or celebA), I saw the MMD would become a negative value after training hundreds of epochs. Is it possible to have negative MMD penalty?

@tolstikhin
Copy link
Owner

The penalty used in WAE-MMD is not precisely the population MMD, but a sample-based U-statistic. Being an unbiased statistic (that is, its expected value coincides with the quantity of interest --- MMD in this case), if the population MMD is zero, it necessarily needs to take negative values from time to time. In summary, yes, negative values are OK.

@ttgump
Copy link

ttgump commented Feb 8, 2019

Thanks for the explanation! Should we consider the MMD has been converged after meet negative values? So when MMD is negative, can we consider q(z|x) is equal to the prior p(z)?

@tolstikhin
Copy link
Owner

Dear ttfump,

q(z|x) is not being matched to p(z) in WAE. Instead, the aggregate posterior is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants