Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch Normalization protocol does not match code implementation. #22

Open
llCurious opened this issue Nov 10, 2021 · 5 comments
Open

Batch Normalization protocol does not match code implementation. #22

llCurious opened this issue Nov 10, 2021 · 5 comments

Comments

@llCurious
Copy link

Hey, snwagh.
I have been reading your paper of Falcon and found this repo. And I am interested in how you perform the computation of Batch Normalization.

I have the following two questions:

  1. the implementation of BN seems to be just a single division
    image
    image

  2. the protocol of Pow seems to reveal the information of the exponent, i.e., \alpha

  3. the BIT_SIZE in your paper is 32, which seems to be too small. How you guarantee the accuracy or say precision? IS the BN actually essential to your ML training and inference?

@snwagh
Copy link
Owner

snwagh commented Nov 11, 2021

@llCurious I've added responses in order for your questions below:

  • Right, the Pow function needs to be vectorized. Take a look at this git issue for more details. The division protocol also needs to be modified accordingly. The function correctly reports the run-time but effectively computes only the first component correctly.
  • The Pow does indeed reveal information of the exponent \alpha and it is by design (see Fig. 8 here). This considerably simplifies the computation and the leakage is well quantified. However, the broader implications of revealing this value (such as can an adversary launch an attack using that information) is not studied in the paper.
  • A BIT_SIZE of 32 is sufficient for inference and the code to reproduce this is given in the files/preload/. End-to-end training in MPC was not performed (given the prohibitive time and parameter tuning) though I suspect you're right, it would either require a larger bit-width or adaptive setting of the fixed-point precision.

@llCurious
Copy link
Author

Thank you for you responses.
Do you mean the division protocol currently can only handle the case where the exponent of the divisor b is the same? Or say, if the divisors in the vector have different exponents, then the current division protocol fails?

BTW, you seem to miss my question about the BN protocol. You mention that a larger bit-width or adaptive setting of the fixed-point precision. can be helpful in end-to-end training, do you mean to employ BN to tackle this problem?

@snwagh
Copy link
Owner

snwagh commented Nov 13, 2021

Yes, that is correct. Either all the exponents have to be the same or the protocol doesn't really guarantee any correctness.

About your BN question, like I said, end-to-end training in MPC was not studied (still many open challenges for that) so it is hard to make a comment empirically on the use of BN for training. However, the use of BN is known from ML literature (plaintext) and the idea is that the benefits of BN (improving convergence/stability) will translate into secure computation too. Does this answer your question? If you're asking if BN will help train a network in the current code base then I'll say no, though it is an issue, it is not the only issue that is preventing training.

@llCurious
Copy link
Author

llCurious commented Nov 16, 2021

OK,i got it. Sry for the late reply~

  • I also notice that you in the Paper Section 5.6, you present the elemental data for the training performance with (or without) BN. I am a little bit confused that how the accuracy is obtained? It seems to be that this is end-to-end secure training?

  • In addition, i wonder how the comparison to prior works is conducted. Do you carry out the experiments of the prior works using 32-bit (which is identical to your setting) or the setting in their papers (like 64-bit in ABY3)?

Really thanks for your patient answers!!!!

@snwagh
Copy link
Owner

snwagh commented Nov 16, 2021

  • The numbers are for end-to-end training but unfortunately for plaintext.
  • I think the numbers are identical (the fastest way to verify would be to run the Falcon code).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants