Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a melanoma encrypted image classification #18

Closed
AmT42 opened this issue May 1, 2023 · 4 comments
Closed

Create a melanoma encrypted image classification #18

AmT42 opened this issue May 1, 2023 · 4 comments
Assignees
Labels
📁 Concrete library targeted: Concrete 💵 Grant accepted This project received a grant from the Zama team

Comments

@AmT42
Copy link

AmT42 commented May 1, 2023

Zama Bounty Program: Melanoma Image Classification

Please give us as much information as possible on the bounty you would like to submit. You can find inspiration from our existing list of bounties here.

  • Bounty type: major_bounty
  • Category: Application
  • Overview: This project demonstrates the application of Zama AI's Concrete-ML library for FHE in machine learning on melanoma image classification using a Kaggle dataset. We will explore compatibility with TensorFlow/ONNX, study different CNN models, and provide a tutorial for developers. Key tasks include data preprocessing, model training, performance trade-offs, FHE implementation with client-server architecture, and documentation. Our goal is to address the challenges of deploying encrypted machine learning models in healthcare while maintaining data privacy.
  • Library targeted: Concrete
  • Reward: 7500$ - 11500$ depending on included components.
  • Related links and reference: (Melanoma Classification)[https://www.kaggle.com/c/siim-isic-melanoma-classification]

- Description:

Title: Melanoma Image Classification with Privacy-Preserving FHE Encryption using Zama AI's Concrete-ML Library

Summary:

In this use case, we aim to showcase the potential of Zama AI's Concrete-ML library for machine learning with Fully Homomorphic Encryption (FHE) on private healthcare data. Our focus will be on melanoma image classification using a publicly available dataset from Kaggle. The primary objectives are to study the compatibility between TensorFlow/ONNX and Concrete-ML and to provide a tutorial and baseline for developers interested in using the Concrete-ML library.

Tasks:

1. Data preparation and preprocessing

  • Macro sizing: 2-3 days
  • Tasks:
    • Load the dataset and apply a series of preprocessing techniques such as resizing, normalization, and data augmentation to ensure compatibility with the selected machine learning models.
    • Perform exploratory data analysis to gain insights into the data distribution and quality, and identify potential challenges and opportunities for model development.
  • Deliverables: A notebook detailing data preparation and preprocessing steps, including any data augmentation techniques or transformations required for the models, along with data exploration and visualization.

2. Model training and evaluation

  • Macro sizing: 7-8 days
  • Tasks:
    • Train and evaluate various CNN models, including simple CNN models trained from scratch and pre-trained models like ResNet and LeNet used with transfer learning.
    • Report and compare relevant metrics for each model.
    • Convert the trained models to their Concrete-ML equivalents and evaluate their performance.
    • Investigate the difference in performance when using Quantized Aware Training (QAT) for training from scratch and applying Quantized Post Training (QPT) for transfer learning.
    • Investigate and report any issues encountered with TensorFlow compatibility when using Concrete-ML. If necessary, switch to PyTorch.
  • Deliverables: A notebook detailing model training, evaluation, and comparison of different CNN models, as well as their performance when converted to Concrete-ML equivalents.

3. Performance and runtime trade-offs with Concrete-ML

  • Macro sizing: 3-5 days
  • Tasks:
    • Study the trade-offs between performance and runtime of Concrete-ML models by experimenting with different values of n_bits, p_error, and global_p_error, as well as other relevant arguments.
    • Analyze the impact of varying p_error on inference time and decision boundaries.
    • Discuss the relationship between these parameters and their effect on model performance, accuracy, and computational resources.
  • Deliverables: A notebook presenting the results of the parameter variation study, including visualizations and insights on the trade-offs between performance and runtime.

4. FHE implementation with Concrete-ML and client-server architecture

  • Macro sizing: ~2-3 days
  • Tasks:
    • Implement FHE encryption on private data using the Concrete-ML library, and adapt the chosen model(s) for use with FHE.
    • Set up a client-server architecture using FHEModelClient, FHEModelDev, and FHEModelServer, demonstrating how the client possesses plaintext data, the server only sees encrypted data and needs serialized_evaluation_keys, and the server returns encrypted results that the client can decrypt.
    • Analyze the performance, latency, and impact of varying n_bits values on the size of serialized_evaluation_keys.
  • Deliverables: An enhanced notebook showcasing FHE implementation with Concrete-ML and client-server architecture. Provide a clear explanation of the server's limitations and requirements when using serialized_evaluation_keys.

5. Documentation and tutorial (optional except the first point)

  • Macro sizing: 1-4 days
  • Tasks:
    • Provide a summary of all issues encountered with Concrete-ML for improvement.
    • Create a comprehensive blog post or series of posts on Medium describing the use case, methodologies, techniques, and results obtained.
    • Provide clear explanations and code samples for developers to follow as a tutorial.
    • Emphasize the FHE-related aspects of the workflow to cater to the interests of the Zama AI team.
  • Deliverables: A blog post or series of posts, including code snippets and visualizations, illustrating the application of FHE encryption using Concrete-ML for melanoma image classification.

Total Macro Sizing (Estimated): 15-23 days

@zaccherinij zaccherinij changed the title Melanoma Image Classification Melanoma Image Classification May 2, 2023
@AmT42
Copy link
Author

AmT42 commented Jun 4, 2023

Hello,

  • The clear model took close to 10 hours to train, so I'm a bit concerned about its FHE counterpart.
  • I also noticed a significant drop in accuracy on large datasets like CIFAR-100 in the new documentation of concrete-ml (https://docs.zama.ai/concrete-ml/deep-learning/torch_support), and my dataset is even more challenging than that. Therefore, I expect the accuracy to be very low.

Based on these observations, I have a strong feeling that we won't be able to achieve acceptable performance. It seems like the technology isn't yet mature enough to handle this kind of problem.

If possible, I would prefer not to "waste" time if you agree that working on an infeasible task. Therefore, I would appreciate your opinion on this matter. If you agree, I can search for an easier medical dataset. However, if you think it's worth trying even if we don't get good results, I can start working on it.

Thank you.

@RomanBredehoft
Copy link

Hello,
It seems that you have opened a topic on our Community forum. Let's continue the discussion over there as we have already given you a first answer !

@aquint-zama aquint-zama changed the title Melanoma Image Classification Create a Melanoma Encrypted Image classification Sep 28, 2023
@zaccherinij zaccherinij changed the title Create a Melanoma Encrypted Image classification Create a Melanoma Encrypted Image Classification Sep 28, 2023
@zaccherinij zaccherinij added the 📁 Concrete library targeted: Concrete label Sep 28, 2023
@zaccherinij zaccherinij changed the title Create a Melanoma Encrypted Image Classification Create a melanoma encrypted image classification Oct 2, 2023
@zaccherinij zaccherinij added 💵 Grant accepted This project received a grant from the Zama team and removed 🤝 Bounty proposition accepted labels Feb 9, 2024
@aquint-zama
Copy link
Collaborator

aquint-zama commented Mar 19, 2024

@AmT42 closed as inactive

@github-project-automation github-project-automation bot moved this from Grants to Awarded Contributions in Zama Bounty and Grant Program Overview Mar 19, 2024
@AmT42
Copy link
Author

AmT42 commented Mar 19, 2024

Hello Aquint,

I'm sorry, I completely forgot to close this ticket. I had put it on pause for a personal project, and now I'm fully dedicated to my project.

I really enjoyed working on this project. You can close it. If I ever have free time, I'll work on it again, but just personally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📁 Concrete library targeted: Concrete 💵 Grant accepted This project received a grant from the Zama team
Projects
Status: Awarded Contributions
Development

No branches or pull requests

4 participants