Stable Diffusion nano

Overview

Stable Diffusion nano is a simplified implementation of latent diffusion models inspired by the short course How Diffusion Models Work by deeplearning.ai. The repository aims to make the concepts and structure of diffusion models, particularly Figure 3 from the paper High-Resolution Image Synthesis with Latent Diffusion Model, accessible to beginners.

The repository implements models proposed in two key papers:

Why This Repository?

The official implementations of the above papers are often too complex for beginners. Stable Diffusion nano simplifies these concepts and presents them in an intuitive, beginner-friendly manner using Jupyter notebooks. Our goal is to provide a hands-on learning experience by focusing on essential components while avoiding unnecessary complexity.

For those interested in deeper theoretical insights, refer to A Gentle Introduction to Diffusion Model: Part 1 - DDPM.

Multi-head Attention

This repository specifically details the Multi-head Attention (MHA) mechanism between latent feature maps and condition embeddings with diagrams. We believe this will be particularly helpful for those who may find MHA difficult to understand.

Network Architecture

A simplified U-Net structure was used, emphasizing the integration of multi-head attention across feature maps. The goal is to demonstrate how multi-head attention can be effectively applied to each feature map within the network.

Notebooks

This repository includes the following notebooks:

1. `01.ddpm.ipynb`

Description: Implements the basic Denoising Diffusion Probabilistic Model (DDPM).
Goal: Understand the fundamental process of adding and removing noise to generate images from random noise.

2. `02.vae_latent_2d.ipynb`

Description: Implements an image encoder-decoder (VAE) for converting pixel-space images into latent-space representations, a crucial step for Latent Diffusion Models.
Goal: Learn how to compress and reconstruct images using a Variational Autoencoder (VAE).

3. `03.ldm_nano.ipynb`

Description: Simplifies the structure in Figure 3 of the LDM paper to create a basic latent diffusion model.
Goal: Implement a complete but simplified Latent Diffusion Model while maintaining the essential architecture and principles.

Visual Representations

Animation

Dataset

We use a custom dataset of 16x16 image sprites prepared from:

This dataset was utilized in the course How Diffusion Models Work. The small resolution ensures faster training and inference, making it suitable for educational purposes.

Getting Started

Prerequisites

Python 3.8 or higher
Jupyter Notebook
PyTorch
torchvision
numpy
matplotlib
plotly

Installation

Clone the repository:

git clone https://github.com/yourusername/stable-diffusion-nano.git
cd stable-diffusion-nano

Install dependencies:
```
pip install -r requirements.txt
```

Running the Notebooks

Open any notebook in Jupyter and run the cells sequentially. Start with 01.ddpm.ipynb for the basics and progress to 03.ldm_nano.ipynb for the complete model. You can also run the notebooks on Google Colab for free. Simply upload the desired notebook to Colab and ensure the necessary dependencies are installed.

Hands-On Notebook

Chapter	Colab
DDPM Notebook
VAE Latent 2D Notebook
LDM Nano Notebook

Contributions

Contributions are welcome! If you find any issues or want to add new features, feel free to open an issue or submit a pull request.

📜 License

This project is licensed under the Non-Commercial Use Only License.

⚠️ Restrictions

Non-Commercial Use Only: This software is provided for personal, educational, and non-commercial purposes only.
Commercial Use Prohibited: Commercial use of this software is strictly prohibited without prior written consent from the copyright holder.
For inquiries about commercial licensing, please contact [email protected].

Acknowledgements

How Diffusion Models Work by deeplearning.ai for the inspiration.
Authors of Denoising Diffusion Probabilistic Models and High-Resolution Image Synthesis with Latent Diffusion Model for their groundbreaking work.
FrootsnVeggies and kyrise for the dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
assets		assets
01.ddpm.ipynb		01.ddpm.ipynb
02.vae_latent_2d.ipynb		02.vae_latent_2d.ipynb
03.ldm_nano.ipynb		03.ldm_nano.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stable Diffusion nano

Overview

Why This Repository?

Multi-head Attention

Network Architecture

Notebooks

1. `01.ddpm.ipynb`

2. `02.vae_latent_2d.ipynb`

3. `03.ldm_nano.ipynb`

Visual Representations

Dataset

Getting Started

Prerequisites

Installation

Running the Notebooks

Hands-On Notebook

Contributions

📜 License

⚠️ Restrictions

Acknowledgements

About

Releases

Packages

Languages

metamath1/stable-diffusion-nano

Folders and files

Latest commit

History

Repository files navigation

Stable Diffusion nano

Overview

Why This Repository?

Multi-head Attention

Network Architecture

Notebooks

1. 01.ddpm.ipynb

2. 02.vae_latent_2d.ipynb

3. 03.ldm_nano.ipynb

Visual Representations

Dataset

Getting Started

Prerequisites

Installation

Running the Notebooks

Hands-On Notebook

Contributions

📜 License

⚠️ Restrictions

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. `01.ddpm.ipynb`

2. `02.vae_latent_2d.ipynb`

3. `03.ldm_nano.ipynb`

Packages