Skip to content

Commit

Permalink
initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
tanganke committed May 10, 2023
0 parents commit a46bcea
Show file tree
Hide file tree
Showing 35 changed files with 5,753 additions and 0 deletions.
166 changes: 166 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
/data/
/output/

# backup files
*.backup

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
173 changes: 173 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
# Improving Heterogeneous Model Reuse by Density Estimation

<!-- TOC start (generated with https://github.com/derlin/bitdowntoc) -->

- [Improving Heterogeneous Model Reuse by Density Estimation](#improving-heterogeneous-model-reuse-by-density-estimation)
* [1. Toy Example](#1-toy-example)
* [2. Benchmark Experiments on Fashion-MNIST](#2-benchmark-experiments-on-fashion-mnist)
+ [prepare dataset](#prepare-dataset)
+ [2.1 Ours](#21-ours)
+ [2.2 Centralized Baseline](#22-centralized-baseline)
+ [2.3 HMR (compared)](#23-hmr-compared)
+ [2.4 RKME (compared)](#24-rkme-compared)

<!-- TOC end -->

my enviroment:

- Python 3
- Linux

insatll package dependency:

```bash
pip install -r ./requirements.txt
```

project layout:

```conf
# code for toy example experiment
toy example/
# benchmark experiments
data/
fashion_mnist/ # Fashion-MNIST datasets under multiparty settings
A/
${party_name}/
${class_name}/
${image_name}.png
B/
C/
D/
train/ # global train dataset
test/ # global test dataset
fashion_mnist.py # code to load multiparty fashion datasets
fashion_mnist.ipynb # reproduce all figures in the paper
fashion_mnist.conv/ # train classifiers on local dataset and train centralized baseline model
data/ # symbolic link to ../data/
output/ # symbolic link to ../output/
fashion_mnist.realnvp/ # train density estimators on locally
fashion_mnist.global/ # global model (Ours)
deploy_global_model.py # deploy the global model
prepare_global_model.py # calibration the global model from raw local models (random initialized)
fashion_mnist.RKME/ # deploy the global model (RKME)
output/ # training logs, model checkpoints
```

## 1. Toy Example

- [Ours](toy_example/HMR_Ours.ipynb)
- [HMR (ICML 2019)](toy_example/HMR_ICML2019.ipynb)

## 2. Benchmark Experiments on Fashion-MNIST

### prepare dataset

```bash
cd data
unzip fashion_mnist.zip
```

### 2.1 Ours

train classifiers on Fashion-MNIST:

```bash
cd fashion_mnist.conv
python3 prepare_conv.py # log dir: output/fashion_mnist/conv/log
```

train density estimators on Fashion-MNIST:

```bash
cd fashion_mnist.realnvp
python3 prepare_realnvp.py # log dir: output/fashion_mnist/realnvp/log
```

evaluate global model on global test set (10k images, 10 classes):

```bash
cd fashion_mnist.global
python3 deploy_global_model.py
# zero-shot accuracy: output/fashion_mnist/global/deploy
# calibration log: output/fashion_mnist/global/calibration/log
```

train global model from raw model on global train set:

```bash
cd fashion_mnist.global
python3 prepare_global_model.py
# raw accuracy: output/fashion_mnist/global/raw
# log dir: output/fashion_mnist/global/raw/log
```

*NOTE*: structure of log directories:

```bash
.
├── A
│ ├── party_0
│ │ ├── version_0
│ │ │... version_XX
│ └── party_1
├── B
│ ├── party_0
│ ├── party_1
│ └── party_2
├── C
│ ├── party_0
│ ├── party_1
│ └── party_2
└── D
├── party_0
├── party_1
├── party_2
├── party_3
├── party_4
├── party_5
└── party_6
```

### 2.2 Centralized Baseline

```bash
cd fashion_mnist.conv
python3 prepare_baseline.py # log dir: output/fashion_mnist/conv/baseline/log
```

### 2.3 HMR (compared)

> Wu, Xi Zhu, Song Liu, and Zhi Hua Zhou. 2019. “Heterogeneous Model Reuse via Optimizing Multiparty Multiclass Margin.” 36th International Conference on Machine Learning, ICML 2019 2019-June: 11862–71.
see [GitHub](https://github.com/YuriWu/HMR).
pre-run results - [output/fashion_mnist.HMR/result.csv](./output/fashion_mnist.HMR/result.csv)

### 2.4 RKME (compared)

> X. Wu, W. Xu, S. Liu, and Z. Zhou. Model reuse with reduced kernel mean embedding specification. IEEE Transactions on Knowledge and Data Engineering, 35(01):699–710, jan 2023.

```bash
cd fashion_mnist_RKME
```

Kernel methods usually cannot work directly on the raw-pixel level or raw-document level due to the high input dimension.
We exact features as the outputs from the penultimate layer of pre-trained ResNet-110.

```bash
python3 prepare_features.py # save features to: output/fashion_mnist.RKME/features.resnet101
```

fit reduced kernel mean embedding, find optimal betas and reduced points (M = 10).

```bash
python3 prepare_rkme.py # log dir: output/fashion_mnist.RKME/features.resnet101.RKME.M=10
```

deploy RKME on global test set.

```bash
python3 deploy_rkme.py # log dir: output/fashion_mnist.RKME/features.resnet101.RKME.M=10/deploy
```
2 changes: 2 additions & 0 deletions data/.gitkeep
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/fashion_mnist/
/fashion_mnist.zip
1 change: 1 addition & 0 deletions fashion_mnist.RKME/ConvNet.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
../fashion_mnist.conv/ConvNet.py
Loading

0 comments on commit a46bcea

Please sign in to comment.