Skip to content

Commit

Permalink
Add ImageNet datamodule, refactor datamodule configs [RT-76] (#6)
Browse files Browse the repository at this point in the history
* Add stub for imagenet datamodule

Signed-off-by: Fabrice Normandin <[email protected]>

* Add SLURM_TMPDIR in devcontainer, add notes

Signed-off-by: Fabrice Normandin <[email protected]>

* Add Olexa's imagenet recipe in python form

Signed-off-by: Fabrice Normandin <[email protected]>

* Add DataModule for the ImageNet dataset

Signed-off-by: Fabrice Normandin <[email protected]>

* Use .yaml files with structured base 4 datamodules

Signed-off-by: Fabrice Normandin <[email protected]>

* Fix typo in log, change preparation benchmark dir

Signed-off-by: Fabrice Normandin <[email protected]>

* Reduce logging verbosity a bit

Signed-off-by: Fabrice Normandin <[email protected]>

* Temporarily add a project cmd -> project/main.py

Signed-off-by: Fabrice Normandin <[email protected]>

* Fix tiny bugs and import errors

Signed-off-by: Fabrice Normandin <[email protected]>

* Add notes about extending structured configs

Signed-off-by: Fabrice Normandin <[email protected]>

* Fix some issues, fix defaults

Signed-off-by: Fabrice Normandin <[email protected]>

* Minor touchup

Signed-off-by: Fabrice Normandin <[email protected]>

* Touchups and import fixes

Signed-off-by: Fabrice Normandin <[email protected]>

* Switch to full config files for datamodule configs

Signed-off-by: Fabrice Normandin <[email protected]>

* Add marks for combinations of configs

Signed-off-by: Fabrice Normandin <[email protected]>

* Add regression file for ImageNet batch

Signed-off-by: Fabrice Normandin <[email protected]>

* Add pytest-testmon dev dependency

Signed-off-by: Fabrice Normandin <[email protected]>

* Fix pre-commit issue

Signed-off-by: Fabrice Normandin <[email protected]>

* Fix bug in VisionDataModule.train/test/val kwargs

Signed-off-by: Fabrice Normandin <[email protected]>

* Fix issues with imagenet32+VisionDataModule

Signed-off-by: Fabrice Normandin <[email protected]>

* Add a regression file for first batch ImageNet32

Signed-off-by: Fabrice Normandin <[email protected]>

* Move VisionDataModule, fix import issues

Signed-off-by: Fabrice Normandin <[email protected]>

* Update dependencies

Signed-off-by: Fabrice Normandin <[email protected]>

* Find working combination of pydantic/lightning

Signed-off-by: Fabrice Normandin <[email protected]>

* Fix integration tests that have to do with dataset

Signed-off-by: Fabrice Normandin <[email protected]>

* Fix typo

Signed-off-by: Fabrice Normandin <[email protected]>

* Update devcontainer file

Signed-off-by: Fabrice Normandin <[email protected]>

---------

Signed-off-by: Fabrice Normandin <[email protected]>
  • Loading branch information
lebrice authored Jun 25, 2024
1 parent 8c3d69b commit a57ce73
Show file tree
Hide file tree
Showing 35 changed files with 1,126 additions and 1,108 deletions.
25 changes: 19 additions & 6 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
Expand Up @@ -59,15 +59,25 @@
}
},
"containerEnv": {
"SCRATCH": "/home/vscode/scratch"
"SCRATCH": "/home/vscode/scratch",
"SLURM_TMPDIR": "/tmp"
},
// Mount a "$SCRATCH" directory in the host to ~/scratch in the container.
// Mount /network to use this to mount a "$SCRATCH" directory in the host to ~/scratch in the container.
"mounts": [
// https://code.visualstudio.com/remote/advancedcontainers/add-local-file-mount
"source=${localEnv:HOME}/.cache/pdm,target=/home/vscode/.pdm_install_cache,type=bind,consistency=cached",
// Mount a directory which will contain the pdm installation cache (shared with the host machine).
// This will use $SCRATCH/.cache/pdm, otherwise
// Mount a "$SCRATCH" directory in the host to ~/scratch in the container.
"source=${localEnv:SCRATCH},target=/home/vscode/scratch,type=bind,consistency=cached",
"source=${localEnv:NETWORK_DIR:/network},target=/network,type=bind,readonly"
"source=${localEnv:SCRATCH}/.cache/pdm,target=/home/vscode/.pdm_install_cache,type=bind,consistency=cached",
// Mount a /network to match the /network directory on the host.
// FIXME: This assumes that either the NETWORK_DIR environment variable is set on the host, or
// that the /network directory exists.
"source=${localEnv:NETWORK_DIR:/network},target=/network,type=bind,readonly",
// Mount a /tmp on the host machine to /tmp/slurm_tmpdir in the container.
// note: there's also a SLURM_TMPDIR env variable set to /tmp/slurm_tmpdir in the container.
// NOTE: this assumes that either $SLURM_TMPDIR is set on the host machine (e.g. a compute node)
// or that `/tmp/slurm_tmpdir` exists on the host machine.
"source=${localEnv:SLURM_TMPDIR:/tmp/slurm_tmpdir},target=/tmp,type=bind,consistency=cached"
],
"runArgs": [
"--gpus",
Expand All @@ -76,7 +86,10 @@
],
// create the pdm cache dir on the host machine if it doesn exist yet so the mount above
// doesn't fail.
"initializeCommand": "mkdir -p ~/.cache/pdm",
"initializeCommand": {
"create pdm install cache": "mkdir -p ${SCRATCH?need the SCRATCH environment variable to be set.}/.cache/pdm", // todo: put this on $SCRATCH on the host (e.g. compute node)
"create fake SLURM_TMPDIR": "mkdir -p ${SLURM_TMPDIR?need the SLURM_TMPDIR environment variable to be set.}" // this is fine on compute nodes
},
// NOTE: Getting some permission issues with the .cache dir if mounting .cache/pdm to
// .cache/pdm in the container. Therefore, here I'm making a symlink from ~/.cache/pdm to
// ~/.pdm_install_cache so the ~/.cache directory is writeable by the container.
Expand Down
Loading

0 comments on commit a57ce73

Please sign in to comment.