-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
565 changed files
with
27,065 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
# How to install datasets | ||
|
||
We suggest putting all datasets under the same folder (say `$DATA`) to ease management and following the instructions below to organize datasets to avoid modifying the source code. The file structure looks like | ||
|
||
``` | ||
$DATA/ | ||
|–– imagenet/ | ||
|–– caltech-101/ | ||
|–– oxford_pets/ | ||
|–– stanford_cars/ | ||
``` | ||
|
||
If you have some datasets already installed somewhere else, you can create symbolic links in `$DATA/dataset_name` that point to the original data to avoid duplicate download. | ||
|
||
Datasets list: | ||
|
||
- [Caltech101](#caltech101) | ||
- [OxfordPets](#oxfordpets) | ||
- [StanfordCars](#stanfordcars) | ||
- [Flowers102](#flowers102) | ||
- [FGVCAircraft](#fgvcaircraft) | ||
- [DTD](#dtd) | ||
- [EuroSAT](#eurosat) | ||
|
||
The instructions to prepare each dataset are detailed below. To ensure reproducibility and fair comparison for future work, we provide fixed train/val/test splits for all datasets except ImageNet where the validation set is used as test set. The fixed splits are either from the original datasets (if available) or created by us. | ||
|
||
|
||
### Caltech101 | ||
- Create a folder named `caltech-101/` under `$DATA`. | ||
- Download `101_ObjectCategories.tar.gz` from http://www.vision.caltech.edu/Image_Datasets/Caltech101/101_ObjectCategories.tar.gz and extract the file under `$DATA/caltech-101`. | ||
- Download `split_zhou_Caltech101.json` from this [link](https://drive.google.com/file/d/1hyarUivQE36mY6jSomru6Fjd-JzwcCzN/view?usp=sharing) and put it under `$DATA/caltech-101`. | ||
|
||
The directory structure should look like | ||
``` | ||
caltech-101/ | ||
|–– 101_ObjectCategories/ | ||
|–– split_zhou_Caltech101.json | ||
``` | ||
|
||
### OxfordPets | ||
- Create a folder named `oxford_pets/` under `$DATA`. | ||
- Download the images from https://www.robots.ox.ac.uk/~vgg/data/pets/data/images.tar.gz. | ||
- Download the annotations from https://www.robots.ox.ac.uk/~vgg/data/pets/data/annotations.tar.gz. | ||
- Download `split_zhou_OxfordPets.json` from this [link](https://drive.google.com/file/d/1501r8Ber4nNKvmlFVQZ8SeUHTcdTTEqs/view?usp=sharing). | ||
|
||
The directory structure should look like | ||
``` | ||
oxford_pets/ | ||
|–– images/ | ||
|–– annotations/ | ||
|–– split_zhou_OxfordPets.json | ||
``` | ||
|
||
### StanfordCars | ||
- Create a folder named `stanford_cars/` under `$DATA`. | ||
- Download the train images http://ai.stanford.edu/~jkrause/car196/cars_train.tgz. | ||
- Download the test images http://ai.stanford.edu/~jkrause/car196/cars_test.tgz. | ||
- Download the train labels https://ai.stanford.edu/~jkrause/cars/car_devkit.tgz. | ||
- Download the test labels http://ai.stanford.edu/~jkrause/car196/cars_test_annos_withlabels.mat. | ||
- Download `split_zhou_StanfordCars.json` from this [link](https://drive.google.com/file/d/1ObCFbaAgVu0I-k_Au-gIUcefirdAuizT/view?usp=sharing). | ||
|
||
The directory structure should look like | ||
``` | ||
stanford_cars/ | ||
|–– cars_test\ | ||
|–– cars_test_annos_withlabels.mat | ||
|–– cars_train\ | ||
|–– devkit\ | ||
|–– split_zhou_StanfordCars.json | ||
``` | ||
|
||
### Flowers102 | ||
- Create a folder named `oxford_flowers/` under `$DATA`. | ||
- Download the images and labels from https://www.robots.ox.ac.uk/~vgg/data/flowers/102/102flowers.tgz and https://www.robots.ox.ac.uk/~vgg/data/flowers/102/imagelabels.mat respectively. | ||
- Download `cat_to_name.json` from [here](https://drive.google.com/file/d/1AkcxCXeK_RCGCEC_GvmWxjcjaNhu-at0/view?usp=sharing). | ||
- Download `split_zhou_OxfordFlowers.json` from [here](https://drive.google.com/file/d/1Pp0sRXzZFZq15zVOzKjKBu4A9i01nozT/view?usp=sharing). | ||
|
||
The directory structure should look like | ||
``` | ||
oxford_flowers/ | ||
|–– cat_to_name.json | ||
|–– imagelabels.mat | ||
|–– jpg/ | ||
|–– split_zhou_OxfordFlowers.json | ||
``` | ||
|
||
### FGVCAircraft | ||
- Download the data from https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/archives/fgvc-aircraft-2013b.tar.gz. | ||
- Extract `fgvc-aircraft-2013b.tar.gz` and keep only `data/`. | ||
- Move `data/` to `$DATA` and rename the folder to `fgvc_aircraft/`. | ||
|
||
The directory structure should look like | ||
``` | ||
fgvc_aircraft/ | ||
|–– images/ | ||
|–– ... # a bunch of .txt files | ||
``` | ||
|
||
### DTD | ||
- Download the dataset from https://www.robots.ox.ac.uk/~vgg/data/dtd/download/dtd-r1.0.1.tar.gz and extract it to `$DATA`. This should lead to `$DATA/dtd/`. | ||
- Download `split_zhou_DescribableTextures.json` from this [link](https://drive.google.com/file/d/1u3_QfB467jqHgNXC00UIzbLZRQCg2S7x/view?usp=sharing). | ||
|
||
The directory structure should look like | ||
``` | ||
dtd/ | ||
|–– images/ | ||
|–– imdb/ | ||
|–– labels/ | ||
|–– split_zhou_DescribableTextures.json | ||
``` | ||
|
||
### EuroSAT | ||
- Create a folder named `eurosat/` under `$DATA`. | ||
- Download the dataset from http://madm.dfki.de/files/sentinel/EuroSAT.zip and extract it to `$DATA/eurosat/`. | ||
- Download `split_zhou_EuroSAT.json` from [here](https://drive.google.com/file/d/1Ip7yaCWFi0eaOFUGga0lUdVi_DDQth1o/view?usp=sharing). | ||
|
||
The directory structure should look like | ||
``` | ||
eurosat/ | ||
|–– 2750/ | ||
|–– split_zhou_EuroSAT.json | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
MIT License | ||
|
||
Copyright (c) 2024 Jihwan Bang | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy | ||
of this software and associated documentation files (the "Software"), to deal | ||
in the Software without restriction, including without limitation the rights | ||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
copies of the Software, and to permit persons to whom the Software is | ||
furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all | ||
copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,52 @@ | ||
## PCB - Official PyTorch Implementation | ||
# Active Prompt Learning in Visual Language Models | ||
## <center> [![Paper](https://img.shields.io/badge/arXiv-2311.11178-b31b1b.svg)](https://arxiv.org/abs/2311.11178) [![Youtube Badge](https://img.shields.io/badge/Youtube-b31b1b?style=round-square&logo=youtube&link=https://www.youtube.com/c/ZipCookResearcher)](https://www.youtube.com/watch?v=JHC9zaDYf5o&ab_channel=%08ZipCookResearcher) [![Citation Badge](https://api.juleskreuer.eu/citation-badge.php?doi=10.48550/arXiv.2311.11178)](https://juleskreuer.eu/citation-badge/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) | ||
This repository is the official PyTorch implementation of the following paper. | ||
|
||
> **Active Prompt Learning in Visual Language Models** | ||
>Jihwan Bang, Sumyeong Ahn, Jae-Gil Lee | ||
>CVPR 2024 | ||
We are working on refactoring the code, and we will push our code soon!! | ||
<img src='architecture.png'> | ||
|
||
If you are interested in our paper, please star and watch to our repository. | ||
|
||
## Updates | ||
|
||
- **[Jun.17th.2024]** First release the code. | ||
|
||
## How to Install | ||
This code is built on the [CoOp repository](https://github.com/KaiyangZhou/CoOp) and it built on top of the awesome toolbox [Dassl.pytorch](https://github.com/KaiyangZhou/Dassl.pytorch). For simply usage, I add the `dassl` directory into our directory, and revise `requirements.txt` to run the code. Hence, you should follow below commands: | ||
``` bash | ||
conda create -n pcb python=3.10 | ||
conda activate pcb | ||
cd pcb | ||
pip install -r requirements.txt | ||
``` | ||
|
||
Next, you should build on the datasets - follow [DATASETS.md](DATASETS.md) to install the datasets. | ||
|
||
## How to Run | ||
To run the code, you need to look into `scripts/alvlm/main.sh`. In this file, you must set parameter `DATA` as the directory path that datasets are stored. After then, you can run the code by following command. | ||
```bash | ||
CUDA_VISIBLE_DEVICES=XX sh scripts/alvlm/main.sh [DATASET NAME] [MODEL NAME] [AL METHOD] [SEED NUMBER] [MODE] | ||
``` | ||
- **DATASET NAME** $\in$ [oxford_flowers, dtd, oxford_pets, caltech101, stanford_cars, eurosat, fgvc_aircraft] | ||
- **MODEL NAME** $\in$ [RN50, RN101, vit_b32, vit_b16] | ||
- **AL METHOD** $\in$ [random, entropy, coreset, badge] | ||
- **SEED**: integer | ||
- **MODE**: This is for description augmentation $\in$ [none, AS, AE] | ||
|
||
|
||
|
||
|
||
## Citation | ||
If you use this code in your research, please kindly cite the following papers | ||
|
||
```bash | ||
@inproceedings{bang2024active, | ||
title={Active Prompt Learning in Vision Language Models}, | ||
author={Bang, Jihwan and Ahn, Sumyeong and Lee, Jae-Gil}, | ||
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition}, | ||
pages={27004--27014}, | ||
year={2024} | ||
} | ||
``` |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
from .clip import * |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.