Setup Dataset

We follow Layout2Im, LostGAN, use the download/preprocess scripts in Sg2Im.

COCO-stuff 2017 (The deprecated segmentation challenge)

  bash bash/download_coco.sh

COCO 2017 Segmentation challenge split file structure

├── annotations
│    └── deprecated-challenge2017
│         └── train-ids.txt
│         └── val-ids.txt
│    └── instances_train2017.json
│    └── instances_val2017.json
│    └── stuff_train2017.json
│    └── stuff_val2017.json
│    └── ...
├── images
│    └── train2017
│         └── 000000000872.jpg
│         └── ... 
│   └── val2017
│         └── 000000000321.jpg
│         └── ...

Visual Genome

  # Run the following script to download and unpack the relevant parts of the Visual Genome dataset. 
  # This will create the directory datasets/vg and will download about 15 GB of data to this directory; after unpacking it will take about 30 GB of disk space.
  bash bash/download_vg.sh
  
  # After downloading the Visual Genome dataset, we need to preprocess it. This will split the data into train / val / test splits, consolidate all scene graphs into HDF5 files, and apply several heuristics to clean the data. In particular we ignore images that are too small, and only consider object and attribute categories that appear some number of times in the training set; we also ignore objects that are too small, and set minimum and maximum values on the number of objects and relationships that appear per image.
  # This will create files train.h5, val.h5, test.h5, and vocab.json in the directory datasets/vg.
  python scripts/preprocess_vg.py

Visual Genome file structure

├── VG_100K
│   └── captions_val2017.json
│   └── ...
└── objects.json
└── train.json
└── ...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DATASET_SETUP.md

DATASET_SETUP.md

Setup Dataset

COCO-stuff 2017 (The deprecated segmentation challenge)

Visual Genome

Files

DATASET_SETUP.md

Latest commit

History

DATASET_SETUP.md

File metadata and controls

Setup Dataset

COCO-stuff 2017 (The deprecated segmentation challenge)

Visual Genome