Skip to content

code for [ECCV 2022 paper] Contributions of Shape, Texture, and Color in Visual Recognition

License

Notifications You must be signed in to change notification settings

gyhandy/Humanoid-Vision-Engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Humanoid-Vision-Engine

[ECCV 2022] Contributions of Shape, Texture, and Color in Visual Recognition

Code is actively updating.

Editor

Figure: Left: Contributions of Shape, Texture, and Color may be different among different scenarios/tasks. Right: Humanoid Vision Engine takes dataset as input and summarizes how shape, texture, and color contribute to the given recognition task in a pure learning manner (E.g., In ImageNet classification, shape is the most discriminative feature and contributes most in visual recognition).

Editor

Figure: Pipeline for humanoid vision engine (HVE). (a) shows how will humans’ vision system deal with an image. After humans’ eyes perceive the object, the different parts of the brain will be activated. The human brain will organize and summarize that information to get a conclusion. (b) shows how we design HVE to correspond to each part of the human’s vision system.

Getting Started

Installation

  • Clone this repo:
git clone https://github.com/gyhandy/Humanoid-Vision-Engine.git
cd Humanoid-Vision-Engine

Datasets

  • Please download the preprocessed dataset from here, and then unzip it place in .data. (Note: if not download automatically, please right click, copy the link and open in a new tab.)

Analyze Dasaset Bias

  1. Please train the feature models with
# train shape model
python preprocess_dataset/1_train_resnet/main.py --data data/iLab/feature_images/shape --arch data/iLab/model/shape_resnet18/
# train texture model
python preprocess_dataset/1_train_resnet/main.py --data data/iLab/feature_images/texture --arch data/iLab/model/texture_resnet18/
# train color model
python preprocess_dataset/1_train_resnet/main.py --data data/iLab/feature_images/color --arch data/iLab/model/color_resnet18/
  1. Please train the humanoid neural network with
python HNN/train_HNN.py --root_shape data/iLab/feature_images/shape --root_texture data/iLab/feature_images/texture --root_color data/iLab/feature_images/color --shape_model data/iLab/model/shape_resnet18/1.pth --texture_model data/iLab/model/texture_resnet18/1.pth --color_model data/iLab/model/color_resnet18/1.pth --save_model_dir data/iLab/model/Attention)
  1. Analyze the bias of dataset
python HNN/compute_bias.py --root_shape data/iLab/feature_images/shape --root_texture data/iLab/feature_images/texture --root_color data/iLab/feature_images/color --shape_model data/iLab/model/shape_resnet18/1.pth --texture_model data/iLab/model/texture_resnet18/1.pth --color_model data/iLab/model/color_resnet18/1.pth --attention_model_dir data/iLab/model/Attention/model_ck0.pth)

Analyze your customized dataset

  1. We need to filter the foreground with GradCam, so we need to train a model first.
python preprocess_dataset/1_train_resnet/main.py --data YOUR_DATA_ROOT --arch PATH_TO_SAVE_MODEL
  1. Entity segmentation
please follow this repo https://github.com/dvlab-research/Entity/tree/main/Entity
  1. Identify foreground. Please have a look into the code and change the arguments to your customized data.
python preprocess_dataset/2_find_foreground_with_gradcam/select_mask.py
  1. Compute texture feature images.
python preprocess_dataset/3_compute_feature_images/generate_texture_feature.py
  1. Run DPT to get monodepth estimation.
python preprocess_dataset/3_compute_feature_images/preprocessed_shape/DPT/run_monodepth.py
  1. Compute shape feature images.
python preprocess_dataset/3_compute_feature_images/preprocessed_shape/generate_shape_feature.py
  1. Compute the images that used to generate color features.
python preprocess_dataset/3_compute_feature_images/preprocess_color/FourierScrambled/generate_input.py
  1. Use matlab to implement FourierScrambled. Matlab code path preprocess_dataset/3_compute_feature_images/preprocess_color/FourierScrambled/main.m.

  2. After get all the feature images, you can go to part 2 and analyze the dataset bias with humanoid neural network.


Imagination of HVE

Experiment of Section 5.2 "Cross Feature Imagination with HVE"

1. Train the Model

  • In terminal, run cd Imagine

  • Edit script.sh. Set the path of dataset and output file.

#!/usr/bin/env bash

FEATURE=texture # choose from texture, color, shape

python main.py --cuda 0, 1 \
               --mode train \
               --batch_size 16 \
               --dataset_path /lab/tmpig8d/u/yao_data/human_simulation_engine/V3_${FEATURE}_dataset \
               --output_path out/deeper/${FEATURE}
  • run sh script.sh

2. Run Generation

  • Edit test.sh.
    • Set the path of dataset and output file.
    • Set the test checkpoint file
    • If want to use the mismatched shape, texture, color as input, set --mismatch
    • Example
FEATURE=texture # choose from texture, color, shape

python main.py --cuda 0 \
               --mode predict \
               --batch_size 16 \
               --dataset_path /lab/tmpig8d/u/yao_data/human_simulation_engine/V3_${FEATURE}_dataset \
               --output_path out/deeper/${FEATURE} \
               --test_epoch 269 \
  • run sh test.sh

3. Calculate the FID

pip install pytorch-fid
  • Resize the groughtruth image

For example (you need to change the path):

#!/usr/bin/env bash
FEATURE=texture # choose from texture, color, shape

# Ground truth images dir
dataset_path=/lab/tmpig8d/u/yao_data/human_simulation_engine/V3_${FEATURE}_dataset/ori/valid/

# Processed gt images dir
process_path=out/deeper_deeper_res_new_texture/${FEATURE}/gt

python create_dataset.py --ori_path ${dataset_path} --path ${process_path}
  • Run FID code:

For example (you need to change the path):

#!/usr/bin/env bash
FEATURE=texture # choose from texture, color, shape

# Processed gt images dir
process_path=out/deeper_deeper_res_new_texture/${FEATURE}/gt

# Generation result dir
output_path=out/deeper_deeper_res_new_texture/${FEATURE}/result_mismatch

python -m pytorch_fid ${process_path} ${output_path} --device cuda:1 --batch-size 128

Zero shot Segmentation

1. Generate distance file

please generate distance file with

python Zero_shot/cross_modality_two_latents.py

2. Download vector

please download vector from conceptnet

3. Zero shot Learning

please run zero shot learning with

python Zero_shot/zero_shot.py

About

code for [ECCV 2022 paper] Contributions of Shape, Texture, and Color in Visual Recognition

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published