[WACV2023] Dense Prediction with Attentive Feature Aggregation

This is the official implementation of our paper "Dense Prediction with Attentive Feature Aggregation".

Yung-Hsu Yang, Thomas E. Huang, Min Sun, Samuel Rota Bulò, Peter Kontschieder, Fisher Yu

Abstract

Aggregating information from features across different layers is essential for dense prediction models. Despite its limited expressiveness, vanilla feature concatenation dominates the choice of aggregation operations. In this paper, we introduce Attentive Feature Aggregation (AFA) to fuse different network layers with more expressive non-linear operations. AFA exploits both spatial and channel attention to compute weighted averages of the layer activations. Inspired by neural volume rendering, we further extend AFA with Scale-Space Rendering (SSR) to perform a late fusion of multi-scale predictions. AFA is applicable to a wide range of existing network designs. Our experiments show consistent and significant improvements on challenging semantic segmentation benchmarks, including Cityscapes and BDD100K at negligible computational and parameter overhead. In particular, AFA improves the performance of the Deep Layer Aggregation (DLA) model by nearly 6% mIoU on Cityscapes. Our experimental analyses show that AFA learns to progressively refine segmentation maps and improve boundary details, leading to new state-of-the-art results on boundary detection benchmarks on NYUDv2 and BSDS500.

Installation

Please refer to INSTALL.md for installation and to PREPARE_DATASETS.md for dataset preparation.

Get Started

Please see GETTING_STARTED.md for the basic usage.

Model Zoo

Cityscapes

Model	Crop Size	Batch Size	Training Epochs	mIoU (val)	mIoU (test)	config	weights	Preds	Visuals
AFA-DLA (Train)	1024x2048	8	375	85.14	-	config	model	val	val
AFA-DLA (Train + Val)	1024x1024	16	275	-	83.58	config	model	test	test

BDD100K

Model	Crop Size	Batch Size	Training Epochs	mIoU (val)	mIoU (test)	config	weights	Preds	Visuals
AFA-DLA	720x1280	16	200	67.46	58.70	config	model	val \| test	val \| test

NYUDv2

Model	Crop Size	Batch Size	Training Epochs	ODS	OIS	config	weights	Preds	Visuals
AFA-DLA (RGB)	480x480	16	54	0.762	0.775	config	model	test	test
AFA-DLA (HHA)	480x480	16	54	0.718	0.730	config	model	test	test

BSDS500

Model	Crop Size	Batch Size	Training Epochs	ODS	OIS	config	weights	Preds	Visuals
AFA-DLA	416x416	16	14	0.812	0.826	config	model	test	test
AFA-DLA (PASCAL)	416x416	16	20	0.810	0.826	config	model	test	test

Qualitative Results

Cityscapes Test Set

BDD100K Test Set

You can find more visualizations in our project page.

Citation

@inproceedings{yang2023dense,
    title={Dense prediction with attentive feature aggregation},
    author={Yang, Yung-Hsu and Huang, Thomas E and Sun, Min and Bul{\`o}, Samuel Rota and Kontschieder, Peter and Yu, Fisher},
    booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
    pages={97--106},
    year={2023}
}

Acknowledgement

The codbase is developed from NVIDIA segmentation. We deeply thank for the help of their open-sourced code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

[WACV2023] Dense Prediction with Attentive Feature Aggregation

Abstract

Installation

Get Started

Model Zoo

Cityscapes

BDD100K

NYUDv2

BSDS500

Qualitative Results

Cityscapes Test Set

BDD100K Test Set

Citation

Acknowledgement

Files

README.md

Latest commit

History

README.md

File metadata and controls

[WACV2023] Dense Prediction with Attentive Feature Aggregation

Abstract

Installation

Get Started

Model Zoo

Cityscapes

BDD100K

NYUDv2

BSDS500

Qualitative Results

Cityscapes Test Set

BDD100K Test Set

Citation

Acknowledgement