Skip to content

Commit

Permalink
Add Explanation Alignment paper (#62)
Browse files Browse the repository at this point in the history
* add eXCV to venues.yml

* Add explanation alignment publications page

* Add explanation alignment teaser image

* Add explanation-alignment thumbnail image

* Update explanation-alignment.md
  • Loading branch information
hhybang authored and arvind committed Sep 25, 2024
1 parent 031a7eb commit dbfd4d1
Show file tree
Hide file tree
Showing 4 changed files with 29 additions and 0 deletions.
6 changes: 6 additions & 0 deletions _data/venues.yml
Original file line number Diff line number Diff line change
Expand Up @@ -141,3 +141,9 @@ mit-genai:
bibtex:
type: article
venue: publisher
eXCV:
short: eXCV Workshop at ECCV
full: 'ECCV Workshop on Explainable Computer Vision: Where are We and Where are We Going?'
bibtex:
type: inproceedings
venue: booktitle
23 changes: 23 additions & 0 deletions _pubs/explanation-alignment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
---
title: 'Explanation Alignment: Quantifying the Correctness of Model Reasoning At Scale'
authors:
- key: hbang
- key: aboggust
- key: arvindsatya
venue: eXCV
type: workshop
year: 2024
date: 2024-09-29
tags:
- empirical study
- quantitative methods
- machine learning interpretability
- human-ai interaction
teaser: Explanation alignment measures the agreement between model-generated explanations and human annotations to detect spurious correlations and model biases by aggregating saliency-based metrics, such as Shared Interest and The Pointing Game, across datasets. In pretrained ImageNet models, it reveals that models with similar accuracy can focus on vastly different image regions, highlighting significant variations in explanation alignment despite comparable performance.
materials:
- name: Code
url: https://github.com/mitvis/explanation_alignment
type: cube

---
To improve the reliability of machine learning models, researchers have developed metrics to measure the alignment between model saliency and human explanations. Thus far, however, these saliency-based alignment metrics have been used to conduct descriptive analyses and instance-level evaluations of models and saliency methods. To enable evaluative and comparative assessments of model alignment, we extend these metrics to compute explanation alignment—the aggregate agreement between model and human explanations. To compute explanation alignment, we aggregate saliency-based alignment metrics over many model decisions and report the result as a performance metric that quantifies how often model decisions are made for the right reasons. Through experiments on nearly 200 image classification models, multiple saliency methods, and MNIST, CelebA, and ImageNet tasks, we find that explanation alignment automatically identifies spurious correlations, such as model bias, and uncovers behavioral differences between nearly identical models. Further, we characterize the relationship between explanation alignment and model performance, evaluating the factors that impact explanation alignment and how to interpret its results in-practice.
Binary file added imgs/teasers/explanation-alignment.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added imgs/thumbs/explanation-alignment.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit dbfd4d1

Please sign in to comment.