SHAP analysis for `h2o.deeplearning` model #16463

bappa10085 · 2024-12-17T11:30:07Z

From the documentation of h2o package I came to know that SHAP analysis is not available for h2o.deeplearning model. So I was unable to use shapviz for h2o.deeplearning model. Here is a minimal reproducible example

library(shapviz)
library(tidyverse)
library(h2o)
h2o.init()

set.seed(1)
# Get rid of that darn ordinals
ord <- c("clarity", "cut", "color")
diamonds[, ord] <- lapply(diamonds[, ord], factor, ordered = FALSE)

dia_h2o <- as.h2o(diamonds)

### Deep Learning Model
fit <- h2o.deeplearning(x = c("carat", "clarity", "color", "cut"), 
                             y = "price", training_frame = dia_h2o, seed=123456)

fit

# SHAP analysis on about 2000 diamonds
X_small <- diamonds %>%
  filter(carat <= 2.5) %>%
  sample_n(2000) %>%
  as.h2o()

shp <- shapviz(fit, X_pred = X_small)

It returns me following error

Error in .check_model_suitability_for_calculation_of_contributions(object, :
Calculation of feature contributions without a background frame requires a tree-based model.

According to the developer of shapviz package "h2o provides SHAP only for certain tree based models. You could ask h2o for a model-agnostic explainer such as permutation SHAP or Kernel SHAP (or some deep learning-specific variant). If they would implement something like this, it could be easily added to the pure plotting package shapviz." as discussed here. Are there any plan to add this feature?

The text was updated successfully, but these errors were encountered:

mayer79 · 2024-12-20T20:10:56Z

A (slightly unrelated) comment from my side: it is fantastic that h2o random forests now provide SHAP values as well.

tomasfryda · 2025-01-07T13:06:19Z

We actually support SHAP for Deep Learning, and GBM, DRF, GLM, StackedEnsembles, XGBoost (see https://docs.h2o.ai/h2o/latest-stable/h2o-docs/performance-and-prediction.html#predict-contributions).

The problem in shapviz package likely stems from different API - all models that support SHAP in h2o (all models that come out of the h2o automl) can be run with background_frame but only tree-based models can be run without it (but AFAIK recommendation is to use background_frame even for the tree-based models (but it's slower)).

Background frame is a frame that contains baselines. The Generalized Deep SHAP paper has some examples when it might be beneficial using some subset as the background_frame.

bappa10085 · 2025-01-07T15:41:40Z

@tomasfryda Can you show the code using the example data I have provided?

mayer79 · 2025-01-07T18:33:38Z

@tomasfryda very neat! I was not aware of additional SHAP algos in h2o.

I will study the implementation a bit closer and see if adaptions to {shapviz} are required.

This seems to work:

X_small <- diamonds %>%
  filter(carat <= 2.5) %>%
  sample_n(200) %>%
  as.h2o()

X_bg <- X_small[1:50, ]

shp <- shapviz(fit, X_pred = X_small, background_frame = X_bg)
sv_importance(shp)
sv_importance(shp, kind = "bee")
sv_dependence(shp, v = c("carat", "clarity", "color", "cut"))

bappa10085 · 2025-01-08T06:22:03Z

@mayer79 What should be the optimum background_frame (% of the samples)?

mayer79 · 2025-01-08T06:42:05Z

My feeling is around 100-500 (not percent) but I need to study the implementation.

tomasfryda · 2025-01-08T07:33:10Z

@mayer79 If you have any questions regarding the implementation, feel free to ask me (I implemented it in H2O-3).

@Bappa10 The number of samples depends on the use-case and complexity of the task (e.g. how big does the dataset be to be representative). It can be memory intensive since we internally build a matrix with number of rows = nrows(test_frame)*nrows(background_frame) (that's the worst case for output_per_reference=True which outputs contribution for each row in the "test_frame" compared to each row from the background frame (this is used in generalized deep shap in Stacked Ensembes)). But 100 - 500 samples seems like a good first try. (You can always add more or pick some other background frame sample to find out how sensitive the SHAP is to the particular background frame.)

bappa10085 · 2025-01-08T09:08:43Z

@mayer79 and @tomasfryda Another question, generally we use 70% of the data for model training (train set) and 30% for model validation (test set). Which set should be used as X_pred and background_frame?

tomasfryda · 2025-01-08T10:08:15Z

@bappa10085 short answer: I'd use subset of training data as background_frame and test_set as the X_pred.

But it's actually quite complicated to decide and it depends on what question are you trying to answer.

For example, according to the Consumer Financial Protection Bureau, for credit denials in the US, the regulatory commentary suggests to “identify the factors for which the applicant’s score fell furthest below the average score for each of those factors achieved by applicants whose total score was at or slightly above the minimum passing score.” This process can be done by using the applicants just above the cutoff to receive the credit product as the background dataset according to Machine Learning for High-Risk Applications (by Patrick Hall, James Curtis, Parul Pandey).

mayer79 · 2025-01-08T13:01:00Z

IMHO it does not matter from which data to sample because the response variable is not used. Thus, I'd sample both from the training data.

tomasfryda · 2025-01-08T14:44:26Z

@mayer79 I think it depends on the question you're trying to answer - when you have real world data, they can have temporal dependencies that might influence the model, e.g., crop production being influenced by climate change.

Depending on the choice of the background dataset you can either have SHAP values that were applicable 50 years ago (e.g. when there was more rainy days, artificial irrigation might appear much less important) or that will be applicable now.

Since it's often the case that we're interested in generalization capabilities of the model, we keep the interesting/"future" data in the test set so by that logic I think it can be beneficial to use the test set as X_pred and for background_frame I'd use relevant subset of the training data (in my example something like subset from the last decade) so we get SHAP values that correspond to our use-case of the model.

Does it make it clearer or did I make some logical mistake there?

mayer79 · 2025-01-08T19:22:24Z

I think that makes perfect sense. I had a "simple" splitting scheme in mind, where I'd want to reduce unnecessary use of the test data.

bappa10085 added the feature label Dec 17, 2024

bappa10085 closed this as completed Jan 9, 2025

mayer79 mentioned this issue Jan 10, 2025

Make model agnostic SHAP for H2O more visible. ModelOriented/shapviz#166

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SHAP analysis for `h2o.deeplearning` model #16463

SHAP analysis for `h2o.deeplearning` model #16463

bappa10085 commented Dec 17, 2024 •

edited

Loading

mayer79 commented Dec 20, 2024

tomasfryda commented Jan 7, 2025

bappa10085 commented Jan 7, 2025

mayer79 commented Jan 7, 2025

bappa10085 commented Jan 8, 2025

mayer79 commented Jan 8, 2025

tomasfryda commented Jan 8, 2025

bappa10085 commented Jan 8, 2025

tomasfryda commented Jan 8, 2025

mayer79 commented Jan 8, 2025

tomasfryda commented Jan 8, 2025

mayer79 commented Jan 8, 2025

SHAP analysis for h2o.deeplearning model #16463

SHAP analysis for h2o.deeplearning model #16463

Comments

bappa10085 commented Dec 17, 2024 • edited Loading

mayer79 commented Dec 20, 2024

tomasfryda commented Jan 7, 2025

bappa10085 commented Jan 7, 2025

mayer79 commented Jan 7, 2025

bappa10085 commented Jan 8, 2025

mayer79 commented Jan 8, 2025

tomasfryda commented Jan 8, 2025

bappa10085 commented Jan 8, 2025

tomasfryda commented Jan 8, 2025

mayer79 commented Jan 8, 2025

tomasfryda commented Jan 8, 2025

mayer79 commented Jan 8, 2025

SHAP analysis for `h2o.deeplearning` model #16463

SHAP analysis for `h2o.deeplearning` model #16463

bappa10085 commented Dec 17, 2024 •

edited

Loading