man/details_rule_fit_h2o.Rd

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/rule_fit_h2o.R
\name{details_rule_fit_h2o}
\alias{details_rule_fit_h2o}
\title{RuleFit models via h2o}
\description{
\code{\link[h2o:h2o.rulefit]{h2o::h2o.rulefit()}} fits a model that derives simple feature rules from a tree
ensemble and uses the rules as features to a regularized (LASSO) model. \code{\link[agua:h2o_train]{agua::h2o_train_rule()}}
is a wrapper around this function.
}
\details{
For this engine, there are multiple modes: classification and regression
\subsection{Tuning Parameters}{

This model has 3 tuning parameters:
\itemize{
\item \code{trees}: # Trees (type: integer, default: 50L)
\item \code{tree_depth}: Tree Depth (type: integer, default: 3L)
\item \code{penalty}: Amount of Regularization (type: double, default: 0) Note
that \code{penalty} for the h2o engine in `rule_fit()`` corresponds to
the L1 penalty (LASSO).
}

Other engine arguments of interest:
\itemize{
\item \code{algorithm}: The algorithm to use to generate rules. should be one of
“AUTO”, “DRF”, “GBM”, defaults to “AUTO”.
\item \code{min_rule_length}: Minimum length of tree depth, opposite of
\code{tree_dpeth}, defaults to 3.
\item \code{max_num_rules}: The maximum number of rules to return. The default
value of -1 means the number of rules is selected by diminishing
returns in model deviance.
\item \code{model_type}: The type of base learners in the ensemble, should be one
of: “rules_and_linear”, “rules”, “linear”, defaults to
“rules_and_linear”.
}
}

\subsection{Translation from parsnip to the underlying model call (regression)}{

\code{\link[agua:h2o_train]{agua::h2o_train_rule()}} is a wrapper around
\code{\link[h2o:h2o.rulefit]{h2o::h2o.rulefit()}}.

The \strong{agua} extension package is required to fit this model.

\if{html}{\out{<div class="sourceCode r">}}\preformatted{library(rules)

rule_fit(
  trees = integer(1),
  tree_depth = integer(1),
  penalty = numeric(1)
) \%>\%
  set_engine("h2o") \%>\%
  set_mode("regression") \%>\%
  translate()
}\if{html}{\out{</div>}}

\if{html}{\out{<div class="sourceCode">}}\preformatted{## RuleFit Model Specification (regression)
## 
## Main Arguments:
##   trees = integer(1)
##   tree_depth = integer(1)
##   penalty = numeric(1)
## 
## Computational engine: h2o 
## 
## Model fit template:
## agua::h2o_train_rule(x = missing_arg(), y = missing_arg(), weights = missing_arg(), 
##     validation_frame = missing_arg(), rule_generation_ntrees = integer(1), 
##     max_rule_length = integer(1), lambda = numeric(1))
}\if{html}{\out{</div>}}
}

\subsection{Translation from parsnip to the underlying model call (classification)}{

\code{\link[agua:h2o_train]{agua::h2o_train_rule()}} for \code{rule_fit()} is a
wrapper around \code{\link[h2o:h2o.rulefit]{h2o::h2o.rulefit()}}.

The \strong{agua} extension package is required to fit this model.

\if{html}{\out{<div class="sourceCode r">}}\preformatted{rule_fit(
  trees = integer(1),
  tree_depth = integer(1),
  penalty = numeric(1)
) \%>\%
  set_engine("h2o") \%>\%
  set_mode("classification") \%>\%
  translate()
}\if{html}{\out{</div>}}

\if{html}{\out{<div class="sourceCode">}}\preformatted{## RuleFit Model Specification (classification)
## 
## Main Arguments:
##   trees = integer(1)
##   tree_depth = integer(1)
##   penalty = numeric(1)
## 
## Computational engine: h2o 
## 
## Model fit template:
## agua::h2o_train_rule(x = missing_arg(), y = missing_arg(), weights = missing_arg(), 
##     validation_frame = missing_arg(), rule_generation_ntrees = integer(1), 
##     max_rule_length = integer(1), lambda = numeric(1))
}\if{html}{\out{</div>}}
}

\subsection{Preprocessing requirements}{

Factor/categorical predictors need to be converted to numeric values
(e.g., dummy or indicator variables) for this engine. When using the
formula method via \code{\link[=fit.model_spec]{fit()}}, parsnip will
convert factor columns to indicators.
}

\subsection{Other details}{

To use the h2o engine with tidymodels, please run \code{h2o::h2o.init()}
first. By default, This connects R to the local h2o server. This needs
to be done in every new R session. You can also connect to a remote h2o
server with an IP address, for more details see
\code{\link[h2o:h2o.init]{h2o::h2o.init()}}.

You can control the number of threads in the thread pool used by h2o
with the \code{nthreads} argument. By default, it uses all CPUs on the host.
This is different from the usual parallel processing mechanism in
tidymodels for tuning, while tidymodels parallelizes over resamples, h2o
parallelizes over hyperparameter combinations for a given resample.

h2o will automatically shut down the local h2o instance started by R
when R is terminated. To manually stop the h2o server, run
\code{h2o::h2o.shutdown()}.
}

\subsection{Saving fitted model objects}{

Models fitted with this engine may require native serialization methods
to be properly saved and/or passed between R sessions. To learn more
about preparing fitted models for serialization, see the bundle package.
}
}
\keyword{internal}