man/cubist_rules.Rd

% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/cubist_rules.R
\name{cubist_rules}
\alias{cubist_rules}
\title{Cubist rule-based regression models}
\usage{
cubist_rules(
  mode = "regression",
  committees = NULL,
  neighbors = NULL,
  max_rules = NULL,
  engine = "Cubist"
)
}
\arguments{
\item{mode}{A single character string for the type of model.
The only possible value for this model is "regression".}

\item{committees}{A non-negative integer (no greater than 100) for the number
of members of the ensemble.}

\item{neighbors}{An integer between zero and nine for the number of training
set instances that are used to adjust the model-based prediction.}

\item{max_rules}{The largest number of rules.}

\item{engine}{A single character string specifying what computational engine
to use for fitting.}
}
\description{
\code{cubist_rules()} defines a model that derives simple feature rules from a tree
ensemble and creates regression models within each rule. This function can fit
regression models.

\Sexpr[stage=render,results=rd]{parsnip:::make_engine_list("cubist_rules")}

More information on how \pkg{parsnip} is used for modeling is at
\url{https://www.tidymodels.org/}.
}
\details{
Cubist is a rule-based ensemble regression model. A basic model tree
(Quinlan, 1992) is created that has a separate linear regression model
corresponding for each terminal node. The paths along the model tree are
flattened into rules and these rules are simplified and pruned. The parameter
\code{min_n} is the primary method for controlling the size of each tree while
\code{max_rules} controls the number of rules.

Cubist ensembles are created using \emph{committees}, which are similar to
boosting. After the first model in the committee is created, the second
model uses a modified version of the outcome data based on whether the
previous model under- or over-predicted the outcome. For iteration \emph{m}, the
new outcome \verb{y*} is computed using

\figure{comittees.png}

If a sample is under-predicted on the previous iteration, the outcome is
adjusted so that the next time it is more likely to be over-predicted to
compensate. This adjustment continues for each ensemble iteration. See
Kuhn and Johnson (2013) for details.

After the model is created, there is also an option for a post-hoc
adjustment that uses the training set (Quinlan, 1993). When a new sample is
predicted by the model, it can be modified by its nearest neighbors in the
original training set. For \emph{K} neighbors, the model-based predicted value is
adjusted by the neighbor using:

\figure{adjust.png}

where \code{t} is the training set prediction and \code{w} is a weight that is inverse
to the distance to the neighbor.

This function only defines what \emph{type} of model is being fit. Once an engine
is specified, the \emph{method} to fit the model is also defined. See
\code{\link[=set_engine]{set_engine()}} for more on setting the engine, including how to set engine
arguments.

The model is not trained or fit until the \code{\link[=fit.model_spec]{fit()}} function is used
with the data.

Each of the arguments in this function other than \code{mode} and \code{engine} are
captured as \link[rlang:topic-quosure]{quosures}. To pass values
programmatically, use the \link[rlang:injection-operator]{injection operator} like so:

\if{html}{\out{<div class="sourceCode r">}}\preformatted{value <- 1
cubist_rules(argument = !!value)
}\if{html}{\out{</div>}}
}
\references{
\url{https://www.tidymodels.org}, \href{https://www.tmwr.org/}{\emph{Tidy Modeling with R}}, \href{https://www.tidymodels.org/find/parsnip/}{searchable table of parsnip models}

Quinlan R (1992). "Learning with Continuous Classes." Proceedings
of the 5th Australian Joint Conference On Artificial Intelligence, pp.
343-348.

Quinlan R (1993)."Combining Instance-Based and Model-Based Learning."
Proceedings of the Tenth International Conference on Machine Learning, pp.
236-243.

Kuhn M and Johnson K (2013). \emph{Applied Predictive Modeling}. Springer.
}
\seealso{
\code{\link[Cubist:cubist.default]{Cubist::cubist()}}, \code{\link[Cubist:cubistControl]{Cubist::cubistControl()}}, \Sexpr[stage=render,results=rd]{parsnip:::make_seealso_list("cubist_rules")}
}