-
Notifications
You must be signed in to change notification settings - Fork 90
/
Copy pathdetails_poisson_reg_hurdle.Rd
127 lines (105 loc) · 4.24 KB
/
details_poisson_reg_hurdle.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/poisson_reg_hurdle.R
\name{details_poisson_reg_hurdle}
\alias{details_poisson_reg_hurdle}
\title{Poisson regression via pscl}
\description{
\code{\link[pscl:hurdle]{pscl::hurdle()}} uses maximum likelihood estimation to fit a model for
count data that has separate model terms for predicting the counts and for
predicting the probability of a zero count.
}
\details{
For this engine, there is a single mode: regression
\subsection{Tuning Parameters}{
This engine has no tuning parameters.
}
\subsection{Translation from parsnip to the underlying model call (regression)}{
The \strong{poissonreg} extension package is required to fit this model.
\if{html}{\out{<div class="sourceCode r">}}\preformatted{library(poissonreg)
poisson_reg() \%>\%
set_engine("hurdle") \%>\%
translate()
}\if{html}{\out{</div>}}
\if{html}{\out{<div class="sourceCode">}}\preformatted{## Poisson Regression Model Specification (regression)
##
## Computational engine: hurdle
##
## Model fit template:
## pscl::hurdle(formula = missing_arg(), data = missing_arg(), weights = missing_arg())
}\if{html}{\out{</div>}}
}
\subsection{Preprocessing and special formulas for zero-inflated Poisson models}{
Factor/categorical predictors need to be converted to numeric values
(e.g., dummy or indicator variables) for this engine. When using the
formula method via \code{\link[=fit.model_spec]{fit()}}, parsnip will
convert factor columns to indicators.
}
\subsection{Specifying the statistical model details}{
For this particular model, a special formula is used to specify which
columns affect the counts and which affect the model for the probability
of zero counts. These sets of terms are separated by a bar. For example,
\code{y ~ x | z}. This type of formula is not used by the base R
infrastructure (e.g. \code{model.matrix()})
When fitting a parsnip model with this engine directly, the formula
method is required and the formula is just passed through. For example:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{library(tidymodels)
tidymodels_prefer()
data("bioChemists", package = "pscl")
poisson_reg() \%>\%
set_engine("hurdle") \%>\%
fit(art ~ fem + mar | ment, data = bioChemists)
}\if{html}{\out{</div>}}
\if{html}{\out{<div class="sourceCode">}}\preformatted{## parsnip model object
##
##
## Call:
## pscl::hurdle(formula = art ~ fem + mar | ment, data = data)
##
## Count model coefficients (truncated poisson with log link):
## (Intercept) femWomen marMarried
## 0.847598 -0.237351 0.008846
##
## Zero hurdle model coefficients (binomial with logit link):
## (Intercept) ment
## 0.24871 0.08092
}\if{html}{\out{</div>}}
However, when using a workflow, the best approach is to avoid using
\code{\link[workflows:add_formula]{workflows::add_formula()}} and use
\code{\link[workflows:add_variables]{workflows::add_variables()}} in
conjunction with a model formula:
\if{html}{\out{<div class="sourceCode r">}}\preformatted{data("bioChemists", package = "pscl")
spec <-
poisson_reg() \%>\%
set_engine("hurdle")
workflow() \%>\%
add_variables(outcomes = c(art), predictors = c(fem, mar, ment)) \%>\%
add_model(spec, formula = art ~ fem + mar | ment) \%>\%
fit(data = bioChemists) \%>\%
extract_fit_engine()
}\if{html}{\out{</div>}}
\if{html}{\out{<div class="sourceCode">}}\preformatted{##
## Call:
## pscl::hurdle(formula = art ~ fem + mar | ment, data = data)
##
## Count model coefficients (truncated poisson with log link):
## (Intercept) femWomen marMarried
## 0.847598 -0.237351 0.008846
##
## Zero hurdle model coefficients (binomial with logit link):
## (Intercept) ment
## 0.24871 0.08092
}\if{html}{\out{</div>}}
The reason for this is that
\code{\link[workflows:add_formula]{workflows::add_formula()}} will try to
create the model matrix and either fail or create dummy variables
prematurely.
}
\subsection{Case weights}{
This model can utilize case weights during model fitting. To use them,
see the documentation in \link{case_weights} and the examples
on \code{tidymodels.org}.
The \code{fit()} and \code{fit_xy()} arguments have arguments called
\code{case_weights} that expect vectors of case weights.
}
}
\keyword{internal}