-
Notifications
You must be signed in to change notification settings - Fork 8
/
Copy pathrandom-effects-within-between-effects-model.Rmd
427 lines (315 loc) · 19.5 KB
/
random-effects-within-between-effects-model.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
---
title: "Fixed and Random Effects Models"
author: "Daniel Lüdecke"
date: "22 May 2019"
output:
html_document:
theme: cerulean
toc: yes
bibliography: random-effects-within-between-effects-model.bib
---
```{r setup, include=FALSE,echo=FALSE}
library(knitr)
knitr::opts_chunk$set(
echo = TRUE,
collapse = TRUE,
warning = FALSE,
message = FALSE,
comment = "#>",
dev = "png"
)
```
![cc](cc-attrib-nc.png)
<!---
(http://i.creativecommons.org/l/by-nc-sa/3.0/88x31.png)
--->
This document is licensed under the
[Creative Commons attribution-noncommercial license](http://creativecommons.org/licenses/by-nc-sa/2.5/ca/).
Please share \& remix noncommercially, mentioning its origin. Sourcecode and data are available [here](https://github.com/strengejacke/mixed-models-snippets).
## The violation of model-assumptions in RE-models for panel data
This example shows how to address the issue when group factors (random effects) and (time-constant) predictors correlate for mixed models, especially in panel data. Models, where predictors and group factors correlate, may have compromised estimates of uncertainty as well as possible bias. In particular in econometrics, fixed-effects models are considered the gold-standard to address such issues. However, it often makes no sense to consider group-effects as "fixed" over a long time period. Apart from this, there are more shortcomings of FE-models as well, see [@bell_fixed_2018], [@bell_understanding_2018] and [@bafumi_fitting_2006].
The following equations and discussion on FE vs. RE are based on [@bell_fixed_2018]. Further discussion on FE vs. RE, also at the end of this document, refer to [@gelman_data_2007] and [@bafumi_fitting_2006].
### Adding group meaned predictors to solve this issue
The solution to the critics from "FE-modelers" is simple: If you include a group-mean of your variables in a random effects model (that is, calculating the mean of the predictor at each group-level and including it as a group-level predictor), it will give the same answer as a fixed effects model (see table 3 very below, and [@bell_understanding_2018] as reference). This is why FE-modelers often call this type of RE-model also a "kind of" FE-model, i.e. they define a RE model as a model where predictors are assumed uncorrelated with the residuals. However,
> "Calling it a FE model is not just inaccurate. It also does down its potential. Eg FE models don’t usually include random slopes, and failing to do so can lead to incorrect SEs as well as being a less interesting and realistic model."
> "A random effects model is such because it has random effects (that is, higher-level entities treated as a distribution) in it rather than fixed effects (higher-level entities treated as dummy variables) in it."
source: [Twitter-Discussion 1](https://twitter.com/AndrewJDBell/status/1026764338370105344), [Twitter-Discussion 2](https://twitter.com/AndrewJDBell/status/1026764347480178689)
### Problems of ignoring random slopes in Fixed Effects models
[@heisig_costs_2017] demonstrate how ignoring random slopes, i.e. neglecting "cross-cluster differences in the effects of lower-level controls reduces the precision of estimated context effects, resulting in unnecessarily wide confidence intervals and low statistical power". You may refer to this paper to justify a mixed model with random slopes over "simple" FE-models.
## Examples
The following code snippets show how to translate the Equations from [@bell_fixed_2018] into R-code, using `lmer()` from the **lme4**-package.
```{r message=FALSE}
library(lme4)
library(sjPlot)
library(parameters)
library(lfe)
load("example.RData")
```
_Sourcecode and data are available [here](https://github.com/strengejacke/mixed-models-snippets)._
## Description of the data
* Variables:
* `x_tv` : time-varying variable
* `z1_ti` : first time-invariant variable, co-variate
* `z2_ti` : second time-invariant variable, co-variate
* `QoL` : Response (quality of life of patient)
* `ID` : patient ID
* `time` : time-point of measurement
## "Classical" growth-model for longitudinal data
```{r}
model_re <- lmer(
QoL ~ time + age + x_tv + z1_ti + z2_ti + (1 + time | ID),
data = d
)
```
## Computing the de-meaned and group-meaned variables
Next is a model from Eq. 10, which includes the _"de-meaned"_ time-varying variable as well as the _"group-meaned"_ time-varying variable.
```{r}
# compute mean of "x_tv" for each subject (ID) and
# then "de-mean" x_tv
d <- cbind(
d,
demean(d, select = c("x_tv", "QoL"), group = "ID") # from package "parameters"
)
```
Now we have:
* `x_tv_between` : time-varying variable with the mean of `x_tv` accross all time-points, for each patient (ID).
* `x_tv_within` : the de-meaned time-varying variable `x_tv`
`QoL_between` and `QoL_within` are used to test different FE-models, which are described later. In those models, I also use a "de-meaned" response variable without the group-variable (`ID`) as fixed effect (see Equation 6 in the paper).
## The complex random-effect-within-between model (REWB)
Eq. 10 suggests allowing the "within-effect" (de-meaned) vary across individuals, that's why `x_tv_within` is added as random slope as well.
Here, the estimate of `x_tv_within` indicates the _within-subject_ effect, while the estimate of `x_tv_between` indicates the _between-subject_ effect. This model also allows for heterogenity across level-2 units, that's why `x_tv_within` also appears in the random effects. The estimates of `z1_ti`
and `z2_ti` also indicate a _between-subject_ effect, as this is a level-2 variable, which cannot have a within-subject effect.
### Model from Equation 10
Here is the equation 10 from Bell et al. 2018:
```{r echo=FALSE}
f <- "y<sub>it</sub> = β<sub>0</sub> + β<sub>1W</sub> (x<sub>it</sub> - ͞x<sub>i</sub>) + β<sub>2B</sub> ͞x<sub>i</sub> + β<sub>3</sub> z<sub>i</sub> + υ<sub>i0</sub> + υ<sub>i1</sub> (x<sub>it</sub> - ͞x<sub>i</sub>) + ε<sub>it</sub>"
knitr::asis_output(f)
```
```{r echo=FALSE}
f <- "<ul><li>x<sub>it</sub> - ͞x<sub>i</sub> is the de-meaned predictor, <em>x_tv_within</em></li><li>͞x<sub>i</sub> is the group-meaned predictor, <em>x_tv_between</em></li><li>β<sub>1W</sub> is the coefficient for x_tv_within (within-subject)</li><li>β<sub>2B</sub> is the coefficient for x_tv_between (bewteen-subject)</li><li>β<sub>3</sub> is the coefficient for `z1_ti` or `z2_ti` (bewteen-subject)</li></ul>"
knitr::asis_output(f)
```
```{r}
# This model this leads to an error (number of observations <= number of
# random effects), the check for nobs vs. re is ignored here.
model_rewb <- lmer(
QoL ~ time + age + x_tv_within + x_tv_between + z1_ti + z2_ti + (1 + time + x_tv_within | ID),
data = d,
control = lmerControl(check.nobs.vs.nRE = "ignore")
)
# An alternative could be to model the random effects as not correlated.
# m2b <- lmer(
# QoL ~ time + age + x_tv_within + x_tv_between + z1_ti + z2_ti + (1 + time + x_tv_within || ID),
# data = d
# )
# here we get no error message, model runs fine...
model_complex_rewb <- lmer(
QoL ~ time + age + x_tv_within + x_tv_between + z1_ti + z2_ti +
(1 + time | ID) + (1 + x_tv_within | ID),
data = d
)
# an alternative would be to assume independence between random slopes
# and no covariance...
model_complex_rewb_2 <- lmer(
QoL ~ time + age + x_tv_within + x_tv_between + z1_ti + z2_ti +
(1 + time | ID) + (0 + x_tv_within | ID),
data = d
)
```
We compare all model fits, but we go on with `model_complex_rewb` (the second model in the table below) for now...
#### Table 1: Comparison of complex REWB-Models
```{r message=FALSE}
tab_model(
model_rewb, model_complex_rewb, model_complex_rewb_2,
show.ci = FALSE,
show.se = TRUE,
auto.label = FALSE,
string.se = "SE",
show.icc = FALSE,
dv.labels = c("Complex REWB (1)", "Complex REWB (2)", "Complex REWB (3)")
)
```
## The simple random-effect-within-between model (REWB) and Mundlak model
After email correspondance, the paper's authors suggest that, depending on the research interest and necessary complexity of the model, a "simple" random-slope might be suitable as well. As stated in the paper, this is useful if homogenity across level-2 units is assumed. This model usually yields the same results as a FE-model, however, we additionally have information about the random effects - and the model can incorporate time-invariant covariates.
Again, the estimate of `x_tv_within` indicates the within-subject effect, while the estimate of `x_tv_between` indicates the between-subject effect.
### Model from Equation 2
```{r echo=FALSE}
f <- "y<sub>it</sub> = β<sub>0</sub> + β<sub>1W</sub> (x<sub>it</sub> - ͞x<sub>i</sub>) + β<sub>2B</sub> ͞x<sub>i</sub> + β<sub>3</sub> z<sub>i</sub> + (υ<sub>i</sub> + ε<sub>it</sub>)"
knitr::asis_output(f)
```
```{r}
model_simple_rewb <- lmer(
QoL ~ time + age + x_tv_within + x_tv_between + z1_ti + z2_ti + (1 + time | ID),
data = d
)
```
An alternativ would be the **Mundlak** model. Here, the estimate of `x_tv` indicates the _within-subject_ effect, while the estimate of `x_tv_between` indicates the _contextual_ effect.
### Model from Equation 3
```{r echo=FALSE}
f <- "y<sub>it</sub> = β<sub>0</sub> + β<sub>1W</sub> x<sub>it</sub> + β<sub>2C</sub> ͞x<sub>i</sub> + β<sub>3</sub> z<sub>i</sub> + (υ<sub>i</sub> + ε<sub>it</sub>)"
knitr::asis_output(f)
```
```{r}
model_mundlak <- lmer(
QoL ~ time + age + x_tv + x_tv_between + z1_ti + z2_ti + (1 + time | ID),
data = d
)
```
The contextual effect, i.e. the coefficient for `x_tv_between`, indicates the effect of an individual (at level 1) that moves from one level-2 group into another one. In the above model, or in general: in case of longitudinal data, the contextual effect is meaningless, as level-2 predictors are individuals (or subjects) themselves, and by definition cannot "move" to another individual. Therefor, the REWB-model is more informative.
## Comparison of models
In table 2, we compare the "classical" RE-model, the complex REWB-model, the "simple" REWB-model and the Mundlak-Model.
#### Table 2: Comparison of RE, REWB and Mundlak Models
```{r message=FALSE}
tab_model(
model_re, model_complex_rewb, model_simple_rewb, model_mundlak,
show.ci = FALSE,
show.se = TRUE,
auto.label = FALSE,
string.se = "SE",
show.icc = FALSE,
dv.labels = c("Classical RE", "Complex REWB", "Simple REWB", "Mundlak")
)
```
## Check if a REWB- or simple RE-model suitable
If the estimates of the within- and between-effect (`x_tv_within` and `x_tv_between`) are (almost) identical, or if the contextual effect (`x_tv_between`) in the **Mundlak**-model is zero and doesn't give a significant improvement for the model, you can also use a simple RE-model.
A simple way to check this is a likelihood-ratio test between the simple RE-model and the Mundlak-model:
```{r}
anova(model_re, model_mundlak)
```
Here we see a significant improvement of the Mundlak-model over the simple RE-model, indicating that it makes sense to model within- and between-subjects effects, i.e. to apply a REWB-model.
## Comparison FE- and REWB-Model
The function `felm()` from the package **lfe** was use to compute the fixed effects regression models. Base R's `lm()` gives the same result, however, the output is much longer due to the ID-parameters.
```{r echo=FALSE}
f <- "y<sub>it</sub> = β<sub>1</sub> (x<sub>it</sub> - ͞x<sub>i</sub>) + (υ<sub>i</sub> + ε<sub>it</sub>)"
knitr::asis_output(f)
```
```{r}
# Model from Equation 5
model_fe_ID <- felm(
QoL ~ time + x_tv_within | ID,
data = d
)
# same as this lm-model
# model_fe_ID <- lm(
# QoL ~ 0 + time + x_tv_within + ID,
# data = d
# )
```
Equation 6 describes a fixed effects model with de-meaned dependent variable.
```{r echo=FALSE}
f <- "(y<sub>it</sub> - ͞y<sub>i</sub>) = β<sub>1</sub> (x<sub>it</sub> - ͞x<sub>i</sub>) + ε<sub>it</sub>"
knitr::asis_output(f)
```
```{r}
# Model from Equation 6
model_fe_y_within <- felm(
QoL_within ~ time + x_tv_within,
data = d
)
# or ...
# model_fe_y_within <- lm(
# QoL_within ~ 0 + time + x_tv_within,
# data = d
# )
```
We compare the results from the FE-models with a simple RE-model and the REWB-model.
```{r}
model_re_2 <- lmer(
QoL ~ time + x_tv_within + x_tv_between + (1 | ID),
data = d
)
# Compare with complex REWB-model
model_complex_rewb3 <- lmer(
QoL ~ time + x_tv_within + x_tv_between +
(1 + time | ID) + (1 + x_tv_within | ID),
data = d
)
```
As we can see, the estimates of the FE-models and the RE-model are identical. However, the estimates from the REWB-model differ. This is because the time-varying predictors, the within-subject effect `x_tv_within`, is allowed to vary between subjects as well (i.e. it is modelled as random slope).
#### Table 3: Comparison of FE- and RE-models
```{r message=FALSE}
tab_model(
model_fe_ID, model_fe_y_within, model_re_2, model_complex_rewb3,
show.ci = FALSE,
show.se = TRUE,
auto.label = FALSE,
string.se = "SE",
show.icc = FALSE,
dv.labels = c("FE-model with ID", "FE, de-meaned Y (with Intercept)", "RE", "Complex REWB")
)
```
## Comparison with the panelr-package
The [panelr-package](https://panelr.jacob-long.com/) provides functions to fit models similar to those suggested by Bell et al. 2018, especially the "simple REWB model" (`model_simple_rewb`) and the Mundlak-model (`model_mundlak`). For the complex REWB model (`model_complex_rewb`), I needed some slight modification when using `panelr::wbm()`, so the following model `model_complex_rewb_panelr` that mimics the complex REWB model is similar to the above model `model_rewb`.
Here we compare the results from `panelr::wbm()` with our previous models `model_complex_rewb` (complex REWB), `model_simple_rewb` (simple REWB) and `model_mundlak` (Mundlak).
```{r}
library(panelr)
# prepare the data for processing with "panelr"
pd <- panel_data(d, id = ID, wave = time)
# the complex REWB-model
model_complex_rewb_panelr <- wbm(QoL ~ x_tv | age + z1_ti + z2_ti + time | (time + x_tv | ID),
data = pd,
control = lmerControl(check.nobs.vs.nRE = "ignore"))
# the simple REWB-model
model_rewb_panelr <- wbm(QoL ~ x_tv | age + z1_ti + z2_ti + time | (time | ID), data = pd)
# the Mundlak model
model_mundlak_panelr <- wbm(QoL ~ x_tv | age + z1_ti + z2_ti + time | (time | ID), data = pd, model = "contextual")
tab_model(
model_complex_rewb_panelr, model_rewb_panelr, model_mundlak_panelr,
show.ci = FALSE,
show.se = TRUE,
auto.label = FALSE,
string.se = "SE",
show.icc = FALSE,
dv.labels = c("Complex REWB", "Simple REWB", "Mundlak")
)
```
```{r}
# compare with other models
tab_model(
model_complex_rewb, model_simple_rewb, model_mundlak,
show.ci = FALSE,
show.se = TRUE,
auto.label = FALSE,
string.se = "SE",
show.icc = FALSE,
dv.labels = c("Complex REWB", "Simple REWB", "Mundlak")
)
```
As we can see, coefficients, standard errors and p-values of all relevant parameters are identical for the simple REWB and Mundlak models from both packages (`panelr::wbm()` and `lme4::lmer()`). This confirms the correct "translation" of the formulae from Bell at al. 2018 into `lmer()`-syntax.
The complex REWB models are also (almost) identical, the minor variation after the second fractional part is most likely due to the slightly different random effects specification.
## Conclusion
When group factors (random effects) and (time-constant) predictors correlate, it's recommended to fit a complex random-effect-within-between model (REWB) instead of a "simple" mixed effects model. This requires de- and group-meaning the time-varying predictors. Depending on the data structure, random slope and intercept may correlate or not.
The random effects structure, i.e. how to model random slopes and intercepts and allow correlations among them, depends on the nature of the data. The benefits from using mixed effects models over fixed effects models are more precise estimates (in particular when random slopes are included) and the possibility to include between-subjects effects.
In case of convergence problems or singular fits, note that changing the optimizer might help. In this context, some models ran fine in _lme4_, while other models that had problems being fitted in _lme4_ ran without any problems in [**glmmTMB**](random-effects-within-between-effects-model-glmmtmb.html)
```{r eval=FALSE}
# compute group-mean of "x_tv" for each subject (ID) and
# then "de-mean" x_tv
d <- cbind(
d,
demean(d, select = c("x_tv", "QoL"), group = "ID") # from package "parameters"
)
# fit complex REWB-model
m <- lmer(
QoL ~ time + age + x_tv_within + x_tv_between + z1_ti + z2_ti +
(1 + time | ID) + (1 + x_tv_within | ID),
data = d
)
# an alternative would be to assume independence between random slopes
# and no covariance...
m <- lmer(
QoL ~ time + age + x_tv_within + x_tv_between + z1_ti + z2_ti +
(1 + time | ID) + (0 + x_tv_within | ID),
data = d
)
```
* `x_tv_within` indicates the _within-subject_ effect
* `x_tv_between` indicates the _between-subject_ effect
* `z1_ti` and `z2_ti` also indicate a _between-subject_ effect
## Further critics of the FE-approach
(source: http://andrewgelman.com/2012/04/02/fixed-effects-and-identification/)
> "But the so-called fixed effects model does not in general minimize bias. It only minimizes bias under some particular models. As I wrote above, 'it’s just another model.' Another way to see this, in the time-series cross-sectional case, is to recognize that there’s no reason to think of group-level coefficients as truly 'fixed'. One example I remember was a regression on some political outcomes, with 40 years of data for each of 50 states, where the analysis included 'fixed effects' for states. I’m sorry but it doesn’t really make sense to think of Vermont from 1960 through 2000 as being 'fixed' in any sense."
> "I just don’t see how setting the group-level variance to infinity can be better than estimating it from the data or setting it to a reasonable finite value. That said, the big advantage of multilevel (“random effects”) modeling comes when you are interested in the varying coefficients themselves, or if you’re interested in predictions for new groups, or if you want the treatment effect itself to vary by group. On a slightly different note, I’m unhappy with many of the time-series cross-sectional analyses I’ve seen because I don’t buy the assumption of constancy over time. That is, I don’t really think those effects are “fixed”!"
> "I don’t know that there’s anything much that’s time-invariant in what I study. But, in any case, the so-called fixed-effects analysis is mathematically a special case of multilevel modeling in which the group-level variance is set to infinity. I agree that there’s no need to “believe” that model for the method to work; however, I think it works because of some implicit additivity assumptions. I’d prefer to (a) allow the group-level variance to be finite, and (b) work in the relevant assumptions more directly."
## Further Readings
- Discussion at [Cross-Validted](https://stats.stackexchange.com/q/100227)
# References