Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report() Cohen's d for a t-test differs from output from lsr::cohensD and effectsize::cohens_d #183

Closed
leighclark opened this issue Jun 9, 2021 · 7 comments
Labels
question ⁉️ Further information is requested

Comments

@leighclark
Copy link

Question and context

I apologise if this is the wrong place to ask this and/or if I am doing something obviously stupid - I'm quite new to all this. I have two questions around the output of report() for a t-test.

This is the output from report()

> report(t.test(self_efficacy_mean ~ user_sex, paired = FALSE, alternative = "less", data = ratings2))
Effect sizes were labelled following Cohen's (1988) recommendations.

The Welch Two Sample t-test testing the difference of self_efficacy_mean by user_sex (mean in group female = 4.61, mean in group male = 5.14) suggests that the effect is positive, statistically significant, and medium (difference = 0.53, 95% CI [-Inf, -0.41], t(522.17) = -7.14, p < .001; Cohen's d = -0.63, 95% CI [-0.80, -0.45])

However the d value differs from that which I get from cohensD and cohens_d:

> cohensD(self_efficacy_mean ~ user_sex, method = "unequal", data = ratings2)
[1] 0.4844071

> cohens_d(self_efficacy_mean ~ user_sex, pooled_sd = FALSE, data = ratings2)
Cohen's d |         95% CI
--------------------------
-0.48     | [-0.62, -0.35]

- Estimated using un-pooled SD.

So my first question is, is this a bug or a user error?

My second question refers to the output from report(). Despite the t-test being alternative = "less", report() reports it as being positive and swaps the use_sex groups around. Is this expected behaviour?

Thanks in advance, and also thanks for the wonderful package - it's been very useful!

@IndrajeetPatil
Copy link
Member

To answer the first question, unfortunately, this is unavoidable.

The report() function internally uses effectsize::effectsize(), and not effectsize::cohens_d.

As you can see, the values match now:

library(report)
library(effectsize)

mod <- t.test(wt ~ as.factor(am), data = mtcars)

report(mod)
#> Effect sizes were labelled following Cohen's (1988) recommendations.
#> 
#> The Welch Two Sample t-test testing the difference of wt by as.factor(am) (mean in group 0 = 3.77, mean in group 1 = 2.41) suggests that the effect is negative, statistically significant, and large (difference = -1.36, 95% CI [0.85, 1.86], t(29.23) = 5.49, p < .001; Cohen's d = 2.03, 95% CI [1.13, 2.91])

effectsize(mod)
#> Warning in effectsize.htest(mod): Unable to retrieve data from htest object.
#> Using t_to_d() approximation.
#> d    |       95% CI
#> -------------------
#> 2.03 | [1.13, 2.91]

This is because cohens_d works with raw data, while effectsize works with model objects and needs to rely on an approximation.

cohens_d(wt ~ as.factor(am), data = mtcars)
#> Cohen's d |       95% CI
#> ------------------------
#> 1.89      | [1.03, 2.73]
#> 
#> - Estimated using pooled SD.

lsr::cohensD(wt ~ as.factor(am), data = mtcars)
#> [1] 1.892406

Created on 2021-06-09 by the reprex package (v2.0.0)

@mattansb
Copy link
Member

mattansb commented Jun 9, 2021

My second question refers to the output from report(). Despite the t-test being alternative = "less", report() reports it as being positive and swaps the use_sex groups around. Is this expected behaviour?

The "order" is consistent with the t.test() function:

effectsize::cohens_d(extra ~ group, paired = TRUE, data = sleep) # negative d value
#> Cohen's d |         95% CI
#> --------------------------
#> -1.28     | [-2.23, -0.44]

t.test(extra ~ group, paired = TRUE, data = sleep) # negative t value
#> 
#>  Paired t-test
#> 
#> data:  extra by group
#> t = -4.0621, df = 9, p-value = 0.002833
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#>  -2.4598858 -0.7001142
#> sample estimates:
#> mean of the differences 
#>                   -1.58

This is because cohens_d works with raw data, while effectsize works with model objects and needs to rely on an approximation.

Adding to @IndrajeetPatil, when the data is recoverable, report will compute Cohen's d, and not just estimate it:

t1 <- t.test(extra ~ group, paired = TRUE, data = sleep)

t2 <- t.test(sleep$extra[sleep$group==1], sleep$extra[sleep$group==2], paired = TRUE)

report::report(t1)
#> Effect sizes were labelled following Cohen's (1988) recommendations.
#> 
#> The Paired t-test testing the difference of extra by group (mean of the differences = -1.58) suggests that the effect is negative, statistically significant, and large (difference = -1.58, 95% CI [-2.46, -0.70], t(9) = -4.06, p < .01; Cohen's d = -1.35, 95% CI [-2.23, -0.44])

report::report(t2)
#> Effect sizes were labelled following Cohen's (1988) recommendations.
#> 
#> The Paired t-test testing the difference between sleep$extra[sleep$group == 1] and sleep$extra[sleep$group == 2] (mean of the differences = -1.58) suggests that the effect is negative, statistically significant, and large (difference = -1.58, 95% CI [-2.46, -0.70], t(9) = -4.06, p < .01; Cohen's d = -1.28, 95% CI [-2.23, -0.44])

Created on 2021-06-09 by the reprex package (v2.0.0)

@IndrajeetPatil IndrajeetPatil added the question ⁉️ Further information is requested label Jun 9, 2021
@IndrajeetPatil
Copy link
Member

Thanks, Mattan. My train commute ended before I could get to the second question 😅

Closing for now. Please feel free to ask any further questions you may have about this.

@leighclark
Copy link
Author

Thank you all for your responses, I will take a moment (several) to digest them :)

Much appreciated!

L.

@leighclark
Copy link
Author

Once again thankyou @IndrajeetPatil and @mattansb for taking the time to respond. I've reproduced everything you said above.

I am not sure if this is the right place to post since the issue is closed (sorry if it's not correct) but my last question might be a bit more subjective:

Since there is a difference between the d values, would you consider the output from cohens_d() more "correct" than the default from report()? Would either of you have a preference if you were reporting your own results in a similar case?

Thanks again and all the best from Berlin!

@IndrajeetPatil
Copy link
Member

Created a new issue to discuss this: #184

Thanks for your feedback! 😸

Hi from Berlin! 😅

@mattansb
Copy link
Member

Since there is a difference between the d values, would you consider the output from cohens_d() more "correct" than the default from report()? Would either of you have a preference if you were reporting your own results in a similar case?

The output from cohens_d() too depends on the input:

library(effectsize)

t1 <- t.test(extra ~ group, paired = TRUE, data = sleep)

t2 <- t.test(sleep$extra[sleep$group==1], sleep$extra[sleep$group==2], paired = TRUE)

cohens_d(t1)
#> Warning in effectsize.htest(x, type = "d", ci = ci, verbose = verbose): Unable
#> to retrieve data from htest object. Using t_to_d() approximation.
#> d     |         95% CI
#> ----------------------
#> -1.35 | [-2.23, -0.44]

cohens_d(t2)
#> Cohen's d |         95% CI
#> --------------------------
#> -1.28     | [-2.23, -0.44]

cohens_d(extra ~ group, paired = TRUE, data = sleep)
#> Cohen's d |         95% CI
#> --------------------------
#> -1.28     | [-2.23, -0.44]

And as I noted above, report can return the true Cohen's d if the right t.test() is passed to it (t2 in the above example).

So this isn't an issue of effectsize vs report - it's an issue of "can the original data be recovered from the t.test() object.
When the data can be recovered, the results from report will not be an approximate.

Generally, we would like the true value, and not just an approximate (:

Conclusion: don't use formula notation in t.test()! This makes the original data un-recoverable!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question ⁉️ Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants