Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

outputs of rbmi::lsmeans() don't match those of emmeans::lsmeans() #412

Closed
adcascone opened this issue Mar 11, 2024 · 8 comments · Fixed by #413 or #473
Closed

outputs of rbmi::lsmeans() don't match those of emmeans::lsmeans() #412

adcascone opened this issue Mar 11, 2024 · 8 comments · Fixed by #413 or #473
Assignees
Labels
bug Something isn't working

Comments

@adcascone
Copy link

On pg. 70 of the rbmi reference manual, users are directed to the references of lsmeans() for identical implementations of estimating the least square means from a linear model in SAS and in R via the emmeans package.

However, the outputs of rbmi::lsmeans() don't match those of emmeans::lsmeans()


  iris2 <- iris[ iris$Species %in% c("versicolor", "virginica"), ]
  iris2$Species <- factor(iris2$Species)
  
  emmeans::lsmeans(lm("Sepal.Length ~ Species + Petal.Length*Petal.Width", data=iris2), specs = "Species",data=iris2, weights="proportional")
    #  Species    lsmean     SE df lower.CL upper.CL
    # versicolor   6.49 0.0796 95     6.33     6.65
    # virginica    5.99 0.0768 95     5.84     6.14
  rbmi:::lsmeans(lm("Sepal.Length ~ Species + Petal.Length*Petal.Width", data=iris2), Species = "versicolor",.weights = "proportional")$est[[1]]
    #6.512539
  rbmi:::lsmeans(lm("Sepal.Length ~ Species + Petal.Length*Petal.Width", data=iris2), Species = "virginica",.weights = "proportional")$est[[1]]
    #6.011461

Environment

  • OS: Linux
  • R version 4.3.2
  • rbmi version 1.2.5

Thank you,
Arianna

@adcascone adcascone added the bug Something isn't working label Mar 11, 2024
@gowerc
Copy link
Collaborator

gowerc commented Mar 19, 2024

Hi @adcascone,

Thanks for raising this. Digging into this a bit more I don't think this is a bug but more of a "what are you estimating" issue. For reference the discrepancy comes from how we handle interactions between continuous covariates where we use mean(A*B) whilst emmeans uses mean(A) * mean(B). For additional reference it appears that SAS's default is equivalent to what we've implemented however they do provide options for the user to select between the two (see Setting Covariate Values ).

Talking to @wolbersm, if I have understood him correctly, the difference comes down to whether you are wanting to construct a single hypothetical patient (in which case the emmeans approach makes more sense) or if you are wanting to construct a counterfactual e.g. what would have happened if all patients had taken drug vs all patients taking placebo (which is what our current implementation represents).

If you would definitely prefer to use the hypothetical patient approach we could look to add an option to rbmi to allow users to select between the two.

(@wolbersm - Please correct me if I've got any of the details above wrong)

@wolbersm
Copy link
Collaborator

Hi both

Thanks @gowerc! Just to small additional clarification to what @gowerc wrote:

rbmi is primarily targeted to the analysis of randomized clinical trials.
In this setting it is often useful to report the estimated (counterfactual) population mean outcomes if all subjects had received intervention or control treatment, respectively. This can be done via standardization and, for linear models, equivalently via LSMEANS as per our implementation. See e.g. this slide deck (slide 30f) for further regarding the connection of standadization and LSMEANS: https://baselbiometrics.github.io/home/docs/trainings/20210202/2_Bornkamp.pdf

@wolbersm
Copy link
Collaborator

So, while I think our choice is reasonable, it could be useful to document in the help pages that rbmi::lsmeans() may give different results to emmeans::lsmeans() if interaction terms are included in the model. What do you think @gowerc?

@gowerc
Copy link
Collaborator

gowerc commented Mar 22, 2024

Just to clarify we ended up making the following changes which hopefully address this issue (please re-open if you feel it hasn't been sufficiently addressed)

  • Renamed the weighting "proportional" to "counterfactual" to be better represent what this does
  • Introduced weighting "proportional_em" to mirror the emmeans "proportional" weighting
  • The weighting "proportional" still exists for backwards compatibility and is an alias for "counterfactual" but now gives a message suggesting the user to update to either "proportional_em" or "counterfactual" to be explicit
  • Updated our documentation to hopefully make this clearer.

Hope this helps !

@adcascone
Copy link
Author

Hi @gowerc and @wolbersm! Many thanks for your response and attention to this matter. I don't have permissions to reopen this issue, so I am commenting instead.

I have an additional question related to the "proportional_em" weighting option in rbmi:::lsmeans(), which mirrors the "proportional" weighting in emmeans::lsmeans().

Based on the updates, I would have expected the following rbmi:::lsmeans() calls with .weights="proportional.em" to produce outputs that match those generated with emmeans::lsmeans() with weights="proportional". However, I'm running into some errors, which makes me wonder if I'm misunderstanding the details in the reference manual:

Data used in below examples

##This example code is from p. 15 ref. manual
iris2 <- iris[ iris$Species %in% c("versicolor", "virginica"), ]
iris2$Species <- factor(iris2$Species)

rbmi code

rbmi:::lsmeans(lm("Sepal.Length ~ Species + Petal.Length*Petal.Width", data=iris2), Species = "versicolor",.weights = "proportional_em")

rbmi:::lsmeans(lm("Sepal.Length ~ Species + Petal.Length*Petal.Width", data=iris2), Species = "virginica",.weights = "proportional_em")

Produces the following errors:

Image

emmeans code

emmeans::lsmeans(lm("Sepal.Length ~ Species + Petal.Length*Petal.Width", data=iris2), specs = "Species",data=iris2, weights="proportional")

Image

Can you please help me understand the rbmi:::lsmeans() errors I am running into? Thanks so much!

Arianna C.

Environment

  • OS: Linux
  • R version 4.4.2
  • rbmi version 1.3.0

@gowerc
Copy link
Collaborator

gowerc commented Jan 20, 2025

Will try and take a look sometime this week

@gowerc gowerc reopened this Jan 20, 2025
@gowerc gowerc self-assigned this Jan 20, 2025
@gowerc
Copy link
Collaborator

gowerc commented Jan 21, 2025

Hey @adcascone - Apologies there was a "school boy" bug that I accidentally introduced to the code. Apparently I forgot to stop the data.frame from reducing to a vector if only 1 variable was selected :(

as.list(dat2[, categorical_vars]),

Should have a drop = FALSE, fixing this appears to get the correct result:

Image

This will be fixed in the next release.

@adcascone
Copy link
Author

Awesome, I'm glad it was an easy fix. Thanks for looking into this, I appreciate your time!

Arianna

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants