You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For binomial data, the sample size to use in the calculation of BIC should be the total number of trials, not the number of binomial observations: see p779 of Kass and Raftery (1995), which I have attached below.
Currently, BICtab (and BIC in base R) use the number of binomial observations, unless, of course, the data are entered in binary format.
In order to avoid having to put the data in binary format, it would be helpful to be able to change the value of k in BICtab so that it uses the correct definition of sample size.
Is it possible to allow for this?
As Kass and Raftery point out, when analysing a contingency table, the sample size used in BIC should be the sum of the frequencies, not the number of cells in the table (so again BICtab and BIC in base R will be wrong if the data are in contingency-table format).
In these two settings (binomial data and contingency tables) the problem arises because we are analysing a summary of the data, rather than the original data collected at the "sampling unit" level.
It is possible that the definition of sample size in AICc needs to be thought about carefully in these settings as well, but I haven't seen any theoretical work on this. I seem to recall that there has been some discussion in the mark-recapture world about what we mean by "sample size" when calculating AICc (after fitting a Cormack-Jolly-Seber multinomial model, for example).
For binomial data, the sample size to use in the calculation of BIC should be the total number of trials, not the number of binomial observations: see p779 of Kass and Raftery (1995), which I have attached below.
Currently, BICtab (and BIC in base R) use the number of binomial observations, unless, of course, the data are entered in binary format.
In order to avoid having to put the data in binary format, it would be helpful to be able to change the value of k in BICtab so that it uses the correct definition of sample size.
Is it possible to allow for this?
As Kass and Raftery point out, when analysing a contingency table, the sample size used in BIC should be the sum of the frequencies, not the number of cells in the table (so again BICtab and BIC in base R will be wrong if the data are in contingency-table format).
In these two settings (binomial data and contingency tables) the problem arises because we are analysing a summary of the data, rather than the original data collected at the "sampling unit" level.
It is possible that the definition of sample size in AICc needs to be thought about carefully in these settings as well, but I haven't seen any theoretical work on this. I seem to recall that there has been some discussion in the mark-recapture world about what we mean by "sample size" when calculating AICc (after fitting a Cormack-Jolly-Seber multinomial model, for example).
Kass 1995.pdf
The text was updated successfully, but these errors were encountered: