Skip to content

Commit

Permalink
updated errata
Browse files Browse the repository at this point in the history
  • Loading branch information
avehtari committed Feb 24, 2022
1 parent c9c008f commit 2ee1312
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 3 deletions.
5 changes: 4 additions & 1 deletion errata.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,11 @@ If you notice an error that is not mentioned in the errata below, submit an issu
- p. 167, the code line `y_rep[s,1] <- y[1]` should be `y_rep[s,1] <- unemp$y[1]` (thanks to Ravi Shroff)
- p. 181, ex 11.5(b) "Figure 10.2." should be "Figure 11.2." (thanks to Ravi Shroff)
- p. 187, "a procedure that is equivalent to centering and rescaling is to leave" $\rightarrow$ "we can get the same inferences for the coefficients other than the intercept by leaving" (thanks to Ravi Shroff)
- p. 209, "$\frac{p_0}{D-p_0}\frac{\sigma}{\sqrt{n}}$, where $p_0$ is the expected number of relevant predictors" $\rightarrow$ "$\frac{p_0}{p-p_0}\frac{\sigma}{\sqrt{n}}$, where $p$ is the number of predictors, $p_0$ is the expected number of relevant predictors" (thanks to Roberto Viviani)
- p. 231, $\log(-.595)$ $\rightarrow$ $\log(0.595)$ (thanks to Daniel Timar)
- p. 267 `fit_nb[[k]] <- stan_glm(y ~ x, family=neg_binomial_2(link="log"), data=fake, refresh=0)` $\rightarrow$ `fit_nb[[k]] <- stan_glm(y ~ x, family=neg_binomial_2(link="log"), data=fake_nb[[k]], refresh=0)` (thanks to Brian Bucher)
- p. 268, `offset=log(exposure)` $\rightarrow$ `offset=log(exposure2)` (thanks to A. Solomon Kurz)
- p. 269, `y_rep_1 <- posterior_predict(fit_1)` $\rightarrow$ `yrep_1 <- posterior_predict(fit_1)` (thanks to Darci Kovacs)
- p. 285, The numbers for the earnings model on p.285 should be the same as on p. 284 (thanks to David Galley)
- p. 295, $n=(2.8* 0.49/0.1)^2=196$ $\rightarrow$ $n=(2.8* 0.5/0.1)^2=196$, where $0.5$ is used as a conservative estimate of the standard deviation (thanks to Solomon A. Kurz)
- p. 299, $0.5/1.15 = 0.43$ $\rightarrow$ $0.5/1.25 = 0.4$ (thanks to Solomon A. Kurz)
Expand All @@ -39,7 +41,7 @@ the true $\beta$ to be 2.8 standard errors from zero, so that there is
an 80\% probability that $\hat{\beta}$ is at least 2 standard errors
from zero. If $\beta=0.018\%$, then its standard error would have to be
no greater than $0.018/2.8$, so that the survey would need a sample
size of $1629*(2.8*0.015/0.018)^2=900$. (thanks to comment by Solomon A. Kurz)
size of $1629*(2.8*0.015/0.018)^2=9000$. (thanks to comment by Solomon A. Kurz, and typo fix by Patrick Wen)
- p. 304 "only 44% of their children were girls" $\rightarrow$ "only 48% of their children were girls" (thanks to comment by Solomon A. Kurz)
- p. 375, three regression equations (unnumbered, 19.2 and 19.3) are missing $\alpha + $ from the beginning of RHS (right after $=$). (thanks to Junhui Yang)
- p. 407, $\sum_{k=1}^K(X_{ik}-X_{jk})^2$ $\rightarrow$ $(\sum_{k=1}^K(X_{ik}-X_{jk})^2)^{1/2}$ (thanks to Stefan Gehrig)
Expand Down Expand Up @@ -78,6 +80,7 @@ size of $1629*(2.8*0.015/0.018)^2=900$. (thanks to comment by Solomon A. Kurz)
- p. 208, Figure 12.11 all top row subplots show the prior distribution for the regularized horseshoe prior. The correct subplot are produced by the R code in [Student example](https://avehtari.github.io/ROS-Examples/Student/student.html) and shown below (thanks to Zhengchen Cai for reporting the issue)
![](Figure_12_11_correct.png){ width=800 }

- p. 213, Ex 12.11, "$\beta$ is 0.3" $\rightarrow$ $\beta$ is $-0.3$ (thanks t o Justin Gross)
- p. 217, First line in section 13.1 "The logistic function, logit(x)" $\rightarrow$ "The logit function, logit(x)" (thanks to VicentModesto)
- p. 219, "The inverse logistic function is curved" $\rightarrow$ "The inverse logit function is curved" (thanks to VicentModesto)
- p. 241-242, in paragraphs starting "The steps go..." and "Figure 14.2...": $(4.0/4.1)x_1$ $\rightarrow$ $(4.4/4.1)x_1$ (thanks to Doug Davidson)
Expand Down
7 changes: 5 additions & 2 deletions errata.html
Original file line number Diff line number Diff line change
Expand Up @@ -343,7 +343,7 @@

<h1 class="title toc-ignore">Regression and Other Stories - Errata</h1>
<h4 class="author">Andrew Gelman, Jennifer Hill, Aki Vehtari</h4>
<h4 class="date">Page updated: 2022-01-13</h4>
<h4 class="date">Page updated: 2022-02-24</h4>

</div>

Expand All @@ -359,16 +359,18 @@ <h2>1st and 2nd printing</h2>
<li>p. 167, the code line <code>y_rep[s,1] &lt;- y[1]</code> should be <code>y_rep[s,1] &lt;- unemp$y[1]</code> (thanks to Ravi Shroff)</li>
<li>p. 181, ex 11.5(b) “Figure 10.2.” should be “Figure 11.2.” (thanks to Ravi Shroff)</li>
<li>p. 187, “a procedure that is equivalent to centering and rescaling is to leave” <span class="math inline">\(\rightarrow\)</span> “we can get the same inferences for the coefficients other than the intercept by leaving” (thanks to Ravi Shroff)</li>
<li>p. 209, “<span class="math inline">\(\frac{p_0}{D-p_0}\frac{\sigma}{\sqrt{n}}\)</span>, where <span class="math inline">\(p_0\)</span> is the expected number of relevant predictors” <span class="math inline">\(\rightarrow\)</span><span class="math inline">\(\frac{p_0}{p-p_0}\frac{\sigma}{\sqrt{n}}\)</span>, where <span class="math inline">\(p\)</span> is the number of predictors, <span class="math inline">\(p_0\)</span> is the expected number of relevant predictors” (thanks to Roberto Viviani)</li>
<li>p. 231, <span class="math inline">\(\log(-.595)\)</span> <span class="math inline">\(\rightarrow\)</span> <span class="math inline">\(\log(0.595)\)</span> (thanks to Daniel Timar)</li>
<li>p. 267 <code>fit_nb[[k]] &lt;- stan_glm(y ~ x, family=neg_binomial_2(link="log"), data=fake, refresh=0)</code> <span class="math inline">\(\rightarrow\)</span> <code>fit_nb[[k]] &lt;- stan_glm(y ~ x, family=neg_binomial_2(link="log"), data=fake_nb[[k]], refresh=0)</code> (thanks to Brian Bucher)</li>
<li>p. 268, <code>offset=log(exposure)</code> <span class="math inline">\(\rightarrow\)</span> <code>offset=log(exposure2)</code> (thanks to A. Solomon Kurz)</li>
<li>p. 269, <code>y_rep_1 &lt;- posterior_predict(fit_1)</code> <span class="math inline">\(\rightarrow\)</span> <code>yrep_1 &lt;- posterior_predict(fit_1)</code> (thanks to Darci Kovacs)</li>
<li>p. 285, The numbers for the earnings model on p.285 should be the same as on p. 284 (thanks to David Galley)</li>
<li>p. 295, <span class="math inline">\(n=(2.8* 0.49/0.1)^2=196\)</span> <span class="math inline">\(\rightarrow\)</span> <span class="math inline">\(n=(2.8* 0.5/0.1)^2=196\)</span>, where <span class="math inline">\(0.5\)</span> is used as a conservative estimate of the standard deviation (thanks to Solomon A. Kurz)</li>
<li>p. 299, <span class="math inline">\(0.5/1.15 = 0.43\)</span> <span class="math inline">\(\rightarrow\)</span> <span class="math inline">\(0.5/1.25 = 0.4\)</span> (thanks to Solomon A. Kurz)</li>
<li>p. 300, The paragraph starting “We illustrate with the example of the survey earnings and height discussed in Chapter 4.” and the next two paragraphs have been edited to:<br />
   We illustrate with the survey of earnings and height discussed in Chapter 12. The coefficient for the sex-earnings interaction in model (12.2) is plausible (a positive interaction, implying that an extra inch of height is worth 2% more for men than for women), with a standard error of 1%.<br />
   Extracting another significant figure from the fitted regression yields an estimated interaction of 0.018 with standard error 0.015. How large a sample size would have been needed for the coefficient on the interaction to be “statistically significant” at the conventional 95% level? A simple calculation uses the fact that standard errors are proportional to <span class="math inline">\(1/\sqrt{n}\)</span>. For a point estimate of 0.018 to be two standard errors from zero, it would need a standard error of 0.009, which would require the sample size to be increased by a factor of <span class="math inline">\((0.015/0.009)^2\)</span>. The original survey had a sample of 1629; this implies a required sample size of <span class="math inline">\(1629*(0.015/0.009)^2=4500\)</span>.<br />
   To perform a power calculation for this hypothetical larger survey of 4500 people, we could suppose that the true <span class="math inline">\(\beta\)</span> for the interaction is equal to 0.018 and that the standard error is as we have just calculated. With a standard error of 0.009 the estimate from the regression would then be conventionally “statistically significant” only if <span class="math inline">\(\hat{\beta}&gt;0.018\)</span> (or, in other direction, if <span class="math inline">\(\hat{\beta}&lt; -0.018\)</span>, but that latter possibility is highly unlikely given our assumptions). If the true coefficient <span class="math inline">\(\beta\)</span> is 0.018, then we would expect <span class="math inline">\(\hat{\beta}\)</span> to exceed 0.018, and thus achieve statistical significance, with a probability of <span class="math inline">\(\frac{1}{2}\)</span>—that is, 50% power. To get 80% power, we need the true <span class="math inline">\(\beta\)</span> to be 2.8 standard errors from zero, so that there is an 80% probability that <span class="math inline">\(\hat{\beta}\)</span> is at least 2 standard errors from zero. If <span class="math inline">\(\beta=0.018\%\)</span>, then its standard error would have to be no greater than <span class="math inline">\(0.018/2.8\)</span>, so that the survey would need a sample size of <span class="math inline">\(1629*(2.8*0.015/0.018)^2=900\)</span>. (thanks to comment by Solomon A. Kurz)</li>
   To perform a power calculation for this hypothetical larger survey of 4500 people, we could suppose that the true <span class="math inline">\(\beta\)</span> for the interaction is equal to 0.018 and that the standard error is as we have just calculated. With a standard error of 0.009 the estimate from the regression would then be conventionally “statistically significant” only if <span class="math inline">\(\hat{\beta}&gt;0.018\)</span> (or, in other direction, if <span class="math inline">\(\hat{\beta}&lt; -0.018\)</span>, but that latter possibility is highly unlikely given our assumptions). If the true coefficient <span class="math inline">\(\beta\)</span> is 0.018, then we would expect <span class="math inline">\(\hat{\beta}\)</span> to exceed 0.018, and thus achieve statistical significance, with a probability of <span class="math inline">\(\frac{1}{2}\)</span>—that is, 50% power. To get 80% power, we need the true <span class="math inline">\(\beta\)</span> to be 2.8 standard errors from zero, so that there is an 80% probability that <span class="math inline">\(\hat{\beta}\)</span> is at least 2 standard errors from zero. If <span class="math inline">\(\beta=0.018\%\)</span>, then its standard error would have to be no greater than <span class="math inline">\(0.018/2.8\)</span>, so that the survey would need a sample size of <span class="math inline">\(1629*(2.8*0.015/0.018)^2=9000\)</span>. (thanks to comment by Solomon A. Kurz, and typo fix by Patrick Wen)</li>
<li>p. 304 “only 44% of their children were girls” <span class="math inline">\(\rightarrow\)</span> “only 48% of their children were girls” (thanks to comment by Solomon A. Kurz)</li>
<li>p. 375, three regression equations (unnumbered, 19.2 and 19.3) are missing $+ $ from the beginning of RHS (right after <span class="math inline">\(=\)</span>). (thanks to Junhui Yang)</li>
<li>p. 407, <span class="math inline">\(\sum_{k=1}^K(X_{ik}-X_{jk})^2\)</span> <span class="math inline">\(\rightarrow\)</span> <span class="math inline">\((\sum_{k=1}^K(X_{ik}-X_{jk})^2)^{1/2}\)</span> (thanks to Stefan Gehrig)</li>
Expand Down Expand Up @@ -407,6 +409,7 @@ <h2>1st printing (original printing)</h2>
<li>p. 190 “an expected positive difference of about 5% in the outcome variable” <span class="math inline">\(\rightarrow\)</span> “an expected positive difference of about 6% in the outcome variable”</li>
<li>p. 208, <code>/sqrt(0.3*26)</code> <span class="math inline">\(\rightarrow\)</span> <code>*sqrt(0.3/26)</code> (this is correct in the code in the web page, typo fix in the book thanks to Eugenia Migliavacca)</li>
<li><p>p. 208, Figure 12.11 all top row subplots show the prior distribution for the regularized horseshoe prior. The correct subplot are produced by the R code in <a href="https://avehtari.github.io/ROS-Examples/Student/student.html">Student example</a> and shown below (thanks to Zhengchen Cai for reporting the issue) <img src="Figure_12_11_correct.png" width="800" /></p></li>
<li>p. 213, Ex 12.11, “<span class="math inline">\(\beta\)</span> is 0.3” <span class="math inline">\(\rightarrow\)</span> <span class="math inline">\(\beta\)</span> is <span class="math inline">\(-0.3\)</span> (thanks t o Justin Gross)</li>
<li>p. 217, First line in section 13.1 “The logistic function, logit(x)” <span class="math inline">\(\rightarrow\)</span> “The logit function, logit(x)” (thanks to VicentModesto)</li>
<li>p. 219, “The inverse logistic function is curved” <span class="math inline">\(\rightarrow\)</span> “The inverse logit function is curved” (thanks to VicentModesto)</li>
<li>p. 241-242, in paragraphs starting “The steps go…” and “Figure 14.2…”: <span class="math inline">\((4.0/4.1)x_1\)</span> <span class="math inline">\(\rightarrow\)</span> <span class="math inline">\((4.4/4.1)x_1\)</span> (thanks to Doug Davidson)</li>
Expand Down

0 comments on commit 2ee1312

Please sign in to comment.