-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathGA_Opioid_Report.Rmd
346 lines (303 loc) · 13.7 KB
/
GA_Opioid_Report.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
---
title: "GA_Opioid_Report"
author: "Hayoung Jung"
output: html_document
---
```{r, set-chunk-opts, echo = FALSE}
library(knitr)
opts_chunk$set(
warning = FALSE, message = FALSE, echo = FALSE
)
```
```{r, here-i-am, include = FALSE}
here::i_am("GA_Opioid_Report.Rmd")
```
```{r, import-dat-rds, include = FALSE}
# Import Data
dat <- readRDS(
file = here::here("data/dat.rds")
)
```
```{r, import-data, include = FALSE}
# Import Libraries
library(tidyverse)
library(ggplot2)
library(tmap)
library(sf)
library(lme4)
```
## Introduction
**The data set is about the number of opioid overdose deaths and population in each county in Georgia in 2020. The goal of this study is to examine county-level measures of social vulnerability and access to health care in relation to observed overdose mortality rates. Specifically, we aim to identify county-level factors associated with opioid mortality in Georgia, focusing on measures of social vulnerability such as poverty, unemployment, housing vacancy, urbanization, and access to health care and treatment.**
## Data Structure and Summary Table
```{r}
# Data Structure
glimpse(dat)
```
```{r}
# Table 1
table_one <- readRDS(
file = here::here("output/table_one.rds")
)
table_one
```
* This table presents an analysis of difference among low, moderate, and high mortality rate groups across various demographic, socioeconomic, and geographic factors. Significant disparities were found in rural-urban continuum code (p=0.008), county population (p=0.040), poverty rate (p=0.002), vacancy rate (p=0.042), percentage of black population (p=0.012), and distance to interstate (p=0.035). These variablees differ meaningfully across mortality rate groups, suggesting potential associations with mortality outcomes. <br>
* Counties that are missing distance to interstate and treatment are **Columbia** and **Echols**.
## Exploratory Data Analysis
### 1. NAME
#### **Name of the county**
```{r, NAME}
#| fig.align = "center"
dat %>%
tm_shape() +
tm_polygons() +
tm_text(text = "NAME",
size = 0.35) +
tm_layout(title = "County \n Name",
title.size = 1,
title.position = c(.8, .9),
frame = FALSE)
```
* There are 159 counties in Georgia.
### 2. rucc_code13 / rucc_code13_n
#### **Rural-urban continuum code of the County**
```{r, rucc_code bar-plot}
#| fig.align = "center",
#| out.width = "500px"
knitr::include_graphics(
here::here("output/bar_rucc.png")
)
```
```{r, rucc_code}
#| fig.align = "center"
dat %>%
tm_shape() +
tm_polygons(col = c("rucc_code13"),
title = "Rural-Urban Continuum Code",
style = "order",
pal = "-Greens") +
tm_text(text = "rucc_code13_n",
size = 0.6) +
tm_layout(frame = FALSE,
legend.title.size = 1)
```
* Most counties in Georgia are non-metro, followed by small metro, large fringe metro, micropolitan, and medium metro areas. <br> Only **Fulton** county is classified as a large central metro.
* Most major metro areas, including large central metro, large fringe metro, and medium metro areas, are concentrated in the northern part of the state, particularly around Atlanta.
### 3 & 4. mortality / population
#### **The number of opioid-related deaths and population in each county**
```{r, mortality & population}
#| fig.align = "center"
# mortality count
p1 <- dat %>%
tm_shape() +
tm_polygons(col = c("mortality"),
title = "Mortality Count",
style = "cont",
pal = "Greens",
legend.reverse = TRUE) +
tm_text(text = "mortality",
size = 0.6) +
tm_layout(frame = FALSE,
legend.title.size = 1)
# population
p2 <- dat %>%
tm_shape() +
tm_polygons(col = c("population"),
title = "County Population",
style = "cont",
pal = "Reds",
legend.reverse = TRUE) +
tm_text(text = "population",
size = 0.4) +
tm_layout(frame = FALSE,
legend.title.size = 1)
tmap_arrange(p1, p2, nrow=1)
```
* Mortality counts are concentrated in urban areas, meaning higher counts are typically found in areas with large populations, such as Fulton, Gwinnett, Cobb, and Dekalb counties.
* In southern Georgia, **Tift** County stands out with a relatively high count (**31**), suggesting a higher expected mortality rate.
### 5. mort_rate
#### **Mortality rate = mortality count/population in each county**
```{r, mort_rate}
#| fig.align = "center"
# incidence
tm_shape(dat) +
tm_polygons(col = c("incidence"),
title = "Mortality rate",
style = "cont",
pal = "Greens",
legend.reverse = TRUE) +
tm_layout(frame = FALSE,
legend.title.size = 1)
```
* Located in southern and central Georgia, **Turner** County has the highest mortality rate at 0.001, followed by Tift (0.0008), Randolph (0.0006), Telfair (0.0006), and Wilkinson (0.0006) counties, which are also in the southern part of the st
* We aim to identify whether certain covariates in counties are associated with higher or lower mortality rates. On each covariate map, we are looking for emerging patterns across these areas.
##### The following are summary of the geographic distribution of the covariates that we are interested in.
### 6. pct_poverty
#### **Percentage of people whose income in the past 12 months is below the poverty level in each county**
```{r, pct_poverty}
#| fig.align = "center"
# pct_poverty
dat %>%
tm_shape() +
tm_polygons(col = c("pct_poverty"),
title = "Poverty Rate",
style = "cont",
pal = "Greens",
legend.reverse = TRUE) +
tm_text(text = "pct_poverty",
size = 0.5) +
tm_layout(frame = FALSE,
legend.title.size = 1)
```
* **Taylor** County has the highest poverty rate at 28.6%, followed by Taliaferro (28.5%), Terrell (28.0%), Jenkins (27.5%), and Seminole (27.1%) counties.
* Overall, counties in southern and central Georgia tend to have relatively higher poverty rates compared to those in northern Georgia.
### 7. vacancy_rate
#### **Percentage of housing vacancy in each county**
```{r, vacancy_rate}
#| fig.align = "center"
# vacancy_rate
dat %>%
tm_shape() +
tm_polygons(col = c("vacancy_rate"),
title = "Vacancy Rate",
style = "cont",
pal = "Blues",
legend.reverse = TRUE) +
tm_text(text = "vacancy_rate",
size = 0.5) +
tm_layout(frame = FALSE,
legend.title.size = 1)
```
* **Quitman** County has the highest vacancy rate at 53.2%, followed by Rabun (44.6%), Hancock (43.3%), Towns (39.6%), and Clay (39.2%) counties.
* Vacancy rates vary widely across the state, with high rates observed in northern, central-eastern, and southern Georgia.
### 8. unemployment_rate / _out
#### **Unemployment rate (before / after treating outliers) in each county**
```{r, unemployment_rate histogram}
#| fig.align = "center",
#| out.width = "500px"
# histogram
knitr::include_graphics(
here::here("output/hist_unemp.png")
)
```
```{r, unemployment_rate}
#| fig.align = "center"
dat %>%
tm_shape() +
tm_polygons(col = c("unemployment_rate"),
title = "Unemployment Rate",
style = "cont",
pal = "Oranges",
legend.reverse = TRUE) +
tm_text(text = "unemployment_rate",
size = 0.5) +
tm_layout(frame = FALSE,
legend.title.size = 1)
```
* **Quitman** county has a notably high unemployment rate of **21.4%**, making it an outlier among other counties. To address this outlier, we replaced any values above the 99th percentile with the 99th percentile of the ranked value.
```{r, unemployment_rate_out, echo = FALSE}
#| fig.align = "center"
# unemployment_rate_out
dat %>%
tm_shape() +
tm_polygons(col = c("unemployment_rate_out"),
title = "Unemployment Rate\n(Replaced Outliers)",
style = "cont",
pal = "Oranges",
legend.reverse = TRUE) +
tm_text(text = "unemployment_rate_out",
size = 0.5) +
tm_layout(frame = FALSE,
legend.title.size = 1)
```
* After adjusting for outliers, **Quitman** and **Baker** counties have the highest unemployment rate at 13.83%, followed by Charlton (13.7%), Long (12.4%), and Crisp (12%) counties.
* Overall, counties in southern and rural Georgia tend to have higher unemployment rates compared to those in northern Georgia.
### 9. pct_black
#### **Percentage of county population that is made up of Black people**
```{r, pct_black}
#| fig.align = "center"
# pct_black
dat %>%
tm_shape() +
tm_polygons(col = c("pct_black"),
title = "Percent of Black Population",
style = "cont",
pal = "Purples",
legend.reverse = TRUE) +
tm_text(text = "pct_black",
size = 0.5) +
tm_layout(frame = FALSE)
```
* **Hancock** County has the largest percentage of Black population at 72.7%, followed by Clayton (72.5%), Dougherty (71.2%), Randolph (64.4%), and Macon (62.6%) counties.
* In general, counties with higher percentages of Black populations are more common in central and southwestern Georgia.
### 10. dist_to_usroad
#### **Distance from County Seat to Interstate**
```{r, dist_to_usroad}
#| fig.align = "center"
# dist_to_usroad
tm_shape(dat) +
tm_polygons(col = c("dist_to_usroad"),
title = "Distance to Interstate",
style = "cont",
pal = "Greens",
legend.reverse = TRUE) +
tm_layout(frame = FALSE,
legend.title.size = 1)
```
* **Seminole** County has the longest distance from the county seat to the interstate at 446,752 feet, followed by Miller (338,135 feet), Early (371,695 feet), Decatur (364,239 feet), and Bacon (313,542 feet) counties.
* Counties in southwestern and southeastern Georgia tend to have longer distances to interstates, particularly in more rural areas.
### 11. dist_to_treatment
#### **Distance to Treatment Centers**
```{r, dist_to_treatment}
#| fig.align = "center"
# dist_to_treatment
tm_shape(dat) +
tm_polygons(col = c("dist_to_treatment"),
title = "Distance to Treatment Centers",
style = "cont",
pal = "Greens",
legend.reverse = TRUE) +
tm_layout(frame = FALSE,
legend.title.size = 1)
```
* U.S. Department of Health and Human Services, Substance Abuse and Mental Health Services Administration, Behavioral Health Services Information System. (2024). FindTreament_Facility_listing. Retrieved from https://findtreatment.gov/locator.
* **Quitman** County has the longest distance to treatment centers at 185,132 feet, followed by Warren (164,830 feet), Stewart (161,331 feet), Glascock (160,496 feet), and Randolph (153,437 feet) counties.
* Counties in southwestern and rural central Georgia generally have longer distances to treatment centers.
## Poisson Random Intercept Model with a Population Offset
<br>
$$
y_i|\mu_i \sim \text{Poisson}(\mu_i), \\ where \ \mu_i = E(y_i) = Var(y_i)
\\\
\\
\log\left(\frac{\mu_i}{pop_i}\right) = \beta_0 + \beta_1\,poverty\_rate_i + \beta_2\,vacancy\_rate_i + \beta_3\,unemployment\_rate_i + \beta_4\,pct\_black_i + \beta_5\,dist\_to\_road_i + \beta_6\,dist\_to\_treatment_i + \theta_i
\\\
\\
\log(\mu_i) = \log(pop_i) + \beta_0 + \beta_1\,poverty\_rate_i + \beta_2\,vacancy\_rate_i + \beta_3\,unemployment\_rate_i + \beta_4\,pct\_black_i + \beta_5\,dist\_to\_road_i + \beta_6\,dist\_to\_treatment_i + \theta_i
\\\
\\
\theta_i \sim N(0,\tau^2)
$$
* \(y_i\) : mortality count for county i
* \(\mu_i\) : expected mortality count for county i
* \(\log pop_i\) : population of county i, used as an offset to adjust for the different population sizes across the counties
* \(\beta_0\) : baseline log expected mortality rate
* \(\theta_i\) : random intercept for county i, county-specific deviation in baseline log expected mortality rate
* \(e^{\beta_1}\) : relative mortality rate change for a one standard deviation increase in the poverty rate
* \(e^{\beta_2}\) : relative mortality rate change for a one standard deviation increase in the vacancy rate
* \(e^{\beta_3}\) : relative mortality rate change for a one standard deviation increase in the unemployment rate
* \(e^{\beta_4}\) : relative mortality rate change for a one standard deviation increase in the percentage of black population
* \(e^{\beta_5}\) : relative mortality rate change for a one standard deviation increase in the distance to the interstate
* \(e^{\beta_6}\) : relative mortality rate change for a one standard deviation increase in the distance to the treatment center
### Fit the Model
```{r, poisson regression, echo = TRUE}
# Fit the poisson regression model
dat$log_pop = log(dat$population)
fit = glmer(mortality ~ offset(log_pop) + pct_poverty_std + vacancy_rate_std +
unemployment_rate_out_std + pct_black_std + dist_to_usroad_std +
dist_to_treatment_std + (1|county),
family = poisson(link = "log"), data = dat)
summary(fit)
```
* The baseline relative mortality rate is \(e^\hat{\beta_0}\) = \(e^{-8.93}\) = **0.0001**.
* There exists **heterogeneity** in baseline mortality rate with a between-county standard deviation \(\tau\) of **0.44**. <br> So 95% of the counties have baseline mortality rates between \(e^{-8.93 \pm 1.96 \times 0.44}\) = **(0.00006, 0.0003)**.
* There is evidence that mortality rate **increases** by approximately **17.4%** (\(e^\hat{\beta_2}\) = \(e^{0.16}\) = 1.174) for a one standard deviation **increase** in the **vacancy rate**.
* There is evidence that mortality rate **decreases** by approximately **16.5%** (\(e^\hat{\beta_5}\) = \(e^{-0.180}\) = 0.835) for a one standard deviation **increase** in the **distance to the interstate**.