Skip to content

Commit

Permalink
resolved lead comments
Browse files Browse the repository at this point in the history
  • Loading branch information
jmorkin committed Apr 11, 2024
1 parent 7032680 commit 1b22bab
Show file tree
Hide file tree
Showing 2 changed files with 21 additions and 13 deletions.
26 changes: 13 additions & 13 deletions exposure_to_lead_in_schools_nys.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ theme<-theme(axis.text=element_text(size=12, face="bold"),

## Overview

This use case explores the risks of exposure to lead via drinking water. Lead contamination is a serious issue that poses severe health risks and requires remedial action. In this lesson, we will analyze data on lead levels in [NY State schools](https://health.data.ny.gov/Health/Lead-Testing-in-School-Drinking-Water-Sampling-and/rkyy-fsv9/data) collected from 2016, 2017, 2018 and 2019 and compare it with population characteristics at county level in NYS to understand its impact. No “safe” levels of lead have been established, but we will discuss what level of lead can be detected. The lesson examines the sources of lead exposure and its adverse effects. The review also discusses the importance of data transparency, public participation, visualizing contamination as maps/graphs, and estimating population risks to address water quality issues.
This use case explores the risks of exposure to lead via drinking water. Lead contamination is a serious issue that poses severe health risks and requires remedial action. In this lesson, we will analyze data on lead levels in [NY State schools](https://health.data.ny.gov/Health/Lead-Testing-in-School-Drinking-Water-Sampling-and/rkyy-fsv9/data) collected from 2016, 2017, 2018, and 2019 and compare it with population characteristics at the county level in New York State (NYS) to understand its impact. No “safe” levels of lead have been established, but we will discuss what level of lead can be detected. The lesson examines the sources of lead exposure and its adverse effects. The review also discusses the importance of data transparency, public participation, visualizing contamination as maps/graphs, and estimating population risks to address water quality issues.

## Learning Objectives

Expand All @@ -157,12 +157,12 @@ It is crucial to highlight that the degree of lead able to be detected depends u

New York State (NYS) Lead Testing in School Drinking Water dataset shows the school drinking water lead sampling and results information reported by each NYS public school and Boards of Cooperative Educational Services (BOCES) [(NYS Department of Health)](https://health.data.ny.gov/Health/Lead-Testing-in-School-Drinking-Water-Sampling-and/rkyy-fsv9/data). More information on the NYS dataset sampling is available [here](https://www.health.ny.gov/environmental/water/drinking/lead/lead_testing_of_school_drinking_water.htm).

Analysis of the dataset reveals that as of 2022, 1,864 schools had lead outlets testing higher than 15 ppb [@NYS2016]. While 527 schools finished their remediation, 1,851 schools reported taking remedial action. There are now 12 schools with outlets exceeding 15 ppb in operation, indicating possible continuous exposure. However, there are gaps in following up and documenting the corrective measures. More transparency is necessary for schools with high exposure to lead to address the hazards of lead contamination and implement improved repeated testing protocols. The New York State Department of Health provides guidelines, rules, and resources pertaining to lead testing and remediation in schools [@DOH2023]. However, it seems that there is currently a lack of financial and technical support for schools to handle lead hazards.
Analysis of the dataset reveals that as of 2022, 1,864 schools had lead outlets testing higher than 15 ppb [@NYS2016]. While 527 schools finished their remediation, 1,851 schools reported taking remedial action. There are now 12 schools with outlets exceeding 15 ppb in operation, indicating possible continuous exposure. However, there are gaps in following up and documenting the corrective measures. More transparency is necessary for schools with high exposure to lead to address the hazards of lead contamination and implement improved repeated testing protocols. The New York State Department of Health provides guidelines, rules, and resources on lead testing and remediation in schools [@DOH2023]. However, it seems that there is currently a lack of financial and technical support for schools to handle lead hazards.

::: {.callout-tip style="color: #7d2748;"}
### Lead in the News

Thousands of people were exposed to dangerously high lead levels in their drinking water when the [Flint water crisis](https://www.nrdc.org/stories/flint-water-crisis-everything-you-need-know) broke out in 2014. A study by Virginia Tech researchers, through their resident-organized sampling to testing data of 252 homes, revealed that lead levels in the city had increased. Over 17% of samples tested higher than the federal "action level" of 15 ppb, which calls for the need for corrective action. More than 40% had lead readings higher than 5 ppb, which the researchers deemed indicative of a "very serious" issue.
Thousands of people were exposed to dangerously high lead levels in their drinking water when the [Flint water crisis](https://www.nrdc.org/stories/flint-water-crisis-everything-you-need-know) broke out in 2014. A study by Virginia Tech researchers, through their resident-organized sampling to testing data of 252 homes, revealed that lead levels in the city had increased [@NRDC]. Over 17% of samples tested higher than the federal "action level" of 15 ppb, which calls for the need for corrective action. More than 40% had lead readings higher than 5 ppb, which the researchers deemed indicative of a "very serious" issue.

Even years after the crisis began, elevated lead levels remained in Flint's schools. An article by [The New York Times](https://www.nytimes.com/2019/11/06/us/politics/flint-michigan-schools.html) discusses how, in 2019, drinking water samples from 30 Flint school buildings still exhibit excessive lead levels. The elevated levels demonstrate remaining problems with a prolonged impact on children's health and development. Schools have an obligation to supply their pupils with clean drinking water. The Flint water crisis brought to light the long-term consequences of prolonged exposure to lead, especially for vulnerable groups such as children.

Expand All @@ -173,7 +173,7 @@ In 2021, the Biden-Harris administration announced an ambitious [Lead Pipe and P

### Read data

To work with the NYS data, first we will read the NYS school lead testing results from 2016 to 2019. The dataset is hosted on a GitHub repository and we will read the dataset by using the dataset url.
To work with the NYS data, first we will read the NYS school lead testing results from 2016 to 2019. The dataset is hosted on a GitHub repository and we will read the dataset by using the dataset URL.

```{r, eval=FALSE}
# Dataset url on GitHub repository.
Expand All @@ -185,15 +185,15 @@ school_lead_df<-read_csv(url(data_url))

### Preparing the lead dataset for the analysis

All datasets require some pre-cleaning and formatting. In the section below, we will format field names. R does not like fields names with spaces, so we need to convert space to an underscore "\_". Also, we need to extract the year from the date field for the next step of our work.
All datasets require some pre-cleaning and formatting. In the section below, we will format field names. R does not like field names with spaces, so we need to convert space to an underscore "\_". Also, we need to extract the year from the date field for the next step of our work.

```{r, eval=FALSE}
# There are empty spaces in the field names which R does not like.
# Replace empty space with "_".
names(school_lead_df) <- names(school_lead_df) %>% stringr::str_replace_all("\\s","_")
# Extract the year from date field. We are using Date_Results_Updated for the date.
# Extract the year from the date field. We are using Date_Results_Updated for the date.
school_lead_df<-school_lead_df %>% mutate(year=format(as.Date(Date_Results_Updated, format="%d/%m/%Y"),"%Y"))
# Data reports lead level by outlets if a outlet lead level is above or under 15ppb.
Expand All @@ -209,7 +209,7 @@ school_lead_df<-school_lead_df %>% mutate(lead_summary_by_school=case_when(

Getting familiar with the dataset is the first step of an analysis. To understand the attributes, we will query data by geographic region (county) and different attributes (fields).

The code below creates a [Shiny app](https://shiny.rstudio.com/) which allows users to select a county and specific fields from a data frame (school_lead_df), and then it displays the corresponding data table based on the user's selections.
The code below creates a [Shiny app](https://shiny.rstudio.com/) that allows users to select a county and specific fields from a data frame (school_lead_df), and then it displays the corresponding data table based on the user's selections.

::: column-margin
::: {.callout-tip style="color: #5a7a2b;"}
Expand Down Expand Up @@ -243,13 +243,13 @@ if (input$county=="All Counties")

<br>

### Converting from tabular data to geo-spatial data
### Converting from tabular data to geospatial data

A dataset needs to have a geometry attribute to plot the data on a map or to do different spatial analysis. The NYS dataset has xy coordinates of schools. The xy coordinates will allow us to convert the tabular dataset to a spatial dataset.
A dataset needs to have a geometry attribute to plot the data on a map or to conduct different spatial analyses. The NYS dataset has xy coordinates of schools. The xy coordinates will allow us to convert the tabular dataset to a spatial dataset.

We first need to address that the xy coordinates are not properly formatted. The coordinates are currently stored with school addresses, for example: 31-02 67 AVENUE Queens, NY 11364(40.74779141700003, -73.74551716499997). We need to extract the coordinates (40.74779141700003, -73.74551716499997) and store the value on each side of the comma as a separate field. The first number refers to the y coordinate (latitude), the second number refers to the x coordinate (longitude).
We first need to address that the xy coordinates are not properly formatted. The coordinates are currently stored with school addresses, for example: 31-02 67 AVENUE Queens, NY 11364(40.74779141700003, -73.74551716499997). We need to extract the coordinates (40.74779141700003, -73.74551716499997) and store the value on each side of the comma as a separate field. The first number refers to the y coordinate (latitude), and the second number refers to the x coordinate (longitude).

While converting the data, we also need to know the projection of xy coordinates. XY coordinates can be in different projections systems. Projection information is typically stored in the metadata of a dataset. However, in the NYS dataset there is not any metadata attached to the dataset.
While converting the data, we also need to know the projection of xy coordinates. XY coordinates can be in different projection systems. Projection information is typically stored in the metadata of a dataset. However, in the NYS dataset, there is not any metadata attached to the dataset.

The most commonly used geographic coordinate system is the [WORLD GEODETIC SYSTEM 1984 (WGS 84)](https://earth-info.nga.mil/index.php?dir=wgs84&action=wgs84). We will use the WGS84 projection to convert the NYS dataset to spatial data.

Expand Down Expand Up @@ -323,9 +323,9 @@ tmap_leaflet(tmap)

<br>

### Obtain population data from US Census Bureau
### Obtain population data from the US Census Bureau

We will next pull population datafrom US Census Bureau by using the Census Bureau's Application Programming Interface (API).
We will next pull population data from the US Census Bureau by using the Census Bureau's Application Programming Interface (API).

```{r , eval=FALSE }
#| context: "render"
Expand Down
8 changes: 8 additions & 0 deletions lead-references.bib
Original file line number Diff line number Diff line change
Expand Up @@ -115,3 +115,11 @@ @article{brooks2021
url = {http://dx.doi.org/10.1017/dmp.2021.41},
langid = {en}
}

@online{NRDC,
author = {National Resource Defense Council},
shortauthor = {{NRDC}},
title = {Flint Water Crisis: Everything You Need to Know},
url = {https://www.nrdc.org/stories/flint-water-crisis-everything-you-need-know#summary},
note = {Accessed: April 11, 2024}
}

0 comments on commit 1b22bab

Please sign in to comment.