# Dataset url on GitHub repository.
<-"https://raw.githubusercontent.com/renastechschool/Python_tutorials/main/Lead_Testing_in_School_Drinking_Water_Sampling_and_Results_Compliance_Year_2016_formated.csv?token=GHSAT0AAAAAACNH7S3BJGTQXNGH4UPQCJI6ZNQB3VA"
@@ -4907,14 +4907,14 @@ data_urlRead data
Preparing the lead dataset for the analysis
-All datasets require some pre-cleaning and formatting. In the section below, we will format field names. R does not like fields names with spaces, so we need to convert space to an underscore “_”. Also, we need to extract the year from the date field for the next step of our work.
+All datasets require some pre-cleaning and formatting. In the section below, we will format field names. R does not like field names with spaces, so we need to convert space to an underscore “_”. Also, we need to extract the year from the date field for the next step of our work.
# There are empty spaces in the field names which R does not like.
# Replace empty space with "_".
names(school_lead_df) <- names(school_lead_df) %>% stringr::str_replace_all("\\s","_")
-# Extract the year from date field. We are using Date_Results_Updated for the date.
+# Extract the year from the date field. We are using Date_Results_Updated for the date.
<-school_lead_df %>% mutate(year=format(as.Date(Date_Results_Updated, format="%d/%m/%Y"),"%Y"))
school_lead_df
# Data reports lead level by outlets if a outlet lead level is above or under 15ppb.
@@ -4928,7 +4928,7 @@
Get familiar with the dataset
Getting familiar with the dataset is the first step of an analysis. To understand the attributes, we will query data by geographic region (county) and different attributes (fields).
-The code below creates a Shiny app which allows users to select a county and specific fields from a data frame (school_lead_df), and then it displays the corresponding data table based on the user’s selections.
+The code below creates a Shiny app that allows users to select a county and specific fields from a data frame (school_lead_df), and then it displays the corresponding data table based on the user’s selections.
@@ -4962,11 +4962,11 @@ Get familiar
-
-Converting from tabular data to geo-spatial data
-A dataset needs to have a geometry attribute to plot the data on a map or to do different spatial analysis. The NYS dataset has xy coordinates of schools. The xy coordinates will allow us to convert the tabular dataset to a spatial dataset.
-We first need to address that the xy coordinates are not properly formatted. The coordinates are currently stored with school addresses, for example: 31-02 67 AVENUE Queens, NY 11364(40.74779141700003, -73.74551716499997). We need to extract the coordinates (40.74779141700003, -73.74551716499997) and store the value on each side of the comma as a separate field. The first number refers to the y coordinate (latitude), the second number refers to the x coordinate (longitude).
-While converting the data, we also need to know the projection of xy coordinates. XY coordinates can be in different projections systems. Projection information is typically stored in the metadata of a dataset. However, in the NYS dataset there is not any metadata attached to the dataset.
+
+Converting from tabular data to geospatial data
+A dataset needs to have a geometry attribute to plot the data on a map or to conduct different spatial analyses. The NYS dataset has xy coordinates of schools. The xy coordinates will allow us to convert the tabular dataset to a spatial dataset.
+We first need to address that the xy coordinates are not properly formatted. The coordinates are currently stored with school addresses, for example: 31-02 67 AVENUE Queens, NY 11364(40.74779141700003, -73.74551716499997). We need to extract the coordinates (40.74779141700003, -73.74551716499997) and store the value on each side of the comma as a separate field. The first number refers to the y coordinate (latitude), and the second number refers to the x coordinate (longitude).
+While converting the data, we also need to know the projection of xy coordinates. XY coordinates can be in different projection systems. Projection information is typically stored in the metadata of a dataset. However, in the NYS dataset, there is not any metadata attached to the dataset.
The most commonly used geographic coordinate system is the WORLD GEODETIC SYSTEM 1984 (WGS 84). We will use the WGS84 projection to convert the NYS dataset to spatial data.
@@ -5026,9 +5026,9 @@ Mapping the dataset
-
-Obtain population data from US Census Bureau
-We will next pull population datafrom US Census Bureau by using the Census Bureau’s Application Programming Interface (API).
+
+Obtain population data from the US Census Bureau
+We will next pull population data from the US Census Bureau by using the Census Bureau’s Application Programming Interface (API).
#|panel: fill
@@ -5264,6 +5264,9 @@ Explore the data
Brooks, Samantha K, and Sonny S Patel. 2021. “Psychological Consequences of the Flint Water Crisis: A Scoping Review.” Disaster Medicine and Public Health Preparedness 16 (3): 1259–69. https://doi.org/10.1017/dmp.2021.41.
+
+Council, National Resource Defense. n.d. “Flint Water Crisis: Everything You Need to Know.” https://www.nrdc.org/stories/flint-water-crisis-everything-you-need-know#summary.
+
Environmental Health Sciences, National Institute of. n.d. “Safe Water and Your Health.” https://www.niehs.nih.gov/health/topics/agents/water-poll.