forked from sol-eng/bike_predict
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdocument.qmd
116 lines (94 loc) · 2.98 KB
/
document.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
---
title: "ETL Step 1 - Raw Data Refresh"
date: "`r lubridate::now(tzone = 'EST')` EST"
output: html_document
---
This notebook uses the [bikeHelpR](https://packagemanager.rstudio.com/client/#/repos/1/packages/bikeHelpR) package to update a database with the latest bike share data. The package pulls data from <https://capitalbikeshare.com> which provides an API to access bike share data. The raw data is written to the *Content DB* database to the *bike_raw_data* table.
```{r setup, include=FALSE}
knitr::opts_chunk$set(warning = FALSE, message = FALSE)
library(dplyr)
library(dbplyr)
library(odbc)
```
## Get data from API
Use the the `bikeHelpR` package to get the latest data from <https://capitalbikeshare.com>.
### Station status
```{r station_status_data}
feeds_station_status <-
bikeHelpR::feeds_urls() %>%
filter(name == "station_status") %>%
pull(url) %>%
bikeHelpR::get_data()
station_status <-
feeds_station_status %>%
magrittr::extract2("data") %>%
dplyr::mutate(time = feeds_station_status$last_updated) %>%
dplyr::select(
is_installed,
num_bikes_available,
last_reported,
is_renting,
eightd_has_available_keys,
num_docks_available,
num_docks_disabled,
is_returning,
station_id, num_ebikes_available,
num_bikes_disabled,
time
)
glimpse(station_status)
```
### Station info
```{r station_info_data}
# The station information endpoint.
station_information_url <-
bikeHelpR::feeds_urls() %>%
filter(name == "station_information") %>%
pull(url)
# Call the endpoint to obtain the JSON data.
request <- httr2::request(station_information_url)
response <- httr2::req_perform(request)
json_data <- httr2::resp_body_json(response)
# Convert the JSON data into a tibble.
station_info <-
json_data$data %>%
as_tibble() %>%
tidyr::unnest_wider(stations) %>%
select(station_id, name, lat, lon) %>%
distinct() %>%
mutate(
last_updated = as.POSIXct(
json_data$last_updated,
origin = "1970-01-01 00:00:00 UTC"
)
)
glimpse(station_info)
```
## Update database
Write the new data from the API to the database.
```{r write_to_db}
con <- dbConnect(odbc(), "Content DB", timeout = 10)
dbWriteTable(con, "bike_raw_data", station_status, append = TRUE)
dbWriteTable(con, "bike_station_info", station_info, overwrite = TRUE)
dbDisconnect(con)
print("Raw data updated 🎉")
```
## Update the pin
Station info will also be written to a pin. This pin will be accessed by the shiny app so that it can easily get the bike station info without connecting to the database.
```{r pin_station_info}
board <- pins::board_rsconnect(
server = Sys.getenv("CONNECT_SERVER"),
key = Sys.getenv("CONNECT_API_KEY"),
versioned = TRUE
)
# Write the model to the board.
pins::pin_write(
board,
x = station_info,
type = "csv",
name = "bike-predict-r-station-info-pin",
title = "Bike Predict - ETL - Pinned Station Info",
description = "Bike station info from https://capitalbikeshare.com."
)
print("Pin 🎉")
```