forked from andrewbanchich/future-imperfect-jekyll-theme
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathMTA.html
279 lines (247 loc) · 13.8 KB
/
MTA.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
<html>
<head>
<!-- Global site tag (gtag.js) - Google Analytics -->
<script async src="https://www.googletagmanager.com/gtag/js?id=UA-121888276-1"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'UA-121888276-1');
</script>
<title>MTA Project</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<!--[if lte IE 8]><script src="assets/js/ie/html5shiv.js"></script><![endif]-->
<link rel="stylesheet" href="assets/css/main.css" />
<!--[if lte IE 9]><link rel="stylesheet" href="assets/css/ie9.css" /><![endif]-->
<!--[if lte IE 8]><link rel="stylesheet" href="assets/css/ie8.css" /><![endif]-->
</head>
<body class="single">
<!-- facebook comments script -->
<div id="fb-root"></div>
<script>(function(d, s, id) {
var js, fjs = d.getElementsByTagName(s)[0];
if (d.getElementById(id)) return;
js = d.createElement(s); js.id = id;
js.src = 'https://connect.facebook.net/en_US/sdk.js#xfbml=1&version=v3.1';
fjs.parentNode.insertBefore(js, fjs);
}(document, 'script', 'facebook-jssdk'));</script>
<!-- twitter initialization script -->
<script>window.twttr = (function(d, s, id) {
var js, fjs = d.getElementsByTagName(s)[0],
t = window.twttr || {};
if (d.getElementById(id)) return t;
js = d.createElement(s);
js.id = id;
js.src = "https://platform.twitter.com/widgets.js";
fjs.parentNode.insertBefore(js, fjs);
t._e = [];
t.ready = function(f) {
t._e.push(f);
};
return t;
}(document, "script", "twitter-wjs"));</script>
<!-- Wrapper -->
<div id="wrapper">
<!-- Header -->
<header id="header">
<h1><a href="https://clarencestephen.github.io/">Clarence Stephen</a></h1>
<nav class="links">
<ul>
<li><a href="about.html">About Me</a></li>
<li><a href="resume.html">Resume</a></li>
<li><a href="projects.html">Projects</a></li>
<li><a href="archive.html">Blog Archive</a></li>
<li><a href="/links.html">Links</a></li>
</ul>
</nav>
<nav class="main">
<ul>
<li class="search">
<a class="fa-search" href="#search">Search</a>
<form id="search" method="get" action="#">
<input type="text" name="query" placeholder="Search" />
</form>
</li>
<li class="menu">
<a class="fa-bars" href="#menu">Menu</a>
</li>
</ul>
</nav>
</header>
<!-- Menu -->
<section id="menu">
<!-- Search -->
<section>
<form class="search" method="get" action="#">
<input type="text" name="query" placeholder="Search" />
</form>
</section>
<!-- Links -->
<section>
<ul class="links">
<li>
<a href="https://clarencestephen.github.io/">
<h3>Home</h3>
<p>Main page</p>
</a>
</li>
<li>
<a href="resume.html">
<h3>Resume</h3>
<p>Data Scientist & Investment Professional</p>
</a>
</li>
<li>
<a href="about.html">
<h3>About Me</h3>
<p> A few more details...</p>
</a>
</li>
<li>
<a href="archive.html">
<h3>Archive</h3>
<p>Prior blog posts</p>
</a>
</li>
<li>
<a href="projects.html">
<h3>Projects</h3>
<p>Archival of projects</p>
</a>
</li>
<li>
<a href="/links.html">
<h3>Links</h3>
<p>Other sites of interest</p>
</a>
</li>
</ul>
</section>
<!-- Actions -->
<section>
<ul class="actions vertical">
<li><a href="#" class="button big fit">Log In</a></li>
</ul>
</section>
</section>
<!-- Main -->
<div id="main">
<!-- Post -->
<article class="post">
<header>
<div class="title">
<hr width="100%">
<h2><a>MTA Project</a></h2>
<p>An analysis of MTA turnstile data in hypothetical context</p>
</div>
<div class="meta">
<time class="published" datetime="2018-07-16">July 16, 2018</time>
<a href="about.html" class="author"><span class="name">Clarence Stephen</span><img src="images/avatar.jpg" alt="" /></a>
<!------------------Social share buttons-------------->
<div class="fb-like" data-width="100" data-layout="button_count" data-action="like" data-size="small" data-show-faces="true" data-share="false"></div>
<div class="fb-share-button" data-layout="button_count" data-size="small" data-mobile-iframe="true"><a target="_blank" href="https://www.facebook.com/sharer/sharer.php?u&src=sdkpreparse" class="fb-xfbml-parse-ignore">Share</a></div>
<br>
<a href="https://twitter.com/share?re f_src=twsrc%5Etfw" class="twitter-share-button" data-text="Check out Clarence's newest post!" data-via="clarencestephen" data-hashtags="DataScience, Finance, BigData, Stocks, MachineLearning" data-related="clarencestephen" data-lang="en" data-dnt="true" data-show-count="false" align = 'center' >Tweet</a><script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
<br>
<script src="//platform.linkedin.com/in.js" type="text/javascript"> lang: en_US</script><script type="IN/Share"></script>
<br>
<form action="https://getsimpleform.com/messages?form_api_token=76db078bca99f64a035e0d2d06de6932" method="post">
<!-- the redirect_to is optional, the form will redirect to the referrer on submission -->
<!--<input type='hidden' name='redirect_to' value='' /> -->
<!-- all your input fields here.... -->
<input type='text' name='email' />
<input type='submit' value='Subscribe via Email' />
</form>
</div>
</header>
<img src="images/DOHS.png" alt="" width="750" height="421" />
<br>
</div>
<p>MTA Traffic Analysis: Data cleaning, exploratory analysis, and visualization of MTA turnstile data to find areas of heavy traffic in NYC subway stations. (Matplotlib, pandas, seaborn)</p>
<i>An analysis of MTA turnstile data showed us that peak traffic occurs at Grand Central Mon-Fri between 4-8pm with particular concentration around control unit R238.
With the hypothetical risk of a terrorist attack, we conclude that MTA should increase turnstile distribution and redistribute traffic volumes.
<br><br>
Warning/disclaimer: The situation herein described is a hypothetical construct. Any semblances with reality are purely coincidental
as the goal was to conduct data analytics on an existing data set in a business context. </i>
<br> <br>
<p>I ride the subway every day. My commute is currently just under an hour, from Crown Heights
where I live with my wife and soon-to-be-born daughter, to Metis near 28th and park ave in Manhattan.
So when our first project was to analyze MTA turnstile data, my initial reaction was to seize on my own anxieties.
New York is <b>CROWDED</b>. Having grown up in Oviedo, FL where we drive EVERWHERE, taking the subway is something I resisted
for most of my decade as a New Yorker. Its crowded, smelly, dirty, and hot and humid during the summer. But its fast, inexpensive
and an iconic element of New York.</p>
<p> Usually I spend the first minute or so in a subway car assessing my surroundings--yes, "assessing." I want to know if there is a
dangerous or inebriated individual or someone who doesn't seem safe...plus as the MTA always says "If you see something, say
something." So when given access to MTA's turnstile data, my thoughts led me to a hypothetical application: thwarting a terrorist attack in NYC.</p>
<p> Let's look at the numbers:
<ul>
<li>MTA daily ridership at more than 5 million commuters</li>
<li>Annual downtown traffic is approximately 110 million people</li>
<li>In 2017, there were more than 4000 transit related incidences</li>
</ul>
Those numbers are staggering, even for a New Yorker. More alarming, however, is the fact that the NYPD only have 40,000
uniformed officers to cover those 5 million commuters and only a measely 4,000 (~10%) work as transit officers--that means on average
each officer deals with a transit incidence <b>EVERYDAY</b>.
</p><p>
Oh and this data by the way--and I do not intend this faceticiously--is the ugliest dataset I've seen to-date. I'd love to hear your horror stories but this is my first.
As the saying goes, there's no such thing as a free lunch, and this project made me work for it. For starters, I took the negative turnstile counts, no doubt indicative
of improperly installed meters, and used their absolute value count, removed all significant outliers (~2%) using a reasonable threshold of 100k commuters per a turnstile in
a 4hr window of time. Missing values were backfilled as appropriate, using interpolation to ensure continuous, smooth distributions.
</p>
<p>Let explore the data...</p>
<p>The chart below shows the distribution of turnstile counts for the highest volume turnstiles around 42nd st/Grand Central/Times Square.
The outliers--highlighted in red--are extreme. It should be obvious, for example that Port Authority doesn't have 2B+ commuters ever.
<p><img src="images/timeperiod1484_922.png" alt="" width="800" height="497" /></p>
<p>After removing the outliers, the first goal is to determine which station has the highest traffic, tagging it at-risk.
Grand Central wins with a fairly large margin:</p>
<p><img src="images/trafficvol1696_978.png" alt="" width="800" height="462" /></p>
<p>Next, looking at traffic in Grand Central by weekday, we can determine which day is riskiest. Here Thursday appears to be the most popular day to commute with the weekend falling off dramatically, presumably as commuters from NJ/Long Island/etc.
return home for the weekend: </p>
<p><img src="images/trafficvol1760_1012.png" alt="" width="800" height="460" /></p>
<p> Finally we determine the distribution with respect to time so we can distribute the NYPD officers appropriately.
Commuters wane from 8pm to 4am and then the pace increases throughout the day dropping slightly during lunch
only to rally with the evening rush hour:
</p>
<p><img src="images/trafficvol1676_1006.png" alt="" width="800" height="480" /></p>
<p>So far we've determined that if an attack were to occur during peak traffic in NYC, it would occur at Grand Central on a Thursday during the evening rush hour.
I'm not quite satisfied with this data point. Grand Central is huge and it would still be a dramatic undertaking for a police force
to comb through all of Grand Central. We need to narrow this down further...
</p>
<p>Breaking down the data into individual Grand Central turnstiles reveals an unusual result. Below is a heat-map where red-shading represents heavier traffic:</p>
<p><img src="images/trafficvol1610_990.png" alt="" width="800" height="492" /></p>
<p>It seems one turnstile gets an inordinate amount of traffic in Grand Central. If I were to take a guess, I suppose this is the turnstile adjacent to the
escalators from the subway into Grand Central. This also makes it clear that there are asymmetric risks posed by the current layout.
Its in the best interest of commuters, the NYPD, MTA, and New Yorkers to redistribute traffic flow in Grand Central. This would increase safety precautions
and allow commuters to travel more efficiently. An immediate consideration perhaps, would be to post additional officers until resolved.
</p>
<p></p>
<p>Associated presentation slides:</p>
<p><embed src="https://drive.google.com/viewerng/viewer?embedded=true&url=https://clarencestephen.github.io/projects/MTA.pdf" align = "middle" width="1000" height="750"></p>
<p>And if you'd like to dig deeper into the code:</p>
<p><a href ="https://github.com/clarencestephen/MTA-Traffic-Analysis" style = "color:blue">MTA Traffic Analysis - GitHub Repository</a></p>
<p>Coming soon: #FinancialSingularity</p>
</div>
<footer>
<div class="fb-comments" data-href="http://cognosis.solutions/MTA.html" data-width="750" data-numposts="5"></div>
</footer>
</article>
</div>
<!-- Footer -->
<section id="footer">
<ul class="icons">
<li><a href="https://twitter.com/clarencestephen" class="fa-twitter"><span class="label">Twitter</span></a></li>
<li><a href="#" class="fa-rss"><span class="label">RSS</span></a></li>
<li><a href="mailto: [email protected]" class="fa-envelope"><span class="label">Email</span></a></li>
</ul>
<p class="copyright">© Cognosis. Images: <a href="http://unsplash.com" target="_blank">Unsplash</a>.</p>
</section>
</div>
<!-- Scripts -->
<script src="assets/js/jquery.min.js"></script>
<script src="assets/js/skel.min.js"></script>
<script src="assets/js/util.js"></script>
<!--[if lte IE 8]><script src="assets/js/ie/respond.min.js"></script><![endif]-->
<script src="assets/js/main.js"></script>
</body>
</html>