-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate flow estimates for walking and cycling #10
Comments
Note that Brutus (#11) analyses flows between 11 types of destinations: own home, own work, other residential/visit place, work related visit, own school, “kiss&ride”, day care, shopping, restaurant, sport/culture/other free time place. Our layers cover all of those, which is good. |
@Robinlovelace New, usable data for Kathamandu in Those data should be able to be plugged straight in to |
Hey @mpadge that looks great to me! Many thanks, looks awesome. I plan to set-up a traffic flow validation repo in ITSLeeds that we can use for multi modal flow estimate validation using data from
|
Update for @mpadge: here's a place where I've created a 'visualisation challenge': https://github.com/ITSLeeds/trafficEstimatr |
It's private ... |
Yes - feel free to take code from there to validate the flows. Or is it that you cannot see the repo? It thought you were in the organization. |
I can't see it 🙈 |
Apologies, my bad! |
Just added you: https://github.com/ITSLeeds/trafficEstimatr/invitations |
🚶♂️ 🚲 🚀 |
Calibration of bicycle estimates can be done via
New York could then be used as a benchmark city for current plan:
A good additional bike city might be Guadalajara, because at least that is somewhat beyond the Global North. |
@Robinlovelace Very first try at moveability versus actual pedestrian counts across New York City, noting that moveability is not intended to model such things at all: Relationship is definitely positive and highly significant (p = 0.002). |
R2 values: Link to paper: https://arxiv.org/ftp/arxiv/papers/1803/1803.10500.pdf |
Great reference - thanks for the link! The result above is the crudest possible, with no detail whatsoever, but nevertheless provides most of the ground work for dividing into the usual layers and adding detail. That should all be (mostly) done tomorrow. |
Awesome progress sir. Suggest we catch up Thursday morning on all this. |
@Robinlovelace flows are coming together, but check this intermediate result: And one of the coolest things is of course that we can find out who is behind this work, which is largely people associated with OSM Ghana, like Etse Lossou, Sammy Hawkrad, and even Essuanlive who is on github - hi and thanks @essuanlive and others! Keep up the great OSM work - it's being used for this and other World Health Organization projects! |
Can you provide an update on the status and planned next steps on this please @mpadge ? |
Thanks for the nudge - this is now all subsumed within the calibration work. This week should see quite some more being added to the manuscript, and I'll then also ensure that I detail next steps here, which will essentially involve procedures for generalising the specific calibration steps applied there to other areas (Accra, Kathmandu). Coming very soon - just got centrality implemented this morning, exactly agreeing with |
Sounds amazing @mpadge, cheers for the update, pls nudge me when you want feedback 👍 |
Shall do! There'll be lots of nudges as soon as it's in "production" mode, which should be very soon. We'll need to discuss a lot of aspects of how to generalise from the calibration resuts |
Apologies for nudge @mpadge but any updates on this from weekend of re-running the code? |
arghhh ... don't apologise, it's absolutely high time that i report ... all ran well, but some results were a bit odd. I dug in to it today, and there are some deep internal issues with |
Maybe run the non parallel version on a subset of the data? |
and finally @Robinlovelace, the ridiculously long-awaited near-final version of what we've been waiting for all this time ... not exactly Note: I also improved the statistical robustness of the variable selection procedures, which has slightly reduced the final R2 value from my claimed 0.925 as unadjusted value to 0.885 in adjusted form. Whatever, that's still spectacularly high. dat <- readRDS ("final-model.Rds")
x <- dat$flowvars
mod <- lm (dat$p ~ x)
summary (mod)
#>
#> Call:
#> lm(formula = dat$p ~ x)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -4331.2 -774.9 -95.3 720.5 3438.5
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 1.282e+03 3.212e+02 3.992 0.000123 ***
#> xsub-dis 1.092e+01 2.550e+00 4.283 4.15e-05 ***
#> xsub-tra 8.974e-01 1.087e-01 8.257 5.33e-13 ***
#> xent-dis 2.347e+04 1.798e+03 13.055 < 2e-16 ***
#> xsub-res -7.732e-01 1.356e-01 -5.701 1.15e-07 ***
#> xsub-cen 6.607e-01 8.660e-02 7.630 1.23e-11 ***
#> xedu-ent 9.193e+02 1.480e+02 6.211 1.13e-08 ***
#> xtra-sus -1.279e+02 2.475e+01 -5.166 1.18e-06 ***
#> xsub-ent -9.099e-02 1.959e-02 -4.644 1.01e-05 ***
#> xedu-dis -5.448e+04 1.479e+04 -3.684 0.000368 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 1327 on 103 degrees of freedom
#> Multiple R-squared: 0.8939, Adjusted R-squared: 0.8846
#> F-statistic: 96.41 on 9 and 103 DF, p-value: < 2.2e-16
# the terms in the model are the following categories:
categories <- c ("subway", "centrality", "residential", "transportation",
"sustenance", "entertainment", "education", "healthcare")
cbind (substring (categories, 1, 3), categories)
#> categories
#> [1,] "sub" "subway"
#> [2,] "cen" "centrality"
#> [3,] "res" "residential"
#> [4,] "tra" "transportation"
#> [5,] "sus" "sustenance"
#> [6,] "ent" "entertainment"
#> [7,] "edu" "education"
#> [8,] "hea" "healthcare"
dat <- data.frame (model = fitted (mod), observed = dat$p)
r2 <- summary (mod)$adj.r.squared
library (ggplot2)
theme_set (theme_minimal ())
ggplot (dat, aes (x = model, y = observed)) +
geom_point () +
geom_smooth (method = "lm") +
ggtitle (paste0 ("R2 = ", signif (r2, 3))) Created on 2019-11-07 by the reprex package (v0.3.0) One really interesting -- and hopefully very important -- thing that emerges from this is the identification of significant layers. For better readability, this is the above table via kable, and ordered in decreasing significance: dat <- readRDS ("final-model.Rds")
x <- dat$flowvars
mod <- summary (lm (dat$p ~ x))
coeffs <- mod$coefficients [order (mod$coefficients [, 4]), ]
rownames (coeffs) <- substring (rownames (coeffs), 2, nchar (rownames (coeffs)))
knitr::kable (coeffs)
Created on 2019-11-07 by the reprex package (v0.3.0) And the three most significant layers are:
|
Spectacular work @mpadge, I plan to give you a call in a bit, sorry my side has been bogged down... |
For each mode.
The text was updated successfully, but these errors were encountered: