Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Traffic Prophet Development Plan #10

Open
cczhu opened this issue Oct 23, 2019 · 0 comments
Open

Traffic Prophet Development Plan #10

cczhu opened this issue Oct 23, 2019 · 0 comments
Assignees

Comments

@cczhu
Copy link
Contributor

cczhu commented Oct 23, 2019

Now that we've successfully run TEPs-I and mapped and documented its processes and functionality, we need to plan out the structure of bdit_traffic_prophet, the Python successor to TEPs-I. This plan is a work-in-progress and will be periodically updated to reflect discussions between me and @aharpalaniTO.

Traffic Prophet

Countfill

Pipeline

  • Fit counts with sufficient data with a FARIMA model. Vary model weights using an iterative technique described in Ganji et al. 2001.
  • Fill data gaps (potentially extrapolate forward in time) for counts with sufficient data using the best-fit FARIMA.
  • Manually select spatial associations between "aggregated" counts (with more data) and "key" counts (with less data).
  • Fill data gaps and predict hourly traffic counts for key locations using Ganji et al. Eqn. 9.

Countmatch

Pipeline

  • ETL process to load data from 15-minute count zip files or from Postgres. Separate count data into blocks by centreline ID and year, and identify whether each block is a permanent count (more than 3/4 of all days of the year, and all 12 months represented) or short-term count.
  • Determine candidate MADT, AADT, day-of-week of the month (DoM) ADT and scaling factors for permanent counts. Determine growth factors for permanent counts between years.
  • Determine nearest permanent count neighbours for each short-term count.
  • Estimate MADT, AADT, DoMADT, etc. for short-term counts using their nearest permanent count.
  • Determine closest DoMADT pattern between short-term and permanent count, and re-estimate AADT for short-term counts.
  • Validate using permanent stations on other permanent stations.
  • Consider unifying directional data in this module rather after the next two.

Issues

  • We may have to make significant changes to the ETL code to read from Postgres rather than Arman's zip files.
  • The number of permanent count locations is very small, and biased toward locations like the Gardiner Expressway. We can boost the number of permanent counts by augmenting our data using PECOUNT, or introducing additional data from Miovision or SCOOT.
  • Arman's code splits permanent and short-term counts into individual years. It is not obvious if this is necessary to compare DoMADT patterns, or if it was meant as a RAM-saving measure.
  • How growth factors are calculated is contentious, and we may need to explore alternative methods.
  • Investigate tuning permanent count and (KDTree-based) nearest neighbour search criteria as hyperparameters.
  • Investigate if we can significantly relax criteria for data to be included for validation, and isolating a portion of this data as a holdout set. We technically don't need to only consider permanent stations, just any station with sufficient data across multiple years.

RoadKridge

Pipeline

  • Prepare land use and road class data for arterial roads.
  • Associate AADT estimates with arterial road segment. Calculate distances between road segments and estimate locations.
  • Estimate variogram for kriging using OLS.
  • Iterative kriging regression to estimate AADT on arterials. Could use PyKrige or SciKit-GStat

Issues

  • Much of the input data to KCOUNT was manually prepared, using code outside of TEPs. We'll need to reproduce them in our data, which might require disaggregating or interpolating neighbourhood or census-tract level data.

LocalSVR

Pipeline

  • Snap Countmatch outputs and land use and road data onto a grid.
  • Train a support vector regression model to determine AADTs on local roads.
  • Use the trained model to predict on local roads.

Issues

  • We don't currently have the information to reproduce the grid to snap independent variables to.

Postprocessing

Pipeline

  • Unify RoadKrige and LocalSVR outputs into a single file of AADT estimates for all centreline IDs.
  • Generate a version of this data and zip together with other data (already prepared by Arman) for export to TEPs-II.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants