Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTFS-ServiceChanges vs extending GTFS-TripUpdates #113

Closed
LeoFrachet opened this issue Oct 22, 2018 · 59 comments
Closed

GTFS-ServiceChanges vs extending GTFS-TripUpdates #113

LeoFrachet opened this issue Oct 22, 2018 · 59 comments
Labels
GTFS Realtime Issues and Pull Requests that focus on GTFS Realtime Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more.

Comments

@LeoFrachet
Copy link
Contributor

LeoFrachet commented Oct 22, 2018

Hi GTFS Community,

A decision has to be made regarding service changes, and the feedback of everybody is heavily needed.

Currently, many things cannot be done in real-time, including:

  • Case A: Updating time of scheduled arrival/departure
  • Case B: Updating trip_headsign, trip_short_name, wheelchair_accessible or bikes_allowed
  • Case C: Adding a new stop, a new shape, a new route or a new transfer.
  • Case D: Have unactivated trips in the GTFS, that we trigger when needed.

What we currently have on the table are:

  • Existing GTFS-TripUpdates (allowing to provide real-time update for arrival & departure).
  • A proposal to extend GTFS-TripUpdates (Add prediction certainty #111 , including UPDATE_SCHEDULE) by @TransitApp, handling case A.
  • A proposal to define a new feed, called GTFS-ServiceChanges, using the CSV GTFS structure, handling cases A, B, C & D (bit.ly/gtfs-service-changes).

So what do we do?

Option 0: Use only the existing formats. Update static GTFS more often.

Will require overhaul of GTFS consumer GTFS pipeline to be able to process GTFS every hour or more.

Option 1: We beef up GTFS-TripUpdates to support the cases A, B, C & D (Cc @Stefan)

We already have a proposal for case A (#111 ), we could easily extend the TripUpdate object to handle case B, but no proposal so far to handle case C.

Advantages:

  • Small change to existing pipeline for case A & B.

Disadvantages:

  • Handling cases C & D forces us to redefine almost every GTFS concepts slightly differently, therefore duplicating the spec.

Option 2: We keep GTFS-TripUpdates for real-time update only, and use GTFS-ServiceChanges to change schedule data (aka cases A, B, C & D)

This is what I had in mind when I drafted GTFS-ServiceChanges, and this is the current state of the GTFS-ServiceChanges proposal.

Advantage:

  • It makes a clear distinction between schedule updates and real-time updates
  • It keep TripUpdate implementation as they are
  • As branding, it pushes agency to adopt this new feed as a new feature for their rider
  • It doesn’t increase the size of the TripUpdates feed (but it may be not an issue @Stefan @gcamp)

Disadvantage

  • Some overlapping for adding feed with GTFS-TripUpdates
  • It’s a distinct feed, and force to open a new endpoint, with a new URL

Option 3: Middle ground proposed by Transit (Cc @gcamp & @juanborre)

  • Extend GTFS-TripUpdate for case A
  • Use GTFS-ServiceChanges for cases C & D.

=> What about case B (trip headsigns, short names…)? Should we extend TripUpdate also for this or not?

(Link to the GTFS-ServiceChanges proposal: bit.ly/gtfs-service-changes)

@LeoFrachet LeoFrachet changed the title GTFS-ServiceChanges vs prediction_type=UPDATED_SCHEDULE GTFS-ServiceChanges vs extending GTFS-TripUpdates Oct 22, 2018
@juanborre
Copy link
Contributor

Hi GTFS Community,

What agencies need more often is to update the schedules or the predictions of a trip and choosing how to display those times to the user, either saying it is real time and the vehicle is being tracked or saying it is an updated scheduled and the vehicle is not being tracked.

That information can be provided easily in the TripUpdates feed. The ServiceChanges can be used less often to convey bigger changes in the data.

Changing headsigns or other trip specific data would add too many things in the TripUpdates feed. Although it is conceivable, it can be better to split that to a different feed that aims at bigger modifications of the data.

It is true that the three options are valid, but the option 3 solves the problem of updating schedules and deciding which predictions are real time or not relatively quickly while being scalable and easier to implement for all the agencies/AVL that already have a TripUpdates feed.

@abyrd
Copy link

abyrd commented Oct 26, 2018

I approach this proposal with caution, because it will increase the complexity of the GTFS specification, creating multiple ways to define the same objects, and therefore demand an increase in the complexity of software consuming the GTFS format. I acknowledge the point of view that this is necessary to provide up-to-date information to passengers. But time should be taken to consider the impact carefully, because once this change happens and some people adopt it there will be no going back to the simpler world where a single file contains a complete snapshot of a transit system's schedules, stops, etc.

The proposal linked to above (
http://bit.ly/gtfs-service-changes) begins with the following statements:

  1. GTFS cannot be processed more than once an hour, hardy not more than once a day.
  2. GTFS-RT feeds cannot contain huge amount of information since they are fetched every 30s (and also they do not allow all kind of service changes).

GTFS Service Changes are then presented as a solution to these issues, but I do not see them as self-evident axioms.

  1. Is there a fundamental reason why GTFS cannot be processed multiple times in a day? For many applications, and for many feeds, it takes at most a few minutes to completely handle a feed. I acknowledge that for very large feeds like all of the Netherlands or the New York City region, fully processing and integrating a whole new data set can be more on the hour-long time scale. You probably wouldn't want to re-publish feeds every hour of every day, but for the occasional schedule revision I don't see an inherent problem with just re-publishing a new feed.

  2. As I posted on the service changes proposal, this second statement assumes that a polling (pull) method is used to receive the GTFS-RT messages. In bigger systems I believe only a streaming (differential push) method makes sense. This is done in the Netherlands, Norway, and Finland and consumed by OpenTripPlanner. A very large number of service patches could be pushed out using such a system without encountering any bandwidth constraints.

I don't rule out the possibility that a third time scale is needed between the static and real-time feeds, but it seems questionable. We should be absolutely sure that effective operation can't be achieved with two time scales before sacrificing simplicity.

@abyrd
Copy link

abyrd commented Oct 26, 2018

I initially lean toward Option 1: revise GTFS-RT TripUpdates to support all cases A, B, C, and D.

You list as disadvantages the fact that handling cases C & D "forces us to redefine almost every GTFS concepts slightly differently, therefore duplicating the spec."

Can you clarify this? Why would all GTFS concepts need to be redefined, and why would the new definitions be different?

It seems quite healthy to me to continue work on GTFS-RT to make it capable of a wide range of revisions to the scheduled data.

@barbeau
Copy link
Collaborator

barbeau commented Oct 30, 2018

What I like about Option 2 (as @LeoFrachet says) is that it's a clear separation of functionality between the feeds, where TripUpdates would be used only for real-time predictions, and ServiceChanges would be used for any static modifications to the network. There are backwards compatibility issues for consumers that aren't aware of new enums when you start piling new functionality on top of TripUpdates. Separating these network/schedule updates into a new channel with a clear purpose ensures that unaware consumers don't start interpreting schedule updates as real-time predictions.

Additionally, if consumers want to consume just VehiclePositions and Service Changes and generate their own predictions, they can do this without needing to parse the TripUpdates as well.

@abyrd
Copy link

abyrd commented Oct 31, 2018 via email

@LeoFrachet LeoFrachet assigned LeoFrachet and unassigned LeoFrachet Oct 31, 2018
@harringtonp
Copy link

Having read both this and proposal #111 the following questions arise:

  1. How often would an operator need to update a schedule ?
  2. How long in advance are they aware of the new schedule ?

The answer to both these questions is presumably along the lines of "as long as a piece of string" but if the general feeling was that the answer for 1) were "Not too often" and 2) "Usually more than a day"
then would the existing mechanism (as mentioned above) of just publishing a new schedule be best.

If this works in over 90% of cases is there really a need to introduce a new layer. I've seen cases where the schedule is updated on several consecutive nights and it may well be to cater for cases like this.

As regards user experience, I think if you can show reliably where a bus or train is on a map at a given time AND the user can see the scheduled or predicted times of arrival for the preceding and following vehicle stop, they will form an opinion themselves as to when the vehicle will arrive at their stop regardless of the schedule or predictions. The point I am getting at is if extra complexity is introduced, confidence would be needed that it can improve the user experience.

@abyrd
Copy link

abyrd commented Nov 1, 2018

Changing headsigns or other trip specific data would add too many things in the TripUpdates feed. Although it is conceivable, it can be better to split that to a different feed that aims at bigger modifications of the data.

I acknowledge that this opinion is shared by several people, but before making a spec change it will be important to justify that opinion. When you say this is "too many things" what is the threshold for "too many"? What specific technical or conceptual restrictions make it excessive to include this information in real-time updates?

Why would it be considered excessive to augment an existing spec, but not excessive to define another separate spec containing the same information, requiring significant software development and additional complexity in every GTFS consumer and data pipeline?

@abyrd
Copy link

abyrd commented Nov 1, 2018

As someone working on producing and consuming large GTFS static and realtime feeds, @skinkie I think your opinion would be valuable here. Do you find it advantageous to add another time-scale of patches with a new format between GTFS-static and GTFS-RT? I see that option 1 above says CC @Stefan and I wonder if that is supposed to be you, because the Github user by that name seems inactive.

@skinkie
Copy link
Contributor

skinkie commented Nov 1, 2018

I discussed this with Leo and did some software development for exchanging a day worth of GTFS data inside tripUpdates. In my opinion tripUpdates should be extended with functionality for adding all possible GTFS static fields, opposed to define a new format. Because no matter what the outcome is with ServiceChanges, there are cases we must update some stuff of existing trips in realtime, which is currently not supported.

But in general my opinion is: this can be exchanged with SIRI-PT, what does justify to make a GTFS-RT alternative?

@abyrd
Copy link

abyrd commented Nov 1, 2018

@skinkie my sense is: many people are bothered by the huge catch-all nature of the Transmodel/SIRI ontology and the verbosity of its fetch/subscription mechanism, and GTFS-RT is a chance to complete a more compact spec that covers the 95% of common passenger information cases.

SIRI is more of an alternative to using GTFS-RT at all. As long as people see benefit to staying in the GTFS-static + GTFS-RT world, I would say it makes sense to complete the functionality available in that pair to cover some high percentage of common use cases.

@LeoFrachet
Copy link
Contributor Author

LeoFrachet commented Nov 1, 2018

I'm thrilled to see this conversation moving forward! Thanks @harringtonp, @skinkie & @abyrd for your contributions!

A few answers here.

How often would an operator need to update a schedule ?

For years now we have producers updating their schedules every day (e.g. WMATA in US-DC). We started to speak about ServiceChanges when some big producers started to speak about updating their (big) GTFS every hours. From what I've heard from the GTFS consumers side of the industry, nobody is ready for that. But I fully agree that "The existing pipeline cannot take it" doesn't not imply "We need a new format". It's an open question. What I want to stress out is that the industry is moving from a seasonal update of their GTFS (aka 4 times per year) to a every-day and even every-hour update (which is great!), but knowing how to address that is an unresolved question.

Do you find it advantageous to add another time-scale of patches with a new format between GTFS-static and GTFS-RT?

@abyrd I assume you haven't actually read the GTFS-ServiceChanges proposal, which is fair. The goal of ServiceChanges is to stick to the CSV GTFS format. It's working by declaring what type of change you want to do (deletion, addition, modification), then selecting a row (table name + id), then if needed providing the field names and values you want to change. So the whole goal is to not add another format, but to stick to CSV GTFS. The only exception is that you're allowed to specify a day for your changes, because otherwise editing the service_id by hand is a huge mess. So GTFS-ServiceChanges aims to be a kind of GTFS-delta if you want. Not another format.

The reason why I said extending TripUpdates "forces us to redefine almost every GTFS concepts slightly differently, therefore duplicating the spec" is that e.g. GTFS-rt StopTimeUpdate object is pretty different than CSV GTFS stop_time object. It's arrival value can be given either by a "delay" or by a "time", with time defined as an absolute POSIX time... which is completely different from the HH:MM:DD format used in CSV GTFS, which allow hours above 24 and which defines 01 as noon minus eleven hours.

Another example: the TripUpdate object doesn't contains the trip_id. It's its child, the TripDescriptor which does. So if we want to add the feature to add or alter a route, should we replicated the same pattern and define a RouteUpdate containing a RouteDescriptor containing the route_id? Or should we simplify it and define a RouteUpdate which will do both? Whichever you pick, you'll define specific object that people will have to memorize. With ServiceChange, you have nothing new to memorize. Routes are in the routes table and contains route_id, route_stop_name, etc...

====

That being said, I agree with what you guys said: in all cases, we will have the backward compatibly that we have new data to be ingested. Either because GTFS will have to updated every hours, or because TripUpdates will have changed, or because there will be a new feed.

The decision which has to be made here is an practical decision. On a theoretical point of view, there is no problem, you can just output a new CSV GTFS as often as you want. But with today implementations, it's not practically possible to consume such update. Regarding expanding TripUpdates, Stefan & Guillaume both think we won't have size issue. So the decision is really an industrial decision, of whether this industry:

  • A. Is ready do change their pipeline to produce & ingest GTFS every 1h, and even every 10min if there is a e.g. huge snow storm.
  • B. Says they won't ever do that change, and that they want another way to get updates on the CSV GTFS.
    • B1. Either by changing the existing TripUpdates formats, which might requires to merge data from different stream it it's hosted by different tools, but maybe not,
    • B2. Either adding another feed, with the extra cost that it will also generate.

I would be nice that everybody gives his point of view, and then we'll do whatever the industry agreed on. @slai & @dbabramov among others.

But I agree with Andrew: there is a need which requires to be addressed.

@harringtonp
Copy link

Where is the GTFS service changes proposal Leo, its not showing up in searches for me ?

@barbeau
Copy link
Collaborator

barbeau commented Nov 5, 2018

@harringtonp GTFS-ServiceChanges - http://bit.ly/gtfs-service-changes

@LeoFrachet
Copy link
Contributor Author

@harringtonp My bad sorry. I added it in the original doc.
@barbeau Thanks.

@barbeau
Copy link
Collaborator

barbeau commented Nov 6, 2018

Wrapped up in this debate is also how we handle trip.schedule_relationship=ADDED going forward. Currently it's not well defined in the spec and it seems to have a small set of producers using it for different things. See #106 and https://groups.google.com/forum/#!topic/gtfs-realtime/W6bm2Xj3p-Q for agencies that are producing it as well as examples and explanations.

@barbeau barbeau added the GTFS Realtime Issues and Pull Requests that focus on GTFS Realtime label Nov 6, 2018
@abyrd
Copy link

abyrd commented Nov 7, 2018

Wrapped up in this debate is also how we handle trip.schedule_relationship=ADDED going forward.

There's also a connection with the UPDATED_SCHEDULE value that was proposed in PR #111 (which has been removed in the current version of the proposal). The ScheduleRelationship enum and field trip.schedule_relationship make more sense to me as a place to indicate that a TripUpdate is an updated schedule, rather than real-time information for a vehicle that is already operating.

@abyrd
Copy link

abyrd commented Nov 9, 2018

@LeoFrachet I'm reading your longer response above and thinking it through. I think there are some deeper questions emerging here. There are mentions of producers updating feeds every day, every hour, or even hypothetically every 10 minutes in snowstorms. It's understandable that operations staff might be rethinking schedules on an hour-by-hour basis, and they may want to publish those changes immediately to give their customers the best information they can. But I would expect all such changes to those feeds to be at least one or two days in the future from the publication date (ideally weeks in the future).

If they are publishing updated GTFS that includes changes to the upcoming 24-48 hour period, this is problematic. The GTFS static feeds represent schedules. They communicate to riders / customers the service the operator intends (and in many cases is legally obligated) to provide in the future. The rider counts on this data for planning journeys in advance.

Anything that is changed in the near future is not a schedule or an update to a schedule. From the rider or data consumer's perspective it is an unpredictable and unexpected disruption of planned service. If I check how to make a journey to the airport tomorrow morning, but the data producer's bus breaks down overnight, the data producer should not publish new "schedules" tomorrow morning saying that they had no planned service to the airport.

The data they publish, and the data presented to the rider who re-checks the journey planning system in these circumstances, should be the same planned service that was seen the day before, with an additional layer showing that the service is severely delayed or cancelled.

I'm all for dynamically adapting service and routing in real-time, but that doesn't fact that the high-capacity backbone of most transit systems is composed of predictable fixed routes. It is my position that this predictable baseline can be expressed with GTFS static at least several days in advance, and everything else can be handled with streaming real-time messages.


Point 2, about "redefining GTFS concepts": this was just a misunderstanding. I interpreted "redefining concepts" to mean the changing the meaning of the terms/concepts themselves in different places, e.g. Trip or Route does not mean the same thing in RT as in static. Fortunately this is not the case, and the concepts have consistent meanings. I see now what you mean: that the same entity would be described using a different syntax or format in the GTFS static vs. GTFS-RT layer, and that format would have to be designed and documented for the RT layer. This is a legitimate concern.

I did in fact read the service changes proposal early on. I just think the issue of using a different format (GTFS CSV mapped into Protobuf messages) is a distinct issue from the introduction of an entire new layer of patching.

Perhaps it was inevitable that GTFS-RT would grow to encompass changes to most entities in the static GTFS. It's unfortunate then that the representation initially chosen in GTFS-RT is so different from GTFS-static. Adding more layers can't really compensate for that past decision though. The fact remains that an additional layer obligates every GTFS consumer in the world to modify their pipeline, and any that don't will silently begin receiving an incorrect picture of the network.

Service changes seem focused on editing or replacing, on rewriting history as if scheduled services never existed. This might be a convenient perspective for data producers who are often very worried about the public perception of service delays and disruptions, and the associated regulatory penalties. But the reality is that in most places with GTFS data, service is planned months in advance. Almost everything else is a disruption and should be represented as such.

@abyrd
Copy link

abyrd commented Nov 9, 2018

By the way the link above to the services changes document does not work anymore. I believe this is the document: https://docs.google.com/document/d/1bpNGrQTXbkyImwRO3VZeQdbxwMzeJnDgxj08WXak4i0/edit#heading=h.c2kju5nsoemr

@slai
Copy link

slai commented Nov 14, 2018

We're still pretty early in our journey with changes other than delay or cancellation in our systems, so I don't have any strong opinions right now, but I will follow along.

It seems like we're in agreement that the GTFS/-RT model is currently failing in the ability to provide all the information riders need during times of abnormal operation, e.g. strikes, equipment failures or emergency operations, and these are the times that riders need accurate information the most.

Option 0 - while this seems simplest and producers are probably pushing for this because it's familiar and also widely consumed, it removes the ability for the consumer to identify which trips were replaced, which is useful for alerting the rider. This is a big deficiency that often makes riders think the app is broken because trips are missing, when this is actually reality.

I don't have any strong opinions on options 1 or 2, but there isn't really an example on what option 1 would look like, especially with the inconsistencies raised here. Maybe an example of a potential proposal to the same level as in ServiceChanges would help.

I think the discussion about whether ServiceChanges is changing schedules and TripUpdates are disruptions is a bit of a side issue and a matter of framing. More importantly, if shoehorning all the capabilities of ServiceChanges into TripUpdate-style messages produces something that's inconsistent or difficult to use, then that's probably an argument for going with something like ServiceChanges.

One last thing - it's worth noting that if streaming GTFS-RT is the only way to efficiently deliver large schedule changes via TripUpdates, then that's going to be a significant change for many consumers I suspect, who are now just periodically downloading blobs from a server. If operators are going to be serving up small blobs, and all of a sudden when an incident occurs, start serving up significantly larger blobs, then that's not going to work well.


EDIT: I've discussed this further internally and philosophically, I believe there is a class of changes between long-term planned schedules in GTFS, and last-minute operational changes in GTFS-RT, that cannot be communicated with the existing mechanisms.

Given the case of maintenance work overrun or an unconfirmed strike, there's no way to communicate the difference between -

a) trips the operator plan to run tomorrow,
b) a new trip that's been added in the next hour due to operational changes

For a), as a rider, I'd consider these with the same reliability as regular schedules with additional consideration given to the alert attached. I would make plans based on them, but know they are subject to change and will likely need to check again closer to the expected departure time.

For b), as a rider I'd consider these to be much more reliable as it's essentially 'real-time' information that the operator has communicated based on the current situation.

Regardless of whether the current design of ServiceChanges is what we want, I believe trips in point A are what ServiceChanges is trying to solve and there's value in providing that kind of information to the user. Indeed in the UK, there are 3 levels of changes - long term plan (LTP), short term plan (STP) and very short term plan (VSTP) - that match this.

It could be argued that the time difference between the creation of the message and the trip could be used to infer the reliability of information, but I think there is a subtle difference that's worth explicitly communicating.

@lauramatson
Copy link

My background is in customer information and at Metro Transit in Minneapolis-St. Paul. My background is more on the UX side than the technical backend, but I like Option 3 for clarity of how the feeds should be used and what they represent.

I disagree with this characterization – Service changes seem focused on editing or replacing, on rewriting history as if scheduled services never existed. What appeals to me about the ServiceChanges format is that it makes clear that it is a deviation from normal scheduled service, represented in the static GTFS.
We can’t control how regularly consumers consume our GTFS feed, which currently is updated weekly. It would be preferable to direct people to a default base schedule that should be used if it can’t be consumed as frequently as it’s produced. Lots of detours and disruptions have uncertain or imprecise start and end dates and times. I worry that adding all detours and disruptions to the GTFS would result in some consumers reflecting detours that have expired and lead to more confusion.

We integrate some detour routing and stop closures into the static GTFS (if long-term, significant, and predictable enough), but this doesn’t distinguish permanent routing and stops from temporary detours. Ideally, we’d want to be able to flag these detour differences (stop closure, new temporary stop, routing change, etc.) so riders know that service is disrupted and they shouldn’t go to their regular stop or to be aware that they may be looking for a zip-tied temporary bus stop sign instead of permanent stop/station infrastructure.

Providing information about where service is actually operating (what stops are open, where buses will go) so riders can plan trips and get accurate service information rather than relying on just Alert messages to convey these disruptions offers accessibility benefits for riders with limited English literacy and those using assistive technologies for navigation.

Even for riders who can read alert messages, it’s less useful if they can’t get a trip plan or accurate service information and they have to parse messages in order to figure out what to do. In the case that a train isn’t operating to the airport, for instance, we would want to make clear that the train is canceled (not just disappeared from the schedule) and would want riders to see that replacement shuttle service is operating and how that option works.

My experience standing on closed train platforms in a Transit uniform is that many customers trust their phones and trip plan results and service data more than alert messages or any instruction staff can provide. We’re failing to meet riders’ expectations if we aren’t providing the information about how to complete their trip based on service as it’s actually operating.

@botanize
Copy link
Contributor

I think Laura points out a critical distinction in how we as planners, operators and analysts think about transit service. We have a schedule that we publish. This is what we tell people they can depend on, and we do our best to operate that schedule. Deviations from the long-term schedule should be shared as deviations, not new schedules. They are un-expected or short-term changes to operations.

As someone who uses GTFS feeds in a wide range of analyses, now at Metro Transit (Minneapolis/St Paul) and formerly at JWA, I think updating the GTFS more frequently would exacerbate problems we already have with schedules updated every few weeks. There has to be some kind of baseline for analysis and planning activities. If you publish a GTFS feed every 30 minutes, I don't have anything I can use for analysis.

When the GTFS changes every 30 minutes or more, how do I:

  • Compare service scenarios (which 30 minute GTFS is the baseline)?
  • Calculate the average number of jobs accessible to residents in the city?
  • Even describe the basic service?

The standard GTFS feed should be used for planned service for a reasonable period of time, like a seasonal pick.

I'm not sure I have a strong opinion on which deviations from planned services belong in TripUpdates vs ServiceChanges.

@LeoFrachet
Copy link
Contributor Author

LeoFrachet commented Nov 30, 2018

Thanks @slai, @lauramatson & @botanize for the feedback! Sorry for the silence on our side, we keep on working on this subject with more 1-to-1 discussions with both producers and consumers to better understand the needs and the scope, to see which proposal would be the best fit.

Once the smoke will have cleared, we'll come back with a proposal to discuss.

@harringtonp
Copy link

Having read through this discussion again and looked at the service changes proposal document, I would be very much in line with Andrew's ( @abyrd ) thoughts. I would find extending the current TripUpdates mechanism preferable and think this really needs to be looked at properly before contemplating the introduction of a whole new layer.

What I would recommend is taking some definite examples which cover the most common cases and seeing how they could be modeled in TripUpdate extensions. And I don't believe this would have much affect on the overall size of the TripUpdate protocol buffer file.

If we take the first example in the service changes proposal document (MBTA snow routes in Boston) all trips for route 62 for the snowy day would have a ScheduleRelationship of CANCELED. Each trip added for route 62 for that day would have a ScheduleRelationship of ADDED. The trip_id in the TripDescriptor is meaningless in this case as it does not reference the schedule so the language in the spec would need to be relaxed. The route name can be derived from the TripDescriptor route_id using the schedule and a trip_headsign field could be added in to the TripDescriptor giving the destination.

Moving on with this example to each stop in the route and the existing StopTimeUpdate messages covering these, the arrival/departure times would be specified as absolute times rather than delays and the ScheduleRelationship could again be ADDED (so added is used for both the trip and each stop). If the stop exists in the current schedule then the stop_id is sufficient.

If it is a newly added stop then there could be additional fields such as stop_name, stop_lat and stop_lon whose names mirror those in the schedule. This requires spec changes but I would imagine it could be done in a backwards compatible and tidy way.

If a stop has moved but is still viewed as the same stop (it hasn't moved far) then the stop_id could be used in combination with new fields stop_lat_moved, stop_lon_moved which give the new temporary location.

From what I can see this largely covers the snow route case and it would also cover "Adding a new stop" in case C above. The case C "Adding a new route" could be done in a similar fashion by specify a route_ short_name and route_type in the TripDescriptor (try keep field names the same as in the schedule). And with regards to Case D, I'm not sure why this would be needed if Case C is flushed out and can be activated quickly by a producer.

Finally, there are strong cases made by others for not having too regular schedule changes. I would largely agree with this and appreciate how the schedule anchors a system and provides a point of reference. I have seen daily updates to schedules on a few consecutive days but suspect this is largely due to errors. On a technical point however, I would venture a guess that most GTFS consumers should be able to update a schedule on a nightly basis without difficulty. If a consumer is checking for a new schedule once a week on a Sunday night/early Monday morning then they could just as easily check at the same time each night. After all, is there likely to be that much more system activity in the early hours of a Saturday morning than there is on a Monday morning... Moving to hourly checks however would be an altogether different story as you could potentially end up have to do updates at very busy times.

@abyrd
Copy link

abyrd commented Dec 11, 2018

Thanks @lauramatson and @botanize for your commentary - it's helpful to have additional input and points of view.

@lauramatson and @botanize seem to be emphasizing that GTFS-static should not be published very often, certainly not every 30 minutes or every day. I am not sure if their comments were in response to things I have written above, so I should clarify my position, especially where we are in complete agreement: I think there is an important distinction between schedule data and updates (deviations from planned service). And I am not suggesting that GTFS-static should be published more often - I only made the formal observation that there is currently no technical restriction or clear limit on how often it could be published.

I agree with every point made about keeping riders updated with very recent information, ensuring that information is distinct from baseline schedules, and the pitfalls of publishing new GTFS-static "schedules" every hour or every day.

However I don't see any direct line from these ideas to the proposal to introduce an additional layer of updates, rather than attempting to express all updates in a single layer. That proposal seems to be more driven by the inconvenience of protocol buffers and the "impedance mismatch" between Protobuf-based GTFS-RT and CSV-based GTFS-static.

If the core problem is that it's messy to express all the different kinds of schedule updates in Protobuf-based GTFS-RT, an additional option has yet to be voiced: such a new text-based update format could completely replace Protobuf based GTFS-RT instead of layering with it. I realize there are many reasons this might be a bad idea, but from a maintenance, maintainability, and approachability point of view, completely replacing GTFS-RT seems less problematic to me than adding layers. See also comments on #109 about the difficulty of extending and maintaining Protobuf specifications for an evolving spec.

I'm not necessarily advocating replacement of protobuf-based GTFS-RT, but pointing out that such a replacement may be no worse (indeed may be better) than adding layers.

@tleboulenge
Copy link

@nathan-reynolds This is valuable information, thanks.

One question about modelling detours: When a section of a route is replaced, do you map cancelled stops to their replacement stops, or does it simply close all replaced stops and assigns replacement stops on the detour route, without any relation to the former ones?

Is it possible (or even happens routinely) that stop sequence numbers on the section of the route that is not affected by the detour, but comes after it, are all shifted, if the number of cancelled/replaced stops don't match?

@nathan-reynolds
Copy link

@tleboulenge Mapping alternate stops to their replacements is something we plan to add as part of our roadmap. We want to be able to handle the use case of providing passengers 'travel instructions' from the old stop to the new stop, as well as including the mapping in the TripUpdates feed should that become available to us.

I think it's a valid use case that you could end up with more alternate stops than scheduled stops, but it's probably the exception rather than the norm. We haven't worked through how exactly we'll handle building these into the 'new trip' since it's still on the roadmap, but it's definitely a valid use case.

@LeoFrachet
Copy link
Contributor Author

Hi everybody,

We're still working on GTFS-ServiceChanges, now with a v3.1, which allows to:

  • change any value on an existing trips (e.g. trip headsign, trip short name) and to change its shape, either by using one already define in GTFS or by defining a new one.
  • create new stop on the fly
  • create new route on the fly
  • create new trip on the fly

The proposal is here: http://bit.ly/gtfs-service-changes-v3_1

We know that in short term, likely only a subset of it will be implemented, since adding full new stops and full new routes may be tricky, but we wanted to provide the mid-term vision, so that we could see which section we want (/need) to adopt in short term.

If you're interested to dive into the proposal, please let me know. Once you'll have read it, it may be worth to have a one-to-one meeting with you to answer your questions and gather your feedbacks. You can email me or contact me on LinkedIn if we haven't exchange already.

Thanks!

@scrudden
Copy link

scrudden commented Jan 23, 2020

I have been looking at the possibility of adding support for GTFS-Service Changes v3.1 to TheTransitClock. Is there a .proto file available and matching bindings?

For those who do not know me, I am the maintainer of TheTransitClock OSS project.

@lionel-nj
Copy link

lionel-nj commented Jan 28, 2020

@scrudden: MobilityData will be working on this matter

An update to gtfs-realtime.proto has been committed on this PR in order to reflect the changes that v3.1 of servicechanges provide: MobilityData#47 (comment)

@darylweinberg
Copy link

darylweinberg commented Jan 29, 2020 via email

@scrudden
Copy link

scrudden commented Feb 25, 2020

Here's hoping. Has anyone created the bindings in java?

@barbeau
Copy link
Collaborator

barbeau commented Feb 25, 2020

@scrudden Here's a draft version of Service Changes bindings based on the draft .proto - MobilityData/gtfs-realtime-bindings#58. Please note that the .proto is still subject to change so these bindings may change as well. So, they aren't suitable for production use yet, but should work for prototyping.

@scrudden
Copy link

@barbeau Thanks for doing that. I see you added the repeat on shape_point.

@scrudden
Copy link

@darylweinberg Nice to meet you! Do you have anything you can share yet? This is a happy coincidence as I have been using Capital Metro data recently for testing.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label Nov 21, 2021
@skinkie
Copy link
Contributor

skinkie commented Nov 21, 2021

keep open

@github-actions github-actions bot removed the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label Nov 22, 2021
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more. label Nov 22, 2022
@github-actions
Copy link

github-actions bot commented Dec 7, 2022

This issue has been closed due to inactivity. Issues can always be reopened after they have been closed.

@mpaine-act
Copy link

What is the current proposal for deltas between GTFS and GTFS-RT? For example, GTFS-RT Trip Updates might have manual trip_id not found in the GTFS trips.txt files (ad-hoc trips).

@isabelle-dr
Copy link
Collaborator

Should we re-open this?

@miklcct
Copy link

miklcct commented Aug 6, 2024

Does the extension implemented in opentripplanner/OpenTripPlanner#4667 satisfy what we want to do here? It is used to dynamically add new carpools as GTFS-RT trips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GTFS Realtime Issues and Pull Requests that focus on GTFS Realtime Status: Stale Issues and Pull Requests that have remained inactive for 30 calendar days or more.
Projects
None yet
Development

No branches or pull requests