Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prediction certainty #111

Closed
wants to merge 20 commits into from
Closed

Conversation

gcamp
Copy link
Contributor

@gcamp gcamp commented Oct 19, 2018

We have seen that uncertainty values in TripUpdates are sometime used to convey the information if a prediction is precise enough to be shown as real time in consumer applications. Different producers have different values where uncertainty has different meanings.

This causes problems because consumers has to have different interpretation for different producers. It would be better to have a clear value for each prediction that defines if a prediction is precise enough to be shown as real time or not.

This is a proposition to add prediction_type TripUpdates as an experimental field. Predictions can either be

  • REALTIME (current default)
  • IMPRECISE_REALTIME where the vehicle is followed in real time but doesn't have enough precision to be shown as real time to the user. The prediction threshold for the switch between REALTIME and and IMPRECISE_REALTIME should be the same threshold that the producer uses to hide the real time icon in other interfaces like bus displays.
  • UPDATED_SCHEDULE where a vehicle is not yet followed in real time but some changes to the schedule was done after the latest GTFS export, very likely due to changes from the control center.

I would also be curious to know if anybody uses the uncertainty value in GTFS-rt without having some magic threshold value. If that's not the case and usage of prediction_type is accepted and successful, I would propose to deprecate uncertainty

Note : I might be unresponsive in the next few weeks. @juanborre has my full confidence and can respond in my stead and represent Transit.

@mike-swiftly
Copy link

I certainly think it is a good idea to add a prediction_type attribute to the GTFS-realtime spec. This will allow systems, like Swiftly's Transitime system, to clearly indicate whether predictions in the user interfaces should be displayed with a real-time icon or not.

I think there are two issues that should first be dealt with, clarifying the possible values and determining if any other related additions are needed.

With respect respect to clarifying the possible values, I'm not clear on UPDATED_SCHEDULE. My current opinion is that there should simply be a value SCHEDULED to indicate that there is currently no location information for the vehicle for that trip and that the predictions are just based on the schedule. Could be that the standard schedule is used or that an improved scheduled based on historical travel times is used. Either way, I think it would be good to simply indicate that the predictions are schedule based. For such predictions a real-time icon should of course not be shown in the user interface.

With respect to other possible additions, I think another key one is to indicate if a prediction is based on when the driver is supposed to leave a terminal or a wait stop. Predictions for after such a stop are directly affected by driver behavior, when the driver actually leaves the terminal. Such predictions are inherently not as accurate. Some user interfaces could indicate such. A possible way to deal with this is to have an additional possible value of REALTIME_DRIVER_AFFECTED.

One last thing, I suggest that "IMPRECISE_REALTIME" be changed to "REALTIME_IMPRECISE" so that it is obvious that the main goal of this attribute is to indicate whether or not the predictions are real time or schedule based.

@LeoFrachet LeoFrachet requested a review from barbeau October 22, 2018 11:06
@gcamp
Copy link
Contributor Author

gcamp commented Oct 22, 2018

@mike-swiftly

The reason I put UPDATED_SCHEDULE instead of simply SCHEDULE was because the role of simply showing scheduled is usually reserved for the static GTFS. UPDATED_SCHEDULE was to make clear it wasn't a regular scheduled case.

Re REALTIME_DRIVER_AFFECTED, I'm not strongly against it but it for a consumer application it's not something that's useful. What's useful is to make sure that we display the same information in all interfaces. For example, if Swifly shows departure dependant on driver behavior as real time in the SMS interface, the departure should be tagged as REALTIME. If Swifly doesn't tag them as real time then it should be REALTIME_IMPRECISE.

Agreed on the REALTIME_IMPRECISE change, will do the modifications.

@abyrd
Copy link

abyrd commented Oct 22, 2018

I would agree that if uncertainty is not used in a consistent way (or barely used at all) then perhaps it should be deprecated. I'm not sure though why we would need a flag for imprecise data. Considering this is primarily a consumer-facing format, if the real-time information is not precise enough to be used by passengers, shouldn't the producer just not send it?

I also don't understand why there would be a value (SCHEDULED) for "predictions are a copy of the schedule because we have no real-time information". Again it seems to me that real-time data should simply not be provided in this case.

As for UPDATED_SCHEDULE, from the passenger's point of view this is still a deviation from the published schedule. How would this information be used by a consumer application?

@gcamp
Copy link
Contributor Author

gcamp commented Oct 22, 2018

@abyrd

Sometimes real time prediction can have low accuracy so users shouldn't rely on it the same way they do for other real time predictions, but they are still more precise than the actual GTFS schedule data, so it's worth sending.

Re-How to use UPDATED_SCHEDULE for consumer application. At Transit we plan to show them initially as regular schedule information, but we do want to eventually change that and show that's planned information but changed from the original schedule.

In a few words, we want producers to send the latest information that they have, even if it's not up to regular real time standards.

@gcamp
Copy link
Contributor Author

gcamp commented Oct 29, 2018

@mike-swiftly @abyrd Any responses on the follow ups earlier? I would like to call for a vote soon. If you think the answer were satisfactory let me know :)

## _enum_ PredictionType

Experimental field, subject to change.
PredictionType represent the source of data of the prediction. This lets consumer adjust their display of the information depending on the source (for example, with a real time symbol).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PredictionType doesn't really seem like a source of data based on the enumerated values. It's more like a qualifier of the type of data.

@barbeau
Copy link
Collaborator

barbeau commented Oct 30, 2018

I'm not sure though why we would need a flag for imprecise data. Considering this is primarily a consumer-facing format, if the real-time information is not precise enough to be used by passengers, shouldn't the producer just not send it?

I tend to agree here. @gcamp Can you give an example of how a consumer would use data of type REALTIME_IMPRECISE?

@skinkie
Copy link
Contributor

skinkie commented Oct 30, 2018

You want to mention that the vehicle is driving (hence it is not cancelled) but are not aware of the whereabouts. Opposed to "there is no realtime status".

@barbeau
Copy link
Collaborator

barbeau commented Oct 30, 2018

@gcamp I agree that uncertainty doesn't seem very useful as it exists today, and I agree that if producers can differentiate between the two levels of predictions we should give them the ability to communicate that info.

However, to me it seems like there are two very different uses of prediction_type here:

  1. Communicate whether a stop_time_update should be shown in the consumer's UI to riders (REALTIME and REALTIME_IMPRECISE)
  2. Communicate an updated schedule from GTFS (UPDATED_SCHEDULE)

It seems unnatural to me to bundle these two very different concepts into the same field.

For the first two values (REALTIME and REALTIME_IMPRECISE), I'd suggest directly addressing the end goal in the name of the field and values. I think a relatively vague name like REALTIME_IMPRECISE could suffer from the same varying interpretations as uncertainty. IMHO having a boolean flag that clearly indicates whether or not a prediction should be shown to a rider would be preferred, such as show_to_user, if that's the primary goal of the first two field values.

I'd be interested to hear from producers to know whether they can actually differentiate between the two quality levels of predictions internally.

Given that UPDATED_SCHEDULE has a completely different purpose from the first two values, what are your thoughts on removing UPDATED_SCHEDULE from this proposal and instead using the GTFS-ServiceChanges spec as mentioned by @LeoFrachet in #113?

One motivation for using a separate channel for schedule updates is that UPDATED_SCHEDULE as proposed here isn't backwards-compatible from a consumer's perspective. If producers start publishing updated schedules as stop_time_updates, and a consumer isn't aware of the new UPDATED_SCHEDULE enum, they will by default show this info as real-time to riders.

(FWIW, REALTIME_IMPRECISE really isn't backwards compatible either in the same way, although the effects may be hidden from consumers if agencies just label the existing predictions they are already pushing out in stop_time_updates as REALTIME_IMPRECISE, as opposed to publishing new stop_time_updates they aren't currently producing).

@barbeau barbeau requested review from barbeau and removed request for barbeau October 30, 2018 18:55
@mike-swiftly
Copy link

I think that there is lack of agreement on how to move forward because we are trying to put too much information into a single field. I think a key goal should be to be able to clearly specify a prediction is due to GPS from a vehicle or whether it is something else, like schedule based.

A secondary goal could be to specify more information about the source and the quality of the prediction.

Therefore I think should have a boolean parameter called something like gps_based. Default would be true. Easy to understand and to implement. We can also have a second parameter called prediction_type that could specify any additional information.

I think splitting things into these two parameters would allow us to move ahead pretty quickly with gps_based and achieve a lot of what we are looking to accomplish.

@skinkie
Copy link
Contributor

skinkie commented Oct 30, 2018

gps_based is something that would not work for me, especially since most vehicles use multiple sources like odometers etc.

@mike-swiftly
Copy link

Stefan make a good point about the name "gps_based". Other examples of real-time position systems include train control systems for when vehicles are underground. So a different name would be an improvement. Perhaps "real_time_information" or "real_time_location" or ???

@barbeau
Copy link
Collaborator

barbeau commented Oct 31, 2018

@mike-swiftly What are your thoughts in terms of naming the field after the action producers want the consumer to take (e.g., show_to_user or hide_from_user, or something better...) instead of qualifying the type of data?

@mike-swiftly
Copy link

I have to vote -1 on the current pull request because I think it does not address the goal of clarifying when the real-time icon should be displayed in a client application. It does address uncertainty, but it still assumes that predictions should always be based on GPS. The RealTimeCertainty value of LOW is described as:

Update is based on information from a vehicle that is or was followed in real time but is now low certainty. Lower certainty can come from things like a bus losing connection after the start of the run or when the prediction is dependant on driver behaviour.

I think the above definition of LOW is good for handling those specific situations. But there are other situations that we were trying to handle, such as when there is no GPS for a vehicle because no vehicle has been assigned to particular block/route. For schedule based predictions the certainty would be LOW, and no real-time icon should be displayed as part of the associated predictions. But there are other situations where the certainty would be LOW, such as the prediction being real-time GPS based but also dependent on driver behavior. For the driver behavior situation would still want to display the real-time icon since the predictions are based on real-time GPS info. This means that a certainty of LOW is not adequate to specify whether or not the real-time icon should be displayed.

But perhaps there is a very simple solution. I propose we add another possible value called something like NOT_REAL_TIME. Before I was proposing a separate field, but I'm trying to keep things simple here so that we might be able to move forward. If the certainty is HIGH or LOW then the real-time icon should be displayed. If it is NOT_REAL_TIME then the icon should not be displayed. If the value is UNKNOWN then most likely the icon should be displayed.

@gcamp
Copy link
Contributor Author

gcamp commented Jan 14, 2019

@mike-swiftly The goal of LOW was specifically to now show the real time indicator. I'm not sure I get what the difference between LOW and NOT_REAL_TIME is adding if it ends up being the same behaviour.

  But there are other situations where the certainty would be LOW, such as the prediction being real-time GPS based but also dependent on driver behavior.

The point of certainty was exactly to know if it's certain enough to show the prediction icon, not if the prediction is based on GPS or now. If an app should show the real time icon it should be HIGH, if it's not then it should be LOW.

For changes to the schedule, that is not real time at all, then further discussion should be in #113 since it was removed from this PR.

@abyrd
Copy link

abyrd commented Jan 16, 2019

@mike-swiftly said:

I have to vote -1 on the current pull request because I think it does not address the goal of clarifying when the real-time icon should be displayed in a client application.

This reaction is understandable, considering the title of this ticket. I believe some commentators' understanding of the goal of this ticket has shifted considerably since that title was written. Maybe this fact hasn't been addressed clearly enough, but much of the discussion has shifted to communicating how accurate a prediction is, sidestepping the original goal of controlling a real-time icon in an end-user app.

To me, the details of when a particular icon is displayed, and even the definition of "real-time" being used in the comments are specific to particular consumers and may not be relevant to a general-purpose firehose data specification like GTFS-RT. We should be careful not to assume that symbols and features in particular end-user apps are shared by all other consumers of GTFS-RT.

It does address uncertainty, but it still assumes that predictions should always be based on GPS.

Nothing here assumes any predictions are "always based on GPS". As already stated above in response to your proposed gtfs_based field, many systems for tracking vehicles do not use GPS. Many systems use different forms of continuous location tracking (odometers, GPS position matched to route or street topology) or discontinuous location cues (rail signaling systems, vehicle operator actions). They may integrate these with historical data on speed and variability.

By "based on GPS", do you instead mean "not solely based on schedules"? Do you consider predictions sent every few minutes as GTFS-RT messages, but based entirely or largely on schedules and historical data to be "not real-time"?

For schedule based predictions the certainty would be LOW, and no real-time icon should be displayed as part of the associated predictions.

I think this is an example of a consumer-specific concept of "real time". Rather than discussing how producers can directly control this particular app-specific UI element, it may be more productive to talk about which pieces of raw information you need from the feed to control this UI element.

@mike-swiftly, as I understand it you want information about the data sources used to produce the prediction, including whether GPS location or schedules were considered in the prediction. Is this correct?

But perhaps there is a very simple solution. I propose we add another possible value called something like NOT_REAL_TIME.

It is not clear to me what it means for a real-time message to be "not real-time". We will need a full definition to discuss clearly. But to me, this description of the prediction seems mostly orthogonal to the proposed accuracy enum.

Ideally we can include all the necessary information in GTFS-RT messages to allow you to control your app following your definitions and design. But I would prefer to include general-purpose pieces of information in the GTFS-RT messages, rather than items which allow controlling UI elements of one particular app according to its own definitions.

@gcamp gcamp changed the title Ability to control display of real time icon by GTFS-rt producers with prediction_type Add prediction certainty Jan 17, 2019
@gcamp
Copy link
Contributor Author

gcamp commented Jan 17, 2019

Vote is now closed. Votes are +2, -1 and thus the proposition doesn't pass.

@mike-swiftly can you detail your thinking a bit after the responses?

@gcamp
Copy link
Contributor Author

gcamp commented Jan 24, 2019

@mike-swiftly Can you please answer, this is blocking this current PR. I'll start an other vote next week if there's no response.

@mike-swiftly
Copy link

My one single issue, as I have tried to previously state, is that the current proposal does not address the issue at hand for real-time information systems to indicate if predictions are based on real-time location information or not. I am completely in agreement with also specifying uncertainty values. But the proposed values are not adequate to also indicate whether predictions are based on real-time location info.

Side note: we are at a disadvantage of not receiving feedback from a significant number of other producers or consumers of GTFS-realtime data. Hey Google, CityMapper, Moovit, NextBus, Clever Devices, Init, etc. are you out there???

Changing the title of the pull request does not change our initial goal. I think it only tries to circumvent it. So let's figure out how to move this forward in a concrete way.

But first, a bit of explanation to the questions that @abyrd asked. I see real-time information as being based on real-time vehicle location information. Of course there are also additional possible information sources, such as historical data. But the key thing is whether the system actually knows that there is a vehicle in service. The real-time location is usually GPS based, though as @abyrd pointed out, it can of course be from other sources, such as a train control system.

Now let me try to define my previously used term NOT_REAL_TIME (if people have better suggestions for terms, that would be great. I'm just interested in the meaning). A prediction indicated to be NOT_REAL_TIME would mean that the system has not received location information for a vehicle that will serve the corresponding trip or block. The prediction information therefore is based on the expectation that a vehicle will be put into service. For some agencies this can be a good assumption, for others, a bad one. It depends on many trips an agency simply misses. I can tell you with certainty that some agencies like SFMTA miss quite a few. We don't really know the uncertainty. But we do know that it is not as certain as if there was real-time location information, and that can be useful information for the passenger.

I expect that most clients would want to show a real-time icon only if real-time location information is available. Otherwise I think it would be very misleading to the passengers. Of course clients might also want to somehow convey other forms of uncertainty as well. But I don't think that a radio wave icon is understood to indicate certainty. Of course clients can design any kind of app they want to. It is their app. But information providers, like myself, should provide the necessary information so that the app designers can do what they think is right.

Therefore if we either add NOT_REAL_TIME as a possible value to the proposed uncertainty field, or if we add a second field true/false to indicate that whether or not real-time location information was used for the prediction I think we would cover all situations and accomplish our goal. Can we do so?

@gcamp
Copy link
Contributor Author

gcamp commented Jan 28, 2019

@mike-swiftly here's how I see it and the different cases.

  • Vehicle is running and is followed in real time : high certainty
  • Vehicle is running but is not followed in real time : low certainty
  • Vehicle has GPS at the start of run but is dependent on driver behaviour : low certainty
  • No location information and best information we have is the schedule information : no prediction should be given, consumer will display schedule information.
  • No location information but we have improved schedule information : not possible with this PR, should be done in an other PR with discussions from GTFS-ServiceChanges vs extending GTFS-TripUpdates #113 taken into account. It wasn't possible before this PR, it won't be possible after but it's not a regression.

I agree that the PR reduced in scope compared with the start, but it doesn't mean what we have now is not valid.

@mike-swiftly
Copy link

Dividing predictions into simply two groups, high and low certainty, is simply not adequate when it comes to describing whether predictions are real-time based. When real-time location information is not available there are some situations where there is a somewhat but not completely high certainty (there isn't yet a live vehicle for the assignment yet, but perhaps this is common and there is a good change that a vehicle will be assigned and leave on time) and situations where there is a low certainty (who knows what is going on??).

At least users can understand that a real-time icon indicates there is a tracked vehicle and a lack of the icon indicates there isn't one.

@gcamp , what is your objection to making the change more descriptive by separating out the indication of whether the predictions are real-time or not?

@barbeau
Copy link
Collaborator

barbeau commented Jan 28, 2019

@gcamp can we change to PredictionCertainty instead of RealTimeCertainty? I think this is more accurate, as predictions don't need to be based on real-time data.

When real-time location information is not available there are some situations where there is a somewhat but not completely high certainty (there isn't yet a live vehicle for the assignment yet, but perhaps this is common and there is a good change that a vehicle will be assigned and leave on time) and situations where there is a low certainty (who knows what is going on??).

So is this MEDIUM uncertainty? :)

what is your objection to making the change more descriptive by separating out the indication of whether the predictions are real-time or not?

I'd prefer to see a more explicit indicator of when a vehicle isn't assigned to a trip, instead of abstracting this to a level of including "not real-time" information in the GTFS-realtime feed.

In theory vehicle assignment to a trip can already be represented in TripUpdate.VehicleDescriptor (and VehiclePosition.TripDescriptor), but the problem is that the consumer can't tell if the field was left blank because the producer didn't have this information or if the trip is known to be unassigned.

A related concept here is is "TripUpdate.delay", which is (emphasis mine) "the current schedule deviation for the trip. Delay should only be specified when the prediction is given relative to some existing schedule in GTFS."

I think the intention here was to make this delay an objective data point based on an observed past vehicle location at a stop or timepoint - in other words, we know this bus passed this stop at noon - as opposed to stop_time_updates which are predictive and forward-looking into the future. However, this intention isn't clearly defined in the spec and I don't think it's being used that way in practice for those that have adopted it, although it's still an experimental field that's open to change/clarification. TripUpdate.delay relates to the vehicle assignment question as a TripUpdate with an assigned vehicle should have a schedule deviation, while trips without an assigned vehicle wouldn't have a schedule deviation. But again, the issue of having an explicit flag stating that the schedule deviation isn't known is needed.

VehiclePosition has VehicleStopStatus, which includes INCOMING_AT, STOPPED_AT, and IN_TRANSIT_TO. Adding another enum to this and sticking it in TripUpdate seems like stretch to me.

So, we would need another enumeration to indicate vehicle assignment in TripUpdates.

We could add VehicleAssignmentStatus, with UNKNOWN, ASSIGNED, and UNASSIGNED. Would it be useful to have additional enum values like SERVING too, indicating that not only has a vehicle been assigned to the trip, but it's actively serving the trip?

@gcamp
Copy link
Contributor Author

gcamp commented Jan 30, 2019

@gcamp can we change to PredictionCertainty instead of RealTimeCertainty? I think this is more accurate, as predictions don't need to be based on real-time data.

@barbeau agree, changed that

 @gcamp , what is your objection to making the change more descriptive by separating out the indication of whether the predictions are real-time or not?

The proposition of NOT_REALTIME is to me the same as changing the schedule. At this point the difference between updated schedule information or low certainty real time information is fairly thin except for naming.

And for adding the update of schedule as a feature, I wanted #113 to be resolved before changing anything. For what it's worth, I agree that updated schedule should be done in the trip update format, but that's for the community to decide.

@mike-swiftly would you agree to this current PR if service change would be able to update the schedule?

Mobility Data is leading the service change proposition, maybe @LeoFrachet @barbeau have something to add into the current discussion regarding service changes.

@gcamp
Copy link
Contributor Author

gcamp commented Feb 11, 2019

Bump 😬

@LeoFrachet
Copy link
Contributor

LeoFrachet commented Feb 12, 2019

[Disclaimer: I usually re-read the whole message thread before posting, but I didn’t because this thread is way too long. Guillaume (@gcamp), since you proposed it at the first place, could you update your first message with a sum up of the current discussion, scope, proposals? That would help new comers (and me) daring to answer without being off topic.]

My understanding is that we need a replicable way to measure the « precision » (or « uncertainty ») of a prediction, so that an uncertainty of x in NY MTA would have the same meaning than an uncertainty of x in PATH… and that an app can decide to flag the value as « real-time » when the uncertainty is less than y.

From the conversations I had with other stakeholders around GTFS-ServiceChanges and GTFS-Flex, I realized we need two values: the prediction time, but also the spreading. It can be a range: 10AM +/- 2 minutes (with the definition that in 99% of the time, the bus will pass between 9:58AM and 10:02). It would allow us to compare the two following prediction: 10AM +/- 2min vs 10AM +/- 15min, allowing the data consumer to decide when they want to flag the value as « real-time ».

Two remarks on that:

First, time ranges are not symmetrical in public transit, since passing early is the worst. It should never happened. I would be interested to see statistics of real passing time for a train scheduled at 10:00 for example. I’m expecting no train early, most trains on time, then a decreasing probability of being late. So instead of symmetrical timeframe, we should provide just the likely maximum delay. For example: 10AM +max 12min, defined as « the train will most likely be arrived at 10:00, but there is a 99% probability that at 10:12 it will be arrived. We could call such value prediction_spread, defined with « There is a likelihood of 99% that the vehicle will be passed at PredictedTime + prediction_spread » for example. The unit could be the minute or the second, in both cases it would be replicable and well defined.

Secondly, how can we calculate it? Good question. But, like, how can we calculate a prediction? There is a naive but good way to start: just use historical data, and see retrospectively what those values should have been. 24 hours before, your prediction_spread is usually 15min, then 1h earlier it’s usually 10 min, and 10 min earlier it’s only 1min. Good. Just use that. It’s enough to be able to compare them between agencies. And yes, sometime it will be wrong, just like the predictions are sometime wrong. But at the end of the month, we’ll be able to see how accurate your prediction_spread were. And then, smart developers will work on improve those values, just like they work on improving the prediction time.

@abyrd
Copy link

abyrd commented Feb 13, 2019

@LeoFrachet I agree with your comments on the fact that departure time distributions are skewed later than the scheduled departure time, and that it's complicated to calculate and define them. Even a value like "9:00 am + up to 5 minutes" is not really clear, because there's always some tail of the distribution out past 5 minutes, so what percentile is being reported?

This PR is basically a response to the fact that no one is providing this kind of detailed precision information, and we'd be better off with a discrete, qualitative description of accuracy: either it is accurate enough to trust or it's not.

This subject was found to be more complex because there's a second dimension that some people would like to fold into the same field: the source of the data (whether it is based on recently reported position or guessed from other sources). The conversation got long because some people just want a high-level judgement on whether the prediction is "good" or not, and others want to specifically know whether recently reported position data was used to calculate the prediction because they have a UI element that indicates this specific bit of information. The proposal was blocked by @mike-swiftly who it seems would really like a way to receive information about the data sources used to compute the prediction.

@abyrd
Copy link

abyrd commented Feb 13, 2019

@gcamp said:

@mike-swiftly would you agree to this current PR if service change would be able to update the schedule?

I think it's inadvisable to ask people to agree to a proposal based on the assurance that another (as yet unapproved) proposal is going to cover their use case. This has the potential to create a larger problem further down the line, a situation where the other proposal "must" be accepted because people have predicated past votes on its acceptance.

I know some organizations and people are counting on the service-changes proposal, but as I've commented elsewhere I have serious doubts that we will see widespread adoption of multi-layer GTFS patching by many consumer applications. This is asking every consumer in the world to carry out significant additional software development to understand a few feeds. This is unlikely to happen uniformly, and would basically fork the specification since any consumers that don't support service-changes could badly (and probably silently) misinterpret those feeds.

@LeoFrachet
Copy link
Contributor

LeoFrachet commented Feb 13, 2019

Andrew (@abyrd) said:

Even a value like "9:00 am + up to 5 minutes" is not really clear, because there's always some tail of the distribution out past 5 minutes, so what percentile is being reported?

I did define which percentile we should use. Defining the percentile is the key the replicability of my proposal. The definition I gave was "There is a likelihood of 99% that the vehicle will be passed at PredictedTime + prediction_spread". So the percentile I was offering to use is 99% (It can be 95%, it doesn't matter, what matters is that everybody uses the same).

we'd be better off with a discrete, qualitative description of accuracy

Yup to you. I have no strong opinion. But I do think that without strict definition, everybody will use different definition, and the "LOW" of one agency will be the "HIGH" of some other. But if producers and consumers see a value, I'm not against it.

@abyrd
Copy link

abyrd commented Feb 13, 2019

@LeoFrachet said:

I did define which percentile we should use. Defining the percentile is the key the replicability of my proposal.

Sorry, I know that was in your proposal - I should have been more clear. I was partly replying to your post and partly musing about the problems with the existing GTFS uncertainty concept. A while ago there was a discussion elsewhere about the fact that you need to specify a distribution and the parameters to that distribution. We could freeze some of those parameters but it's all quite arbitrary and I do wonder how much it would really be used, and how correctly.

I'm guessing these broader "certain/uncertain" categories would be more heavily and correctly used.

@gcamp
Copy link
Contributor Author

gcamp commented Feb 14, 2019

@mike-swiftly would you agree to this current PR if service change would be able to update the schedule?

I think it's inadvisable to ask people to agree to a proposal based on the assurance that another (as yet unapproved) proposal is going to cover their use case. This has the potential to create a larger problem further down the line, a situation where the other proposal "must" be accepted because people have predicated past votes on its acceptance.

@abyrd That's probably right, but we could still eventually add schedule information to the enum proposed whatever happens with the service change. My point was more that we could still improve the GTFS-rt spec right now instead of getting stuck on adding more to this proposal.

@LeoFrachet Sorry for getting you into the discussion, my point was more if there was any advancement on service changes that could direct the discussion on if we should add a NOT_REAL_TIME or not. I think @abyrd resumed well the discussion, I'll try to update the first post and re-call a vote.

@abyrd
Copy link

abyrd commented Feb 15, 2019

@LeoFrachet said:

I would be interested to see statistics of real passing time for a train scheduled at 10:00 for example. I’m expecting no train early, most trains on time, then a decreasing probability of being late.

I just did a quick informal literature review. As we'd expect, the distributions are skewed later than the scheduled arrival time, but up to 30-40% of buses arrive well before the scheduled time. Of course this is horrible for riders, but it's a fact visible in the data. And of course it depends heavily on mode of transport and location.

If we accept that this variability is just a fact of life in bus operations, rider experience can be greatly improved by showing them both the expectation of arrival time and a safe time to reach the stop. For example, showing an expected arrival time of 12:30 but also saying you must arrive at 12:25 to have a 95% chance of catching this bus, and have a 95% chance of being on the bus by 12:35. Providing the whole distribution to RT consumers (based on historical data and/or current conditions) would allow this. It would also allow providing a much more nuanced time window for arrival times and transfers downstream. Perhaps a sort of empirical CDF with N evenly spaced percentiles. But again I'm not sure who will produce such data (correctly). We might be able to make it happen in the Netherlands, and if it was done with open source components maybe others would follow.

But of course this would be a separate proposal: instead of the current ill-defined uncertainty field in GTFS-RT, we would have one very simple enum field (as described in this PR) and a separate very detailed arrival probability field.

@barbeau
Copy link
Collaborator

barbeau commented Feb 18, 2019

Providing the whole distribution to RT consumers (based on historical data and/or current conditions) would allow this.

If provided, I think it need to be clear whether the agency is reporting a distribution based on historical data, or one created for the prediction given to the rider (which could be a mix of current and historical conditions). The first is relatively trivial to implement for producers if you have archived data, but less meaningful to riders if the current trip is an outlier - and these are the trips riders really want accurate information about. The 2nd can be much harder to do accurately, at least from my experience dealing with location accuracy on mobile devices. We found widespread inaccuracies in the "accuracy" value provided to apps by GPS hardware, and that was after producers had pushed to have the threshold be the 68% confidence level - it turns out they couldn't even meet that accuracy level. I'm not sure if this experience directly translates to arrival/departure predictions yet, but so far it feels similar to me. The other challenge is how the CDF is converted to something that riders understand - although that's outside the scope of the GTFS-rt exchange. This is a very similar problem to travel time reliability on the highway side of things, which I understand has been fairly heavily studied.

One last set of facts to drive home that the current uncertainty value isn't working. We did a quick analysis of feeds from TransitFeeds.com, and found the following:

  • 6 agencies use the existing TripUpdate uncertainty value (same for all stop_time_updates)
  • 3 agencies are publishing uncertainty=0 for all predictions 😱
Feed uncertainty value
OCTA Trip Updates 0
Thunder Bay Transit Trip Updates 0
HART Trip Updates 30
BART Trip Updates 30
Kingston Transit Trip Updates 0
MBTA Trip Updates 300

@lenaschimmel
Copy link

We are currently developing a sophisticated delay prediction for the Dystonse project that will generate arbitrary detailed probability distributions for delays based on historical data and real time information. (Our repository is documented in English, but much of the extra information about our project is available only in German at this time.)

The main purpose of those predictions will be in our own routing algorithm that takes the full distribution into account. We are looking into alternative ways to make our predictions available via APIs and/or feeds, so that other routing services or real time information displays can use them.

The current spec of the uncertainty field is not optimal for the task, so we've been looking for alternate proposals. This discussion is the best match we found so far.

As @LeoFrachet and @abyrd discussed over a year ago, defining specific percentiles is mandatory to have data that is comparable across data providers.

Given our current data and algorithms, we could return a value for any percentile in theory, but for most routes, we don't have enough historical records to be really confident about the 1% or 99% percentile, but 5% and 95% should be fine. Even if detailed data was available for any given percentile, different use cases might benefit from using different percentiles. So from our viewpoint, an ideal proposal would allow to attach any number of delay/percentile pairs to a stop_time_update. On the other hand, we realize that hardly anybody used the current uncentainty field, so maybe that proposal would be overly complex for everybody but us. Any thoughts on that?

Also, if you know of any other place where similar ideas are discussed, please leave a hint.


Here's a sample of our data, which coincidentally has very little skewing, although I know other examples that do have the heavy skewing that was mentioned before:

Bildschirmfoto 2020-06-08 um 20 04 10

@skinkie
Copy link
Contributor

skinkie commented Jun 8, 2020 via email

@barbeau
Copy link
Collaborator

barbeau commented Jun 9, 2020

If we could assist you regarding data. For The Netherlands we have a multi
year historic real time dataset available for research purposes.

@lenaschimmel Same here - we've been collecting GTFS-RT data from 8-9 agencies over the last two years for related work going on at USF, with the goal of comparing multiple prediction techniques across different agencies. If you're interested in collaborating please contact me via LinkedIn - https://www.linkedin.com/in/seanbarbeau/.

@barbeau
Copy link
Collaborator

barbeau commented Jun 9, 2020

So I wonder, shouldn't the schedule not first be reversed engineered
back, and then remodelled based on a historic estimation given the input
schedule that does effect how drivers execute it?

@skinkie I think you're referring to something like busbuzzard?

@skinkie
Copy link
Contributor

skinkie commented Jun 9, 2020

@skinkie I think you're referring to something like busbuzzard?

Not that specific code, but it looks fun. I have asked some people if they could come up with the original paper.

@gcamp
Copy link
Contributor Author

gcamp commented Oct 16, 2020

Unless there are big disagreement, I'm going to close this PR. Schedule change should probably be done in a potential Service Change spec. uncertainty is still not very well defined. If anyone wants to clarify it, I would be open to discuss further.

@gcamp gcamp closed this Oct 16, 2020
@gcamp gcamp deleted the prediction_type branch October 16, 2020 15:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GTFS Realtime Issues and Pull Requests that focus on GTFS Realtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants