Issues observed in production #185

iantei · 2025-01-11T21:09:44Z

Documenting the issues observed in the production below:

Timestamp	Program/Study name	Study or Program	Chart Issues
2025-01-11 07:15:13	nc-transit-equity (NC Transit Equity Mode Shift Program Evaluation Public Dashboard)	Study	All the charts generation has issue.
2025-01-11 03:00:23	cosa-ebike-project (Low-Income E-bike Rebate Pilot Program Public Dashboard	Program	'e-bike' specific number of trips by replaced mode, 'e-bike' specific trip length (miles) by replaced mode, Sketch of 'e-bike' specific energy impact, Average mils for each replaced mode (w/Other) These above ones have tabular error display. "Sketch of 'e-bike' specific emission impact" - Seems like missing file altogether, shows null
2025-01-11 06:05:39	sm-bike (Santa Monica Income-Qualified E-Bike Voucher Program Public Dashboard)	Program	Sensed bar for Stacked Bar is generated properly, other bars have issues., All Chart have issues. Except three charts: "Average trip length (miles) (sensed)", "Trip frequency (weekend, sensed)", "Trip Frequency (sensed)". Basically, Sensed information is fine, others all have issues.
2025-01-11 03:22:00	uue (Property Taxpayer Education Study)	Study	Same as sm-bike
2024-12-19 03:42:51	washingtoncommons	Survey	TripConfirmSurvey shows the Stacked Bar Charts properly, still it has Insufficient data representation on it too.

Program/Study Name	Chart
NC Transit
Cosa Ebike
Santa Monica
UUE
Washingtoncommons

shankari · 2025-01-12T00:59:13Z

nc-transit-equity: this is expected behavior; there is no data!
cosa-ebike-project: KeyError below
sm-ebike: KeyError below
uue: this is expected behavior, there are no labeled trips!
washingtoncommons: KeyError below

`KeyError`

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[2], line 18
     15 label_units, short_label, label_units_lower, distance_col, weight_unit = scaffolding.get_units(use_imperial)
     17 # get color mappings
---> 18 colors_mode, colors_replaced, colors_purpose, colors_sensed, colors_ble  = scaffolding.mapping_color_labels() #just need sensed
File /usr/src/app/saved-notebooks/scaffolding.py:319, in mapping_color_labels(labels, unique_keys)
    316         values.append("other")
    318 # Mapping between mode values and base_mode OR baseMode (backwards compatibility)
--> 319 value_to_basemode = {mode["value"]: mode.get("base_mode", mode.get("baseMode", "UNKNOWN")) for mode in labels["MODE"]}
    320 # Assign colors to mode, replaced, purpose, and sensed values
    321 colors_mode = emcdb.dedupe_colors([
    322     [mode, emcdb.BASE_MODES[value_to_basemode.get(mode, "UNKNOWN")]['color']]
    323     for mode in set(mode_values)
    324 ], adjustment_range=[1,1.8])
KeyError: 'MODE'

shankari · 2025-01-12T01:13:56Z

For the record, the errors in running the notebooks show up as colorized output - e.g.
https://stackoverflow.com/questions/71796687/what-does-031m-means

^[[0;31m---------------------------------------------------------------------------^[[0m
^[[0;31mKeyError^[[0m                                  Traceback (most recent call last)
Cell ^[[0;32mIn[2], line 18^[[0m
^[[1;32m     15^[[0m label_units, short_label, label_units_lower, distance_col, weight_unit ^[[38;5;241m=^[[39m scaffolding^[[38;5;241m.^[[39mget_units(use_imperial)
^[[1;32m     17^[[0m ^[[38;5;66;03m# get color mappings^[[39;00m
^[[0;32m---> 18^[[0m colors_mode, colors_replaced, colors_purpose, colors_sensed, colors_ble  ^[[38;5;241m=^[[39m ^[[43mscaffolding^[[49m^[[38;5;241;43m.^[[39;49m^[[43mmapping_color_labels^[[49m^[[43m(^[[49m^[[43m)^[[49m ^[[38;5;66;03m#just need sensed^[[39;00m
File ^[[0;32m/usr/src/app/saved-notebooks/scaffolding.py:319^[[0m, in ^[[0;36mmapping_color_labels^[[0;34m(labels, unique_keys)^[[0m
^[[1;32m    316^[[0m         values^[[38;5;241m.^[[39mappend(^[[38;5;124m"^[[39m^[[38;5;124mother^[[39m^[[38;5;124m"^[[39m)
^[[1;32m    318^[[0m ^[[38;5;66;03m# Mapping between mode values and base_mode OR baseMode (backwards compatibility)^[[39;00m
^[[0;32m--> 319^[[0m value_to_basemode ^[[38;5;241m=^[[39m {mode[^[[38;5;124m"^[[39m^[[38;5;124mvalue^[[39m^[[38;5;124m"^[[39m]: mode^[[38;5;241m.^[[39mget(^[[38;5;124m"^[[39m^[[38;5;124mbase_mode^[[39m^[[38;5;124m"^[[39m, mode^[[38;5;241m.^[[39mget(^[[38;5;124m"^[[39m^[[38;5;124mbaseMode^[[39m^[[38;5;124m"^[[39m, ^[[38;5;124m"^[[39m^[[38;5;124mUNKNOWN^[[39m^[[38;5;124m"^[[39m)) ^[[38;5;28;01mfor^[[39;00m mode ^[[38;5;129;01min^[[39;00m ^[[43mlabels^[[49m^[[43m[^[[49m^[[38;5;124;43m"^[[39;49m^[[38;5;124;43mMODE^[[39;49m^[[38;5;124;43m"^[[39;49m^[[43m]^[[49m}
^[[1;32m    320^[[0m ^[[38;5;66;03m# Assign colors to mode, replaced, purpose, and sensed values^[[39;00m
^[[1;32m    321^[[0m colors_mode ^[[38;5;241m=^[[39m emcdb^[[38;5;241m.^[[39mdedupe_colors([
^[[1;32m    322^[[0m     [mode, emcdb^[[38;5;241m.^[[39mBASE_MODES[value_to_basemode^[[38;5;241m.^[[39mget(mode, ^[[38;5;124m"^[[39m^[[38;5;124mUNKNOWN^[[39m^[[38;5;124m"^[[39m)][^[[38;5;124m'^[[39m^[[38;5;124mcolor^[[39m^[[38;5;124m'^[[39m]]
^[[1;32m    323^[[0m     ^[[38;5;28;01mfor^[[39;00m mode ^[[38;5;129;01min^[[39;00m ^[[38;5;28mset^[[39m(mode_values)
^[[1;32m    324^[[0m ], adjustment_range^[[38;5;241m=^[[39m[^[[38;5;241m1^[[39m,^[[38;5;241m1.8^[[39m])
^[[0;31mKeyError^[[0m: 'MODE'

To convert that to the meaningful text shown above, we can just save it to a file (e.g. /tmp/colorized_error) and then cat it
cat /tmp/colorized_error

iantei · 2025-01-13T17:54:19Z

KeyError

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[2], line 18
     15 label_units, short_label, label_units_lower, distance_col, weight_unit = scaffolding.get_units(use_imperial)
     17 # get color mappings
---> 18 colors_mode, colors_replaced, colors_purpose, colors_sensed, colors_ble  = scaffolding.mapping_color_labels() #just need sensed
File /usr/src/app/saved-notebooks/scaffolding.py:319, in mapping_color_labels(labels, unique_keys)
    316         values.append("other")
    318 # Mapping between mode values and base_mode OR baseMode (backwards compatibility)
--> 319 value_to_basemode = {mode["value"]: mode.get("base_mode", mode.get("baseMode", "UNKNOWN")) for mode in labels["MODE"]}
    320 # Assign colors to mode, replaced, purpose, and sensed values
    321 colors_mode = emcdb.dedupe_colors([
    322     [mode, emcdb.BASE_MODES[value_to_basemode.get(mode, "UNKNOWN")]['color']]
    323     for mode in set(mode_values)
    324 ], adjustment_range=[1,1.8])
KeyError: 'MODE'

@shankari I investigated into the issue with the cosa-ebike-project dataset I have access to.
The issue is because of the following reason: There are no replaced modes found
Therefore, the behavior is expected as only the charts depend on replaced modes have failed to generate.

Got the below log while executing the notebook manually, with the above dataset.

This is a program, but no replaced modes found. Likely cold start case. Ignoring replaced mode mapping

Printing the columns for expanded_ct:

Index(['source', 'end_ts', 'end_fmt_time', 'end_loc', 'raw_trip', 'start_ts',
       'start_fmt_time', 'start_loc', 'duration', 'distance', 'start_place',
       'end_place', 'cleaned_trip', 'inferred_labels', 'inferred_trip',
       'expectation', 'confidence_threshold', 'expected_trip',
       'inferred_section_summary', 'cleaned_section_summary', 'user_input',
       'additions', 'start_local_dt_year', 'start_local_dt_month',
       'start_local_dt_day', 'start_local_dt_hour', 'start_local_dt_minute',
       'start_local_dt_second', 'start_local_dt_weekday',
       'start_local_dt_timezone', 'end_local_dt_year', 'end_local_dt_month',
       'end_local_dt_day', 'end_local_dt_hour', 'end_local_dt_minute',
       'end_local_dt_second', 'end_local_dt_weekday', 'end_local_dt_timezone',
       '_id', 'user_id', 'metadata_write_ts', 'ble_sensed_summary',
       'mode_confirm', 'purpose_confirm', 'distance_miles', 'distance_kms',
       'Mode_confirm', 'mode_confirm_w_other', 'Trip_purpose',
       'purpose_confirm_w_other'],
      dtype='object')

Printing the columns for expanded_ct_inferred:

Index(['source', 'end_ts', 'end_fmt_time', 'end_loc', 'raw_trip', 'start_ts',
       'start_fmt_time', 'start_loc', 'duration', 'distance', 'start_place',
       'end_place', 'cleaned_trip', 'inferred_labels', 'inferred_trip',
       'expectation', 'confidence_threshold', 'expected_trip',
       'inferred_section_summary', 'cleaned_section_summary', 'user_input',
       'additions', 'start_local_dt_year', 'start_local_dt_month',
       'start_local_dt_day', 'start_local_dt_hour', 'start_local_dt_minute',
       'start_local_dt_second', 'start_local_dt_weekday',
       'start_local_dt_timezone', 'end_local_dt_year', 'end_local_dt_month',
       'end_local_dt_day', 'end_local_dt_hour', 'end_local_dt_minute',
       'end_local_dt_second', 'end_local_dt_weekday', 'end_local_dt_timezone',
       '_id', 'user_id', 'metadata_write_ts', 'ble_sensed_summary',
       'mode_confirm', 'purpose_confirm', 'distance_miles', 'distance_kms',
       'Mode_confirm', 'mode_confirm_w_other', 'Trip_purpose',
       'purpose_confirm_w_other'],
      dtype='object')

In both the cases of labeled and inferred data frame, there is missing replaced_mode column in the data frame, which resulted in the current production issue for cosa-ebike-project.

For e-bike specific number of trips by replaced mode

plot_and_text_stacked_bar_chart(data_eb, lambda df: df.groupby("replaced_mode_w_other").agg({distance_col: 'count'}).sort_values(by=distance_col, ascending=False), 
                                    f"Labeled `{mode_of_interest}` by user\n"+stacked_bar_quality_text, ax[0], text_results[0], colors_replaced, debug_df, value_to_translations_replaced)

For e-bike specific trip length (miles) by replaced mode

    plot_and_text_stacked_bar_chart(data_eb, lambda df: df.groupby("replaced_mode_w_other").agg({distance_col: 'sum'}).sort_values(by=distance_col, ascending=False), 
                                    "Labeled by user\n (Trip distance)\n"+stacked_bar_quality_text, ax[0], text_results[0], colors_replaced, debug_df, value_to_translations_replaced)

For Average Miles for each replaced mode

dg=data_eb.groupby('Replaced_mode').agg({distance_col: ['sum', 'count' , 'mean']},)

For Sketch of e-bike specific energy impact

ebei=data_eb.groupby('Replaced_mode').agg({'Energy_Impact(kWH)': ['sum', 'mean']},)

For Sketch of e-bike specific emission impact

ebco2=data_eb.groupby('Replaced_mode').agg({f'CO2_Impact({weight_unit})': ['sum', 'mean']},)

All these charts which have issues are due to missing replaced_mode.

Why is it not the above KeyError: 'MODE' error?
cosa-ebike-project, sm-ebike and washingtoncommons - all these study/program utilizes default labels from emcommon/resources/label-options.default.json, since it does not have custom labels.
It complains that it's unable to identify - labels[MODE], which should not happens since the json has MODE key in it.

How could we have ended up in this KeyError: 'MODE' error?
Earlier, we would only need to pass dynamic_labels in the notebook for specific program/study, while running the Jupyter notebook manually (without running generate_plots.py). Now, if we plan to run the notebook manually, we need to fill up labels from the above emcommon/resources/labels-options before running the notebook.

iantei · 2025-01-13T18:05:01Z

sm-ebike: KeyError below
uue: this is expected behavior, there are no labeled trips!

Alike uue, seems like sm-ebike also has no labeled trips. Thus resulting in the expected behavior.

iantei · 2025-01-13T18:53:49Z

KeyError

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[2], line 18
     15 label_units, short_label, label_units_lower, distance_col, weight_unit = scaffolding.get_units(use_imperial)
     17 # get color mappings
---> 18 colors_mode, colors_replaced, colors_purpose, colors_sensed, colors_ble  = scaffolding.mapping_color_labels() #just need sensed
File /usr/src/app/saved-notebooks/scaffolding.py:319, in mapping_color_labels(labels, unique_keys)
    316         values.append("other")
    318 # Mapping between mode values and base_mode OR baseMode (backwards compatibility)
--> 319 value_to_basemode = {mode["value"]: mode.get("base_mode", mode.get("baseMode", "UNKNOWN")) for mode in labels["MODE"]}
    320 # Assign colors to mode, replaced, purpose, and sensed values
    321 colors_mode = emcdb.dedupe_colors([
    322     [mode, emcdb.BASE_MODES[value_to_basemode.get(mode, "UNKNOWN")]['color']]
    323     for mode in set(mode_values)
    324 ], adjustment_range=[1,1.8])
KeyError: 'MODE'

In reference to the washingtoncommons, this is the cause of the issue.
It should have been - scaffolding.mapping_color_labels(labels)

iantei · 2025-01-13T23:14:06Z

I have fixed an issue related with survey related charts in #188

I couldn't validate the case for washingtoncommons, but used dfc-fermata to validate the above issue.
The observation of washingtoncommons and dfc-fermata charts being last generated only around 2024 was a good hint into the prospect of something going about wrong.

I am not sure if this would entirely fix the issue reported for washingtoncommons though. Once, I have access to the recent washingtoncommons dataset in prod, I can validate and fix the issue if it still persists.

shankari · 2025-01-14T00:02:04Z

@shankari I investigated into the issue with the cosa-ebike-project dataset I have access to.
The issue is because of the following reason: There are no replaced modes found
Therefore, the behavior is expected as only the charts depend on replaced modes have failed to generate.

Good catch! However, if this is expected behavior, we should not fail silently, and should, instead, report the reason to the user. The tables that we show for the others ("Registered participants", "Participants with one trip...") help us see what is going on (0 participants, 0 trips, etc). We should show similar data instead of having a blank energy/emissions impact chart.

Please incorporate this into #188

iantei · 2025-01-14T18:43:55Z

if this is expected behavior, we should not fail silently, and should, instead, report the reason to the user. ... We should show similar data instead of having a blank energy/emissions impact chart.

I agree.

The blank emissions impact chart was due to the assignment of file_name and plot_title_no_quality coming up from the previous cell of execution, in case there is an exception - which led to missing charts. Therefore, blank emissions impact chart.

This has been handled here: 32623a1

While led to look into the stacked bar charts incorporation, which had missing representation of information when we had no replaced_mode data in the trip information.
This has been handled here: add11f1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues observed in production #185

Issues observed in production #185

iantei commented Jan 11, 2025

shankari commented Jan 12, 2025

shankari commented Jan 12, 2025

iantei commented Jan 13, 2025

iantei commented Jan 13, 2025

iantei commented Jan 13, 2025

iantei commented Jan 13, 2025

shankari commented Jan 14, 2025

iantei commented Jan 14, 2025

Issues observed in production #185

Issues observed in production #185

Comments

iantei commented Jan 11, 2025

shankari commented Jan 12, 2025

KeyError

shankari commented Jan 12, 2025

iantei commented Jan 13, 2025

iantei commented Jan 13, 2025

iantei commented Jan 13, 2025

iantei commented Jan 13, 2025

shankari commented Jan 14, 2025

iantei commented Jan 14, 2025

`KeyError`