Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues observed in production #185

Open
iantei opened this issue Jan 11, 2025 · 8 comments
Open

Issues observed in production #185

iantei opened this issue Jan 11, 2025 · 8 comments

Comments

@iantei
Copy link
Contributor

iantei commented Jan 11, 2025

Documenting the issues observed in the production below:

Timestamp Program/Study name Study or Program Chart Issues
2025-01-11 07:15:13 nc-transit-equity (NC Transit Equity Mode Shift Program Evaluation Public Dashboard) Study All the charts generation has issue.
2025-01-11 03:00:23 cosa-ebike-project (Low-Income E-bike Rebate Pilot Program Public Dashboard Program 'e-bike' specific number of trips by replaced mode, 'e-bike' specific trip length (miles) by replaced mode, Sketch of 'e-bike' specific energy impact, Average mils for each replaced mode (w/Other) These above ones have tabular error display. "Sketch of 'e-bike' specific emission impact" - Seems like missing file altogether, shows null
2025-01-11 06:05:39 sm-bike (Santa Monica Income-Qualified E-Bike Voucher Program Public Dashboard) Program Sensed bar for Stacked Bar is generated properly, other bars have issues., All Chart have issues. Except three charts: "Average trip length (miles) (sensed)", "Trip frequency (weekend, sensed)", "Trip Frequency (sensed)". Basically, Sensed information is fine, others all have issues.
2025-01-11 03:22:00 uue (Property Taxpayer Education Study) Study Same as sm-bike
2024-12-19 03:42:51 washingtoncommons Survey TripConfirmSurvey shows the Stacked Bar Charts properly, still it has Insufficient data representation on it too.
Program/Study Name Chart
NC Transit NCTransit
Cosa Ebike COSA_EBIKE
Santa Monica Santa_Monica
UUE UUE
Washingtoncommons WashingtonCommons
@shankari
Copy link
Contributor

  1. nc-transit-equity: this is expected behavior; there is no data!
  2. cosa-ebike-project: KeyError below
  3. sm-ebike: KeyError below
  4. uue: this is expected behavior, there are no labeled trips!
  5. washingtoncommons: KeyError below

KeyError

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[2], line 18
     15 label_units, short_label, label_units_lower, distance_col, weight_unit = scaffolding.get_units(use_imperial)
     17 # get color mappings
---> 18 colors_mode, colors_replaced, colors_purpose, colors_sensed, colors_ble  = scaffolding.mapping_color_labels() #just need sensed
File /usr/src/app/saved-notebooks/scaffolding.py:319, in mapping_color_labels(labels, unique_keys)
    316         values.append("other")
    318 # Mapping between mode values and base_mode OR baseMode (backwards compatibility)
--> 319 value_to_basemode = {mode["value"]: mode.get("base_mode", mode.get("baseMode", "UNKNOWN")) for mode in labels["MODE"]}
    320 # Assign colors to mode, replaced, purpose, and sensed values
    321 colors_mode = emcdb.dedupe_colors([
    322     [mode, emcdb.BASE_MODES[value_to_basemode.get(mode, "UNKNOWN")]['color']]
    323     for mode in set(mode_values)
    324 ], adjustment_range=[1,1.8])
KeyError: 'MODE'

@shankari
Copy link
Contributor

For the record, the errors in running the notebooks show up as colorized output - e.g.
https://stackoverflow.com/questions/71796687/what-does-031m-means

^[[0;31m---------------------------------------------------------------------------^[[0m
^[[0;31mKeyError^[[0m                                  Traceback (most recent call last)
Cell ^[[0;32mIn[2], line 18^[[0m
^[[1;32m     15^[[0m label_units, short_label, label_units_lower, distance_col, weight_unit ^[[38;5;241m=^[[39m scaffolding^[[38;5;241m.^[[39mget_units(use_imperial)
^[[1;32m     17^[[0m ^[[38;5;66;03m# get color mappings^[[39;00m
^[[0;32m---> 18^[[0m colors_mode, colors_replaced, colors_purpose, colors_sensed, colors_ble  ^[[38;5;241m=^[[39m ^[[43mscaffolding^[[49m^[[38;5;241;43m.^[[39;49m^[[43mmapping_color_labels^[[49m^[[43m(^[[49m^[[43m)^[[49m ^[[38;5;66;03m#just need sensed^[[39;00m
File ^[[0;32m/usr/src/app/saved-notebooks/scaffolding.py:319^[[0m, in ^[[0;36mmapping_color_labels^[[0;34m(labels, unique_keys)^[[0m
^[[1;32m    316^[[0m         values^[[38;5;241m.^[[39mappend(^[[38;5;124m"^[[39m^[[38;5;124mother^[[39m^[[38;5;124m"^[[39m)
^[[1;32m    318^[[0m ^[[38;5;66;03m# Mapping between mode values and base_mode OR baseMode (backwards compatibility)^[[39;00m
^[[0;32m--> 319^[[0m value_to_basemode ^[[38;5;241m=^[[39m {mode[^[[38;5;124m"^[[39m^[[38;5;124mvalue^[[39m^[[38;5;124m"^[[39m]: mode^[[38;5;241m.^[[39mget(^[[38;5;124m"^[[39m^[[38;5;124mbase_mode^[[39m^[[38;5;124m"^[[39m, mode^[[38;5;241m.^[[39mget(^[[38;5;124m"^[[39m^[[38;5;124mbaseMode^[[39m^[[38;5;124m"^[[39m, ^[[38;5;124m"^[[39m^[[38;5;124mUNKNOWN^[[39m^[[38;5;124m"^[[39m)) ^[[38;5;28;01mfor^[[39;00m mode ^[[38;5;129;01min^[[39;00m ^[[43mlabels^[[49m^[[43m[^[[49m^[[38;5;124;43m"^[[39;49m^[[38;5;124;43mMODE^[[39;49m^[[38;5;124;43m"^[[39;49m^[[43m]^[[49m}
^[[1;32m    320^[[0m ^[[38;5;66;03m# Assign colors to mode, replaced, purpose, and sensed values^[[39;00m
^[[1;32m    321^[[0m colors_mode ^[[38;5;241m=^[[39m emcdb^[[38;5;241m.^[[39mdedupe_colors([
^[[1;32m    322^[[0m     [mode, emcdb^[[38;5;241m.^[[39mBASE_MODES[value_to_basemode^[[38;5;241m.^[[39mget(mode, ^[[38;5;124m"^[[39m^[[38;5;124mUNKNOWN^[[39m^[[38;5;124m"^[[39m)][^[[38;5;124m'^[[39m^[[38;5;124mcolor^[[39m^[[38;5;124m'^[[39m]]
^[[1;32m    323^[[0m     ^[[38;5;28;01mfor^[[39;00m mode ^[[38;5;129;01min^[[39;00m ^[[38;5;28mset^[[39m(mode_values)
^[[1;32m    324^[[0m ], adjustment_range^[[38;5;241m=^[[39m[^[[38;5;241m1^[[39m,^[[38;5;241m1.8^[[39m])
^[[0;31mKeyError^[[0m: 'MODE'

To convert that to the meaningful text shown above, we can just save it to a file (e.g. /tmp/colorized_error) and then cat it
cat /tmp/colorized_error

@iantei
Copy link
Contributor Author

iantei commented Jan 13, 2025

KeyError

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[2], line 18
     15 label_units, short_label, label_units_lower, distance_col, weight_unit = scaffolding.get_units(use_imperial)
     17 # get color mappings
---> 18 colors_mode, colors_replaced, colors_purpose, colors_sensed, colors_ble  = scaffolding.mapping_color_labels() #just need sensed
File /usr/src/app/saved-notebooks/scaffolding.py:319, in mapping_color_labels(labels, unique_keys)
    316         values.append("other")
    318 # Mapping between mode values and base_mode OR baseMode (backwards compatibility)
--> 319 value_to_basemode = {mode["value"]: mode.get("base_mode", mode.get("baseMode", "UNKNOWN")) for mode in labels["MODE"]}
    320 # Assign colors to mode, replaced, purpose, and sensed values
    321 colors_mode = emcdb.dedupe_colors([
    322     [mode, emcdb.BASE_MODES[value_to_basemode.get(mode, "UNKNOWN")]['color']]
    323     for mode in set(mode_values)
    324 ], adjustment_range=[1,1.8])
KeyError: 'MODE'

@shankari I investigated into the issue with the cosa-ebike-project dataset I have access to.
The issue is because of the following reason: There are no replaced modes found
Therefore, the behavior is expected as only the charts depend on replaced modes have failed to generate.

Got the below log while executing the notebook manually, with the above dataset.

This is a program, but no replaced modes found. Likely cold start case. Ignoring replaced mode mapping

Printing the columns for expanded_ct:

Index(['source', 'end_ts', 'end_fmt_time', 'end_loc', 'raw_trip', 'start_ts',
       'start_fmt_time', 'start_loc', 'duration', 'distance', 'start_place',
       'end_place', 'cleaned_trip', 'inferred_labels', 'inferred_trip',
       'expectation', 'confidence_threshold', 'expected_trip',
       'inferred_section_summary', 'cleaned_section_summary', 'user_input',
       'additions', 'start_local_dt_year', 'start_local_dt_month',
       'start_local_dt_day', 'start_local_dt_hour', 'start_local_dt_minute',
       'start_local_dt_second', 'start_local_dt_weekday',
       'start_local_dt_timezone', 'end_local_dt_year', 'end_local_dt_month',
       'end_local_dt_day', 'end_local_dt_hour', 'end_local_dt_minute',
       'end_local_dt_second', 'end_local_dt_weekday', 'end_local_dt_timezone',
       '_id', 'user_id', 'metadata_write_ts', 'ble_sensed_summary',
       'mode_confirm', 'purpose_confirm', 'distance_miles', 'distance_kms',
       'Mode_confirm', 'mode_confirm_w_other', 'Trip_purpose',
       'purpose_confirm_w_other'],
      dtype='object')

Printing the columns for expanded_ct_inferred:

Index(['source', 'end_ts', 'end_fmt_time', 'end_loc', 'raw_trip', 'start_ts',
       'start_fmt_time', 'start_loc', 'duration', 'distance', 'start_place',
       'end_place', 'cleaned_trip', 'inferred_labels', 'inferred_trip',
       'expectation', 'confidence_threshold', 'expected_trip',
       'inferred_section_summary', 'cleaned_section_summary', 'user_input',
       'additions', 'start_local_dt_year', 'start_local_dt_month',
       'start_local_dt_day', 'start_local_dt_hour', 'start_local_dt_minute',
       'start_local_dt_second', 'start_local_dt_weekday',
       'start_local_dt_timezone', 'end_local_dt_year', 'end_local_dt_month',
       'end_local_dt_day', 'end_local_dt_hour', 'end_local_dt_minute',
       'end_local_dt_second', 'end_local_dt_weekday', 'end_local_dt_timezone',
       '_id', 'user_id', 'metadata_write_ts', 'ble_sensed_summary',
       'mode_confirm', 'purpose_confirm', 'distance_miles', 'distance_kms',
       'Mode_confirm', 'mode_confirm_w_other', 'Trip_purpose',
       'purpose_confirm_w_other'],
      dtype='object')

In both the cases of labeled and inferred data frame, there is missing replaced_mode column in the data frame, which resulted in the current production issue for cosa-ebike-project.

  • For e-bike specific number of trips by replaced mode
plot_and_text_stacked_bar_chart(data_eb, lambda df: df.groupby("replaced_mode_w_other").agg({distance_col: 'count'}).sort_values(by=distance_col, ascending=False), 
                                    f"Labeled `{mode_of_interest}` by user\n"+stacked_bar_quality_text, ax[0], text_results[0], colors_replaced, debug_df, value_to_translations_replaced)
  • For e-bike specific trip length (miles) by replaced mode
    plot_and_text_stacked_bar_chart(data_eb, lambda df: df.groupby("replaced_mode_w_other").agg({distance_col: 'sum'}).sort_values(by=distance_col, ascending=False), 
                                    "Labeled by user\n (Trip distance)\n"+stacked_bar_quality_text, ax[0], text_results[0], colors_replaced, debug_df, value_to_translations_replaced)
  • For Average Miles for each replaced mode
dg=data_eb.groupby('Replaced_mode').agg({distance_col: ['sum', 'count' , 'mean']},)
  • For Sketch of e-bike specific energy impact
ebei=data_eb.groupby('Replaced_mode').agg({'Energy_Impact(kWH)': ['sum', 'mean']},)
  • For Sketch of e-bike specific emission impact
ebco2=data_eb.groupby('Replaced_mode').agg({f'CO2_Impact({weight_unit})': ['sum', 'mean']},)

All these charts which have issues are due to missing replaced_mode.


Why is it not the above KeyError: 'MODE' error?
cosa-ebike-project, sm-ebike and washingtoncommons - all these study/program utilizes default labels from emcommon/resources/label-options.default.json, since it does not have custom labels.
It complains that it's unable to identify - labels[MODE], which should not happens since the json has MODE key in it.

How could we have ended up in this KeyError: 'MODE' error?
Earlier, we would only need to pass dynamic_labels in the notebook for specific program/study, while running the Jupyter notebook manually (without running generate_plots.py). Now, if we plan to run the notebook manually, we need to fill up labels from the above emcommon/resources/labels-options before running the notebook.

@iantei
Copy link
Contributor Author

iantei commented Jan 13, 2025

sm-ebike: KeyError below
uue: this is expected behavior, there are no labeled trips!

Alike uue, seems like sm-ebike also has no labeled trips. Thus resulting in the expected behavior.

@iantei
Copy link
Contributor Author

iantei commented Jan 13, 2025

KeyError

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[2], line 18
     15 label_units, short_label, label_units_lower, distance_col, weight_unit = scaffolding.get_units(use_imperial)
     17 # get color mappings
---> 18 colors_mode, colors_replaced, colors_purpose, colors_sensed, colors_ble  = scaffolding.mapping_color_labels() #just need sensed
File /usr/src/app/saved-notebooks/scaffolding.py:319, in mapping_color_labels(labels, unique_keys)
    316         values.append("other")
    318 # Mapping between mode values and base_mode OR baseMode (backwards compatibility)
--> 319 value_to_basemode = {mode["value"]: mode.get("base_mode", mode.get("baseMode", "UNKNOWN")) for mode in labels["MODE"]}
    320 # Assign colors to mode, replaced, purpose, and sensed values
    321 colors_mode = emcdb.dedupe_colors([
    322     [mode, emcdb.BASE_MODES[value_to_basemode.get(mode, "UNKNOWN")]['color']]
    323     for mode in set(mode_values)
    324 ], adjustment_range=[1,1.8])
KeyError: 'MODE'

In reference to the washingtoncommons, this is the cause of the issue.
It should have been - scaffolding.mapping_color_labels(labels)

@iantei
Copy link
Contributor Author

iantei commented Jan 13, 2025

I have fixed an issue related with survey related charts in #188

I couldn't validate the case for washingtoncommons, but used dfc-fermata to validate the above issue.
The observation of washingtoncommons and dfc-fermata charts being last generated only around 2024 was a good hint into the prospect of something going about wrong.

I am not sure if this would entirely fix the issue reported for washingtoncommons though. Once, I have access to the recent washingtoncommons dataset in prod, I can validate and fix the issue if it still persists.

@shankari
Copy link
Contributor

@shankari I investigated into the issue with the cosa-ebike-project dataset I have access to.
The issue is because of the following reason: There are no replaced modes found
Therefore, the behavior is expected as only the charts depend on replaced modes have failed to generate.

Good catch! However, if this is expected behavior, we should not fail silently, and should, instead, report the reason to the user. The tables that we show for the others ("Registered participants", "Participants with one trip...") help us see what is going on (0 participants, 0 trips, etc). We should show similar data instead of having a blank energy/emissions impact chart.

Please incorporate this into #188

@iantei
Copy link
Contributor Author

iantei commented Jan 14, 2025

if this is expected behavior, we should not fail silently, and should, instead, report the reason to the user. ... We should show similar data instead of having a blank energy/emissions impact chart.

I agree.

The blank emissions impact chart was due to the assignment of file_name and plot_title_no_quality coming up from the previous cell of execution, in case there is an exception - which led to missing charts. Therefore, blank emissions impact chart.

This has been handled here: 32623a1

While led to look into the stacked bar charts incorporation, which had missing representation of information when we had no replaced_mode data in the trip information.
This has been handled here: add11f1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants