Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plot_animated throws lots of unnecessary warnings when data has many columns #37

Open
hraftery opened this issue Oct 9, 2021 · 0 comments
Labels
help wanted Extra attention is needed

Comments

@hraftery
Copy link

hraftery commented Oct 9, 2021

Describe the bug
There are two issues here, but it turns out their cause and fix overlap, so I've included them both.

  • Issue 1

plot_animated outputs two warnings with text: "UserWarning: FixedFormatter should only be used together with FixedLocator". Turns out this is a known gotcha when using set_yticklabels or set_xticklabels. Although the warning is innocuous in this case (it's purpose is described here), it's alarming for the user of pandas_alive, and is easy to suppress.

The "fix" is to call ax.set_xticks(ax.get_xticks()) before the call to set_xticklabels, and similarly for the y-axis.

More info here and here.

  • Issue 2

If the data to be plotted has many columns (more than about 60), then plot_animated outputs dozens of warnings with text like:

/usr/local/lib/python3.9/site-packages/matplotlib/backends/backend_agg.py:201: RuntimeWarning: Glyph 157 missing from current font.
font.set_text(s, 0, flags=flags)

where "157" varies from 128 to beyond 157.

This was really hard to pinpoint, but turns out to be because a fake list of column headings is generated when the plot is first created, by iterating through ASCII characters by the number of columns in the data. If there's too many columns, it iterates right off the normal ASCII range, and the standard matplotlib fonts do not have a glyph for ASCII values beyond 127.

To Reproduce
This method is derived from the panda_alive author's article here. In other words, this is a pretty typical to a "my-first-pandas_alive-animation".

import pandas as pd
import matplotlib.pyplot as plt
import pandas_alive
from IPython.display import HTML
import urllib.request, json
from datetime import datetime

NSW_COVID_19_CASES_BY_LOCATION_URL = "https://data.nsw.gov.au/data/api/3/action/package_show?id=aefcde60-3b0c-4bc0-9af1-6fe652944ec2"
with urllib.request.urlopen(NSW_COVID_19_CASES_BY_LOCATION_URL) as url:
    data = json.loads(url.read().decode())

data_url = data["result"]["resources"][0]["url"]
df = pd.read_csv(data_url)

df['lga_name19'].fillna("Unknown", inplace=True)
df['notification_date'] = pd.to_datetime(df['notification_date'])
df_grouped = df.groupby(["notification_date", "lga_name19"]).size()

df_cases = pd.DataFrame(df_grouped).unstack()
df_cases.columns = df_cases.columns.droplevel().astype(str)
df_cases = df_cases.fillna(0)
df_cases.index = pd.to_datetime(df_cases.index)

animated_html = df_cases.plot_animated(n_visible=15)

Expected behavior
Following an introductory tutorial and using the library as intended would not show dozens of warnings. See so many warnings leaves the newbie feeling like they've done something wrong.

Additional context
Here is a patch for pandas_alive/charts.py that fixes the two issues. If this fix is suitable, I can create a PR, or two if you'd like to separate the issues.

214c214
<         fake_cols = [chr(i + 70) for i in range(self.df.shape[1])]
---
>         fake_cols = [chr(i + 70) for i in range(self.n_visible)]
218c218
<             ax.barh(fake_cols, [1] * self.df.shape[1])
---
>             ax.barh(fake_cols, np.ones(len(fake_cols)))
221c221,225
<             ax.set_yticklabels(self.df.columns)
---
>             # Before the labels are set, convince matplotlib not to throw user warning about FixedLocator and FixedFormatter
>             # Added by HR211009, inspired by https://github.com/matplotlib/matplotlib/issues/18848#issuecomment-817098738
>             ax.set_xticks(ax.get_xticks())
>             ax.set_yticks(ax.get_yticks())
>             ax.set_yticklabels(self.df.columns[:len(fake_cols)])
224c228
<             ax.bar(fake_cols, [1] * self.df.shape[1])
---
>             ax.bar(fake_cols, np.ones(len(fake_cols)))
227c231,235
<             ax.set_xticklabels(self.df.columns, ha="right")
---
>             # Before the labels are set, convince matplotlib not to throw user warning about FixedLocator and FixedFormatter
>             # Added by HR211009, inspired by https://github.com/matplotlib/matplotlib/issues/18848#issuecomment-817098738
>             ax.set_xticks(ax.get_xticks())
>             ax.set_yticks(ax.get_yticks())
>             ax.set_xticklabels(self.df.columns[:len(fake_cols)], ha="right")
hraftery added a commit to hraftery/pandas_alive that referenced this issue Oct 20, 2021
@JackMcKew JackMcKew added the help wanted Extra attention is needed label Mar 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants