Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented option to plot pulls #219

Draft
wants to merge 1 commit into
base: dev
Choose a base branch
from
Draft

Implemented option to plot pulls #219

wants to merge 1 commit into from

Conversation

TimLdl
Copy link
Collaborator

@TimLdl TimLdl commented Jun 6, 2023

No description provided.

@cverstege cverstege self-requested a review June 6, 2023 11:41
Copy link
Member

@cverstege cverstege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some general questions and comments about this implementation.

Comment on lines +226 to +233
if self._fit.did_fit and (
self._fit.has_errors or not self._fit._cost_function.needs_errors):
_band_y = self.y_error_band
return target_axes.fill_between(
self.model_line_x,
-_band_y, _band_y,
**kwargs)
return None # don't plot error band if fitter input data has no errors...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the exact same code as plot_pull_error_band below. Shouldn't the pull error band be normed to one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting changes should be in a separate commit or can be omitted and then be done as a full codebase refactor. As far as I can tell, the changes in this file are formatting only.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting changes should be in a separate commit or can be omitted and then be done as a full codebase refactor. As far as I can tell, the changes in this file are formatting only.

Comment on lines +179 to +181
def plot_pull(self, target_axes, error_contributions=('data',), **kwargs):
raise TypeError("Pull cannot be plotted for unbinned fits.")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this function implemented here and not in the other PlotAdapters? Will it ever be called? I don't think so, because it inherits from base and not from XYPlotAdapter.
Correct me if I'm wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument for pull plots is added to the base class. Without this addition you get The following error when trying to do an unbinned pull plot:

Traceback (most recent call last):
  File "/home/johannesg/Projects/kafe2/examples/010_unbinned_fit/qr_skwjYt.py", line 72, in <module>
    plot.plot(fit_info=True, asymmetric_parameter_errors=True, pull=True)  # plot the data and the fit
  File "/home/johannesg/Projects/kafe2/kafe2/fit/_base/plot.py", line 1380, in plot
    _plot_results = self._plot_and_get_results(
  File "/home/johannesg/Projects/kafe2/kafe2/fit/_base/plot.py", line 872, in _plot_and_get_results
    _artist = _pdc.call_plot_method(_pt,
  File "/home/johannesg/Projects/kafe2/kafe2/fit/_base/plot.py", line 389, in call_plot_method
    return _callable(
  File "/home/johannesg/Projects/kafe2/kafe2/fit/_base/plot.py", line 685, in plot_pull
    (self.data_y - self.model_y) / self._get_total_error(error_contributions),
  File "/home/johannesg/Projects/kafe2/kafe2/fit/unbinned/plot.py", line 59, in data_y
    raise TypeError("There's no y-data in the unbinned container")
TypeError: There's no y-data in the unbinned container```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we shouln't remove code from existing examples. But we can add of course. If this is for testing only, then please remove it from the commit.
There should be an example showcasing the pulls of course.

@@ -659,6 +669,25 @@ def plot_residual(self, target_axes, error_contributions=('data',), **kwargs):
**kwargs
)

def plot_pull(self, target_axes, error_contributions=('data',), **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the default for the shift (y location) should be the model error contribution, and the default for the y_err of the errorbar should be the data error.

@@ -1398,6 +1439,27 @@ def plot(self, legend=True, fit_info=True, asymmetric_parameter_errors=False,
else:
_axis.set_ylim(residual_range)

if pull:
_axis = self._current_axes['pull']
_pull_label = kc('fit', 'plot', 'pull_label')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no default value implemented. This needs to be added here

residual_label: 'Residual'

Same as the residual label.

@GuenterQuast
Copy link
Member

Hallo Zusammen,

ich habe den Pull-Plot angeschaut und war auch etwas verwirrt.
Fehlerbalken oder Unsicherheitsband ergeben keinen Sinn, weil sie eine andere Skala benötigen
als die Pull-Größe, die in Einheiten der Unsicherheit gemessen wird.

Der Vorteil eines Echten Pull-Plots wären Einfachheit und Klarheit:
lediglich die Abweichung eines jeden Punktes von Null muss gezeigt werden.
Dazu reicht ein Balkendiagramm; man kann auch die 1- und 2- Sigmabereiche
z.b. grün und gelb einfärben (s. z.B. https://pyhf.github.io/pyhf-tutorial/PullPlot.html).

Grüße,
Günter

@JohannesGaessler
Copy link
Collaborator

When you make a PR please write at least one short sentence that describes the changes for bookkeeping. Also attaching a plot would be useful. This is the plot that I got when running the example that you modified:

Figure_1

I didn't look at the code yet but I can already tell that there is an issue. The label for the pull plot is the same as for the regular plot but that is incorrect since the values shown there are not the same. I think the label should simply be "Pull" in equivalence to ratio plots where it's simply "Ratio".

Copy link
Collaborator

@JohannesGaessler JohannesGaessler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Cedric said, please don't reformat unrelated parts of the code when you make functional changes. It makes it more difficult for us to review your changes. Using my IDE I can filter these changes out but on Github you can only do this partially by hiding whitespace changes and those can have actual functional implications for Python so it's not ideal.


:param matplotlib.axes.Axes target_axes: The :py:obj:`matplotlib` axes used for plotting.
:param error_contributions: Which error contributions to include when plotting the data.
Can either be ``data``, ``'model'`` or both.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of quotation marks is inconsistent here. Since it seems to be the result of copy-pasting, please fix it for the other methods as well.

plot_width_share=0.5, font_scale=1.0, figsize=None):
pull=False, pull_range=None, pull_height_share=0.25,
plot_width_share=0.5, figsize=None,
font_scale=1.0):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is your reason for changing the order of figsize and font_scale?

@cverstege
Copy link
Member

cverstege commented Jun 9, 2023

ich habe den Pull-Plot angeschaut und war auch etwas verwirrt. Fehlerbalken oder Unsicherheitsband ergeben keinen Sinn, weil sie eine andere Skala benötigen als die Pull-Größe, die in Einheiten der Unsicherheit gemessen wird.

Genau das habe ich mit #219 (comment) gemeint. Das Fehlerband ist in einem Pull-Plot der Definition nach 1. Die Datenpunkte sind die Differenz der Daten mit dem Fit normiert mit der Unsicherheit des Fits (pull = (fit-data)/sigma_fit). Das hat zur Folge, dass das Fit-Fehlerband im Pull Plot dann bei +/-1 sein muss. Die Fehlerbalken der Datenpunkte werden mit der Unsicherheit des Fits skaliert (sigma_pull = sigma_data/sigma_fit). Einfache Fehlerfortpflanzung der Pull-Definition.

@TimLdl TimLdl marked this pull request as draft June 30, 2023 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants