Skip to content

Fix shifts in plot_fancy_dataspecs#2484

Open
enocera wants to merge 10 commits into
masterfrom
fix_shifts
Open

Fix shifts in plot_fancy_dataspecs#2484
enocera wants to merge 10 commits into
masterfrom
fix_shifts

Conversation

@enocera

@enocera enocera commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

This PR ensures that the shift due to correlated systematic uncertainties is applied to all theoretical predictions when more than one prediction is compared to the data, for example when using plot_fancy_dataspecs. This PR fixes #2481.

Here are the same plots as in #2481, but with the fix applied.

https://vp.nnpdf.science/TCmkXajzSeq8TGsHVLE4Dw==

@enocera enocera requested a review from scarlehoff June 14, 2026 16:50
@enocera enocera added the bug Something isn't working label Jun 14, 2026

@scarlehoff scarlehoff left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a few comments. The general idea is that alpha is only needed for the data so it doesn't need to be part of the output.
And the data doesn't need the shift so it shouldn't go through the costly part of the equation.

If you check for data beforehand, then you avoid also having to guard against using data/theory relying on the order in which enters the plot, because it is not a given that tomorrow we won't change the order for whatever reason (and the bug would completely escape me as it did when I reviewed these changes, because it didn't occur to me to test with more than one prediction...)

Comment thread validphys2/src/validphys/dataplots.py Outdated
# For unknown reasons, `shifts_from_systematics` may randomly fail.
# If a LinAlgError is raised, shifts are not included in the final plot.
try:
shifts, alpha = shifts_from_systematics(lcd_wc, theory_predictions)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think alpha should not be an output of this function. The complicated/costly part of this function (inverting a matrix, etc) doesn't need to be done for the data part (which is the only reason you might want to enter this loop with the data), right? I think it would be worth it to compute alpha, given solely lcd_wc as a separate function (shifts_from_systematics can also call that same function).

Btw, one thing that might reduce the report time is to make the result hashable (depends only on commondata and on the theory number -if a theory prediction-) such that the amount of shifts that need to be computed for a report are not duplicated... writing but probably an issue to tackle later.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have separated the computation of the shifts and the extraction of the uncorrelated and correlated parts part of the uncertainty.

Comment thread validphys2/src/validphys/dataplots.py
Comment thread validphys2/src/validphys/dataplots.py Outdated
Comment on lines +306 to +318
do_shift = False

# Shift theory predictions, but not data
if i >= 1 and do_shift:
cv[mask] = result.central_value - shifts
else:
cv[mask] = result.central_value

# Retain only the uncorrelated part of the data error if shifting the data
if i == 0 and do_shift:
err[mask] = alpha
else:
err[mask] = result.std_error

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
do_shift = False
# Shift theory predictions, but not data
if i >= 1 and do_shift:
cv[mask] = result.central_value - shifts
else:
cv[mask] = result.central_value
# Retain only the uncorrelated part of the data error if shifting the data
if i == 0 and do_shift:
err[mask] = alpha
else:
err[mask] = result.std_error
shifts = 0.0
cv[mask] = result.central_value - shifts
err[mask] = result.std_error

Since when you are dealing with the data, you are not entering in the if condition, this is safe.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment thread validphys2/src/validphys/dataplots.py
@enocera

enocera commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

@scarlehoff I hope to have correctly addressed your ocmments. perhaps, let me generate a vp comparefit reports to check that everything is in order.

@enocera enocera requested a review from scarlehoff June 25, 2026 16:02
Comment thread validphys2/src/validphys/dataplots.py Outdated
shifts = 0.

cv[mask] = result.central_value - shifts
err[mask] = unco_unc(lcd_wc)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the check for data/theory is missing with DataResult, right? Or the same needs to be done in all cases?

(but maybe then alpha will need to be define in both path, in the try and the except)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some thoughts, I think that here, if try fails, we just don't want to apply the shifts, and we want to fall into the unshifted case by default (which is not the case now). I am thinking that perhaps we should fall in this case also when alpha is zero (that is when there's no uncorrelated part of the uncertainty because e.g. the experimentalists provide us only with a covariance matrix). Right now we just set the uncertainty displayed in data/theory plots to zero and do not shift the data (and say in the title that the data is shifted).

@scarlehoff scarlehoff Jun 26, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way of making sure there is no shift applied is to do

try:
    shifts = shift_calculation()
    alpha = alpha_calculation()
except:
     shifts = 0.0
     alpha = result.std_err

so that it is equivalent to not applying the shift, no?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly, because I fear that the label printed on the plot will still be "shifted" because try is in the with_shift if. I'm checking right now.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple fix, turn with_shift to False.

Comment thread validphys2/src/validphys/dataplots.py Outdated
Comment on lines +288 to +293
shifts = None
alpha = None
do_shift = with_shift

# Compute shifts due to the correlated part on the experiemntal ucnertainty
if do_shift:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
shifts = None
alpha = None
do_shift = with_shift
# Compute shifts due to the correlated part on the experiemntal ucnertainty
if do_shift:
# Compute shifts due to the correlated part on the experimental uncertainty
if with_shift:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@enocera

enocera commented Jun 26, 2026

Copy link
Copy Markdown
Contributor Author

@scarlehoff I have rethought how we apply shifts, based on your comments, that I didn't completely understand in the first place, but that I hope to have understood now. What I hopefully do now is to separate the uncertainty on the data and the shift to the theory and to more clearly identify four cases: with shift; no shift (default); no shift, because even if the user requires a shift, the shift cannot be computed because uncorrelated uncertainties are zero; no shift, because, even if the user requires a shift, the computation of the shift fails.

@scarlehoff

Copy link
Copy Markdown
Member

My comments were ill-informed in that I didn't appreciate the fact that two different predictions might have different cuts.

I'll look at the new version of the changes with that in mind. I like the 4 cases though.

# Determine data uncertainty
else:
alpha = unco_unc(lcd_wc)
if alpha.all() == 0. or fail:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queston, if fails, it should fail for everyone, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a plot we have data, theory1, theory2. Any failure should remove the shifts for everyone, right?

@enocera enocera Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! In that sense. So yes, if it fails for one theory, but not for another, then the plot must collapse to the unshifted version for everything. I hadn't thought about this case which, I think, is not encompassed by the current structure.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because right now, if I understand it correctly, if the data comes first the fail will never activate?

@enocera enocera Jun 29, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that now theory comes first. The way I understand your comment is the following: suppose you are comparing theory 1 and theory 2. Shifts are set to 0 initially. Then suppose that theory 1 fails because of the error in computing shifts. Then everything falls back to the unshifted version of the plot. Then suppose that theory 2 also fails for the same reason. No problem, everything falls back to the unshifted version of the plot. I understand that you're saying that there is an inconsistency when only one of the two theories fails, irrespective of the order. In the current implementation, the plot never falls back to the unshifted version, because some theories will be shifted and some will not, and the error will be the total error or only the uncorrelated part of the error depending on whether the failing theory is the last (or not). Is this what you are thinking of?
If so, I guess that this can be fixed by initialising fail to False outside the loop on i and by putting try except in an if fail==False.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, do you remember for which data set the computation of the shifts was failing?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that you're saying that there is an inconsistency when only one of the two theories fails, irrespective of the order. In the current implementation, the plot never falls back to the unshifted version, because some theories will be shifted and some will not, and the error will be the total error or only the uncorrelated part of the error depending on whether the failing theory is the last (or not). Is this what you are thinking of?

But more generally, the fact that it depends on the order and that is not guaranteed.

If so, I guess that this can be fixed by initialising fail to False outside the loop on i and by putting try except in an if fail==False.

Yes, probably

BTW, do you remember for which data set the computation of the shifts was failing?

No ^^U
But you can force the failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Shift in data/theory comparison plots in vp reports

2 participants