-
Notifications
You must be signed in to change notification settings - Fork 14
ATLAS_WCHARM_13TEV #2337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
ATLAS_WCHARM_13TEV #2337
Changes from 4 commits
f205bfe
bf1e7d8
68e203d
d1adbcc
43cf998
d8e6aec
8ef8942
0580b11
15e98dd
dbef8c6
3f24200
6bde456
cc395a8
421a5bb
6019a3a
109f27a
202b0e2
2330693
52ebd23
7f15390
883bca9
eb117c5
64aba29
5d820f2
77b5b84
665b6af
5430028
21ff988
70f0be3
f6e22dd
f40ed74
2c870b6
0522fa7
9665cd0
de68a7d
18eda5c
3d6dce2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| data_central: | ||
| - 12.27 | ||
| - 11.57 | ||
| - 10.41 | ||
| - 9.09 | ||
| - 6.85 | ||
| - 11.87 | ||
| - 11.55 | ||
| - 10.09 | ||
| - 8.6 | ||
| - 6.25 | ||
| - 12.18 | ||
| - 11.77 | ||
| - 10.61 | ||
| - 8.85 | ||
| - 7.22 | ||
| - 12.52 | ||
| - 12.14 | ||
| - 10.29 | ||
| - 8.38 | ||
| - 6.55 |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,35 @@ | ||
| """ | ||
| When running `python filter.py` the relevant data yaml | ||
| file will be created in the `nnpdf_data/commondata/ATLAS_WPWM_7TEV_46FB` directory. | ||
| """ | ||
|
|
||
| import yaml | ||
| from filter_utils import get_data_values, get_kinematics | ||
| from nnpdf_data.filter_utils.utils import prettify_float | ||
|
|
||
| yaml.add_representer(float, prettify_float) | ||
|
|
||
|
|
||
| def filter_ATLAS_WCHARM_13TEV_data_kinematic(): | ||
| """ | ||
| This function writes the systematics to yaml files. | ||
| """ | ||
|
|
||
| central_values = get_data_values() | ||
|
|
||
| kin = get_kinematics() | ||
|
|
||
| data_central_yaml = {"data_central": central_values} | ||
|
|
||
| kinematics_yaml = {"bins": kin} | ||
|
|
||
| # write central values and kinematics to yaml file | ||
| with open("data.yaml", "w") as file: | ||
| yaml.dump(data_central_yaml, file, sort_keys=False) | ||
|
|
||
| with open("kinematics.yaml", "w") as file: | ||
| yaml.dump(kinematics_yaml, file, sort_keys=False) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| filter_ATLAS_WCHARM_13TEV_data_kinematic() |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,84 @@ | ||
| """ | ||
| This module contains helper functions that are used to extract the data values | ||
| from the rawdata files. | ||
| """ | ||
|
|
||
| import yaml | ||
| import pandas as pd | ||
| import numpy as np | ||
|
|
||
|
|
||
| def get_data_values(): | ||
| """ | ||
| returns the central data values in the form of a list. | ||
| """ | ||
|
|
||
| data_central = [] | ||
|
|
||
| for i in range(19, 23): | ||
| hepdata_table = f"rawdata/HEPData-ins2628732-v1-Table_{i}.yaml" | ||
|
|
||
| with open(hepdata_table, 'r') as file: | ||
| input = yaml.safe_load(file) | ||
|
|
||
| values = input['dependent_variables'][0]['values'] | ||
|
|
||
| for value in values: | ||
| # store data central and convert the units and apply the correction factor | ||
| data_central.append(value['value']) | ||
|
|
||
| return data_central | ||
|
|
||
|
|
||
| def get_kinematics(): | ||
| """ | ||
| returns the kinematics in the form of a list of dictionaries. | ||
| """ | ||
| kin = [] | ||
|
|
||
| for i in range(19, 23): | ||
| hepdata_table = f"rawdata/HEPData-ins2628732-v1-Table_{i}.yaml" | ||
|
|
||
| with open(hepdata_table, 'r') as file: | ||
| input = yaml.safe_load(file) | ||
|
|
||
| for i, M in enumerate(input["independent_variables"][0]['values']): | ||
| kin_value = { | ||
| 'abs_eta': {'min': None, 'mid': (0.5 * (M['low'] + M['high'])), 'max': None}, | ||
| 'm_W2': {'min': None, 'mid': 6.46046213e03, 'max': None}, | ||
| 'sqrts': {'min': None, 'mid': 13000.0, 'max': None}, | ||
| } | ||
| kin.append(kin_value) | ||
|
|
||
| return kin | ||
|
|
||
| def decompose_covmat(covmat): | ||
| """Given a covmat it return an array sys with shape (ndat,ndat) | ||
| giving ndat correlated systematics for each of the ndat point. | ||
| The original covmat is obtained by doing sys@sys.T""" | ||
|
|
||
| lamb, mat = np.linalg.eig(covmat) | ||
| sys = np.multiply(np.sqrt(lamb), mat) | ||
| return sys | ||
|
|
||
| def get_uncertainties(): | ||
| """ | ||
| returns the uncertainties. | ||
| """ | ||
|
|
||
| ndat = 5 | ||
| # Produce covmat of form [[W-/W+],[0], | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @enocera We are given the covariance matrices for W-/W+ and for W-( We are given the systematics for each point but I am not sure if these include the correlations from this covmat. The covariances matrices are also combined statistical and systematic uncertainty covariance matrices so does this mean that decomposing this
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Dear @ecole41 after looking into this data set a little, I suggest to implement two variants insofar as uncertainties are concerned. This means that you have to generate two
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you, that makes a lot of sense. I just wanted to check the function used to symmetrise the errors. The function in: nnpdf_data/nnpdf_data/filter_utils/uncertainties.py, shows that the se_delta is equal to the average of the two errors and the se_sigma is related to their difference. Should this be the other way around?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I also wanted to check how we should treat the
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
You are totally right, that function looks wrong, it should be the other way around to be consistent with Eqs. (23)-(24) and (27) of https://arxiv.org/pdf/physics/0403086. We should check how many data sets have been affected by that typo. Thanks.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
You have to compute |
||
| # [0],[W-*/W+*]] | ||
| covmat = np.zeros((4*ndat, 4*ndat)) # Multiply by 4 because of W+/- and */not * | ||
|
|
||
| def edit_covmat(filename, offset): | ||
| with open(filename) as f: | ||
| data = yaml.safe_load(f) | ||
| flat_values = [v["value"] for v in data["dependent_variables"][0]["values"]] | ||
| matrix = np.array(flat_values).reshape((2 * ndat, 2 * ndat)) | ||
| covmat[offset:offset + 2 * ndat, offset:offset + 2 * ndat] = matrix | ||
|
|
||
| edit_covmat("rawdata/HEPData-ins2628732-v1-Table_16.yaml", offset=0) | ||
| edit_covmat("rawdata/HEPData-ins2628732-v1-Table_18.yaml", offset=2 * ndat) | ||
|
|
||
| sys = decompose_covmat(covmat) | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| setname: ATLAS_WCHARM_13TEV | ||
|
|
||
| nnpdf_metadata: | ||
| nnpdf31_process: DY CC | ||
| experiment: ATLAS | ||
|
|
||
| arXiv: | ||
| url: https://arxiv.org/abs/2302.00336 | ||
| journal: Phys. Rev. D 108 (2023) 032012 | ||
| iNSPIRE: | ||
| url: https://inspirehep.net/literature/2628732 | ||
| hepdata: | ||
| url: https://www.hepdata.net/record/ins2628732 | ||
| version: 1 | ||
|
|
||
| version: 1 | ||
| version_comment: Implementation | ||
|
|
||
| implemented_observables: | ||
| - observable_name: | ||
| observable: | ||
| description: | ||
| label: ATLAS $W^-+c$ 13 TeV | ||
| units: '[pb]' | ||
| process_type: | ||
| tables: [19,20,21,22] # 5/19 (W−+D+), 6/20(W++D−), 9/21(W−+D∗+), 10/22(W++D∗−) | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @enocera Should these four ((W−+D+), (W++D−), (W−+D∗+), (W++D∗−))be separated into separate observables or be kept as one?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These should be four different observables of the same data set, as, e.g., different differential distributions for top pair production.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, and should these all have a separate data, uncertainties and kinematics yaml files? So four of each? |
||
| ndata: | ||
| plotting: | ||
| dataset_label: ATLAS $W^-+c$ 13 TeV | ||
| y_label: 'Differential fiducial cross-section times the single-lepton-flavor W boson branching ratio' #In Latex terms? | ||
|
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure how to put this in Latex terms |
||
| x_label: $|\eta^\ell|$ | ||
| plot_x: abs_eta | ||
| kinematic_coverage: | ||
| kinematics: | ||
| variables: | ||
| abs_eta: | ||
| description: | ||
| label: | ||
| units: '' | ||
| m_W2: | ||
| description: | ||
| label: | ||
| units: | ||
| file: | ||
| data_uncertainties: | ||
| data_central: | ||
| variants: | ||
| legacy: | ||
| data_uncertainties: | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@enocera Do we want to have all of the channels ((W−+D+), (W++D−), (W−+D∗+), (W++D∗−)) put together into one dataset like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ecole41 Yes we want a single data set with all the channels. Two remarks.