-
Notifications
You must be signed in to change notification settings - Fork 280
DL3 Event List with EventPreprocessor #3004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
94a929a
5c305ae
5f0e28c
8f4b48e
8d61836
2905316
d4b0881
add36d6
dd07d55
5985cbe
b7a66e5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| Adds a new ``FeatureSet`` to `~ctapipe.io.EventPreprocessor` for DL2 to DL3 | ||
| processing to generate GADF-like event lists. This adds the ability to transform | ||
| coordinates from Alt/AZ to ICRS, and supports applying pre-optimized | ||
| gamma-hadron cuts. | ||
|
|
||
| Also added a ``mode=MARK`` option that keeps all events, but adds boolean | ||
| columns per ``QualityCriterion`` showing which ones pass. This is useful for | ||
| debugging. The default mode is still ``DROP``, which filters events. A | ||
| corresponding API function `~ctapipe.core.QualityQuery.add_table_mask_columns` | ||
| was also added to support this. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,8 +1,10 @@ | ||
| """Module containing classes related to event loading and preprocessing""" | ||
|
|
||
| from enum import Enum, auto | ||
|
|
||
| from astropy.coordinates import angular_separation | ||
|
|
||
| from ..coordinates import altaz_to_nominal | ||
| from ..coordinates import altaz_to_icrs, altaz_to_nominal | ||
| from ..core import ( | ||
| Component, | ||
| FeatureGenerator, | ||
|
|
@@ -123,6 +125,68 @@ def _dl2_irf_config(preprocessor): | |
| } | ||
|
|
||
|
|
||
| @FeatureSetRegistry.register("dl2_to_dl3") | ||
| def _dl2_to_dl3_config(preprocessor: "EventPreprocessor"): | ||
| """Creates a DL3/Event table that conforms to GADF/VODF column naming.""" | ||
| return { | ||
| "features_to_generate": [ | ||
| ("EVENT_ID", "event_id"), | ||
| ("TIME", "time"), | ||
| ("ALT", f"{preprocessor.geometry_reconstructor}_alt"), | ||
| ("AZ", f"{preprocessor.geometry_reconstructor}_az"), | ||
| ( | ||
| "_reco_fov_coord", | ||
| "altaz_to_nominal(AZ, ALT, subarray_pointing_lon, subarray_pointing_lat)", | ||
| ), | ||
| ("FOV_LON", "_reco_fov_coord[:,0]"), | ||
| ("FOV_LAT", "_reco_fov_coord[:,1]"), | ||
| ( | ||
| "_reco_icrs_coord", | ||
| "altaz_to_icrs(AZ, ALT, TIME, subarray.reference_location)", | ||
| ), | ||
| ("RA", "_reco_icrs_coord[:,0]"), | ||
| ("DEC", "_reco_icrs_coord[:,1]"), | ||
| ("ENERGY", f"{preprocessor.energy_reconstructor}_energy"), | ||
| ("GAMMANESS", f"{preprocessor.gammaness_reconstructor}_prediction"), | ||
| ( | ||
| "MULTIP", | ||
| f"subarray.multiplicity({preprocessor.geometry_reconstructor}_telescopes)", | ||
| ), | ||
| ("HMAX", f"{preprocessor.geometry_reconstructor}_h_max"), | ||
| ("_passed_gh", "gammaness_cut_function(GAMMANESS, ENERGY)"), | ||
| ("_passed_multip", "multiplicity_cut_function(MULTIP, ENERGY)"), | ||
| ], | ||
| "quality_criteria": [ | ||
| ("VALID_RECO", f"{preprocessor.geometry_reconstructor}_is_valid"), | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As quality criteria are independant for each reconstructor, I think it would be safer to perform an overall "and" on reconstructor quality cut.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I will add criteria also for energy and gammaness. It's true they could be different, and it's useful to track
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I also realized I forgot the MULTIP cut, so I added a similar function hook where we can add a function of
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thinking more about it, but perhaps for the future after we refactor the optimization code to use this as well, would be to just copy over the I have to think what would be the cleanest way to do that, while keeping this class a bit generic. I think one option is to expand the In that scenario, I would just have: |
||
| ("VALID_ENERGY", f"{preprocessor.energy_reconstructor}_is_valid"), | ||
| ("VALID_GAMMANESS", f"{preprocessor.gammaness_reconstructor}_is_valid"), | ||
| ("PASSED_GH", "_passed_gh"), | ||
| ("PASSED_MULTIP", "_passed_multip"), | ||
| ], | ||
| "output_features": [ | ||
| "EVENT_ID", | ||
| "TIME", | ||
| "RA", | ||
| "DEC", | ||
| "ENERGY", | ||
| "ALT", | ||
| "AZ", | ||
| "FOV_LON", | ||
| "FOV_LAT", | ||
| "GAMMANESS", | ||
| "MULTIP", | ||
| "HMAX", | ||
| ], | ||
| } | ||
|
|
||
|
|
||
| class EventPreprocessorMode(Enum): | ||
| """Mode of output of EventPreprocessor.""" | ||
|
|
||
| DROP = auto() #: drop events that do not pass | ||
| MARK = auto() #: only mark evens as not passing, adding boolean columns | ||
|
|
||
|
|
||
| class EventPreprocessor(Component): | ||
| """ | ||
| Selects or generates features and filters tables of events. | ||
|
|
@@ -140,6 +204,16 @@ class with the columns you to retain in the output table. | |
| - `~ctapipe.coordinates.altaz_to_nominal` | ||
| """ | ||
|
|
||
| mode = traits.UseEnum( | ||
| EventPreprocessorMode, | ||
| default_value=EventPreprocessorMode.DROP, | ||
| help=( | ||
| "If 'DROP', removes events that do not pass quality cuts. " | ||
| "If 'MARK', generates a new boolean column for each quality criteria, " | ||
| "but keeps all events." | ||
| ), | ||
| ).tag(config=True) | ||
|
|
||
| energy_reconstructor = traits.Unicode( | ||
| default_value="RandomForestRegressor", | ||
| help="Prefix of the reco `_energy` column", | ||
|
|
@@ -176,8 +250,9 @@ class with the columns you to retain in the output table. | |
| ), | ||
| ).tag(config=True) | ||
|
|
||
| def __init__(self, config=None, parent=None, **kwargs): | ||
| def __init__(self, config=None, parent=None, subarray=None, **kwargs): | ||
| super().__init__(config=config, parent=parent, **kwargs) | ||
| self.subarray = subarray | ||
| if self.feature_set == "custom": | ||
| self.feature_generator = FeatureGenerator(parent=self) | ||
| self.quality_query = QualityQuery(parent=self) | ||
|
|
@@ -198,20 +273,38 @@ def __init__(self, config=None, parent=None, **kwargs): | |
| "of features in the configuration (DL2EventPreprocessor.features)." | ||
| ) | ||
|
|
||
| def __call__(self, table): | ||
| """Return new table with only the columns in features.""" | ||
| def __call__(self, table, **other_attributes): | ||
| """ | ||
| Return new table with only the columns in features. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| table: Table | ||
| Table to process | ||
| **other_attributes: Any | ||
| Other functions or objects that the FeatureGenerator should have | ||
| access to, in addition to the default ones. | ||
| """ | ||
|
|
||
| # generate new features, which includes renaming columns: | ||
| generated = self.feature_generator( | ||
| table, | ||
| angular_separation=angular_separation, | ||
| altaz_to_nominal=altaz_to_nominal, | ||
| altaz_to_icrs=altaz_to_icrs, | ||
| subarray=self.subarray, | ||
| **other_attributes, | ||
| ) | ||
|
|
||
| # apply event selection on the resulting table | ||
|
|
||
| selected_mask = self.quality_query.get_table_mask(generated) | ||
|
|
||
| # return only the columns specified in `self.features`, and rows in | ||
| # `selected_mask` | ||
| return generated[self.features][selected_mask] | ||
| if self.mode == EventPreprocessorMode.DROP: | ||
| # return only the columns specified in `self.features`, and rows in | ||
| # `selected_mask` | ||
| selected_mask = self.quality_query.get_table_mask(generated) | ||
| return generated[self.features][selected_mask] | ||
| elif self.mode == EventPreprocessorMode.MARK: | ||
| generated = self.quality_query.add_table_mask_columns(generated) | ||
| return generated[self.features + self.quality_query.criteria_names] | ||
| else: | ||
| raise ValueError("Unsupported mode: {self.mode}") | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to return the
SkyCoordobject and then docoord.ra.to(u.deg)in the transform isntead of filling a 2d array and then indexing into that?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was just to output a simple 2D array, not a SkyCoord: astropy.table supports writing columns of SkyCoords, but our
io.write_tablefunction doesn't.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But looking at the examples and the DL3 config, the goal seems to anyway be to get two separate columns? The 2d array is never actually used.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is used: it's a bit confusing maybe, but there is a 2D feature genereated with this function, and then the following two features extract the single columns. THat was the most efficient way to do this without calculating twice.
e.g.:
( "reco_fov_coord", "altaz_to_nominal(AZ, ALT, subarray_pointing_lon, subarray_pointing_lat)", ), # makes the 2D coordinate ("FOV_LON", "reco_fov_coord[:,0]"), # makes single column ("FOV_LAT", "reco_fov_coord[:,1]"), # makes single columnThe 2D column (
reco_fov_coord) is then not included in theEventPreprocessoroutput, so it's temporary only.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I wrote above, right?
You can just leave the skycoord and then in the feature generator you access its fields instead of indexing into the array.