ENH: extract nominal ray angle by default for ODIM backend#383
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #383 +/- ##
==========================================
+ Coverage 94.20% 94.22% +0.02%
==========================================
Files 29 29
Lines 6417 6447 +30
==========================================
+ Hits 6045 6075 +30
Misses 372 372
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I would say that your PR message needs to be human-generated. LLMs tend to be extremely verbose in their messages, to the point of confusion. Then, I think it shows that you understand your PR and helps us as maintainers understand it better. |
|
Thank you for the comment. I updated the PR for clarity. I stressed this is team effort with AI. I take responsibility for the text. The relatively long review, which seems valid, is actually not part of the PR. It is for discussion purposes related to AI usage in review. |
kmuehlbauer
left a comment
There was a problem hiding this comment.
Thanks @egouden for this PR.
I'll just add my reasonings here a second time:
_This actually looks like a fast path of reindexing, which should be implemented anyway. How we could achieve this:
Use reindex_angle=True and return nominal angles IF the suggested fast path (extracting nominal rays from startazA/astart) would work. If the fast path does not work (missing/duplicate rays etc. pp) we go the usual way returning measured angles + reindexing.
If we want to switch behaviour and return nominal angles by default, we need to switch the default to "reindex_angle=True"_
| return np.round(np.nanmean(angle_diff_wanted), decimals=2) | ||
|
|
||
|
|
||
| def _get_azimuth_how(how): |
There was a problem hiding this comment.
Please leave these two functions here, instead of copying around.
| except (AttributeError, KeyError, TypeError): | ||
| azimuth = _get_azimuth_where(self.where) | ||
| nrays = self.where["nrays"] | ||
| if self._azimuth_mode == "nominal" and nrays in [180, 360, 720]: |
There was a problem hiding this comment.
Can we move this into a dedicated function, like _get_azimuth_nominal().
I'd also move the check for [180, 360, 720] inside that function. Why do we want to restrict this to these numbers? ODIM_H5 doesn't mention anything related.
It looks like this is implementation detail and only works if all rays of a certain layout are available. Otherwise it will fall back.
There was a problem hiding this comment.
A dedicated function has been created. The check for standard nrays is to catch malformed scans, which by definition are not nominal. There might be exotic well formed scans outside these numbers. I see no easy other way to identify malformed scan.
| astart = self.how["startazA"][0] | ||
| astart = min(astart, astart - 360, key=abs) | ||
| astart = np.round(astart / (ascale / 2)) * ascale / 2 |
There was a problem hiding this comment.
Good, this gets us a value what astart is described in ODIM_H5:
Azimuthal offset from 0◦ of the start of the first ray in the sweep. This value is positive where the gate starts clockwise after 0◦, and it will be negative if it starts before 0◦. In either case, the value must be no larger than half a ray’s width.
There was a problem hiding this comment.
Yes and the issue is that astart is complex and can be wrong, hence the ODIM solution of mandatory startazA.
| astart = np.round(astart / (ascale / 2)) * ascale / 2 | ||
| except (KeyError, TypeError, AttributeError): | ||
| try: | ||
| astart = self.how["astart"] |
There was a problem hiding this comment.
So, why falling back to this here? According to the standard astart is optional whereas startazA is mandatory.
I've skimmed the available standards (https://www.eumetnet.eu/observations/weather-radar-network/, and click on "Read More" on OPERA Data Information Model) and startazA is recommended since 2.1 and mandatory since 2.4. astart was first mentioned in 2.2.
Maybe check for astart first (easy) and fallback to startAzA?
| "xradar: No startazA or astart found, using 0 as default." | ||
| ) | ||
| astart = 0 | ||
| azimuth = np.arange(astart + ascale / 2, 360, ascale) |
There was a problem hiding this comment.
Do we need to check if the correct number of azimuth angles is created here? Or could it happen, that len(azimuth) != nrays for some reason?
There was a problem hiding this comment.
No, it should always be correct, considering startzaA is within (lazy) specifications, of course, i.e. not below -0.75 or above 0.5.
| astart = 0 | ||
| azimuth = np.arange(astart + ascale / 2, 360, ascale) | ||
| else: | ||
| if nrays not in [180, 360, 720]: |
There was a problem hiding this comment.
Move that nrays check into above suggested function.
| try: | ||
| azimuth = _get_azimuth_how(self.how) | ||
| except (AttributeError, KeyError, TypeError): | ||
| azimuth = _get_azimuth_where(self.where) |
There was a problem hiding this comment.
In light of the new nominal azimuth approach it might be time to refactor this try/except. Can be done in a later PR.
| If True, fixes erroneous second angle data. Defaults to ``False``. | ||
| site_as_coords : bool | ||
| Attach radar site-coordinates to Dataset, defaults to ``True``. | ||
| azimuth_mode : str |
There was a problem hiding this comment.
We need to preserve backwards compatibility. So "nominal" can't be the default.
There was a problem hiding this comment.
The API is the same. I do not think this will break code. I will repeat my rationale. Most users are expecting nominal angles. The radar data should be nominal by design. This is in my opinion enforced in the FM301 specification. This issue can be considered a flaw in ODIM format design (i.e. not having nominal astart mandatory). The old behavior is preserved as an option. This is one of the thing i would like to see fixed before 1.0. I would like to hear the opinion of other team members on this. Regarding your main comment, this is in my view not really about reindexing angles which is more a post processing approach (i used it too later in my workflow).
There was a problem hiding this comment.
It's not about what most users expect, it's about what current users get when running their workflows. Currently users get measured angles. If we want to return nominal angles by default instead of measured angles we need to properly announce that change or better go through a deprecation cycle. One could claim that we are still in version 0.12 as of now, so we could change anything anytime. But I would really avoid that.
One more thing, would we want to have nominal angles for RHI, too?
There was a problem hiding this comment.
I understand the concern for stability. I think fixing this properly would have a positive impact on both old and new users.
To be consistent with FM301, all angles should be nominal. Are there other readers returning measured angles?
There was a problem hiding this comment.
Most readers return measured angles if they are available. I'm on board with returning nominal angles by default. But we should keep behaviour consistent and introduce the change via deprecation cycle. This would also have the benefit of properly announcing this new scheme.
We could start announcing it like this:
reindex_angle=False: DeprecationWarning("Default angle output will change in a future version, nominal angles (reindex_angle="nominal") will become the default. To silence this warning and keep current measured angles, set reindex_angle="raw" now.")"
Then we implement the fast path along your implementation for reindex_angle="nominal". The error message for non-conforming sweeps should state that the user has to either provide a fitting reindex_dict or use reindex_angle="raw" to return raw measured angles.
I think this is the easiest and fastest way without cluttering the API with yet another keyword.
Let's assume "nominal_angles" as our new default. Any reindex operation which repairs/fixes issues with the data will also return nominal angles. So, we could just alias the current reindex_angle with eg. just angle.
angle="nominal" # get nominal angle
angle="raw" # get raw (measured) angle
angle=dict(direction=1, angle_res=1., start_angle=0, stop_angle=360) # get reindexed nominal angles
WDYT?
There was a problem hiding this comment.
This sounds like a great plan! I am always in favor to keep it simple. I guess it is more efficient if you take over from this point in a new integrated PR? I can help of course.
There was a problem hiding this comment.
I'll try to re-use as much of your implementation as possible and have a PR ready soon.
| invalid_netcdf=invalid_netcdf, | ||
| phony_dims=phony_dims, | ||
| decode_vlen_strings=decode_vlen_strings, | ||
| azimuth_mode=azimuth_mode, |
There was a problem hiding this comment.
So, after digging through this, we might do this a bit different.
What this PR aims to:
If nrays in [180, 360, 720] and azimuth_mode="nominal" return nominal azimuth angles.
This actually looks like a fast path of reindexing, which should be implemented anyway. How we could achieve this:
-
Use
reindex_angle=Trueand return nominal angles IF the suggested fast path (extracting nominal rays fromstartazA/astart) would work. If the fast path does not work (missing/duplicate rays etc. pp) we go the usual way returning measured angles + reindexing. -
If we want to switch behaviour and return nominal angles by default, we need to switch the default to
"reindex_angle=True"
This would be a minimal change and would also speed up reading well-formed data when reindexing is requested.
There was a problem hiding this comment.
I was suggesting to return nominal angles per default (point 2 above) without adding the azimuth_mode kwarg at the backend level.
Example 1 (perfect file, 360 rays):
This currently returns measured angles (default reindex_angle=False). To return nominal angles users have to use reindex_angle=reindex_dict. I'm proposing to use reindex_angle=True as a means of "try to return nominal angle" and return nominal angles directly from the odim reader.
Example 2 (broken file, 361 rays):
This currently returns measured angles. To return a repaired version with nominal angles users have to use reindex_angle=reindex_dict. I'm proposing to use reindex_angle=True as above and return an error in this case (malformed data).
Then, in a deprecation cycle we would switch from the default reindex_angle=False to reindex_angle=True. Then users always get nominal angles directly from the odim reader (fast path) side stepping the reindexing.
I'm open to change reindex_angle kwarg to a more fitting name.
There was a problem hiding this comment.
I understand now. nominal_angle=true?
There was a problem hiding this comment.
I'm still considering the naming, reindex_angle does not fit any more, azimuth_mode does not cover for RHI, angle_mode does not reflect the possible mapping input. Pure angle might be a bit too implicit. I'm tending to use angle_spec.
My current idea is to move along those lines:
angle_spec = "nominal" # try to return nominal angles, raise if not possible
angle_spec = "raw" # just return mid point angles from whatever data is in the file
angle_spec = "auto" # `extract_angle_parameters`, reindex and return nominal angles
angle_spec = reindex_mapping # reindex with given mapping, return nominal angles
Still checking possible best data flow...
|
The changes and comments are mine only. There was already a test with nominal 361 rays. I added one test with nominal rays but a double ray (with the help of grok 4.3). |
EXPERIMENTAL PR
This PR was produced together with DeepSeek V4 Flash Free (OpenCode Zen). I provided guidance on the goal and expectations. The main implementation has been made by me 6 months ago without AI. I needed AI help for review and the new kwargs and tests. I did not check these as my experience is limited.
Our summary:
Adds azimuth_mode option (nominal/per_ray) to ODIM backend with fallback for non-standard nrays.
Our rationale:
Nominal azimuth (uniform sweep from startazA[0]) is the default because it's more robust: per-ray startazA/stopazA often have antenna jitter that isn't meaningful, and most downstream algorithms (gridding, interpolation) assume uniform spacing anyway. The old per-ray method is preserved as a fallback for non-standard nrays, and users can opt into it explicitly via azimuth_mode="per_ray" if they need the raw midpoints.
EXPERIMENTAL REVIEW
In depth review by mistral-medium, a bigger european model
In-Depth Review: Commit fe06f12
Summary: Adds azimuth_mode parameter to ODIM backend, defaulting to "nominal" mode which extracts uniform ray angles for standard sweep configurations (180/360/720 rays) instead of calculating per-ray midpoints.
The Problem Being Solved
ODIM files store azimuth information in two ways:
1.
Per-ray: startazA and stopazA arrays defining start/stop angles for each ray
2.
Nominal/uniform: Standard PPI sweeps with uniformly spaced rays (180, 360, 720 rays)
The previous implementation always calculated per-ray midpoints from startazA/stopazA, which could introduce small inconsistencies or unnecessary complexity for standard uniform sweeps. Many radar files, especially from operational networks, use standard sweep patterns where azimuth angles are perfectly uniform.
Implementation Assessment
⚠️ Considerations:
✅ Strengths:
1.
Backward Compatible: Default behavior now prefers nominal angles, but:
•
Falls back to per-ray calculation for non-standard ray counts
•
Explicit azimuth_mode="per_ray" restores old behavior
•
All existing code continues to work
2.
Well-Architected: The azimuth_mode parameter is properly threaded through:
•
_OdimH5NetCDFMetadata.init
•
OdimSubStore.init
•
OdimStore.init and .open()
•
OdimBackendEntrypoint.open_dataset()
3.
Sensible Defaults:
•
Standard ray counts (180, 360, 720) cover most operational radars
•
Starting angle calculated from first startazA value or astart attribute
•
Graceful fallback with clear warnings
4.
Good Error Handling:
•
Warnings for missing startazA/astart (defaults to 0)
•
Warnings for non-standard ray counts
•
try/except blocks for missing attributes
1.
Magic Numbers: The hardcoded [180, 360, 720] list is pragmatic but inflexible. Some radars use 450, 900, or other counts. Consider making this configurable.
2.
Angle Rounding Logic:
astart = np.round(astart / (ascale / 2)) * ascale / 2
This snaps the start angle to the nearest half-step. Good for handling minor discrepancies, but could be documented better.
3.
Warning Message: "Unexpected number of rays" might be confusing. Consider "Non-standard number of rays" or similar.
4.
Dtype: The nominal azimuth uses default numpy dtype (float64), while _get_azimuth_where used float32. Minor inconsistency.
Overall Verdict: Good Implementation (8.5/10)
The implementation is well-designed, backward-compatible, and solves a real problem. The choice of defaulting to nominal mode is reasonable for the target use case (ODIM operational radar data).
Minor improvements for future consideration:
•
Make standard ray counts configurable
•
Clarify warning messages
•
Document the angle snapping logic
•
Consider dtype consistency
The commit represents a thoughtful enhancement that improves data consistency for the most common use cases while maintaining full backward compatibility.