FIX: NEXRAD Level 2 LDM stride drops MSG_31 records (#376)#377
FIX: NEXRAD Level 2 LDM stride drops MSG_31 records (#376)#377aladinor wants to merge 7 commits into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #377 +/- ##
==========================================
+ Coverage 94.18% 94.20% +0.02%
==========================================
Files 29 29
Lines 6416 6440 +24
==========================================
+ Hits 6043 6067 +24
Misses 373 373
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
736caa7 to
637928d
Compare
637928d to
610a5fb
Compare
|
@aladinor So for proper testing, we would need to have to import this specific file into open-radar-data? |
NEXRADRecordFile.init_record hard-coded a 120-message stride between LDM-compressed records. The NEXRAD ICD 2620010J §7.3.4 mandates a *variable* count: each LDM holds 120 radial messages (MSG_31) plus 0 or more RDA Status messages (MSG_2). When MSG_2 is interleaved with the 120 MSG_31s, xradar's mod-120 budget exhausts on the first 118 MSG_31 + 2 MSG_2, then jumps to the next LDM, silently dropping the trailing 2 MSG_31s. On KILX20230629_154426_V06, sweep_10 (elev_num=11) reports 358 rays on-wire-correct 360. The two missing radials sit at azimuth_numbers 119 and 120 in LDM 49 (122 messages = 120 MSG_31 + 2 MSG_2). Independent bzip2 byte-walks, danielway/nexrad 1.0.0-rc.7, and radish's internal decoder all read 6840 records on this fixture; xradar reports 6838. Replace the synthetic (recnum - 134) // 120 LDM lookup and the (recnum - 134) % 120 in-LDM offset with a true byte-walk: - _load_ldm(idx): decompress + cache LDM idx (extracted from init_record's previous inline block). - _resolve_compressed_data_position(recnum): compute (ldm, byte_offset) via cache-hit / first-compressed-call / sequential-advance branches. Cross-LDM advancement detects end-of-LDM by buffer length, not message count. - init_record: three-branch dispatch (metadata / uncompressed-data / compressed-data) with a shared trailer. The recnum-position cache is populated on every successful walk, so non-sequential init_record(stored_recnum) jumps from get_data's per-moment replay (line 775) restore from cache. Extend test_bz2_compressed_buffer_path_real to assert the cold-entry at recnum=134 transitions cleanly into a sequential walk — pins the byte-walker contract against future regressions.
610a5fb to
ecb1909
Compare
|
@kmuehlbauer Yes — to pin the actual symptom of #376 against the file that exposed it ( What this PR currently has:
What's missing (waiting for fixture):
I'll open a PR against |
|
@aladinor open-radar-data v0.8.0 release is out. Please add the necessary changes. Thanks! |
Address @kmuehlbauer's review on openradar#377. KILX20230629_154426_V06 has landed in open-radar-data 0.8.0 (openradar/open-radar-data#98), so we can now ship a targeted regression test against the exact file that exposed the bug. LDM 49 of this file contains 120 MSG_31 + 2 MSG_2 = 122 messages. Pre-fix the mod-120 stride dropped the trailing 2 MSG_31s, so sweep_10 reported 358 rays (on-wire-correct is 360) and the volume total was 6838 instead of 6840. - Add ``nexradlevel2_kilx_ldm_stride_file`` session fixture in ``tests/conftest.py``. - Add ``test_kilx_over_120_ldm_decodes_all_msg31`` asserting per-sweep ray counts == [720]*6 + [360]*7, total == 6840, and sweep_10 == 360 against the file. - Bump open-radar-data minimum version to 0.8.0 across ``requirements_dev.txt``, ``environment.yml``, ``ci/unittests.yml``, and ``ci/notebooktests.yml``.
|
@kmuehlbauer KILX20230629_154426_V06 just landed in open-radar-data 0.8.0 — added the targeted regression test in 21fa5fb:
LDM 49 of that file contains 120 MSG_31 + 2 MSG_2 = 122 messages, which is exactly the configuration that triggered the mod-120 stride bug. The test fails on the unfixed code ( |
|
@aladinor We still need to wait for conda forge. I'll add the new version now |
- Drop redundant nex.get_data_header() call — msg_31_header property auto-triggers it (xradar/io/backends/nexrad_level2.py:616-625) - Rename fixture/test to drop the kilx site prefix; the LDM stride is the salient property, not the ICAO - Trim redundant sum() and per_sweep[10] assertions — list-equality subsumes them and pytest's diff already pinpoints any regressed index - Drop "the formerly-truncated sweep" inline comment already covered by the docstring
The xarray-nightly job in .github/workflows/ci.yml hard-codes its own conda create-args list (rather than reading from ci/unittests.yml), so the earlier 0.7.0 → 0.8.0 bump didn't reach it. The job loaded an older open-radar-data without the KILX fixture, causing test_ldm_stride_decodes_all_msg31 to error in the registry lookup.
|
@kmuehlbauer, let me know if you have additional comments |
kmuehlbauer
left a comment
There was a problem hiding this comment.
@aladinor There is quite some code shuffle here ;-) I'll have another look next week.
One thing, this nexrad notebook doesn't work when the fix is applied: https://openradar-docs--377.org.readthedocs.build/projects/xradar/en/377/notebooks/nexrad_read_chunks.html.
Summary
Closes #376.
NEXRADRecordFile.init_recordhard-codes a 120-message stride between LDM-compressed records. NEXRAD ICD2620010J§7.3.4 mandates a variable count: each LDM Compressed Record contains "120 radial messages (type 31) plus 0 or more RDA Status messages (type 2)". When MSG_2 is interleaved alongside the 120 MSG_31s, the mod-120 budget exhausts on118 MSG_31 + 2 MSG_2, and xradar jumps to the next LDM, dropping the trailing MSG_31s.Reproducer
s3://unidata-nexrad-level2/2023/06/29/KILX/KILX20230629_154426_V06. LDM 49 of this file contains 122 messages (120 MSG_31 + 2 MSG_2). xradar reportssweep_10 = 358; the file actually has 360 (verified by directbz2.decompressbyte-walk and bydanielway/nexrad).danielway/nexrad1.0.0-rc.7bz2.decompressbyte-walkImplementation
Replace the synthetic mod-120 stride with a byte-walk that detects end-of-LDM by buffer length, not message count.
_load_ldm(idx)— decompress + lazy-cache LDMidx(extracted from the previous inline block, no behavior change)._resolve_compressed_data_position(recnum)— return(ldm_idx, byte_offset)via:get_dataper-moment replay),self.filepos + self.record_size, crossing into the next LDM when the current buffer can no longer hold a message header.init_record— three-branch dispatch (metadata / uncompressed-data / compressed-data) with a shared trailer.Test plan
tests/io/test_nexrad_level2.pypasssweep_10 = 360, total = 6840sweep_10.azimuth = 360, max azimuth gap 1.044°test_bz2_compressed_buffer_path_realto pin the cold-entry → sequential-walk transition (+2 lines)black --checkandruff checkcleanNote on regression test
None of the existing fixtures (KATX, KLIX, KLBB, KLBB.gz) trigger this bug — only metadata LDMs in those files have >120 messages, which goes through the (correct) metadata path. A targeted regression test against
KILX20230629_154426_V06is tracked as a follow-up once the file lands inopen-radar-data.Spec citations
2620010J§7.3.4 (LDM Compressed Record): the per-LDM payload is variable — 120 radials + 0 or more RDA Status messages.2620010J§7.3.6.1.1: 2432-byte fixed framing for non-MSG_31/29 messages — already handled byget_end(). The within-LDM walk was always correct; only the cross-LDM stride was wrong.