Skip to content

FIX: NEXRAD Level 2 LDM stride drops MSG_31 records (#376)#377

Open
aladinor wants to merge 7 commits into
openradar:mainfrom
aladinor:fix/376-nexrad-ldm-stride
Open

FIX: NEXRAD Level 2 LDM stride drops MSG_31 records (#376)#377
aladinor wants to merge 7 commits into
openradar:mainfrom
aladinor:fix/376-nexrad-ldm-stride

Conversation

@aladinor

@aladinor aladinor commented May 4, 2026

Copy link
Copy Markdown
Member

Summary

Closes #376.

NEXRADRecordFile.init_record hard-codes a 120-message stride between LDM-compressed records. NEXRAD ICD 2620010J §7.3.4 mandates a variable count: each LDM Compressed Record contains "120 radial messages (type 31) plus 0 or more RDA Status messages (type 2)". When MSG_2 is interleaved alongside the 120 MSG_31s, the mod-120 budget exhausts on 118 MSG_31 + 2 MSG_2, and xradar jumps to the next LDM, dropping the trailing MSG_31s.

-    def get_ldm(recnum):
-        if recnum < 134:
-            return 0
-        mod = ((recnum - 134) // 120) + 1   # bug: hard-codes 120 msgs/LDM
-        return mod
-    ...
-    rnum = (recnum - 134) % 120              # bug: same assumption
-    start = self.record_size + self.filepos if rnum else 0

Reproducer

s3://unidata-nexrad-level2/2023/06/29/KILX/KILX20230629_154426_V06. LDM 49 of this file contains 122 messages (120 MSG_31 + 2 MSG_2). xradar reports sweep_10 = 358; the file actually has 360 (verified by direct bz2.decompress byte-walk and by danielway/nexrad).

Total sweep_10
xradar 0.12.0 (today) 6838 358
danielway/nexrad 1.0.0-rc.7 6840 360
Direct bz2.decompress byte-walk 6840 360
xradar with this PR 6840 360

Implementation

Replace the synthetic mod-120 stride with a byte-walk that detects end-of-LDM by buffer length, not message count.

  • _load_ldm(idx) — decompress + lazy-cache LDM idx (extracted from the previous inline block, no behavior change).
  • _resolve_compressed_data_position(recnum) — return (ldm_idx, byte_offset) via:
    • cache hit (non-sequential jumps, e.g. get_data per-moment replay),
    • first compressed call (cold or post-metadata) → LDM 1 byte 0,
    • sequential advance from self.filepos + self.record_size, crossing into the next LDM when the current buffer can no longer hold a message header.
  • init_record — three-branch dispatch (metadata / uncompressed-data / compressed-data) with a shared trailer.

Test plan

  • All 104 tests in tests/io/test_nexrad_level2.py pass
  • Smoke test on KILX: sweep_10 = 360, total = 6840
  • DataTree end-to-end: sweep_10.azimuth = 360, max azimuth gap 1.044°
  • Extended test_bz2_compressed_buffer_path_real to pin the cold-entry → sequential-walk transition (+2 lines)
  • black --check and ruff check clean

Note on regression test

None of the existing fixtures (KATX, KLIX, KLBB, KLBB.gz) trigger this bug — only metadata LDMs in those files have >120 messages, which goes through the (correct) metadata path. A targeted regression test against KILX20230629_154426_V06 is tracked as a follow-up once the file lands in open-radar-data.

Spec citations

  • ICD 2620010J §7.3.4 (LDM Compressed Record): the per-LDM payload is variable — 120 radials + 0 or more RDA Status messages.
  • ICD 2620010J §7.3.6.1.1: 2432-byte fixed framing for non-MSG_31/29 messages — already handled by get_end(). The within-LDM walk was always correct; only the cross-LDM stride was wrong.

aladinor added a commit to aladinor/xradar that referenced this pull request May 4, 2026
@codecov

codecov Bot commented May 4, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 96.66667% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 94.20%. Comparing base (d723857) to head (d97f9b3).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
xradar/io/backends/nexrad_level2.py 96.66% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #377      +/-   ##
==========================================
+ Coverage   94.18%   94.20%   +0.02%     
==========================================
  Files          29       29              
  Lines        6416     6440      +24     
==========================================
+ Hits         6043     6067      +24     
  Misses        373      373              
Flag Coverage Δ
notebooktests 0.00% <0.00%> (ø)
unittests 94.20% <96.66%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

aladinor added a commit to aladinor/xradar that referenced this pull request May 4, 2026
@aladinor aladinor force-pushed the fix/376-nexrad-ldm-stride branch from 736caa7 to 637928d Compare May 4, 2026 15:59
aladinor added a commit to aladinor/xradar that referenced this pull request May 4, 2026
@aladinor aladinor force-pushed the fix/376-nexrad-ldm-stride branch from 637928d to 610a5fb Compare May 4, 2026 16:04
@aladinor aladinor requested a review from kmuehlbauer May 4, 2026 16:05
@kmuehlbauer

Copy link
Copy Markdown
Collaborator

@aladinor So for proper testing, we would need to have to import this specific file into open-radar-data?

aladinor added 4 commits May 8, 2026 07:44
NEXRADRecordFile.init_record hard-coded a 120-message stride between
LDM-compressed records. The NEXRAD ICD 2620010J §7.3.4 mandates a
*variable* count: each LDM holds 120 radial messages (MSG_31) plus 0
or more RDA Status messages (MSG_2). When MSG_2 is interleaved with
the 120 MSG_31s, xradar's mod-120 budget exhausts on the first 118
MSG_31 + 2 MSG_2, then jumps to the next LDM, silently dropping the
trailing 2 MSG_31s.

On KILX20230629_154426_V06, sweep_10 (elev_num=11) reports 358 rays
on-wire-correct 360. The two missing radials sit at azimuth_numbers
119 and 120 in LDM 49 (122 messages = 120 MSG_31 + 2 MSG_2).
Independent bzip2 byte-walks, danielway/nexrad 1.0.0-rc.7, and
radish's internal decoder all read 6840 records on this fixture;
xradar reports 6838.

Replace the synthetic (recnum - 134) // 120 LDM lookup and the
(recnum - 134) % 120 in-LDM offset with a true byte-walk:

- _load_ldm(idx): decompress + cache LDM idx (extracted from
  init_record's previous inline block).
- _resolve_compressed_data_position(recnum): compute (ldm, byte_offset)
  via cache-hit / first-compressed-call / sequential-advance branches.
  Cross-LDM advancement detects end-of-LDM by buffer length, not
  message count.
- init_record: three-branch dispatch (metadata / uncompressed-data /
  compressed-data) with a shared trailer.

The recnum-position cache is populated on every successful walk, so
non-sequential init_record(stored_recnum) jumps from get_data's
per-moment replay (line 775) restore from cache.

Extend test_bz2_compressed_buffer_path_real to assert the cold-entry
at recnum=134 transitions cleanly into a sequential walk — pins the
byte-walker contract against future regressions.
@aladinor aladinor force-pushed the fix/376-nexrad-ldm-stride branch from 610a5fb to ecb1909 Compare May 8, 2026 12:46
@aladinor

aladinor commented May 8, 2026

Copy link
Copy Markdown
Member Author

@kmuehlbauer Yes — to pin the actual symptom of #376 against the file that exposed it (KILX20230629_154426_V06, where LDM 49 has 120 MSG_31 + 2 MSG_2 = 122 messages and sweep_10 should be 360 rays), we need that fixture in open-radar-data.

What this PR currently has:

  • 6 unit tests covering all the new code paths in init_record / _load_ldm / _resolve_compressed_data_position (sequential walk, cache hit, first compressed call, EOF, invariant violations) — patch coverage 100% on the changed lines per Codecov
  • One contract pin in test_bz2_compressed_buffer_path_real that the cold-entry at recnum=134 transitions cleanly into a sequential walk

What's missing (waiting for fixture):

  • A test that loads the actual KILX file and asserts len(msg_31_header[10]) == 360 and total = 6840

I'll open a PR against openradar/open-radar-data to add the file (10 MB) and then send a follow-up here that adds the targeted regression assertion. None of the existing fixtures (KATX / KLIX / KLBB) trigger this bug — only metadata LDMs in those files have >120 messages, which goes through the (correct) metadata path. So the open-radar-data upload is the only realistic route to a true regression test.

@kmuehlbauer

Copy link
Copy Markdown
Collaborator

@aladinor open-radar-data v0.8.0 release is out. Please add the necessary changes. Thanks!

Address @kmuehlbauer's review on openradar#377. KILX20230629_154426_V06 has
landed in open-radar-data 0.8.0 (openradar/open-radar-data#98), so
we can now ship a targeted regression test against the exact file
that exposed the bug.

LDM 49 of this file contains 120 MSG_31 + 2 MSG_2 = 122 messages.
Pre-fix the mod-120 stride dropped the trailing 2 MSG_31s, so
sweep_10 reported 358 rays (on-wire-correct is 360) and the volume
total was 6838 instead of 6840.

- Add ``nexradlevel2_kilx_ldm_stride_file`` session fixture in
  ``tests/conftest.py``.
- Add ``test_kilx_over_120_ldm_decodes_all_msg31`` asserting
  per-sweep ray counts == [720]*6 + [360]*7, total == 6840, and
  sweep_10 == 360 against the file.
- Bump open-radar-data minimum version to 0.8.0 across
  ``requirements_dev.txt``, ``environment.yml``, ``ci/unittests.yml``,
  and ``ci/notebooktests.yml``.
@aladinor

aladinor commented May 8, 2026

Copy link
Copy Markdown
Member Author

@kmuehlbauer KILX20230629_154426_V06 just landed in open-radar-data 0.8.0 — added the targeted regression test in 21fa5fb:

  • New nexradlevel2_kilx_ldm_stride_file session fixture in tests/conftest.py
  • New test_kilx_over_120_ldm_decodes_all_msg31 asserting per-sweep ray counts [720]*6 + [360]*7, total = 6840, and sweep_10 == 360
  • Bumped open-radar-data >= 0.8.0 in requirements_dev.txt, environment.yml, ci/unittests.yml, and ci/notebooktests.yml

LDM 49 of that file contains 120 MSG_31 + 2 MSG_2 = 122 messages, which is exactly the configuration that triggered the mod-120 stride bug. The test fails on the unfixed code (sweep_10 == 358, total = 6838) and passes with the fix.

@kmuehlbauer

Copy link
Copy Markdown
Collaborator

@aladinor We still need to wait for conda forge. I'll add the new version now

aladinor added 2 commits May 8, 2026 10:48
- Drop redundant nex.get_data_header() call — msg_31_header property
  auto-triggers it (xradar/io/backends/nexrad_level2.py:616-625)
- Rename fixture/test to drop the kilx site prefix; the LDM stride is
  the salient property, not the ICAO
- Trim redundant sum() and per_sweep[10] assertions — list-equality
  subsumes them and pytest's diff already pinpoints any regressed index
- Drop "the formerly-truncated sweep" inline comment already covered
  by the docstring
The xarray-nightly job in .github/workflows/ci.yml hard-codes its
own conda create-args list (rather than reading from ci/unittests.yml),
so the earlier 0.7.0 → 0.8.0 bump didn't reach it. The job loaded an
older open-radar-data without the KILX fixture, causing
test_ldm_stride_decodes_all_msg31 to error in the registry lookup.
@aladinor

aladinor commented May 8, 2026

Copy link
Copy Markdown
Member Author

@kmuehlbauer, let me know if you have additional comments

@kmuehlbauer kmuehlbauer left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aladinor There is quite some code shuffle here ;-) I'll have another look next week.

One thing, this nexrad notebook doesn't work when the fix is applied: https://openradar-docs--377.org.readthedocs.build/projects/xradar/en/377/notebooks/nexrad_read_chunks.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

MSG_31 parser drops 2 radials from sweep_10 of KILX20230629_154426_V06 (358 instead of 360)

2 participants