diff --git a/.claude/sweep-accuracy-state.csv b/.claude/sweep-accuracy-state.csv index 0427c55e2..f2e9b7f9e 100644 --- a/.claude/sweep-accuracy-state.csv +++ b/.claude/sweep-accuracy-state.csv @@ -1,41 +1,41 @@ -module,last_inspected,issue,severity_max,categories_found,notes -aspect,2026-06-02,2827,MEDIUM,5,"Cat5 backend divergence: planar cupy _gpu snapped aspect>359.999 to 0 (no such clamp in numpy _cpu, whose range is [0,360) and never reaches 360), so cupy/dask+cupy disagreed with numpy by ~360 on near-degenerate gradients (gx~0+, gy>0). Removing the clamp exposed a 2nd divergence: GPU used coarse 57.29578 vs numpy 180/pi, flipping the >90 compass branch and yielding exact 360 vs 0 on uint32/uint64 random data. Fix #2827/PR #2833: GPU reuses RADIAN and wraps >=360 back to [0,360). Cats 1-4 clean; geodesic path canonicalizes consistently on CPU+GPU and was left untouched. CUDA available; cupy+dask+cupy verified (235 tests pass, numpy-vs-cupy max abs diff 0 over 360 rasters). Dedup: prior aspect fixes #2780 (cellsize)/#2774 (dask mem guard)/#2781 (oracle) all merged and unrelated. Note: PR review COMMENT could not be posted to GitHub (auto-mode permission denial); findings recorded in PR run instead." -balanced_allocation,2026-04-14T12:00:00Z,1203,,,float32 allocation array caused source ID mismatch for non-integer IDs. Fix in PR #1205. -bilateral,2026-05-01,,,,"No CRIT/HIGH/MEDIUM. Sigma underflow validated via sqrt(tiny) bound; oversize sigma clamped. float64 throughout numpy/cupy. NaN center returns NaN; NaN neighbors skipped (denom not incremented). w_sum>0 guard avoids div-by-zero. map_overlap depth==kernel radius. CUDA bounds correct. Inf input could yield 0*inf=NaN in v_sum but unvalidated input is general xrspatial pattern, not bilateral-specific." -contour,2026-05-01,,,,"Marching squares correct: NaN check uses self-inequality, loop bounds (ny-1,nx-1) cover all quads, dask overlap depth=1 matches 2x2 stencil, float64 cast consistent across backends, saddle disambiguation via center value. No CRIT/HIGH issues; minor LOW (Inf inputs not specifically rejected) not flagged." -corridor,2026-05-01,,LOW,1,"LOW: corridor inherits float32 from cost_distance; for very large accumulated costs, normalized = corridor - corridor_min loses precision near min (intrinsic to upstream dtype, not corridor itself). NaN handling correct (skipna min, np.isfinite check before normalize). All 4 backends route through pure xarray arithmetic; threshold uses dask/cupy/numpy where with try/except dispatch. No CRIT/HIGH issues." -cost_distance,2026-04-13T12:00:00Z,1191,,,CuPy Bellman-Ford max_iterations = h+w instead of h*w. Fix in PR #1192. -curvature,2026-03-30T15:00:00Z,,,,Formula matches ArcGIS reference. Backends consistent. No issues found. -dasymetric,2026-04-14T12:00:00Z,,,,Mass conservation correct. Weighted/binary/limiting_variable all verified. Pycnophylactic Tobler algorithm correct. -diffusion,2026-05-01,,LOW,1;2;5,"LOW: no Kahan summation across long iterations (drift over 100k steps, standard for explicit Euler); lap=n+s+w+e-4*val has catastrophic cancellation for nearly-uniform large values; res=0 in attrs causes div-by-zero (no guard); dask+cupy boundary='nan' relies on dask accepting cp.nan as fill. CPU/GPU NaN handling consistent (np.isnan vs val!=val). depth=1 matches stencil radius. Memory guards, CFL check, step cap all in place. No CRIT/HIGH." -edge_detection,2026-05-01,,,,Thin wrappers around convolve_2d with fixed Sobel/Prewitt/Laplacian kernels; no issues found -emerging_hotspots,2026-04-30,,MEDIUM,2;3,MEDIUM: threshold_90 uses int() (truncation) instead of ceil() so n_times=11 requires only 9/11 (81.8%) instead of 90%. MEDIUM: NaN time steps produce gi_bin=0 which classifier counts as 'non-significant' rather than missing; threshold_90 uses full n_times not valid count. LOW: 'global_std == 0' check does not catch NaN std for fully/mostly NaN inputs. -fire,2026-04-30,,,,All ops per-pixel (no accumulation/stencil/projected distance). NaN handled via x!=x; CUDA bounds use strict <; rdnbr and ros divisions guarded; CPU/GPU/dask paths algorithmically identical. No accuracy issues found. -flood,2026-04-30,,MEDIUM,2;5,"MEDIUM (not fixed): dask backend preserves float32 input dtype while numpy promotes to float64 in flood_depth and curve_number_runoff; DataArray inputs for curve_number, mannings_n bypass scalar > 0 (and CN <= 100) range validation, silently producing NaN/garbage." -focal,2026-06-10,3214,MEDIUM,1;5,"mean() dtype divergence: numpy/dask+numpy cast to float64 (astype(float)) while cupy/dask+cupy forced float32, so output dtype was backend-dependent and float64 rasters lost precision on GPU (offset 1e7: GPU error 0.58 > true spread 0.42, same class as fixed #2831). mean() was left out of the #2769 _promote_float contract that apply/focal_stats follow. Fix #3214: _promote_float in mean(), drop hardcoded cupy.float32 in _mean_cupy/_mean_dask_cupy, excludes cast to working dtype for cross-backend match parity. CUDA available; all 4 backends executed (245 focal tests pass incl new 3214 dtype tests). Cats 2-4 clean: GPU kernels two-pass std/var (#2831 fix verified), NaN checks via v!=v, map_overlap depths == kernel radius, Gi* validated against reference test. LOW (documented, not fixed): mean() excludes mask only the center pixel; excluded sentinel values (e.g. -9999) still contribute to neighboring cells' means on all backends -- docstring says 'left unchanged rather than averaged', backend-consistent." -geotiff,2026-06-12,3260,HIGH,1;2,"Pass 27 (2026-06-12, deep-sweep): HIGH fixed -- issue #3260. to_geotiff(pack=True) cast the packed values to the integer dtype from attrs['mask_and_scale_dtype'] with no range or finiteness check: finite values outside the dtype range wrapped in the astype (4000.0 with SCALE=0.1 on int16 packs to 40000 and landed on disk as -25536, unpacking to -2553.6) and +/-Inf cast to a platform-defined integer (0 on linux/x86), all silently on all four backends; dask deferred the cast into the write's compute so there was no warning at all. Internally inconsistent: the adjacent NaN-no-sentinel guard exists precisely because 'the astype below would silently wrap'. Fix adds _pack_guard_int_range after the round, before the cast (eager raises at call time, dask via map_blocks from the write's single compute, mirroring #3235); exclusive upper bound iinfo.max+1 stays exact in float64 so int64/uint64 reject exactly-2**63 instead of wrapping. Also fixed en route: eager cupy no-sentinel integer pack crashed with TypeError in bool(out.isnull().any()) (implicit cupy->numpy); now routed through the cupy-safe _pack_guard_no_nan. 15 new tests in tests/write/test_pack_range_guard_3260.py covering all four backends (CUDA available, gpu legs executed), boundary iinfo.min/max round trips, round-back-into-range, uint underflow, and the 2**63 float64 bound. Scope this pass: post-2026-06-09 commits only (pack/unpack #3174/#3175/#3239/#3240/#3241, VRT offsets #3135, GPU streaming writer) since Pass 26 covered the rest 3 days earlier; overview kernels, _coords transform math, _decode predictor/orientation/LERC fill, and _nodata lifecycle re-read with no new findings; GPU streaming writer reviewed (per-band NaN rewrite and tile-row alignment mirror the full-array path). cuda-available. | Pass 26 (2026-06-09, deep-sweep): MEDIUM fixed -- issue #3098. _apply_eager_nodata_mask in _attrs.py compared the integer nodata sentinel AFTER promoting the buffer to float64, so int64/uint64 sentinels above 2**53 (INT64_MAX/UINT64_MAX) swallowed up to 512/1024 nearby valid values into NaN on the numpy-eager and cupy-eager backends, while the dask per-chunk mask (_delayed_read_window), the GPU GDS chunk path (_apply_nodata_mask_gpu), and the VRT path all compare at native integer width and masked only exact hits. Reproduced end-to-end: int64 file with nodata=INT64_MAX and values INT64_MAX-1..-513 gave 4 NaN eager vs 1 NaN dask. Same function was internally inconsistent: the mask_nodata=False pixels_present scan already compared at native width. Fix computes the mask at source dtype width before the float64 promotion (promotion itself stays unconditional per #2990); one site fixes both numpy and cupy eager since GPU routes through the helper via duck typing. 5 regression tests in tests/read/test_nodata.py (int64 exact-hit, eager-vs-dask parity, uint64, near-sentinel-no-hit pixels_present, gpu eager); verified on all 4 backends with CUDA. Also audited this pass with no findings: overview reduce kernels CPU vs GPU (empirical parity run incl. float32 median midpoint analysis: RN(a+b)/2 == RN((a+b)/2) so no divergence), unpack/pack scale-offset paths (#3075/#3065, mask-before-scale ordering consistent eager/dask, dask+gpu reuses CPU dask graph), bbox-to-window floor/ceil (GDAL touched semantics), VRT nearest mapping floor((out+0.5)*src/out) and Int64 nodata native-width round-trip, predictor 2/3 GPU kernels (lossless), writer NaN-to-sentinel gates. cuda-available; GPU paths executed, not just reviewed. | Pass 25 (2026-05-15): HIGH fixed -- issue #1975. _block_reduce_2d's cubic branch in xrspatial/geotiff/_writer.py gated the sentinel-to-NaN mask on arr2d.dtype.kind=='f', so to_geotiff(cog=True, overview_resampling='cubic', nodata=) on an integer raster fell through to an unmasked zoom(arr2d, 0.5, order=3). The bicubic spline blended the sentinel (e.g. -9999) into neighbouring valid cells; cast back to the source integer dtype, the boundary pixels surfaced as silent garbage. Reproduction (1024x1024 int16 + 256x256 nodata corner + nodata=-9999): lvl1 boundary [128, 124:132] showed [1082, 1082, 1085, 1134, 5, 93, 100, 100] instead of [-9999/NaN, ..., 100, 100, 100, 100]; max poisoned value 1134 (11x the actual data value of 100) and min -11104 (below the sentinel -9999). Same root cause as #1623 (float cubic + nodata) but for the integer dtype branch. Both CPU and GPU writers affected because _block_reduce_2d_gpu's cubic path falls back to _block_reduce_2d on CPU. Fix mirrors the float branch: promote the cropped block to float64, mask sentinel to NaN via the integer-range guard (mirrors _int_nodata_in_range), run scipy.ndimage.zoom(prefilter=False), rewrite NaN back to the sentinel, then np.round(...).astype(source_int_dtype) so the integer cast is well-defined. 12 regression tests in test_cog_cubic_int_overview_nodata_1975.py: helper-level cubic per int dtype (int16, uint16, int32), no-nodata regression, out-of-range sentinel no-op, fractional sentinel no-op, all-sentinel block fallback, float cubic regression guard, end-to-end 1024x1024 round-trip, non-constant int regression, cubic-vs-mean sentinel-mask parity, and GPU/CPU byte parity. All 3186 non-stale geotiff tests still pass (2 pre-existing failures unrelated: test_predictor2_big_endian_gpu references the hidden read_to_array symbol, and test_size_param_validation_gpu_vrt_1776 asserts pre-#1767 tile_size=4 behaviour). Categories: Cat 1 (precision loss from cubic spline blending sentinel into valid cells) + Cat 2 (NaN-equivalent corruption: the read-side int-to-NaN mask only catches exact sentinel hits, so the poisoned values survive as legitimate measurements) + Cat 5 (backend parity: CPU and GPU writers shared the same wrong cubic path). | Pass 23 (2026-05-14): HIGH fixed -- issue #1847. extract_geo_info parsed GDAL_NODATA via float() unconditionally, which loses 1 ULP on uint64 max (2**64-1) and int64 max (2**63-1). The downstream integer-mask gate info.min <= int(nodata) <= info.max then rejects the cast because float-rounded sentinel is one above the dtype max; the sentinel pixel survives as a literal valid integer instead of NaN. Same float-only parse in _reader._resolve_masked_fill (LERC fill) and _reader._sparse_fill_value (SPARSE_OK fill). VRT _vrt._parse_band_nodata had already fixed this for the XML parse path (PR #1833) but TIFF source-of-truth was never updated, so write_vrt([uint64.tif]) stringified the float-parsed nodata as '1.8446744073709552e+19' into XML where the VRT reader then rejected it for being out of range. Fix: lift the int-first parse into shared helper _parse_nodata_str in _geotags.py and reuse across the three TIFF-side sites. The helper tries int(text) first to preserve full precision, falls back to float(text) for NaN/Inf/scientific/fractional. Downstream gates already handle int values transparently because np.isfinite(int) works and int(int) is a no-op. 25 regression tests in test_nodata_int64_precision_1847.py: unit-level _parse_nodata_str matrix (int vs float branches, edge cases), eager open_geotiff (uint64 max / int64 max / int64 min / uint16 / int32 / float regression guards), read_geotiff_dask (uint64 max, int64 max), write_vrt + read_vrt round-trip with XML literal assertion, and a GPU parity test. All 2434 non-stale geotiff tests still pass (1 pre-existing test_size_param_validation_gpu_vrt_1776 failure unrelated -- test asserts pre-#1767 tile_size=4 behaviour). Categories: Cat 2 (NaN propagation: sentinel pixel survived as literal valid number on all 4 backends) + Cat 5 (backend inconsistency: VRT XML parse path handled 64-bit sentinels via _parse_band_nodata but TIFF parse path did not, even though write_vrt fed the latter into the former). Audited but did not file: LOW silent kwarg drop -- to_geotiff(da, 'out.vrt', photometric='miniswhite') drops the photometric arg at _write_vrt_tiled call (per-tile files written as MinIsBlack). Data round-trips correctly because no inversion happens on either side; only the tile photometric tag disagrees with the user's request. Niche path + no data corruption + metadata-only drift = LOW, not filed. | Pass 22 (2026-05-13): HIGH fixed -- issue #1809. MinIsWhite (photometric=0) inversion ran before the sentinel-to-NaN nodata mask on all four backends (eager numpy in open_geotiff, dask chunk reader, eager GPU in read_geotiff_gpu, GPU stripped fallback). Because the inversion rewrites the original sentinel value (e.g. uint8 nodata=0 becomes 255, float32 nodata=-9999 becomes 9999), the post-inversion mask matched the wrong pixels: cells whose stored value happened to equal iinfo.max - sentinel were flagged NaN while real sentinel cells survived as inverted values. PR #1804 (a5d78e4) had refactored the helper but kept the original ordering. Fix: introduce _miniswhite_inverted_nodata in _reader.py and stash the inverted sentinel on geo_info._mask_nodata; route every backend mask through that field, keeping geo_info.nodata + attrs[nodata] at the original value for write-side round-trip. Dask path also re-inverts the closure nodata at graph-build time, picking up _ifd_photometric / _ifd_samples_per_pixel stashed in _read_geo_info. 9 regression tests in test_miniswhite_nodata_1809.py cover uint8 nodata=0, uint16 nodata=65535, float32 nodata=-9999 across numpy, dask, and GPU backends plus no-collision and no-nodata controls. All 2424 non-stale geotiff tests pass (4 pre-existing failures unrelated to this fix). Categories: Cat 2 (NaN propagation: real data became NaN while sentinel survived as inverted value) + Cat 5 (backend inconsistency: all four backends share the identical wrong result, so they agreed on the wrong answer rather than diverged). | Pass 21 (2026-05-13): MEDIUM fixed -- issue #1774. open_geotiff / read_geotiff_dask / _apply_nodata_mask_gpu crashed with ValueError: cannot convert float NaN to integer when reading an integer TIFF whose GDAL_NODATA tag was the string ""nan"" / ""inf"" / ""-inf"". Three sites in xrspatial/geotiff/__init__.py called int(nodata) on the integer-dtype branch without first checking np.isfinite. _geotags.py:extract_geo_info parses the GDAL_NODATA tag through float(nodata_str) so a ""nan"" tag surfaces as Python NaN; the integer mask code then explodes. Sibling helpers _resolve_masked_fill and _sparse_fill_value in _reader.py already gate on not math.isnan(v) and not math.isinf(v) (the unfinished pass of #1581). Fix: gate each int(nodata) cast on np.isfinite(nodata). A non-finite sentinel on an integer file cannot match any pixel, so the mask is a no-op and the file dtype is preserved; attrs['nodata'] still carries the raw NaN/Inf sentinel so a write round-trip keeps the original GDAL_NODATA tag. The read_geotiff_dask effective_dtype branch already used try/except and was safe in practice, but tightened with the same isfinite gate for readability. 15 regression tests in test_nodata_nan_int_1774.py covering eager numpy (3 NaN variants + 6 Inf variants), in-range finite still masks regression guard, dask (NaN + Inf), and GPU (NaN + Inf + finite). All pass; 2023 existing geotiff tests still pass (7 pre-existing test_predictor2_big_endian_gpu failures unrelated: they reference xrspatial.geotiff.read_to_array which was hidden from the public namespace in #1708, 3 pre-existing matplotlib palette failures in test_features.py unrelated). Categories: Cat 2 (NaN propagation: NaN nodata produced a crash instead of being treated as missing) + Cat 5 (backend inconsistency: _resolve_masked_fill / _sparse_fill_value already guarded; the three __init__.py sites did not). | Pass 20 (2026-05-12): HIGH fixed -- PR #1691 (no issue created; agent harness blocked gh issue create). Integer COG overview pyramid mixed sentinel into reduced pixels. _block_reduce_2d (_writer.py:258-264) and _block_reduce_2d_gpu (_gpu_decode.py:3027-3028) promoted integer blocks to float64 but never masked the sentinel to NaN before nanmean / nanmin / nanmax / nanmedian. The reduction averaged the sentinel into surrounding valid cells (e.g. (-9999 + 100 + 100 + 100)/4 = -2425 cast back to int16), producing overview pixels that the read-side int-to-NaN mask in open_geotiff couldn't recover because they didn't equal the sentinel. Silent garbage at every zoom above level 0 for to_geotiff(int_data, cog=True, nodata=N). Methods affected: mean, min, max, median; nearest/mode safe (no averaging). Fix: gate the sentinel-to-NaN mask on representability in the source integer dtype (mirrors _int_nodata_in_range in _reader.py) so uint16+GDAL_NODATA=""-9999"" stays a no-op; rewrite all-sentinel-block NaN back to sentinel before the integer dtype cast so the cast is well-defined (the caller's post-overview loop in write() only runs for floats). GPU mirror gets the same path with cupy.where + cupy.isnan for byte parity with CPU. 38 regression tests in test_cog_int_overview_nodata_2026_05_12.py: _block_reduce_2d per-dtype/per-method matrix (uint8/uint16/int16/int32 x mean/min/max/median), all-sentinel-block, no-nodata regression, out-of-range sentinel no-op, end-to-end uint16 + int16 round-trip, 3-band integer COG, GPU per-dtype/per-method matrix, CPU/GPU byte-match parity. All 1606 existing geotiff tests still pass. Categories: Cat 1 (precision/representation loss in nan-aware reduction) + Cat 2 (silent NaN-equivalent corruption from sentinel poisoning) + Cat 5 (backend parity between float and integer code paths within the same writer). Deferred LOW: HTTP COG path (_read_cog_http at _reader.py:1638) skips the band-range validation that local/dask/GPU added in #1673; band=-1 silently selects the last channel on HTTP while local raises IndexError. Cat 5, MEDIUM-leaning but separate concern from the overview fix; one-finding-per-PR per project policy. | Pass 19 (2026-05-12): MEDIUM fixed -- issue #1655. read_vrt silently dropped 0 on a SimpleSource because of src.nodata or nodata at _vrt.py:370. Python treats 0.0 as falsy, so the per-source sentinel fell through to the band-level (or None when missing) and pixels equal to 0.0 in the source file survived as valid data. The in-code comment acknowledged the quirk as backward compat, but the resulting behaviour silently biased every NaN-aware aggregation on VRT mosaics whose sources used 0 as a sentinel (a common convention for unsigned remote-sensing imagery). Fix: src_nodata = src.nodata if src.nodata is not None else nodata. Five regression tests in test_vrt_source_nodata_zero_1655.py covering source NODATA=0, integer XML literal, non-zero unchanged, band-level NoDataValue=0 still honoured, and source-overrides-band precedence. All 100 vrt-related geotiff tests still pass; 3 pre-existing test_features.py matplotlib palette failures unrelated. Categories: Cat 2 (NaN propagation) + Cat 5 (backend inconsistency: read_geotiff masks 0 correctly when GDAL_NODATA tag is set; only VRT path was broken). | Pass 18 (2026-05-11): MEDIUM fixed -- issue #1642. PR #1641 (issue #1640) inherited level-0 georef on overview reads but kept the level-0 origin_x/origin_y unchanged. That is correct for PixelIsArea (origin = upper-left corner of pixel (0,0)) but wrong for PixelIsPoint (origin = center of pixel (0,0), GeoKey 1025 = 2). For a 1024x1024 PixelIsPoint COG with 10 m pixels and origin (0, 0), open_geotiff(overview_level=1) returned x[:3]=[0,20,40] instead of [5,25,45] (level-1 pixel 0 covers level-0 pixels 0-1 whose centers are 0 and 10, centroid 5); same for y. Downstream sel/interp/reproject silently snaps to the wrong pixel for any DEM-style PixelIsPoint COG (USGS, OpenTopography, Copernicus DEM). Categories: Cat 3 (off-by-one / boundary handling) + Cat 5 (raster_type-dependent backend convention). Fix: in extract_geo_info_with_overview_inheritance (_geotags.py), pick the effective raster_type first (overview-declared if non-default, otherwise inherited from parent), then when it is PixelIsPoint apply origin_shift = (scale - 1) * 0.5 * pixel_size_lvl0 along each axis before building the new GeoTransform. PixelIsArea path is byte-equivalent. 13 regression tests in test_overview_pixel_is_point_1642.py: centroid identity across all 4 backends, transform tuple across all 4 backends, uniform grid step, unit-level helper tests for both raster_types via stubbed extract_geo_info, own-geokeys-not-clobbered path on PixelIsPoint, and a PixelIsArea regression check. All 1397 existing non-network geotiff tests still pass (3 pre-existing matplotlib palette failures unrelated). Deferred LOW: non-power-of-two overview dimensions cause scale = base_w/ov_w to diverge from the true 2^level reduction (writer drops the right/bottom strip via h2=(h//2)*2; for h=1023 a level-1 overview has 511 rows so scale=2.0019 not 2.0). Fix would need to either (a) emit explicit geo tags on overview IFDs from the writer or (b) pass the level number into the inheritance helper; neither is a one-line change and the resulting coord error is sub-pixel of level 0. | Pass 17 (2026-05-11): MEDIUM fixed -- issue #1634. open_geotiff eager path windowed read produced confusing CoordinateValidationError when window extended past source extent. read_to_array clamped the window internally and returned a smaller array, but the eager code path used unclamped window indices for y/x coord generation (xrspatial/geotiff/__init__.py lines 562-572), so the coord array length differed from the data and xarray refused to construct the DataArray. Same bug affected the windowed transform shift in _populate_attrs_from_geo_info. The dask path (read_geotiff_dask) already validated up front since #1561, raising a clear ValueError with the format 'window=... is outside the source extent (HxW) or has non-positive size.' so the two backends diverged on the contract. Fix: validate the window up front in open_geotiff's eager branch via _read_geo_info (metadata-only read, no extra pixel cost) using the exact same condition the dask path uses, raising the same ValueError message format. Reproduction: 10x10 raster + window=(5,5,15,15) on eager raised CoordinateValidationError('conflicting sizes ... length 5 ... length 10'); now raises ValueError('window=(5, 5, 15, 15) is outside the source extent (10x10) or has non-positive size.'). Categories: Cat 3 (off-by-one / boundary handling) + Cat 5 (backend inconsistency). 12 regression tests in test_window_out_of_bounds_1634.py: negative start, past-right-edge, past-bottom-edge, past-both-edges, zero-size, inverted window, full-extent ok, interior subset, edge-aligned, eager-vs-dask parity, message-format parity, issue reproducer. All 1286 existing non-network geotiff tests still pass. | Pass 16 (2026-05-11): HIGH fixed -- issue #1623. to_geotiff(cog=True, overview_resampling='cubic', nodata=) on a float raster with NaN regions produced overview pixels with severe ringing artefacts near nodata borders. Same class of bug as #1613 but for the cubic branch: writer rewrites NaN to the sentinel upstream, then _block_reduce_2d(method=cubic) handed the sentinel-poisoned array straight to scipy.ndimage.zoom(order=3). The cubic spline blended the sentinel (e.g. -9999) into neighbouring cells, producing values like 1133.44, -10290.08 where the data was a constant 100. Repro on 16x16 float32 with a 4x4 NaN corner showed 18 polluted pixels in the 8x8 overview. Fix: when nodata is supplied on a float dtype and the sentinel is found, mask sentinel to NaN, run cubic with prefilter=False so a single NaN cannot poison the entire row/column (default B-spline prefilter is global), then rewrite any NaN in the result back to the sentinel. prefilter=False only fires when a sentinel is present so the non-nodata cubic semantics are unchanged. GPU side: _block_reduce_2d_gpu previously raised on method='cubic'; added a CPU fallback (same pattern as 'mode') so GPU writer produces byte-equivalent overviews. GPU_OVERVIEW_METHODS now includes 'cubic'. 12 regression tests in test_cog_cubic_overview_nodata_1623.py (helper no-ringing, poisoning repro, no-nodata unchanged, end-to-end round-trip, GPU fallback, CPU/GPU byte-match, +/-inf nodata mask, NaN-sentinel no-op, GPU_OVERVIEW_METHODS contract). All 1256 existing geotiff tests still pass (3 pre-existing matplotlib failures unrelated). | Pass 15 (2026-05-11): HIGH fixed -- issue #1613. to_geotiff(cog=True, nodata=) on a float raster with NaN produced a corrupted overview pyramid. The NaN-to-sentinel rewrite in __init__.py:1202 (CPU) and :2852 (GPU write_geotiff_gpu) ran BEFORE _make_overview / make_overview_gpu, so the nan-aware aggregations (np.nanmean/min/max/median, cupy.nanmean/min/max/median) saw the sentinel as a real number and biased every overview pixel. Reproduction with -9999 sentinel produced [[-4998.75,-4997.75],..] where np.nanmean gives [[1.5,3.5],..]. Both CPU and GPU paths affected; backend results matched each other but were both wrong (CAT 2 NaN propagation + CAT 5 documents the parity). Fix: _block_reduce_2d / _block_reduce_2d_gpu accept a nodata kwarg that masks the sentinel back to NaN for float dtypes before the reduction; the writer's overview loop passes nodata in, then rewrites all-sentinel reductions (which surface as NaN from the reducer) back to the sentinel for the on-disk pyramid. 11 regression tests in test_cog_overview_nodata_1613.py (CPU mean / partial-block / min/max/median / no-nodata passthrough / helper kwarg / all-sentinel block / GPU mean / GPU helper / CPU-GPU agreement). All 235 nodata/overview/cog tests still pass. | Pass 14 (2026-05-11): HIGH fixed -- issue #1611. read_vrt(band=None) on a multi-band integer VRT with per-band tags only masks band 0's sentinel. __init__.py lines 2795-2809 in read_vrt apply vrt.bands[0].nodata to the full ndim==3 array; bands 1+ keep their integer sentinels as literal finite values (e.g. 65000 surfaces as 65000.0 after the dtype=float64 cast, not NaN). Float-VRT path masks per-band correctly in _vrt._read_data lines 296-297 + 347-351. PR #1602 fixed the single-band band=N case for issue #1598; the band=None multi-band case is the same class of bug. Repro: 2-band uint16 VRT with NoDataValue 65535 / 65000 returns r.values[1,1,1] == 65000.0 instead of NaN; r.values[1,1,0] is NaN (band 0 sentinel masked). Fix scope: in read_vrt, when band is None, iterate over vrt.bands and mask each arr[..., i] slice against its own (gated by the same _int_nodata_in_range guard PR #1583 introduced). Severity HIGH (Cat 2 NaN propagation + Cat 5 backend inconsistency: identical input semantics produce different masking outcomes based on dtype, with finite garbage values where NaN expected). Fix in PR #1612: walks vrt.bands when band is None and ndim==3, masks each arr[..., i] slice against its own via the refactored _sentinel_for_dtype helper (reuses PR #1583's range guard so out-of-range/non-finite/fractional sentinels are a no-op). attrs['nodata'] still carries band 0's sentinel for band=None reads (documented contract). 7 regression tests in test_vrt_multiband_int_nodata_1611.py: uint16 per-band, int32 negative, mixed presence, dtype preservation when no sentinel hit, out-of-range gating, band=N non-regression, attrs contract. 135 existing vrt/nodata geotiff tests still pass. | Pass 13 (2026-05-11): HIGH fixed -- issue #1599. write_geotiff_gpu (and to_geotiff gpu=True) emitted raw NaN bytes for missing pixels even when nodata= was supplied, while the CPU writer substituted NaN with the sentinel before encoding. xrspatial-only round-trips were unaffected (the reader masks both NaN and the sentinel), but external readers (rasterio/GDAL/QGIS) that mask only on the GDAL_NODATA tag saw NaN pixels as valid data -- rasterio reported 100% valid pixels on a 25-NaN file vs CPU's 25-invalid report. Root cause: __init__.py lines 2579-2587 jumped from shape/dtype resolution straight to compression, missing the equivalent of the CPU writer's NaN-to-sentinel rewrite at to_geotiff line ~1156. Fix: cupy.isnan + masked write on a defensive copy of arr, gated on np_dtype.kind=='f' and not np.isnan(float(nodata)). Caller's CuPy buffer preserved (copy before mutate). 7 regression tests in test_gpu_writer_nan_sentinel_1599.py: substitution lands as sentinel, CPU/GPU byte-equivalent, caller buffer not mutated, no-NaN no-op, NaN sentinel skips substitution, rasterio sees identical invalid count on CPU/GPU, multiband 3D path. All other GPU writer tests still pass (50 passed across band-first, attrs, nodata, dask+cupy, writer, nodata aliases). | Pass 12 (2026-05-11): HIGH fixed -- issue #1581. Reading a uint TIFF with a negative GDAL_NODATA sentinel (e.g. uint16 + -9999) raised OverflowError on every backend because the nodata-mask code did arr.dtype.type(int(nodata)) with no range check. Three identical cast sites in __init__.py (numpy eager, _apply_nodata_mask_gpu, _delayed_read_window) plus _resolve_masked_fill and _sparse_fill_value in _reader.py. Fix: _int_nodata_in_range helper gates the cast; out-of-range sentinels are a no-op for value matching (the file can never contain that value), file dtype is preserved, attrs['nodata'] still surfaces the original sentinel so write round-trips keep the GDAL_NODATA tag intact. Matches rasterio behavior. 8 regression tests in test_nodata_out_of_range_1581.py cover the helper, both eager and dask read paths, in-range sentinel non-regression, and GPU helper (cupy-gated). | Pass 11 (2026-05-10): CLEAN. Audited the one additional commit since pass 10 -- #1559 (PR 1548, Centralise GeoTIFF attrs population across all read backends). Refactor extracts _populate_attrs_from_geo_info helper and routes eager numpy, dask, GPU stripped, GPU tiled read paths through it; before the fix dask only emitted crs/transform/raster_type/nodata while numpy emitted the full attrs set including x/y_resolution, resolution_unit, image_description, extra_samples, GDAL metadata, and the CRS-description fields. No data-path arithmetic touched; only attrs dict population. Windowed origin math (origin_x + c0*pixel_width, origin_y + r0*pixel_height) verified to produce -98.0 / 48.75 origin for window=(10,20,50,70) on a (0.1,-0.125) pixel-size raster, with PixelIsArea half-pixel offset preserved on coord lookups (-97.95, 48.6875). Cross-backend attrs parity re-verified: numpy/dask/cupy all emit identical key set on deflate+predictor3+nodata round-trip (crs, crs_wkt, nodata, transform, x_resolution, y_resolution). Data bit-parity re-verified across numpy/dask/cupy on same payload (np.array_equal with equal_nan=True). test_attrs_parity_1548.py (5 tests), test_reader.py/test_writer.py/test_dask_cupy_combined.py (25 tests), GPU orientation/predictor2-BE/LERC-mask/nodata/byteswap suites (65 tests) all green. No accuracy or backend-divergence findings. | Pass 10 (2026-05-10): CLEAN. Audited 5 recent commits: #1558 drop-defensive-copies (frombuffer path still .copy()s before in-place predictor decode at _reader.py:778), #1556 fp-predictor ngjit (writer pre-ravels so 1-D slice arg is correct, float32/64 LE+BE bit-exact), #1552 batched D2H (OOM guard fires before cupy.concatenate, host_buf offsets correct), #1551 parallel-decode gate (>= vs > sends 256x256 default to parallel path, no value diff confirmed via partial-tile parity), #1549 nvjpeg constants (gray + RGB GPU JPEG decode pixel-identical to Pillow CPU, max diff = 0). Cross-backend parity re-verified clean: numpy/dask+numpy/cupy/dask+cupy equal .data/.dtype/.coords/nodata/NaN-mask on deflate+predictor3+nodata; orientations 1-8 numpy==GPU; partial edge tiles 100x150, 257x383, 512x257 numpy==GPU==dask; predictor2 LE/BE round-trip uint8/int16/uint16/int32/uint32 pass; predictor3 LE/BE float32/64 pass. Deferred LOW (pre-existing, not opened): float16 (bps=16, SampleFormat=3) absent from tiff_dtype_to_numpy map - writer never emits, asymmetric but unreachable. | Pass 9 (2026-05-09): TWO HIGH fixed -- (a) PR #1539 closes #1537: TIFF Orientation tag 2/3/4 (mirror flips) on georeferenced files left y/x coords computed from the un-flipped transform, so xarray label lookups returned the wrong pixel even though _apply_orientation flipped the buffer. PR #1521 only updated the transform for the 5-8 axis-swap branch. Fix updates origin and pixel-scale signs along whichever axes were flipped, for both PixelIsArea (origin shifts by N*step) and PixelIsPoint (shifts by (N-1)*step). 10 new tests in test_orientation.py. (b) PR #1546 closes #1540: read_geotiff_gpu ignored Orientation tag completely; CPU correctly applied 2-8 (PR #1521) but GPU returned the raw stored buffer. Cross-backend disagreement on every non-default orientation. Fix adds _apply_orientation_gpu (cupy slicing mirror of the CPU helper) and _apply_orientation_geo_info, threads them into the tiled GPU pipeline, reuses CPU-fallback geo_info for the stripped path to avoid double-applying. 28 new tests in test_orientation_gpu.py (every orientation, single-band tiled, single-band stripped, 3-band tiled, mirror-flip sel-fidelity, default no-tag passthrough). Re-confirmed clean: HTTP coalesce_ranges with overlapping ranges and zero-length ranges, parallel streaming write thread-safety (each tile gets independent buffer via copy or padded zeros), planar=2 + chunky GPU LERC mask propagation matches CPU, IFD chain cap MAX_IFDS=256, max_z_error round-trip on tiled write, _resolve_masked_fill float vs integer dtype semantics. Deferred LOW: per-sample LERC mask (3D mask (h,w,samples)) collapsed to per-pixel ""any sample invalid"" on GPU while CPU honours per-sample; LERC implementations rarely emit 3D masks (verified: lerc.encode with 2D mask on 3-band returns 2D mask). Documented planar=2 + LERC + GPU silently drops mask (rare in practice, source comment acknowledges). | Pass 8 (2026-05-07): HIGH fixed in fix-jpeg-tiff-disable -- to_geotiff(compression='jpeg') wrote files that no external reader can decode. The writer tags compression=7 (new-style JPEG) but emits a self-contained JFIF stream per tile/strip and never writes the JPEGTables tag (347) that the TIFF spec requires for that codec. libtiff/GDAL/rasterio all reject the file with TIFFReadEncodedStrip() failed; our reader round-trips because Pillow decodes the standalone JFIF, hiding the break. Pass-4 notes flagged the read side of the same JPEGTables gap and deferred it; pass-8 covers the write side. Fix: reject compression='jpeg' at the to_geotiff entry with a clear ValueError pointing at deflate/zstd/lzw. The internal _writer.write is untouched so the existing self-decoding tests still cover the codec; re-enabling the public path needs a JPEGTables-aware encoder. PR diffs reviewed but not merged: #1512 (BytesIO source) and #1513 (LERC max_z_error) -- both look correct; #1512 file-like read path goes through read_all() once so the per-call BytesIOSource lock is theoretical, and #1513 forwards max_z_error through every overview/tile/strip/streaming path including _write_vrt_tiled and _compress_block. No regressions found in either open PR. Other surfaces audited clean: predictor=3 with float16 (writer auto-promotes to float32 on both eager and streaming paths, value-exact round-trip); planar=2 multi-tile read uses band_idx*tiles_per_band offset so no cross-contamination between planes; _header.py multi-byte tag parsing uses bo (byte_order) consistently; Pillow YCbCr-vs-tagged-RGB photometric mismatch becomes moot once JPEG is disabled. Deferred (LOW/MEDIUM, not filed): JPEG2000 writer accepts arbitrary dtype with no validation (rare codec, narrow risk); float16 dtype not in tiff_dtype_to_numpy decode map (writer never emits it - asymmetric but unreachable); Orientation tag (274) still ignored on read (pass-4 deferral). | Pass 7 (2026-05-07): HIGH fixed in fix-mmap-cache-refcount-after-replace -- _MmapCache.release() looked up the cache entry by realpath, so a holder that acquired the OLD mmap before an os.replace and released it AFTER another caller had acquired the post-replace entry would decrement the new holder's refcount. Subsequent eviction (cache full, or another acquire) closed the still-in-use mmap, breaking reads with 'mmap closed or invalid'. Real exposure: any concurrent reader/writer pattern where to_geotiff replaces a file that another reader had just opened via open_geotiff with chunks= or via _FileSource. PR #1506 added stale-replacement detection but did not fix the refcount confusion across the pop. Fix: acquire returns an opaque entry token; release takes the token and decrements that exact entry, regardless of cache state. Orphaned (popped) entries close their fh+mmap when their own refcount hits zero. _FileSource updated to pass the token. Regression test test_release_after_path_replacement_does_not_clobber_new_holder added. All 665 geotiff tests pass; GPU path verified. | Pass 6 (2026-05-07) PR #1507: BE pred2 numba TypingError. | Pass 5 (2026-05-06) PR #1506: mmap cache stale after file replace. | Pass 4 (2026-05-06) PR #1501: sparse COG tiles. | Pass 3 (2026-05-06) PR #1500: predictor=3 byte order. | Pass 2 (2026-05-05) PR #1498: predictor=2 sample-wise. | Pass 1 (2026-04-23) PR #1247. Re-confirmed clean over passes 2-7: items 2 (writer always emits LE TIFFs - hardcoded b'II'), 3 (RowsPerStrip default = height when missing), 4 (StripByteCounts missing raises clear ValueError), 5 (TileWidth without TileLength caught by 'tw <= 0 or th <= 0' check at _reader.py:688), 9 (read determinism on compressed+tiled+multiband), 11 (predictor=2 with awkward sample stride round-trips), 18 (compression_level=99 raises ValueError 'out of range for deflate (valid: 1-9)'), 21 (concurrent writes serialize correctly via mkstemp+os.replace), 24 (uint16 dtype preserved on numpy backend, dask honors chunks param), 26 (chunks rounds correctly with remainder chunk for non-tile-aligned). Deferred: item 8 (BytesIO/file-like sources are not supported, source.lower() error) - documented as 'str' parameter, not a bug; item 19 (LERC max_z_error not user-exposed by to_geotiff) - missing feature, not a bug." -glcm,2026-05-01,1408,HIGH,2,"angle=None averaged NaN as 0, masking no-valid-pairs as zero texture; fixed via nanmean-style averaging" -hillshade,2026-04-10T12:00:00Z,,,,"Horn's method correct. All backends consistent. NaN propagation correct. float32 adequate for [0,1] output." -hydro,2026-04-30,,LOW,1,Only LOW: twi log(0)=-inf if fa=0 (out-of-contract); MFD weighted sum no Kahan (negligible). No CRIT/HIGH issues. -interpolate-kriging,2026-06-04,2915,MEDIUM,1,"Cat1 nugget-on-diagonal bug (MEDIUM): _build_kriging_matrix set K[:n,:n]=vario_func(D) where D has 0 diagonal, so vario_func(0)=nugget c0 landed on the matrix diagonal; semivariogram gamma(0)=0 by definition (nugget is the h->0+ limit). Forced exact interpolation of noisy data and biased kriging variance downward. Only bites when fitted nugget>0; existing trend-dominated test data fits ~0 nugget so tests passed. Fix #2915/PR #2922: np.fill_diagonal(G,0.0) in shared host code (all 4 backends consume same K_inv). Cats 2-5 clean: validate_points drops NaN/Inf rows; range floor 1e-12 prevents div blowup; dask map_blocks slices grid coords with correct half-open extents and returns matching block shape (kriging is global, no overlap needed); planar Euclidean distance is expected for kriging (Cat4 n/a); numpy/cupy/dask share one algorithm and parity tests pass rtol=1e-10. CUDA available; all 16 kriging tests pass incl cupy + dask+cupy. Singular-matrix path adds 1e-10*eye Tikhonov term (separate from nugget, unaffected, correct)." -kde,2026-04-13T12:00:00Z,1198,,,kde/line_density return zeros for descending-y templates. Fix in PR #1199. -mahalanobis,2026-05-01,,LOW,1,"LOW: np.linalg.inv (no pinv fallback) returns garbage for near-singular cov without raising. LOW: two-pass mean/cov instead of Welford could lose precision for inputs with very large mean/small variance. No CRIT/HIGH; all four backends use float64 throughout, NaN handled via isfinite, dist_sq clamped non-negative, singular case raises ValueError." -mcda,2026-06-10,3146,MEDIUM,5,"Cat5 backend failures, all raise loudly (no wrong numbers): owa raised on every dask backend (da.sort does not exist in dask.array; fixed via rechunk+map_blocks np.sort with explicit meta) and on cupy (numpy order-weight array * cupy stack); standardize piecewise raised on cupy (cupy.interp needs cupy bp/vl + C-contiguous input) and dask+cupy (np.asarray on cupy chunk), categorical raised on dask+cupy (same asarray); monte-carlo sensitivity raised on cupy/dask+cupy (.values implicit conversion; now Welford accumulates with matching array module). All fixed + GPU tests added (issue #3146). Cats 1-4 clean: Welford already used, AHP Perron eigenvector + Saaty RI table correct, NaN propagation verified across combine ops, no neighborhood/geodesic code. constrain on cupy raises cupy.astype AttributeError = known cupy 13.6 + xarray xr.where incompat (dependency pin, not mcda). CUDA available; cupy + dask+cupy executed for all probes and tests." -morphology,2026-04-30,"1397,1399",HIGH,2;5,HIGH fixed in #1397/PR #1398: morph_erode/dilate seeded centre cell into running min/max even when kernel[centre]==0 (all 4 backends). HIGH fixed in #1399/PR #1400: dask backends raised on 1xN/Nx1 kernels because empty-slice writeback (0:-0). -multispectral,2026-03-30T14:00:00Z,1094,,, -normalize,2026-05-01,,,,rescale and standardize across all 4 backends. NaN/inf filtered via isfinite mask before min/max/mean/std. Constant input handled (range=0 -> new_min; std=0 -> 0.0). Output dtype float64 consistently. Backend parity covered by test_matches_numpy. No accuracy issues found. -perlin,2026-04-10T12:00:00Z,,,,Improved Perlin noise implementation correct. Fade/gradient functions verified. Backend-consistent. Continuous at cell boundaries. -polygon_clip,2026-06-10,3186,HIGH,5,"Cat5 backend inconsistency: dask+cupy clip_polygon rasterizes the mask with a uniform chunk size from the raster's first chunk, then feeds raster+mask to da.map_blocks (positional block pairing). Non-uniform raster chunks gave the mask a different block layout -> IndexError/ValueError (or silent mis-stamp). Repro (8,6) rechunk ((3,5),(6,)) on dask+cupy raised ValueError Shapes do not align; dask+numpy was fine via xarray.where rechunk. Fix #3186/PR: rechunk cond to raster.data.chunks[-2:] before map_blocks; added non-uniform regression tests for dask+numpy and dask+cupy. use_cuda->gpu migration in that branch was already landed by #3089/#3122. CUDA available; cupy+dask+cupy verified, 25 tests pass. Cats 1-4 clean: numpy path uses raster.where, cupy path operates on raw arrays, NaN inputs preserved, no neighborhood ops/curvature. Prior fix #1197/#1200 (crop+all_touched) merged and unrelated." -polygonize,2026-05-29,2606,HIGH,5,"Cat 5 HIGH: dask connectivity=8 cross-chunk merge filled diagonal notch where same-value regions meet only at a corner across a chunk boundary; total area exceeded raster. Hole ring was dropped because containment tested hole[0] (on exterior at pinch). Fixed via _ring_interior_point in PR for #2606. numpy, dask+numpy, dask+cupy area parity now holds; 4-conn was already correct. cupy + dask+cupy paths validated on GPU host. Other cats clean: NaN masked on numpy/cupy float paths (tested), _is_close handles +/-inf via exact-equality short-circuit, atol/rtol/simplify_tolerance reject NaN/inf, integer GPU CCL matches numpy." -proximity,2026-06-09,3108,HIGH,4;5,"Cat5/Cat4: bounded GREAT_CIRCLE dask (numpy+cupy) missed targets across the +/-180 antimeridian seam: _halo_depth sized x-halo as linear parallel-arc sum, but haversine is periodic in lon and chords shorten near poles, so array-space adjacency is no lower bound on spherical distance; numpy/cupy (brute force) found the wrap target (~111 km), dask returned NaN. Fixed in #3108 via chord bound 2R asin(cos(lat_max)|sin(dlon/2)|) + x-axis fold when seam/180-deg chord within max_distance (covers over-pole too). CUDA host: cupy + dask+cupy executed, 417+ tests pass. Cat1-3 clean (float32 output documented; NaN via isfinite consistent; bounds guards correct; tie-break unified in #2881). LOW (not fixed): great_circle_distance uses WGS84 equatorial radius 6378137 as sphere radius (~0.1% vs mean-radius convention) but documented and exposed as param." -rasterize,2026-06-12,3304,HIGH,3,"Cat3 HIGH: merge='sum'/'count' burned the same geometry into a pixel 2-3x on all 4 backends (cross-backend parity probes never caught it because all backends shared the bug). Two sources: all_touched=True polygons overlap the scanline center-fill with the supercover boundary pass and re-burn shared ring-vertex cells (count up to 3 for ONE polygon; rasterio MergeAlg.add gives 1); lines re-burn the connecting vertex of consecutive Bresenham segments (count 2 per interior vertex; rasterio gives 1 even for self-crossing lines). Fix #3304/PR #3313: for sum/count only, enumerate line + boundary cells host-side, dedup per (geometry,row,col), drop boundary cells the geometry's own scanline covers via a crossing-parity replay of the fill's exact float arithmetic on the same edge table, burn survivors through the point kernels (numpy/cupy/dask+numpy/dask+cupy). Coverage unchanged (==merge='last'); points intentionally not deduped (GDAL adds once per point, dup MultiPoint=2 matches rasterio). CUDA available; cupy + dask+cupy executed. 790 rasterize tests + 48 new pass. Cats 1/2/4/5 clean this pass: GPU atomic min/max NaN masks (#2255) correct, GPU ceil emulation matches np.ceil for negatives, no curvature surface, backends bit-identical on mixed-geometry probes. Pre-existing known-OK: Bresenham vs GDAL line coverage tie-breaks (pinned in coverage_2026_06_09 tests); dead all_touched branch in _extract_edges_vectorized (callers never pass all_touched) noted, not removed." -reproject,2026-06-12,3274,HIGH,1;4,"3 confirmed bugs, all kernel-vs-PROJ parity: #3274 HIGH LAEA inverse spurious /rq (2.6 km err for 3035) + _authalic_apa inverse series wrong (4.8 m in AEA/CEA inverses; PROJ 3-term = 1.6 mm), CPU+CUDA kernels both; #3275 HIGH _is_wgs84_compatible_ellipsoid passes R-defined spheres (MODIS sinusoidal 18.9 km err) and _aea_params/_cea_params lack the guard entirely (23.8 km on spherical aea/cea); #3276 MEDIUM itrf helmert scale 1e-9 but PROJ +s is ppm (1e-6), ~23 mm err. Verified clean: merc/emerc/UTM/tmerc/LCC/polar stere (incl lat_ts akm1) forward+inverse <=1e-5 m vs pyproj; resampling kernels NaN handling and GDAL renorm match across numpy/cupy/dask (CUDA run, gpu-vs-cpu 1.3e-7); dask footprint chunk-skip bbox is a superset in all probed cases (no holes). LOW (documented only): _source_footprint_in_target probe array typo uses x-midpoint mx as a latitude in last 3 ys entries (bbox superset, correctness unaffected)." -resample,2026-05-29,2610,HIGH,3;5,"dask interp (nearest/bilinear) overlap depth=1 too small on downsample; block-centered source coord landed past chunk, map_coordinates clamped to edge -> wrong seam rows. Fixed PR #2627 via per-axis _downsample_radius. cupy+dask+cupy verified." -sieve,2026-04-13T12:00:00Z,,,,Union-find CCL correct. NaN excluded from labeling. All backends funnel through _sieve_numpy. -sky_view_factor,2026-05-01,1407,HIGH,4,Horizon angle ignored cell size; fixed by passing cellsize_x/cellsize_y into CPU+GPU kernels and using ground distance -terrain,2026-04-10T12:00:00Z,,,,Perlin/Worley/ridged noise correct. Dask chunk boundaries produce bit-identical results. No precision issues. -terrain_metrics,2026-04-30,,LOW,2;5,"LOW: Inf input not rejected, propagates as Inf (consistent across backends but undocumented). LOW: dask+cupy non-nan boundary path double-pads (wasted compute, central output values still correct). No CRIT/HIGH; tests cover NaN propagation, all 4 backends, all 4 boundary modes, dtype acceptance." -viewshed,2026-05-29,2691,HIGH,3;5,max_distance window sized from coarser axis clipped cells on anisotropic rasters (PR #2702). LOW unfixed: distance_sweep ring radius same max(res) pattern but max_distance arg always None; _calculate_event_row_col line 880 abs(x>1) precedence bug is a broken guard only. cuda+rtx paths validated. -visibility,2026-04-13T12:00:00Z,,,,"Bresenham line, LOS kernel, Fresnel zone all correct. All backends converge to numpy." -worley,2026-05-01,,MEDIUM,2;5,"MEDIUM: numpy backend uses np.empty_like(data) so integer input dtype produces integer output (distances truncated to 0); cupy/dask paths always produce float32. LOW: freq=inf produces 100000 sentinel (sqrt of initial min_dist=1e10), no validation of freq/seed for non-finite values." -zonal,2026-05-27,2528,MEDIUM,5,"Pass 2 (2026-05-27): MEDIUM fixed -- issue #2528. zonal_stats() on dask-backed inputs silently dropped 'majority' from the requested stats list. The mutable default stats_funcs included 'majority' (added in commit 7c8d5759), but the dask path filtered it out at xrspatial/zonal.py:459 (computed_stats = [s for s in stats_funcs.keys() if s in stats_dict]) because 'majority' is not in _DASK_BLOCK_STATS. Symptom: stats(zones=dask, values=dask) returned 7 columns instead of the 8 the docstring promises; stats(..., stats_funcs=['mean','majority']) returned only ['zone','mean'] with no error or warning. Both dask+numpy and dask+cupy were affected (dask+cupy delegates to dask+numpy). Fix: replaced the mutable list literal default with stats_funcs=None and resolved the default per backend inside the function -- numpy/cupy get the full 8-stat list, dask gets the 7-stat subset (no majority). Explicit majority on dask now raises ValueError with a clear supported-stats message instead of silently filtering. 4 regression tests in test_zonal.py: explicit majority raises on dask, bare default omits majority on dask, bare default keeps majority on numpy, default list is not mutated across calls (covers the historical mutable-default pitfall). All 129 test_zonal.py tests pass (125 pre-existing + 4 new); test_dasymetric.py 61 tests still pass (dasymetric uses zonal.stats internally). Categories: Cat 5 (backend inconsistency: numpy/cupy honoured majority; dask paths silently dropped it). | Pass 1 (2026-03-30T12:00:00Z): historical entry #1090." +module,last_inspected,issue,severity_max,categories_found,notes +aspect,2026-06-02,2827,MEDIUM,5,"Cat5 backend divergence: planar cupy _gpu snapped aspect>359.999 to 0 (no such clamp in numpy _cpu, whose range is [0,360) and never reaches 360), so cupy/dask+cupy disagreed with numpy by ~360 on near-degenerate gradients (gx~0+, gy>0). Removing the clamp exposed a 2nd divergence: GPU used coarse 57.29578 vs numpy 180/pi, flipping the >90 compass branch and yielding exact 360 vs 0 on uint32/uint64 random data. Fix #2827/PR #2833: GPU reuses RADIAN and wraps >=360 back to [0,360). Cats 1-4 clean; geodesic path canonicalizes consistently on CPU+GPU and was left untouched. CUDA available; cupy+dask+cupy verified (235 tests pass, numpy-vs-cupy max abs diff 0 over 360 rasters). Dedup: prior aspect fixes #2780 (cellsize)/#2774 (dask mem guard)/#2781 (oracle) all merged and unrelated. Note: PR review COMMENT could not be posted to GitHub (auto-mode permission denial); findings recorded in PR run instead." +balanced_allocation,2026-04-14T12:00:00Z,1203,,,float32 allocation array caused source ID mismatch for non-integer IDs. Fix in PR #1205. +bilateral,2026-05-01,,,,"No CRIT/HIGH/MEDIUM. Sigma underflow validated via sqrt(tiny) bound; oversize sigma clamped. float64 throughout numpy/cupy. NaN center returns NaN; NaN neighbors skipped (denom not incremented). w_sum>0 guard avoids div-by-zero. map_overlap depth==kernel radius. CUDA bounds correct. Inf input could yield 0*inf=NaN in v_sum but unvalidated input is general xrspatial pattern, not bilateral-specific." +contour,2026-05-01,,,,"Marching squares correct: NaN check uses self-inequality, loop bounds (ny-1,nx-1) cover all quads, dask overlap depth=1 matches 2x2 stencil, float64 cast consistent across backends, saddle disambiguation via center value. No CRIT/HIGH issues; minor LOW (Inf inputs not specifically rejected) not flagged." +corridor,2026-05-01,,LOW,1,"LOW: corridor inherits float32 from cost_distance; for very large accumulated costs, normalized = corridor - corridor_min loses precision near min (intrinsic to upstream dtype, not corridor itself). NaN handling correct (skipna min, np.isfinite check before normalize). All 4 backends route through pure xarray arithmetic; threshold uses dask/cupy/numpy where with try/except dispatch. No CRIT/HIGH issues." +cost_distance,2026-04-13T12:00:00Z,1191,,,CuPy Bellman-Ford max_iterations = h+w instead of h*w. Fix in PR #1192. +curvature,2026-03-30T15:00:00Z,,,,Formula matches ArcGIS reference. Backends consistent. No issues found. +dasymetric,2026-04-14T12:00:00Z,,,,Mass conservation correct. Weighted/binary/limiting_variable all verified. Pycnophylactic Tobler algorithm correct. +diffusion,2026-05-01,,LOW,1;2;5,"LOW: no Kahan summation across long iterations (drift over 100k steps, standard for explicit Euler); lap=n+s+w+e-4*val has catastrophic cancellation for nearly-uniform large values; res=0 in attrs causes div-by-zero (no guard); dask+cupy boundary='nan' relies on dask accepting cp.nan as fill. CPU/GPU NaN handling consistent (np.isnan vs val!=val). depth=1 matches stencil radius. Memory guards, CFL check, step cap all in place. No CRIT/HIGH." +edge_detection,2026-05-01,,,,Thin wrappers around convolve_2d with fixed Sobel/Prewitt/Laplacian kernels; no issues found +emerging_hotspots,2026-04-30,,MEDIUM,2;3,MEDIUM: threshold_90 uses int() (truncation) instead of ceil() so n_times=11 requires only 9/11 (81.8%) instead of 90%. MEDIUM: NaN time steps produce gi_bin=0 which classifier counts as 'non-significant' rather than missing; threshold_90 uses full n_times not valid count. LOW: 'global_std == 0' check does not catch NaN std for fully/mostly NaN inputs. +fire,2026-04-30,,,,All ops per-pixel (no accumulation/stencil/projected distance). NaN handled via x!=x; CUDA bounds use strict <; rdnbr and ros divisions guarded; CPU/GPU/dask paths algorithmically identical. No accuracy issues found. +flood,2026-04-30,,MEDIUM,2;5,"MEDIUM (not fixed): dask backend preserves float32 input dtype while numpy promotes to float64 in flood_depth and curve_number_runoff; DataArray inputs for curve_number, mannings_n bypass scalar > 0 (and CN <= 100) range validation, silently producing NaN/garbage." +focal,2026-06-10,3214,MEDIUM,1;5,"mean() dtype divergence: numpy/dask+numpy cast to float64 (astype(float)) while cupy/dask+cupy forced float32, so output dtype was backend-dependent and float64 rasters lost precision on GPU (offset 1e7: GPU error 0.58 > true spread 0.42, same class as fixed #2831). mean() was left out of the #2769 _promote_float contract that apply/focal_stats follow. Fix #3214: _promote_float in mean(), drop hardcoded cupy.float32 in _mean_cupy/_mean_dask_cupy, excludes cast to working dtype for cross-backend match parity. CUDA available; all 4 backends executed (245 focal tests pass incl new 3214 dtype tests). Cats 2-4 clean: GPU kernels two-pass std/var (#2831 fix verified), NaN checks via v!=v, map_overlap depths == kernel radius, Gi* validated against reference test. LOW (documented, not fixed): mean() excludes mask only the center pixel; excluded sentinel values (e.g. -9999) still contribute to neighboring cells' means on all backends -- docstring says 'left unchanged rather than averaged', backend-consistent." +geotiff,2026-06-14,3331,MEDIUM,2,"Pass 28 (2026-06-14, deep-sweep accuracy): MEDIUM fixed -- issue #3331 / PR #3332. Direct GeoTIFF read accepted a zero or non-finite ModelPixelScale (33550) / ModelTransformation (34264) diagonal: _geotags._extract_transform read sx=scale[0]/sy=scale[1] (and m[0]/m[5]) with no finite-nonzero check, then coords_from_pixel_geometry built arange(N)*pixel_width+origin, so a zero scale collapsed the whole axis onto the origin (constant, non-georeferenced coords) and a NaN/Inf scale produced an all-NaN/all-Inf axis -- a degenerate raster returned with no error on all four backends. Asymmetric: the VRT read path already rejected zero res_x/res_y (VRTUnsupportedError, _vrt_validation.py:202, comment notes the eager read 'would currently surface this as an opaque coord error' -- but it actually surfaced nothing) and the writer rejected zero-step coords (NonUniformCoordsError, _validation.py:1361). Fix adds DegeneratePixelSizeError (GeoTIFFAmbiguousMetadataError subclass, exported + docs table) and a _check_finite_nonzero_pixel_size guard in _extract_transform covering the scale-only, tiepoint+scale, and ModelTransformation paths; the unit-scale tiepoint fallback (literal 1.0) and the allow_rotated no-georef path are untouched. Guard sits in the shared extract path so numpy/dask/gpu/dask+gpu all reject identically -- verified all four locally on a degenerate fixture (CUDA available). NaN ModelTransformation diagonal still rejected: the rotation-tol check short-circuits on NaN (x>NaN is False) and falls through to the new guard. 29 tests in tests/unit/test_degenerate_pixel_size_3331.py (zero/NaN/+-Inf x 3 paths, subclass contract, positive controls, end-to-end open_geotiff). Existing geotiff unit+read suites: 2271 passed, 5 skipped; flake8+isort clean. Scope this pass: numerical core re-audit (overview kernels, coords transform math, decode predictor/nodata, dtypes) via 3 parallel readers -- overview float32-vs-float64 mean accumulator 'divergence' DISMISSED (the ngjit kernel float path and the numpy nanmean float path are mutually exclusive: float mean/min/max/median route to the kernel and return early, _overview.py:340-353); coords single-neighbor pixel-size recovery LOW (standard, negligible, bypassed by transform attr); decode/predictor/nodata clean (#3098 int-sentinel-before-float64-promote pattern not repeated). cuda-available. | Pass 27 (2026-06-12, deep-sweep): HIGH fixed -- issue #3260. to_geotiff(pack=True) cast the packed values to the integer dtype from attrs['mask_and_scale_dtype'] with no range or finiteness check: finite values outside the dtype range wrapped in the astype (4000.0 with SCALE=0.1 on int16 packs to 40000 and landed on disk as -25536, unpacking to -2553.6) and +/-Inf cast to a platform-defined integer (0 on linux/x86), all silently on all four backends; dask deferred the cast into the write's compute so there was no warning at all. Internally inconsistent: the adjacent NaN-no-sentinel guard exists precisely because 'the astype below would silently wrap'. Fix adds _pack_guard_int_range after the round, before the cast (eager raises at call time, dask via map_blocks from the write's single compute, mirroring #3235); exclusive upper bound iinfo.max+1 stays exact in float64 so int64/uint64 reject exactly-2**63 instead of wrapping. Also fixed en route: eager cupy no-sentinel integer pack crashed with TypeError in bool(out.isnull().any()) (implicit cupy->numpy); now routed through the cupy-safe _pack_guard_no_nan. 15 new tests in tests/write/test_pack_range_guard_3260.py covering all four backends (CUDA available, gpu legs executed), boundary iinfo.min/max round trips, round-back-into-range, uint underflow, and the 2**63 float64 bound. Scope this pass: post-2026-06-09 commits only (pack/unpack #3174/#3175/#3239/#3240/#3241, VRT offsets #3135, GPU streaming writer) since Pass 26 covered the rest 3 days earlier; overview kernels, _coords transform math, _decode predictor/orientation/LERC fill, and _nodata lifecycle re-read with no new findings; GPU streaming writer reviewed (per-band NaN rewrite and tile-row alignment mirror the full-array path). cuda-available. | Pass 26 (2026-06-09, deep-sweep): MEDIUM fixed -- issue #3098. _apply_eager_nodata_mask in _attrs.py compared the integer nodata sentinel AFTER promoting the buffer to float64, so int64/uint64 sentinels above 2**53 (INT64_MAX/UINT64_MAX) swallowed up to 512/1024 nearby valid values into NaN on the numpy-eager and cupy-eager backends, while the dask per-chunk mask (_delayed_read_window), the GPU GDS chunk path (_apply_nodata_mask_gpu), and the VRT path all compare at native integer width and masked only exact hits. Reproduced end-to-end: int64 file with nodata=INT64_MAX and values INT64_MAX-1..-513 gave 4 NaN eager vs 1 NaN dask. Same function was internally inconsistent: the mask_nodata=False pixels_present scan already compared at native width. Fix computes the mask at source dtype width before the float64 promotion (promotion itself stays unconditional per #2990); one site fixes both numpy and cupy eager since GPU routes through the helper via duck typing. 5 regression tests in tests/read/test_nodata.py (int64 exact-hit, eager-vs-dask parity, uint64, near-sentinel-no-hit pixels_present, gpu eager); verified on all 4 backends with CUDA. Also audited this pass with no findings: overview reduce kernels CPU vs GPU (empirical parity run incl. float32 median midpoint analysis: RN(a+b)/2 == RN((a+b)/2) so no divergence), unpack/pack scale-offset paths (#3075/#3065, mask-before-scale ordering consistent eager/dask, dask+gpu reuses CPU dask graph), bbox-to-window floor/ceil (GDAL touched semantics), VRT nearest mapping floor((out+0.5)*src/out) and Int64 nodata native-width round-trip, predictor 2/3 GPU kernels (lossless), writer NaN-to-sentinel gates. cuda-available; GPU paths executed, not just reviewed. | Pass 25 (2026-05-15): HIGH fixed -- issue #1975. _block_reduce_2d's cubic branch in xrspatial/geotiff/_writer.py gated the sentinel-to-NaN mask on arr2d.dtype.kind=='f', so to_geotiff(cog=True, overview_resampling='cubic', nodata=) on an integer raster fell through to an unmasked zoom(arr2d, 0.5, order=3). The bicubic spline blended the sentinel (e.g. -9999) into neighbouring valid cells; cast back to the source integer dtype, the boundary pixels surfaced as silent garbage. Reproduction (1024x1024 int16 + 256x256 nodata corner + nodata=-9999): lvl1 boundary [128, 124:132] showed [1082, 1082, 1085, 1134, 5, 93, 100, 100] instead of [-9999/NaN, ..., 100, 100, 100, 100]; max poisoned value 1134 (11x the actual data value of 100) and min -11104 (below the sentinel -9999). Same root cause as #1623 (float cubic + nodata) but for the integer dtype branch. Both CPU and GPU writers affected because _block_reduce_2d_gpu's cubic path falls back to _block_reduce_2d on CPU. Fix mirrors the float branch: promote the cropped block to float64, mask sentinel to NaN via the integer-range guard (mirrors _int_nodata_in_range), run scipy.ndimage.zoom(prefilter=False), rewrite NaN back to the sentinel, then np.round(...).astype(source_int_dtype) so the integer cast is well-defined. 12 regression tests in test_cog_cubic_int_overview_nodata_1975.py: helper-level cubic per int dtype (int16, uint16, int32), no-nodata regression, out-of-range sentinel no-op, fractional sentinel no-op, all-sentinel block fallback, float cubic regression guard, end-to-end 1024x1024 round-trip, non-constant int regression, cubic-vs-mean sentinel-mask parity, and GPU/CPU byte parity. All 3186 non-stale geotiff tests still pass (2 pre-existing failures unrelated: test_predictor2_big_endian_gpu references the hidden read_to_array symbol, and test_size_param_validation_gpu_vrt_1776 asserts pre-#1767 tile_size=4 behaviour). Categories: Cat 1 (precision loss from cubic spline blending sentinel into valid cells) + Cat 2 (NaN-equivalent corruption: the read-side int-to-NaN mask only catches exact sentinel hits, so the poisoned values survive as legitimate measurements) + Cat 5 (backend parity: CPU and GPU writers shared the same wrong cubic path). | Pass 23 (2026-05-14): HIGH fixed -- issue #1847. extract_geo_info parsed GDAL_NODATA via float() unconditionally, which loses 1 ULP on uint64 max (2**64-1) and int64 max (2**63-1). The downstream integer-mask gate info.min <= int(nodata) <= info.max then rejects the cast because float-rounded sentinel is one above the dtype max; the sentinel pixel survives as a literal valid integer instead of NaN. Same float-only parse in _reader._resolve_masked_fill (LERC fill) and _reader._sparse_fill_value (SPARSE_OK fill). VRT _vrt._parse_band_nodata had already fixed this for the XML parse path (PR #1833) but TIFF source-of-truth was never updated, so write_vrt([uint64.tif]) stringified the float-parsed nodata as '1.8446744073709552e+19' into XML where the VRT reader then rejected it for being out of range. Fix: lift the int-first parse into shared helper _parse_nodata_str in _geotags.py and reuse across the three TIFF-side sites. The helper tries int(text) first to preserve full precision, falls back to float(text) for NaN/Inf/scientific/fractional. Downstream gates already handle int values transparently because np.isfinite(int) works and int(int) is a no-op. 25 regression tests in test_nodata_int64_precision_1847.py: unit-level _parse_nodata_str matrix (int vs float branches, edge cases), eager open_geotiff (uint64 max / int64 max / int64 min / uint16 / int32 / float regression guards), read_geotiff_dask (uint64 max, int64 max), write_vrt + read_vrt round-trip with XML literal assertion, and a GPU parity test. All 2434 non-stale geotiff tests still pass (1 pre-existing test_size_param_validation_gpu_vrt_1776 failure unrelated -- test asserts pre-#1767 tile_size=4 behaviour). Categories: Cat 2 (NaN propagation: sentinel pixel survived as literal valid number on all 4 backends) + Cat 5 (backend inconsistency: VRT XML parse path handled 64-bit sentinels via _parse_band_nodata but TIFF parse path did not, even though write_vrt fed the latter into the former). Audited but did not file: LOW silent kwarg drop -- to_geotiff(da, 'out.vrt', photometric='miniswhite') drops the photometric arg at _write_vrt_tiled call (per-tile files written as MinIsBlack). Data round-trips correctly because no inversion happens on either side; only the tile photometric tag disagrees with the user's request. Niche path + no data corruption + metadata-only drift = LOW, not filed. | Pass 22 (2026-05-13): HIGH fixed -- issue #1809. MinIsWhite (photometric=0) inversion ran before the sentinel-to-NaN nodata mask on all four backends (eager numpy in open_geotiff, dask chunk reader, eager GPU in read_geotiff_gpu, GPU stripped fallback). Because the inversion rewrites the original sentinel value (e.g. uint8 nodata=0 becomes 255, float32 nodata=-9999 becomes 9999), the post-inversion mask matched the wrong pixels: cells whose stored value happened to equal iinfo.max - sentinel were flagged NaN while real sentinel cells survived as inverted values. PR #1804 (a5d78e4) had refactored the helper but kept the original ordering. Fix: introduce _miniswhite_inverted_nodata in _reader.py and stash the inverted sentinel on geo_info._mask_nodata; route every backend mask through that field, keeping geo_info.nodata + attrs[nodata] at the original value for write-side round-trip. Dask path also re-inverts the closure nodata at graph-build time, picking up _ifd_photometric / _ifd_samples_per_pixel stashed in _read_geo_info. 9 regression tests in test_miniswhite_nodata_1809.py cover uint8 nodata=0, uint16 nodata=65535, float32 nodata=-9999 across numpy, dask, and GPU backends plus no-collision and no-nodata controls. All 2424 non-stale geotiff tests pass (4 pre-existing failures unrelated to this fix). Categories: Cat 2 (NaN propagation: real data became NaN while sentinel survived as inverted value) + Cat 5 (backend inconsistency: all four backends share the identical wrong result, so they agreed on the wrong answer rather than diverged). | Pass 21 (2026-05-13): MEDIUM fixed -- issue #1774. open_geotiff / read_geotiff_dask / _apply_nodata_mask_gpu crashed with ValueError: cannot convert float NaN to integer when reading an integer TIFF whose GDAL_NODATA tag was the string ""nan"" / ""inf"" / ""-inf"". Three sites in xrspatial/geotiff/__init__.py called int(nodata) on the integer-dtype branch without first checking np.isfinite. _geotags.py:extract_geo_info parses the GDAL_NODATA tag through float(nodata_str) so a ""nan"" tag surfaces as Python NaN; the integer mask code then explodes. Sibling helpers _resolve_masked_fill and _sparse_fill_value in _reader.py already gate on not math.isnan(v) and not math.isinf(v) (the unfinished pass of #1581). Fix: gate each int(nodata) cast on np.isfinite(nodata). A non-finite sentinel on an integer file cannot match any pixel, so the mask is a no-op and the file dtype is preserved; attrs['nodata'] still carries the raw NaN/Inf sentinel so a write round-trip keeps the original GDAL_NODATA tag. The read_geotiff_dask effective_dtype branch already used try/except and was safe in practice, but tightened with the same isfinite gate for readability. 15 regression tests in test_nodata_nan_int_1774.py covering eager numpy (3 NaN variants + 6 Inf variants), in-range finite still masks regression guard, dask (NaN + Inf), and GPU (NaN + Inf + finite). All pass; 2023 existing geotiff tests still pass (7 pre-existing test_predictor2_big_endian_gpu failures unrelated: they reference xrspatial.geotiff.read_to_array which was hidden from the public namespace in #1708, 3 pre-existing matplotlib palette failures in test_features.py unrelated). Categories: Cat 2 (NaN propagation: NaN nodata produced a crash instead of being treated as missing) + Cat 5 (backend inconsistency: _resolve_masked_fill / _sparse_fill_value already guarded; the three __init__.py sites did not). | Pass 20 (2026-05-12): HIGH fixed -- PR #1691 (no issue created; agent harness blocked gh issue create). Integer COG overview pyramid mixed sentinel into reduced pixels. _block_reduce_2d (_writer.py:258-264) and _block_reduce_2d_gpu (_gpu_decode.py:3027-3028) promoted integer blocks to float64 but never masked the sentinel to NaN before nanmean / nanmin / nanmax / nanmedian. The reduction averaged the sentinel into surrounding valid cells (e.g. (-9999 + 100 + 100 + 100)/4 = -2425 cast back to int16), producing overview pixels that the read-side int-to-NaN mask in open_geotiff couldn't recover because they didn't equal the sentinel. Silent garbage at every zoom above level 0 for to_geotiff(int_data, cog=True, nodata=N). Methods affected: mean, min, max, median; nearest/mode safe (no averaging). Fix: gate the sentinel-to-NaN mask on representability in the source integer dtype (mirrors _int_nodata_in_range in _reader.py) so uint16+GDAL_NODATA=""-9999"" stays a no-op; rewrite all-sentinel-block NaN back to sentinel before the integer dtype cast so the cast is well-defined (the caller's post-overview loop in write() only runs for floats). GPU mirror gets the same path with cupy.where + cupy.isnan for byte parity with CPU. 38 regression tests in test_cog_int_overview_nodata_2026_05_12.py: _block_reduce_2d per-dtype/per-method matrix (uint8/uint16/int16/int32 x mean/min/max/median), all-sentinel-block, no-nodata regression, out-of-range sentinel no-op, end-to-end uint16 + int16 round-trip, 3-band integer COG, GPU per-dtype/per-method matrix, CPU/GPU byte-match parity. All 1606 existing geotiff tests still pass. Categories: Cat 1 (precision/representation loss in nan-aware reduction) + Cat 2 (silent NaN-equivalent corruption from sentinel poisoning) + Cat 5 (backend parity between float and integer code paths within the same writer). Deferred LOW: HTTP COG path (_read_cog_http at _reader.py:1638) skips the band-range validation that local/dask/GPU added in #1673; band=-1 silently selects the last channel on HTTP while local raises IndexError. Cat 5, MEDIUM-leaning but separate concern from the overview fix; one-finding-per-PR per project policy. | Pass 19 (2026-05-12): MEDIUM fixed -- issue #1655. read_vrt silently dropped 0 on a SimpleSource because of src.nodata or nodata at _vrt.py:370. Python treats 0.0 as falsy, so the per-source sentinel fell through to the band-level (or None when missing) and pixels equal to 0.0 in the source file survived as valid data. The in-code comment acknowledged the quirk as backward compat, but the resulting behaviour silently biased every NaN-aware aggregation on VRT mosaics whose sources used 0 as a sentinel (a common convention for unsigned remote-sensing imagery). Fix: src_nodata = src.nodata if src.nodata is not None else nodata. Five regression tests in test_vrt_source_nodata_zero_1655.py covering source NODATA=0, integer XML literal, non-zero unchanged, band-level NoDataValue=0 still honoured, and source-overrides-band precedence. All 100 vrt-related geotiff tests still pass; 3 pre-existing test_features.py matplotlib palette failures unrelated. Categories: Cat 2 (NaN propagation) + Cat 5 (backend inconsistency: read_geotiff masks 0 correctly when GDAL_NODATA tag is set; only VRT path was broken). | Pass 18 (2026-05-11): MEDIUM fixed -- issue #1642. PR #1641 (issue #1640) inherited level-0 georef on overview reads but kept the level-0 origin_x/origin_y unchanged. That is correct for PixelIsArea (origin = upper-left corner of pixel (0,0)) but wrong for PixelIsPoint (origin = center of pixel (0,0), GeoKey 1025 = 2). For a 1024x1024 PixelIsPoint COG with 10 m pixels and origin (0, 0), open_geotiff(overview_level=1) returned x[:3]=[0,20,40] instead of [5,25,45] (level-1 pixel 0 covers level-0 pixels 0-1 whose centers are 0 and 10, centroid 5); same for y. Downstream sel/interp/reproject silently snaps to the wrong pixel for any DEM-style PixelIsPoint COG (USGS, OpenTopography, Copernicus DEM). Categories: Cat 3 (off-by-one / boundary handling) + Cat 5 (raster_type-dependent backend convention). Fix: in extract_geo_info_with_overview_inheritance (_geotags.py), pick the effective raster_type first (overview-declared if non-default, otherwise inherited from parent), then when it is PixelIsPoint apply origin_shift = (scale - 1) * 0.5 * pixel_size_lvl0 along each axis before building the new GeoTransform. PixelIsArea path is byte-equivalent. 13 regression tests in test_overview_pixel_is_point_1642.py: centroid identity across all 4 backends, transform tuple across all 4 backends, uniform grid step, unit-level helper tests for both raster_types via stubbed extract_geo_info, own-geokeys-not-clobbered path on PixelIsPoint, and a PixelIsArea regression check. All 1397 existing non-network geotiff tests still pass (3 pre-existing matplotlib palette failures unrelated). Deferred LOW: non-power-of-two overview dimensions cause scale = base_w/ov_w to diverge from the true 2^level reduction (writer drops the right/bottom strip via h2=(h//2)*2; for h=1023 a level-1 overview has 511 rows so scale=2.0019 not 2.0). Fix would need to either (a) emit explicit geo tags on overview IFDs from the writer or (b) pass the level number into the inheritance helper; neither is a one-line change and the resulting coord error is sub-pixel of level 0. | Pass 17 (2026-05-11): MEDIUM fixed -- issue #1634. open_geotiff eager path windowed read produced confusing CoordinateValidationError when window extended past source extent. read_to_array clamped the window internally and returned a smaller array, but the eager code path used unclamped window indices for y/x coord generation (xrspatial/geotiff/__init__.py lines 562-572), so the coord array length differed from the data and xarray refused to construct the DataArray. Same bug affected the windowed transform shift in _populate_attrs_from_geo_info. The dask path (read_geotiff_dask) already validated up front since #1561, raising a clear ValueError with the format 'window=... is outside the source extent (HxW) or has non-positive size.' so the two backends diverged on the contract. Fix: validate the window up front in open_geotiff's eager branch via _read_geo_info (metadata-only read, no extra pixel cost) using the exact same condition the dask path uses, raising the same ValueError message format. Reproduction: 10x10 raster + window=(5,5,15,15) on eager raised CoordinateValidationError('conflicting sizes ... length 5 ... length 10'); now raises ValueError('window=(5, 5, 15, 15) is outside the source extent (10x10) or has non-positive size.'). Categories: Cat 3 (off-by-one / boundary handling) + Cat 5 (backend inconsistency). 12 regression tests in test_window_out_of_bounds_1634.py: negative start, past-right-edge, past-bottom-edge, past-both-edges, zero-size, inverted window, full-extent ok, interior subset, edge-aligned, eager-vs-dask parity, message-format parity, issue reproducer. All 1286 existing non-network geotiff tests still pass. | Pass 16 (2026-05-11): HIGH fixed -- issue #1623. to_geotiff(cog=True, overview_resampling='cubic', nodata=) on a float raster with NaN regions produced overview pixels with severe ringing artefacts near nodata borders. Same class of bug as #1613 but for the cubic branch: writer rewrites NaN to the sentinel upstream, then _block_reduce_2d(method=cubic) handed the sentinel-poisoned array straight to scipy.ndimage.zoom(order=3). The cubic spline blended the sentinel (e.g. -9999) into neighbouring cells, producing values like 1133.44, -10290.08 where the data was a constant 100. Repro on 16x16 float32 with a 4x4 NaN corner showed 18 polluted pixels in the 8x8 overview. Fix: when nodata is supplied on a float dtype and the sentinel is found, mask sentinel to NaN, run cubic with prefilter=False so a single NaN cannot poison the entire row/column (default B-spline prefilter is global), then rewrite any NaN in the result back to the sentinel. prefilter=False only fires when a sentinel is present so the non-nodata cubic semantics are unchanged. GPU side: _block_reduce_2d_gpu previously raised on method='cubic'; added a CPU fallback (same pattern as 'mode') so GPU writer produces byte-equivalent overviews. GPU_OVERVIEW_METHODS now includes 'cubic'. 12 regression tests in test_cog_cubic_overview_nodata_1623.py (helper no-ringing, poisoning repro, no-nodata unchanged, end-to-end round-trip, GPU fallback, CPU/GPU byte-match, +/-inf nodata mask, NaN-sentinel no-op, GPU_OVERVIEW_METHODS contract). All 1256 existing geotiff tests still pass (3 pre-existing matplotlib failures unrelated). | Pass 15 (2026-05-11): HIGH fixed -- issue #1613. to_geotiff(cog=True, nodata=) on a float raster with NaN produced a corrupted overview pyramid. The NaN-to-sentinel rewrite in __init__.py:1202 (CPU) and :2852 (GPU write_geotiff_gpu) ran BEFORE _make_overview / make_overview_gpu, so the nan-aware aggregations (np.nanmean/min/max/median, cupy.nanmean/min/max/median) saw the sentinel as a real number and biased every overview pixel. Reproduction with -9999 sentinel produced [[-4998.75,-4997.75],..] where np.nanmean gives [[1.5,3.5],..]. Both CPU and GPU paths affected; backend results matched each other but were both wrong (CAT 2 NaN propagation + CAT 5 documents the parity). Fix: _block_reduce_2d / _block_reduce_2d_gpu accept a nodata kwarg that masks the sentinel back to NaN for float dtypes before the reduction; the writer's overview loop passes nodata in, then rewrites all-sentinel reductions (which surface as NaN from the reducer) back to the sentinel for the on-disk pyramid. 11 regression tests in test_cog_overview_nodata_1613.py (CPU mean / partial-block / min/max/median / no-nodata passthrough / helper kwarg / all-sentinel block / GPU mean / GPU helper / CPU-GPU agreement). All 235 nodata/overview/cog tests still pass. | Pass 14 (2026-05-11): HIGH fixed -- issue #1611. read_vrt(band=None) on a multi-band integer VRT with per-band tags only masks band 0's sentinel. __init__.py lines 2795-2809 in read_vrt apply vrt.bands[0].nodata to the full ndim==3 array; bands 1+ keep their integer sentinels as literal finite values (e.g. 65000 surfaces as 65000.0 after the dtype=float64 cast, not NaN). Float-VRT path masks per-band correctly in _vrt._read_data lines 296-297 + 347-351. PR #1602 fixed the single-band band=N case for issue #1598; the band=None multi-band case is the same class of bug. Repro: 2-band uint16 VRT with NoDataValue 65535 / 65000 returns r.values[1,1,1] == 65000.0 instead of NaN; r.values[1,1,0] is NaN (band 0 sentinel masked). Fix scope: in read_vrt, when band is None, iterate over vrt.bands and mask each arr[..., i] slice against its own (gated by the same _int_nodata_in_range guard PR #1583 introduced). Severity HIGH (Cat 2 NaN propagation + Cat 5 backend inconsistency: identical input semantics produce different masking outcomes based on dtype, with finite garbage values where NaN expected). Fix in PR #1612: walks vrt.bands when band is None and ndim==3, masks each arr[..., i] slice against its own via the refactored _sentinel_for_dtype helper (reuses PR #1583's range guard so out-of-range/non-finite/fractional sentinels are a no-op). attrs['nodata'] still carries band 0's sentinel for band=None reads (documented contract). 7 regression tests in test_vrt_multiband_int_nodata_1611.py: uint16 per-band, int32 negative, mixed presence, dtype preservation when no sentinel hit, out-of-range gating, band=N non-regression, attrs contract. 135 existing vrt/nodata geotiff tests still pass. | Pass 13 (2026-05-11): HIGH fixed -- issue #1599. write_geotiff_gpu (and to_geotiff gpu=True) emitted raw NaN bytes for missing pixels even when nodata= was supplied, while the CPU writer substituted NaN with the sentinel before encoding. xrspatial-only round-trips were unaffected (the reader masks both NaN and the sentinel), but external readers (rasterio/GDAL/QGIS) that mask only on the GDAL_NODATA tag saw NaN pixels as valid data -- rasterio reported 100% valid pixels on a 25-NaN file vs CPU's 25-invalid report. Root cause: __init__.py lines 2579-2587 jumped from shape/dtype resolution straight to compression, missing the equivalent of the CPU writer's NaN-to-sentinel rewrite at to_geotiff line ~1156. Fix: cupy.isnan + masked write on a defensive copy of arr, gated on np_dtype.kind=='f' and not np.isnan(float(nodata)). Caller's CuPy buffer preserved (copy before mutate). 7 regression tests in test_gpu_writer_nan_sentinel_1599.py: substitution lands as sentinel, CPU/GPU byte-equivalent, caller buffer not mutated, no-NaN no-op, NaN sentinel skips substitution, rasterio sees identical invalid count on CPU/GPU, multiband 3D path. All other GPU writer tests still pass (50 passed across band-first, attrs, nodata, dask+cupy, writer, nodata aliases). | Pass 12 (2026-05-11): HIGH fixed -- issue #1581. Reading a uint TIFF with a negative GDAL_NODATA sentinel (e.g. uint16 + -9999) raised OverflowError on every backend because the nodata-mask code did arr.dtype.type(int(nodata)) with no range check. Three identical cast sites in __init__.py (numpy eager, _apply_nodata_mask_gpu, _delayed_read_window) plus _resolve_masked_fill and _sparse_fill_value in _reader.py. Fix: _int_nodata_in_range helper gates the cast; out-of-range sentinels are a no-op for value matching (the file can never contain that value), file dtype is preserved, attrs['nodata'] still surfaces the original sentinel so write round-trips keep the GDAL_NODATA tag intact. Matches rasterio behavior. 8 regression tests in test_nodata_out_of_range_1581.py cover the helper, both eager and dask read paths, in-range sentinel non-regression, and GPU helper (cupy-gated). | Pass 11 (2026-05-10): CLEAN. Audited the one additional commit since pass 10 -- #1559 (PR 1548, Centralise GeoTIFF attrs population across all read backends). Refactor extracts _populate_attrs_from_geo_info helper and routes eager numpy, dask, GPU stripped, GPU tiled read paths through it; before the fix dask only emitted crs/transform/raster_type/nodata while numpy emitted the full attrs set including x/y_resolution, resolution_unit, image_description, extra_samples, GDAL metadata, and the CRS-description fields. No data-path arithmetic touched; only attrs dict population. Windowed origin math (origin_x + c0*pixel_width, origin_y + r0*pixel_height) verified to produce -98.0 / 48.75 origin for window=(10,20,50,70) on a (0.1,-0.125) pixel-size raster, with PixelIsArea half-pixel offset preserved on coord lookups (-97.95, 48.6875). Cross-backend attrs parity re-verified: numpy/dask/cupy all emit identical key set on deflate+predictor3+nodata round-trip (crs, crs_wkt, nodata, transform, x_resolution, y_resolution). Data bit-parity re-verified across numpy/dask/cupy on same payload (np.array_equal with equal_nan=True). test_attrs_parity_1548.py (5 tests), test_reader.py/test_writer.py/test_dask_cupy_combined.py (25 tests), GPU orientation/predictor2-BE/LERC-mask/nodata/byteswap suites (65 tests) all green. No accuracy or backend-divergence findings. | Pass 10 (2026-05-10): CLEAN. Audited 5 recent commits: #1558 drop-defensive-copies (frombuffer path still .copy()s before in-place predictor decode at _reader.py:778), #1556 fp-predictor ngjit (writer pre-ravels so 1-D slice arg is correct, float32/64 LE+BE bit-exact), #1552 batched D2H (OOM guard fires before cupy.concatenate, host_buf offsets correct), #1551 parallel-decode gate (>= vs > sends 256x256 default to parallel path, no value diff confirmed via partial-tile parity), #1549 nvjpeg constants (gray + RGB GPU JPEG decode pixel-identical to Pillow CPU, max diff = 0). Cross-backend parity re-verified clean: numpy/dask+numpy/cupy/dask+cupy equal .data/.dtype/.coords/nodata/NaN-mask on deflate+predictor3+nodata; orientations 1-8 numpy==GPU; partial edge tiles 100x150, 257x383, 512x257 numpy==GPU==dask; predictor2 LE/BE round-trip uint8/int16/uint16/int32/uint32 pass; predictor3 LE/BE float32/64 pass. Deferred LOW (pre-existing, not opened): float16 (bps=16, SampleFormat=3) absent from tiff_dtype_to_numpy map - writer never emits, asymmetric but unreachable. | Pass 9 (2026-05-09): TWO HIGH fixed -- (a) PR #1539 closes #1537: TIFF Orientation tag 2/3/4 (mirror flips) on georeferenced files left y/x coords computed from the un-flipped transform, so xarray label lookups returned the wrong pixel even though _apply_orientation flipped the buffer. PR #1521 only updated the transform for the 5-8 axis-swap branch. Fix updates origin and pixel-scale signs along whichever axes were flipped, for both PixelIsArea (origin shifts by N*step) and PixelIsPoint (shifts by (N-1)*step). 10 new tests in test_orientation.py. (b) PR #1546 closes #1540: read_geotiff_gpu ignored Orientation tag completely; CPU correctly applied 2-8 (PR #1521) but GPU returned the raw stored buffer. Cross-backend disagreement on every non-default orientation. Fix adds _apply_orientation_gpu (cupy slicing mirror of the CPU helper) and _apply_orientation_geo_info, threads them into the tiled GPU pipeline, reuses CPU-fallback geo_info for the stripped path to avoid double-applying. 28 new tests in test_orientation_gpu.py (every orientation, single-band tiled, single-band stripped, 3-band tiled, mirror-flip sel-fidelity, default no-tag passthrough). Re-confirmed clean: HTTP coalesce_ranges with overlapping ranges and zero-length ranges, parallel streaming write thread-safety (each tile gets independent buffer via copy or padded zeros), planar=2 + chunky GPU LERC mask propagation matches CPU, IFD chain cap MAX_IFDS=256, max_z_error round-trip on tiled write, _resolve_masked_fill float vs integer dtype semantics. Deferred LOW: per-sample LERC mask (3D mask (h,w,samples)) collapsed to per-pixel ""any sample invalid"" on GPU while CPU honours per-sample; LERC implementations rarely emit 3D masks (verified: lerc.encode with 2D mask on 3-band returns 2D mask). Documented planar=2 + LERC + GPU silently drops mask (rare in practice, source comment acknowledges). | Pass 8 (2026-05-07): HIGH fixed in fix-jpeg-tiff-disable -- to_geotiff(compression='jpeg') wrote files that no external reader can decode. The writer tags compression=7 (new-style JPEG) but emits a self-contained JFIF stream per tile/strip and never writes the JPEGTables tag (347) that the TIFF spec requires for that codec. libtiff/GDAL/rasterio all reject the file with TIFFReadEncodedStrip() failed; our reader round-trips because Pillow decodes the standalone JFIF, hiding the break. Pass-4 notes flagged the read side of the same JPEGTables gap and deferred it; pass-8 covers the write side. Fix: reject compression='jpeg' at the to_geotiff entry with a clear ValueError pointing at deflate/zstd/lzw. The internal _writer.write is untouched so the existing self-decoding tests still cover the codec; re-enabling the public path needs a JPEGTables-aware encoder. PR diffs reviewed but not merged: #1512 (BytesIO source) and #1513 (LERC max_z_error) -- both look correct; #1512 file-like read path goes through read_all() once so the per-call BytesIOSource lock is theoretical, and #1513 forwards max_z_error through every overview/tile/strip/streaming path including _write_vrt_tiled and _compress_block. No regressions found in either open PR. Other surfaces audited clean: predictor=3 with float16 (writer auto-promotes to float32 on both eager and streaming paths, value-exact round-trip); planar=2 multi-tile read uses band_idx*tiles_per_band offset so no cross-contamination between planes; _header.py multi-byte tag parsing uses bo (byte_order) consistently; Pillow YCbCr-vs-tagged-RGB photometric mismatch becomes moot once JPEG is disabled. Deferred (LOW/MEDIUM, not filed): JPEG2000 writer accepts arbitrary dtype with no validation (rare codec, narrow risk); float16 dtype not in tiff_dtype_to_numpy decode map (writer never emits it - asymmetric but unreachable); Orientation tag (274) still ignored on read (pass-4 deferral). | Pass 7 (2026-05-07): HIGH fixed in fix-mmap-cache-refcount-after-replace -- _MmapCache.release() looked up the cache entry by realpath, so a holder that acquired the OLD mmap before an os.replace and released it AFTER another caller had acquired the post-replace entry would decrement the new holder's refcount. Subsequent eviction (cache full, or another acquire) closed the still-in-use mmap, breaking reads with 'mmap closed or invalid'. Real exposure: any concurrent reader/writer pattern where to_geotiff replaces a file that another reader had just opened via open_geotiff with chunks= or via _FileSource. PR #1506 added stale-replacement detection but did not fix the refcount confusion across the pop. Fix: acquire returns an opaque entry token; release takes the token and decrements that exact entry, regardless of cache state. Orphaned (popped) entries close their fh+mmap when their own refcount hits zero. _FileSource updated to pass the token. Regression test test_release_after_path_replacement_does_not_clobber_new_holder added. All 665 geotiff tests pass; GPU path verified. | Pass 6 (2026-05-07) PR #1507: BE pred2 numba TypingError. | Pass 5 (2026-05-06) PR #1506: mmap cache stale after file replace. | Pass 4 (2026-05-06) PR #1501: sparse COG tiles. | Pass 3 (2026-05-06) PR #1500: predictor=3 byte order. | Pass 2 (2026-05-05) PR #1498: predictor=2 sample-wise. | Pass 1 (2026-04-23) PR #1247. Re-confirmed clean over passes 2-7: items 2 (writer always emits LE TIFFs - hardcoded b'II'), 3 (RowsPerStrip default = height when missing), 4 (StripByteCounts missing raises clear ValueError), 5 (TileWidth without TileLength caught by 'tw <= 0 or th <= 0' check at _reader.py:688), 9 (read determinism on compressed+tiled+multiband), 11 (predictor=2 with awkward sample stride round-trips), 18 (compression_level=99 raises ValueError 'out of range for deflate (valid: 1-9)'), 21 (concurrent writes serialize correctly via mkstemp+os.replace), 24 (uint16 dtype preserved on numpy backend, dask honors chunks param), 26 (chunks rounds correctly with remainder chunk for non-tile-aligned). Deferred: item 8 (BytesIO/file-like sources are not supported, source.lower() error) - documented as 'str' parameter, not a bug; item 19 (LERC max_z_error not user-exposed by to_geotiff) - missing feature, not a bug." +glcm,2026-05-01,1408,HIGH,2,"angle=None averaged NaN as 0, masking no-valid-pairs as zero texture; fixed via nanmean-style averaging" +hillshade,2026-04-10T12:00:00Z,,,,"Horn's method correct. All backends consistent. NaN propagation correct. float32 adequate for [0,1] output." +hydro,2026-04-30,,LOW,1,Only LOW: twi log(0)=-inf if fa=0 (out-of-contract); MFD weighted sum no Kahan (negligible). No CRIT/HIGH issues. +interpolate-kriging,2026-06-04,2915,MEDIUM,1,"Cat1 nugget-on-diagonal bug (MEDIUM): _build_kriging_matrix set K[:n,:n]=vario_func(D) where D has 0 diagonal, so vario_func(0)=nugget c0 landed on the matrix diagonal; semivariogram gamma(0)=0 by definition (nugget is the h->0+ limit). Forced exact interpolation of noisy data and biased kriging variance downward. Only bites when fitted nugget>0; existing trend-dominated test data fits ~0 nugget so tests passed. Fix #2915/PR #2922: np.fill_diagonal(G,0.0) in shared host code (all 4 backends consume same K_inv). Cats 2-5 clean: validate_points drops NaN/Inf rows; range floor 1e-12 prevents div blowup; dask map_blocks slices grid coords with correct half-open extents and returns matching block shape (kriging is global, no overlap needed); planar Euclidean distance is expected for kriging (Cat4 n/a); numpy/cupy/dask share one algorithm and parity tests pass rtol=1e-10. CUDA available; all 16 kriging tests pass incl cupy + dask+cupy. Singular-matrix path adds 1e-10*eye Tikhonov term (separate from nugget, unaffected, correct)." +kde,2026-04-13T12:00:00Z,1198,,,kde/line_density return zeros for descending-y templates. Fix in PR #1199. +mahalanobis,2026-05-01,,LOW,1,"LOW: np.linalg.inv (no pinv fallback) returns garbage for near-singular cov without raising. LOW: two-pass mean/cov instead of Welford could lose precision for inputs with very large mean/small variance. No CRIT/HIGH; all four backends use float64 throughout, NaN handled via isfinite, dist_sq clamped non-negative, singular case raises ValueError." +mcda,2026-06-10,3146,MEDIUM,5,"Cat5 backend failures, all raise loudly (no wrong numbers): owa raised on every dask backend (da.sort does not exist in dask.array; fixed via rechunk+map_blocks np.sort with explicit meta) and on cupy (numpy order-weight array * cupy stack); standardize piecewise raised on cupy (cupy.interp needs cupy bp/vl + C-contiguous input) and dask+cupy (np.asarray on cupy chunk), categorical raised on dask+cupy (same asarray); monte-carlo sensitivity raised on cupy/dask+cupy (.values implicit conversion; now Welford accumulates with matching array module). All fixed + GPU tests added (issue #3146). Cats 1-4 clean: Welford already used, AHP Perron eigenvector + Saaty RI table correct, NaN propagation verified across combine ops, no neighborhood/geodesic code. constrain on cupy raises cupy.astype AttributeError = known cupy 13.6 + xarray xr.where incompat (dependency pin, not mcda). CUDA available; cupy + dask+cupy executed for all probes and tests." +morphology,2026-04-30,"1397,1399",HIGH,2;5,HIGH fixed in #1397/PR #1398: morph_erode/dilate seeded centre cell into running min/max even when kernel[centre]==0 (all 4 backends). HIGH fixed in #1399/PR #1400: dask backends raised on 1xN/Nx1 kernels because empty-slice writeback (0:-0). +multispectral,2026-03-30T14:00:00Z,1094,,, +normalize,2026-05-01,,,,rescale and standardize across all 4 backends. NaN/inf filtered via isfinite mask before min/max/mean/std. Constant input handled (range=0 -> new_min; std=0 -> 0.0). Output dtype float64 consistently. Backend parity covered by test_matches_numpy. No accuracy issues found. +perlin,2026-04-10T12:00:00Z,,,,Improved Perlin noise implementation correct. Fade/gradient functions verified. Backend-consistent. Continuous at cell boundaries. +polygon_clip,2026-06-10,3186,HIGH,5,"Cat5 backend inconsistency: dask+cupy clip_polygon rasterizes the mask with a uniform chunk size from the raster's first chunk, then feeds raster+mask to da.map_blocks (positional block pairing). Non-uniform raster chunks gave the mask a different block layout -> IndexError/ValueError (or silent mis-stamp). Repro (8,6) rechunk ((3,5),(6,)) on dask+cupy raised ValueError Shapes do not align; dask+numpy was fine via xarray.where rechunk. Fix #3186/PR: rechunk cond to raster.data.chunks[-2:] before map_blocks; added non-uniform regression tests for dask+numpy and dask+cupy. use_cuda->gpu migration in that branch was already landed by #3089/#3122. CUDA available; cupy+dask+cupy verified, 25 tests pass. Cats 1-4 clean: numpy path uses raster.where, cupy path operates on raw arrays, NaN inputs preserved, no neighborhood ops/curvature. Prior fix #1197/#1200 (crop+all_touched) merged and unrelated." +polygonize,2026-05-29,2606,HIGH,5,"Cat 5 HIGH: dask connectivity=8 cross-chunk merge filled diagonal notch where same-value regions meet only at a corner across a chunk boundary; total area exceeded raster. Hole ring was dropped because containment tested hole[0] (on exterior at pinch). Fixed via _ring_interior_point in PR for #2606. numpy, dask+numpy, dask+cupy area parity now holds; 4-conn was already correct. cupy + dask+cupy paths validated on GPU host. Other cats clean: NaN masked on numpy/cupy float paths (tested), _is_close handles +/-inf via exact-equality short-circuit, atol/rtol/simplify_tolerance reject NaN/inf, integer GPU CCL matches numpy." +proximity,2026-06-09,3108,HIGH,4;5,"Cat5/Cat4: bounded GREAT_CIRCLE dask (numpy+cupy) missed targets across the +/-180 antimeridian seam: _halo_depth sized x-halo as linear parallel-arc sum, but haversine is periodic in lon and chords shorten near poles, so array-space adjacency is no lower bound on spherical distance; numpy/cupy (brute force) found the wrap target (~111 km), dask returned NaN. Fixed in #3108 via chord bound 2R asin(cos(lat_max)|sin(dlon/2)|) + x-axis fold when seam/180-deg chord within max_distance (covers over-pole too). CUDA host: cupy + dask+cupy executed, 417+ tests pass. Cat1-3 clean (float32 output documented; NaN via isfinite consistent; bounds guards correct; tie-break unified in #2881). LOW (not fixed): great_circle_distance uses WGS84 equatorial radius 6378137 as sphere radius (~0.1% vs mean-radius convention) but documented and exposed as param." +rasterize,2026-06-12,3304,HIGH,3,"Cat3 HIGH: merge='sum'/'count' burned the same geometry into a pixel 2-3x on all 4 backends (cross-backend parity probes never caught it because all backends shared the bug). Two sources: all_touched=True polygons overlap the scanline center-fill with the supercover boundary pass and re-burn shared ring-vertex cells (count up to 3 for ONE polygon; rasterio MergeAlg.add gives 1); lines re-burn the connecting vertex of consecutive Bresenham segments (count 2 per interior vertex; rasterio gives 1 even for self-crossing lines). Fix #3304/PR #3313: for sum/count only, enumerate line + boundary cells host-side, dedup per (geometry,row,col), drop boundary cells the geometry's own scanline covers via a crossing-parity replay of the fill's exact float arithmetic on the same edge table, burn survivors through the point kernels (numpy/cupy/dask+numpy/dask+cupy). Coverage unchanged (==merge='last'); points intentionally not deduped (GDAL adds once per point, dup MultiPoint=2 matches rasterio). CUDA available; cupy + dask+cupy executed. 790 rasterize tests + 48 new pass. Cats 1/2/4/5 clean this pass: GPU atomic min/max NaN masks (#2255) correct, GPU ceil emulation matches np.ceil for negatives, no curvature surface, backends bit-identical on mixed-geometry probes. Pre-existing known-OK: Bresenham vs GDAL line coverage tie-breaks (pinned in coverage_2026_06_09 tests); dead all_touched branch in _extract_edges_vectorized (callers never pass all_touched) noted, not removed." +reproject,2026-06-12,3274,HIGH,1;4,"3 confirmed bugs, all kernel-vs-PROJ parity: #3274 HIGH LAEA inverse spurious /rq (2.6 km err for 3035) + _authalic_apa inverse series wrong (4.8 m in AEA/CEA inverses; PROJ 3-term = 1.6 mm), CPU+CUDA kernels both; #3275 HIGH _is_wgs84_compatible_ellipsoid passes R-defined spheres (MODIS sinusoidal 18.9 km err) and _aea_params/_cea_params lack the guard entirely (23.8 km on spherical aea/cea); #3276 MEDIUM itrf helmert scale 1e-9 but PROJ +s is ppm (1e-6), ~23 mm err. Verified clean: merc/emerc/UTM/tmerc/LCC/polar stere (incl lat_ts akm1) forward+inverse <=1e-5 m vs pyproj; resampling kernels NaN handling and GDAL renorm match across numpy/cupy/dask (CUDA run, gpu-vs-cpu 1.3e-7); dask footprint chunk-skip bbox is a superset in all probed cases (no holes). LOW (documented only): _source_footprint_in_target probe array typo uses x-midpoint mx as a latitude in last 3 ys entries (bbox superset, correctness unaffected)." +resample,2026-05-29,2610,HIGH,3;5,"dask interp (nearest/bilinear) overlap depth=1 too small on downsample; block-centered source coord landed past chunk, map_coordinates clamped to edge -> wrong seam rows. Fixed PR #2627 via per-axis _downsample_radius. cupy+dask+cupy verified." +sieve,2026-04-13T12:00:00Z,,,,Union-find CCL correct. NaN excluded from labeling. All backends funnel through _sieve_numpy. +sky_view_factor,2026-05-01,1407,HIGH,4,Horizon angle ignored cell size; fixed by passing cellsize_x/cellsize_y into CPU+GPU kernels and using ground distance +terrain,2026-04-10T12:00:00Z,,,,Perlin/Worley/ridged noise correct. Dask chunk boundaries produce bit-identical results. No precision issues. +terrain_metrics,2026-04-30,,LOW,2;5,"LOW: Inf input not rejected, propagates as Inf (consistent across backends but undocumented). LOW: dask+cupy non-nan boundary path double-pads (wasted compute, central output values still correct). No CRIT/HIGH; tests cover NaN propagation, all 4 backends, all 4 boundary modes, dtype acceptance." +viewshed,2026-05-29,2691,HIGH,3;5,max_distance window sized from coarser axis clipped cells on anisotropic rasters (PR #2702). LOW unfixed: distance_sweep ring radius same max(res) pattern but max_distance arg always None; _calculate_event_row_col line 880 abs(x>1) precedence bug is a broken guard only. cuda+rtx paths validated. +visibility,2026-04-13T12:00:00Z,,,,"Bresenham line, LOS kernel, Fresnel zone all correct. All backends converge to numpy." +worley,2026-05-01,,MEDIUM,2;5,"MEDIUM: numpy backend uses np.empty_like(data) so integer input dtype produces integer output (distances truncated to 0); cupy/dask paths always produce float32. LOW: freq=inf produces 100000 sentinel (sqrt of initial min_dist=1e10), no validation of freq/seed for non-finite values." +zonal,2026-05-27,2528,MEDIUM,5,"Pass 2 (2026-05-27): MEDIUM fixed -- issue #2528. zonal_stats() on dask-backed inputs silently dropped 'majority' from the requested stats list. The mutable default stats_funcs included 'majority' (added in commit 7c8d5759), but the dask path filtered it out at xrspatial/zonal.py:459 (computed_stats = [s for s in stats_funcs.keys() if s in stats_dict]) because 'majority' is not in _DASK_BLOCK_STATS. Symptom: stats(zones=dask, values=dask) returned 7 columns instead of the 8 the docstring promises; stats(..., stats_funcs=['mean','majority']) returned only ['zone','mean'] with no error or warning. Both dask+numpy and dask+cupy were affected (dask+cupy delegates to dask+numpy). Fix: replaced the mutable list literal default with stats_funcs=None and resolved the default per backend inside the function -- numpy/cupy get the full 8-stat list, dask gets the 7-stat subset (no majority). Explicit majority on dask now raises ValueError with a clear supported-stats message instead of silently filtering. 4 regression tests in test_zonal.py: explicit majority raises on dask, bare default omits majority on dask, bare default keeps majority on numpy, default list is not mutated across calls (covers the historical mutable-default pitfall). All 129 test_zonal.py tests pass (125 pre-existing + 4 new); test_dasymetric.py 61 tests still pass (dasymetric uses zonal.stats internally). Categories: Cat 5 (backend inconsistency: numpy/cupy honoured majority; dask paths silently dropped it). | Pass 1 (2026-03-30T12:00:00Z): historical entry #1090." diff --git a/docs/source/user_guide/geotiff_safe_io.rst b/docs/source/user_guide/geotiff_safe_io.rst index e711dc687..a755411f0 100644 --- a/docs/source/user_guide/geotiff_safe_io.rst +++ b/docs/source/user_guide/geotiff_safe_io.rst @@ -204,6 +204,11 @@ the ambiguous-metadata family at once. - The affine transform has non-zero rotation / shear terms. - ``allow_rotated=True`` (experimental). The opt-in returns the pixel grid without the geospatial assumption. + * - :class:`~xrspatial.geotiff.DegeneratePixelSizeError` + - The ``ModelPixelScale`` (or ``ModelTransformation`` diagonal) + declares a zero or non-finite pixel size, which would build a + constant or all-NaN coordinate axis. + - No opt-in. Re-export the file with a non-zero, finite pixel size. * - :class:`~xrspatial.geotiff.NonUniformCoordsError` - The DataArray coords on write imply a non-uniform pixel grid. - Regrid the array to uniform spacing first. diff --git a/xrspatial/geotiff/__init__.py b/xrspatial/geotiff/__init__.py index af8087ac8..5764875be 100644 --- a/xrspatial/geotiff/__init__.py +++ b/xrspatial/geotiff/__init__.py @@ -65,10 +65,10 @@ from ._coords import \ transform_tuple_from_pixel_geometry as _transform_tuple_from_pixel_geometry # noqa: F401 from ._crs import _resolve_crs_to_wkt, _wkt_to_epsg # noqa: F401 -from ._errors import (ConflictingCRSError, ConflictingNodataError, DuplicateIFDTagError, - GeoTIFFAmbiguousMetadataError, InconsistentGeoKeysError, InvalidCRSCodeError, - InvalidIntegerNodataError, MalformedScaleOffsetError, MixedBandMetadataError, - NonRepresentableEPSGCRSError, NonUniformCoordsError, +from ._errors import (ConflictingCRSError, ConflictingNodataError, DegeneratePixelSizeError, + DuplicateIFDTagError, GeoTIFFAmbiguousMetadataError, InconsistentGeoKeysError, + InvalidCRSCodeError, InvalidIntegerNodataError, MalformedScaleOffsetError, + MixedBandMetadataError, NonRepresentableEPSGCRSError, NonUniformCoordsError, RemoteStableSourcesOnlyError, RotatedTransformError, UnknownCRSModelTypeError, UnparseableCRSError, UnsupportedGeoTIFFFeatureError, VRTStableSourcesOnlyError, VRTUnsupportedError) @@ -110,6 +110,7 @@ 'CloudSizeLimitError', 'ConflictingCRSError', 'ConflictingNodataError', + 'DegeneratePixelSizeError', 'DuplicateIFDTagError', 'GeoTIFFAmbiguousMetadataError', 'GeoTIFFFallbackWarning', diff --git a/xrspatial/geotiff/_errors.py b/xrspatial/geotiff/_errors.py index fe067a07c..a2e75d60c 100644 --- a/xrspatial/geotiff/_errors.py +++ b/xrspatial/geotiff/_errors.py @@ -18,6 +18,7 @@ ├── InvalidCRSCodeError ├── UnparseableCRSError ├── RotatedTransformError + ├── DegeneratePixelSizeError ├── NonUniformCoordsError ├── MixedBandMetadataError ├── ConflictingCRSError @@ -66,6 +67,29 @@ class RotatedTransformError(GeoTIFFAmbiguousMetadataError): """ +class DegeneratePixelSizeError(GeoTIFFAmbiguousMetadataError): + """A georeferenced transform declares a zero or non-finite pixel size. + + Raised on read when the axis-aligned ``ModelPixelScale`` (or the + ``ModelTransformation`` diagonal) carries a zero, NaN, or +/-Inf + ``pixel_width`` / ``pixel_height``. The reader builds coordinate + arrays as ``arange(N) * pixel_width + origin``, so a zero pixel size + collapses the whole axis onto the origin (a constant, non- + georeferenced coordinate array) and a non-finite pixel size produces + an all-NaN / all-Inf axis. Either way a downstream spatial op would + silently run on coordinates that do not describe the data. + + The VRT read path already rejects a zero ``res_x`` / ``res_y`` with + :class:`VRTUnsupportedError`, and the writer rejects a zero-step + coordinate axis with :class:`NonUniformCoordsError`; this extends the + same fail-closed contract to the direct-TIFF read path. + + Subclasses :class:`GeoTIFFAmbiguousMetadataError` (and therefore + ``ValueError``) so existing ``except ValueError`` callers keep + catching the case. + """ + + class NonUniformCoordsError(GeoTIFFAmbiguousMetadataError): """DataArray coords disagree with the implied transform on write. @@ -340,6 +364,7 @@ class UnsupportedGeoTIFFFeatureError(ValueError): __all__ = [ "ConflictingCRSError", "ConflictingNodataError", + "DegeneratePixelSizeError", "DuplicateIFDTagError", "GeoTIFFAmbiguousMetadataError", "InconsistentGeoKeysError", diff --git a/xrspatial/geotiff/_geotags.py b/xrspatial/geotiff/_geotags.py index 0a6189571..d7ef73630 100644 --- a/xrspatial/geotiff/_geotags.py +++ b/xrspatial/geotiff/_geotags.py @@ -1,10 +1,12 @@ """GeoTIFF tag interpretation: CRS, affine transform, GeoKeys.""" from __future__ import annotations +import math from dataclasses import dataclass, field from ._dtypes import resolve_bits_per_sample, tiff_dtype_to_numpy -from ._errors import NonRepresentableEPSGCRSError, RotatedTransformError, UnknownCRSModelTypeError +from ._errors import (DegeneratePixelSizeError, NonRepresentableEPSGCRSError, RotatedTransformError, + UnknownCRSModelTypeError) from ._header import (IFD, TAG_BITS_PER_SAMPLE, TAG_COMPRESSION, TAG_EXTRA_SAMPLES, TAG_GDAL_METADATA, TAG_GDAL_NODATA, TAG_GEO_ASCII_PARAMS, TAG_GEO_DOUBLE_PARAMS, TAG_GEO_KEY_DIRECTORY, TAG_IMAGE_LENGTH, @@ -631,6 +633,36 @@ def _validate_tiepoint_consistency(tiepoint: tuple, raise NotImplementedError(f"{primary}\n{cause}\n{hint}") +def _check_finite_nonzero_pixel_size(pixel_width: float, + pixel_height: float, + source: str) -> None: + """Reject a zero or non-finite axis-aligned pixel size. + + The reader builds coordinate arrays as ``arange(N) * pixel_width + + origin``, so a zero ``pixel_width`` / ``pixel_height`` collapses the + whole axis onto the origin and a NaN / +/-Inf one fills the axis with + NaN / Inf -- a degenerate raster the rest of the pipeline would + silently consume. The VRT read path already raises for this in + ``_vrt_validation`` and the writer raises ``NonUniformCoordsError`` + for a zero-step axis; this brings the direct-TIFF read path in line. + + ``source`` names the tag the pixel size came from so the message + points at the offending bytes. + """ + for axis, value in (("pixel_width", pixel_width), + ("pixel_height", pixel_height)): + if value == 0.0 or not math.isfinite(value): + raise DegeneratePixelSizeError( + f"{source} yields a degenerate {axis}={value!r}. The " + f"reader builds pixel-to-world coordinates as " + f"arange(N) * {axis} + origin, so a zero size collapses " + f"the axis onto the origin and a non-finite size fills " + f"it with NaN / Inf. Re-export the file with a non-zero, " + f"finite ModelPixelScale (or ModelTransformation " + f"diagonal)." + ) + + def _extract_transform(ifd: IFD, allow_rotated: bool = False ) -> tuple[GeoTransform, bool]: @@ -723,6 +755,8 @@ def _extract_transform(ifd: IFD, return GeoTransform( rotated_affine=(m[0], m[1], m[3], m[4], m[5], m[7]), ), False + _check_finite_nonzero_pixel_size( + m[0], m[5], "ModelTransformationTag (34264) diagonal") return GeoTransform( origin_x=m[3], origin_y=m[7], @@ -741,6 +775,11 @@ def _extract_transform(ifd: IFD, sx = scale[0] if len(scale) > 0 else 1.0 sy = scale[1] if len(scale) > 1 else 1.0 + # ``pixel_height`` is stored as ``-sy``; a zero / non-finite ``sy`` + # is degenerate either way, so check the magnitudes here, before + # both the tiepoint+scale and the scale-only returns below. + _check_finite_nonzero_pixel_size(sx, sy, "ModelPixelScaleTag (33550)") + if tiepoint is not None: if not isinstance(tiepoint, tuple): tiepoint = (tiepoint,) diff --git a/xrspatial/geotiff/tests/release_gates/test_features.py b/xrspatial/geotiff/tests/release_gates/test_features.py index 1e471e5e6..00d9f98d6 100644 --- a/xrspatial/geotiff/tests/release_gates/test_features.py +++ b/xrspatial/geotiff/tests/release_gates/test_features.py @@ -2819,6 +2819,10 @@ def test_all_lists_supported_functions(self): # importing from the private ``_errors`` module. 'ConflictingCRSError', 'ConflictingNodataError', + # Read-side fail-closed on a zero or non-finite ModelPixelScale + # / ModelTransformation diagonal (issue #3331), replacing the + # legacy silent build of a constant or all-NaN coordinate axis. + 'DegeneratePixelSizeError', # Issue #2483: read-side fail-closed on TIFF directories that # repeat a tag, replacing the legacy silent last-wins parse. 'DuplicateIFDTagError', diff --git a/xrspatial/geotiff/tests/unit/test_degenerate_pixel_size_3331.py b/xrspatial/geotiff/tests/unit/test_degenerate_pixel_size_3331.py new file mode 100644 index 000000000..5bef9dbde --- /dev/null +++ b/xrspatial/geotiff/tests/unit/test_degenerate_pixel_size_3331.py @@ -0,0 +1,186 @@ +"""Reject a zero or non-finite ModelPixelScale on read (issue #3331). + +The direct-TIFF read path built coordinate arrays as +``arange(N) * pixel_width + origin`` straight from the +``ModelPixelScale`` / ``ModelTransformation`` diagonal, with no +finite-nonzero check. A zero pixel size collapsed the whole axis onto +the origin; a NaN / Inf one filled it with NaN / Inf. Either way the +read returned a degenerate raster with no error. + +The VRT read path already rejected a zero ``res_x`` / ``res_y`` +(``VRTUnsupportedError``) and the writer rejected a zero-step coord axis +(``NonUniformCoordsError``). This test pins the matching rejection on the +direct-TIFF read path: ``DegeneratePixelSizeError``. +""" +from __future__ import annotations + +import struct + +import numpy as np +import pytest + +from xrspatial.geotiff._errors import DegeneratePixelSizeError, GeoTIFFAmbiguousMetadataError +from xrspatial.geotiff._geotags import _extract_transform +from xrspatial.geotiff._header import parse_all_ifds, parse_header + +_BO = '<' + + +def _assemble_tiff(extra_geo_tags: list[tuple]) -> bytes: + """Build a 2x2 single-strip TIFF with the given geo tags. + + ``extra_geo_tags`` is a list of ``(tag, doubles_tuple)`` entries + appended as TIFF DOUBLE (type 12) arrays. + """ + width, height = 2, 2 + pixels = np.zeros((height, width), dtype=np.uint8) + + tag_list = [] + + def add_short(tag, val): + tag_list.append((tag, 3, 1, struct.pack(f'{_BO}H', val))) + + def add_long(tag, val): + tag_list.append((tag, 4, 1, struct.pack(f'{_BO}I', val))) + + def add_doubles(tag, vals): + tag_list.append( + (tag, 12, len(vals), struct.pack(f'{_BO}{len(vals)}d', *vals))) + + add_short(256, width) # ImageWidth + add_short(257, height) # ImageLength + add_short(258, 8) # BitsPerSample + add_short(259, 1) # Compression: none + add_short(262, 1) # PhotometricInterpretation + add_short(277, 1) # SamplesPerPixel + add_short(278, height) # RowsPerStrip + add_long(273, 0) # StripOffsets (placeholder) + add_long(279, len(pixels.tobytes())) # StripByteCounts + add_short(339, 1) # SampleFormat + for tag, vals in extra_geo_tags: + add_doubles(tag, list(vals)) + + tag_list.sort(key=lambda t: t[0]) + + num_entries = len(tag_list) + ifd_start = 8 + ifd_size = 2 + 12 * num_entries + 4 + + overflow = bytearray() + overflow_offsets = {} + for tag, _typ, _count, raw in tag_list: + if len(raw) > 4: + overflow_offsets[tag] = ifd_start + ifd_size + len(overflow) + overflow.extend(raw) + if len(overflow) % 2: + overflow.append(0) + + pixel_start = ifd_start + ifd_size + len(overflow) + + patched = [] + for tag, typ, count, raw in tag_list: + if tag == 273: + patched.append((tag, typ, count, struct.pack(f'{_BO}I', pixel_start))) + else: + patched.append((tag, typ, count, raw)) + tag_list = patched + + out = bytearray() + out.extend(b'II') + out.extend(struct.pack(f'{_BO}H', 42)) + out.extend(struct.pack(f'{_BO}I', ifd_start)) + out.extend(struct.pack(f'{_BO}H', num_entries)) + for tag, typ, count, raw in tag_list: + out.extend(struct.pack(f'{_BO}HHI', tag, typ, count)) + if len(raw) <= 4: + out.extend(raw.ljust(4, b'\x00')) + else: + out.extend(struct.pack(f'{_BO}I', overflow_offsets[tag])) + out.extend(struct.pack(f'{_BO}I', 0)) + out.extend(overflow) + out.extend(pixels.tobytes()) + return bytes(out) + + +def _tiff_with_pixel_scale(sx: float, sy: float, with_tiepoint: bool) -> bytes: + """TIFF with ModelPixelScale (33550), optionally + ModelTiepoint (33922).""" + tags = [(33550, (sx, sy, 0.0))] + if with_tiepoint: + tags.append((33922, (0.0, 0.0, 0.0, 500000.0, 4500000.0, 0.0))) + return _assemble_tiff(tags) + + +def _tiff_with_transformation(sx: float, sy: float) -> bytes: + """TIFF with an axis-aligned ModelTransformation (34264) diagonal.""" + matrix = ( + sx, 0.0, 0.0, 500000.0, + 0.0, sy, 0.0, 4500000.0, + 0.0, 0.0, 1.0, 0.0, + 0.0, 0.0, 0.0, 1.0, + ) + return _assemble_tiff([(34264, matrix)]) + + +def _extract(data: bytes): + header = parse_header(data) + ifds = parse_all_ifds(data, header) + return _extract_transform(ifds[0]) + + +_BAD = [ + pytest.param(0.0, 30.0, id="zero_width"), + pytest.param(30.0, 0.0, id="zero_height"), + pytest.param(float('nan'), 30.0, id="nan_width"), + pytest.param(30.0, float('nan'), id="nan_height"), + pytest.param(float('inf'), 30.0, id="inf_width"), + pytest.param(30.0, float('-inf'), id="neg_inf_height"), +] + + +@pytest.mark.parametrize("sx,sy", _BAD) +def test_pixel_scale_only_rejected(sx, sy): + with pytest.raises(DegeneratePixelSizeError) as exc: + _extract(_tiff_with_pixel_scale(sx, sy, with_tiepoint=False)) + assert 'ModelPixelScaleTag' in str(exc.value) + + +@pytest.mark.parametrize("sx,sy", _BAD) +def test_pixel_scale_with_tiepoint_rejected(sx, sy): + with pytest.raises(DegeneratePixelSizeError) as exc: + _extract(_tiff_with_pixel_scale(sx, sy, with_tiepoint=True)) + assert 'ModelPixelScaleTag' in str(exc.value) + + +@pytest.mark.parametrize("sx,sy", _BAD) +def test_transformation_diagonal_rejected(sx, sy): + with pytest.raises(DegeneratePixelSizeError) as exc: + _extract(_tiff_with_transformation(sx, sy)) + assert 'ModelTransformationTag' in str(exc.value) + + +def test_degenerate_pixel_size_is_ambiguous_metadata_subclass(): + # Existing ``except ValueError`` / ``except + # GeoTIFFAmbiguousMetadataError`` callers must keep catching this. + assert issubclass(DegeneratePixelSizeError, GeoTIFFAmbiguousMetadataError) + assert issubclass(DegeneratePixelSizeError, ValueError) + + +@pytest.mark.parametrize("builder", [ + lambda: _tiff_with_pixel_scale(30.0, 30.0, with_tiepoint=False), + lambda: _tiff_with_pixel_scale(30.0, 30.0, with_tiepoint=True), + lambda: _tiff_with_transformation(30.0, -30.0), +]) +def test_finite_nonzero_pixel_size_still_accepted(builder): + transform, has_georef = _extract(builder()) + assert has_georef is True + assert transform.pixel_width == pytest.approx(30.0) + assert abs(transform.pixel_height) == pytest.approx(30.0) + + +def test_open_geotiff_rejects_zero_pixel_scale(tmp_path): + # End-to-end through the public read entry point. + path = tmp_path / 'degenerate_scale_3331.tif' + path.write_bytes(_tiff_with_pixel_scale(0.0, 30.0, with_tiepoint=True)) + from xrspatial.geotiff import open_geotiff + with pytest.raises(DegeneratePixelSizeError): + open_geotiff(str(path))