Skip to content

Introduce StorageChainBlockImport#12242

Draft
skunert wants to merge 18 commits into
masterfrom
warp-with-RAPI
Draft

Introduce StorageChainBlockImport#12242
skunert wants to merge 18 commits into
masterfrom
warp-with-RAPI

Conversation

@skunert
Copy link
Copy Markdown
Contributor

@skunert skunert commented May 31, 2026

Storage chains rely on Renew extrinsics to keep indexed transaction payloads alive past their original retention window. A node that tip-syncs past a Renew without ever having executed the original store ends up with an empty TRANSACTION column locally, and apply_index_ops fails. This PR adds StorageChainBlockImport, a block-import wrapper that intercepts this case during tip-sync, fetches the missing bytes over bitswap, and hands them to the backend through the PrefetchedIndexedTransactions carrier introduced in #12086.

Gap-sync (the same problem after warp-sync) is wired up but stays disabled in production — the sync layer doesn't yet fetch bodies inside the pruning window. The dispatch and its tests are already in place behind the existing origin filter, so flipping the gate is a one-line change once the sync-layer work lands.

@skunert skunert added T0-node This PR/Issue is related to the topic “node”. A5-run-CI Run CI on draft PR labels May 31, 2026
skunert and others added 15 commits June 1, 2026 09:16
Introduces `cumulus-client-storage-chain-sync`, the parachain-side wrapper
that fetches indexed-transaction bytes via bitswap and feeds them into the
backend through the dedicated `BlockImportParams::prefetched_indexed_transactions`
field added in the parent low-level PR.

Wrapper:
- `StorageChainBlockImport` wraps an inner `BlockImport` and intercepts blocks
  carrying `IndexOperation::Renew` ops.
- Case A (`StateAction::ApplyChanges`): query `TransactionStorageApi` against
  the parent state plus the incoming `StorageChanges` overlay so renew metadata
  reflects state visible to the just-executed block, then fetch missing payloads
  over bitswap and attach them to `params.prefetched_indexed_transactions`.
- Case B (`StateAction::Execute*`): execute the block once via runtime API on a
  rollback-only transaction; capture the produced `StorageChanges`; reuse the
  same `ApiRef` for the metadata query so no second execution is required;
  forward the captured changes downstream as `ApplyChanges` so the backend
  imports a single set of state changes per block.
- Bitswap fetcher (`fetcher.rs`) rotates across peers from a `BitswapPeerSource`
  handle, hashes each returned blob against the algorithm declared by the
  runtime metadata, and rejects unverifiable blobs before they reach the
  backend.

Integration tests (`tests/it.rs`) cover:
- pass-through paths (warp sync, gap sync, body-none),
- Case A no-renews, partial-fetcher error, already-present-hash skip,
- Case A attaches prefetched bytes for each supported `HashingAlgorithm`
  (Blake2b256/Sha2_256/Keccak256) via rstest cases,
- Case B executes exactly once and indexes on the same overlay, with a
  rollback marker proving the indexed-transactions query is rolled back.

Omni-node wiring:
- `polkadot-omni-node-lib` builds the wrapper inside `spec.rs` and threads the
  bitswap network + syncing handles through `aura.rs` so omni-node operators
  can opt into storage-chain sync without bespoke node code.
Gates the bitswap protobuf schema module and the CID Prefix builder behind a
new `test-helpers` feature so downstream test crates (cumulus-client-storage-chain-sync
integration tests) can hand-roll bitswap responses without depending on
production-internal types. Default builds are unchanged.

Extracted from earlier branch commit a862cda to keep upstream PR #12086
clean of cumulus-test-only changes.
Adds:
- `#[derive(Hash)]` on `HashingAlgorithm` so consumers can key it in `HashSet`s.
- `HashingAlgorithm::multihash_code(self) -> u64` returning the IANA-assigned
  IPFS multihash code for CID construction.
- `HashingAlgorithm::hash(self, &[u8]) -> ContentHash` dispatching to the
  matching `sp_crypto_hashing` primitive.
- Two unit tests verifying multihash codes and hash dispatch.

Required by the downstream cumulus-client-storage-chain-sync block-import
wrapper to build per-algorithm CIDs and verify fetched bytes. Additive,
no behavior change for existing consumers.

Extracted from earlier branch commit a862cda to keep upstream PR #12086
clean of cumulus-consumer-only additions.
…carrier shape

The substrate-side `PrefetchedIndexedTransactions` carrier in PR #12086 uses
`renew_payloads: HashMap<H256, Vec<u8>>` (not the `Vec<([u8;32], Vec<u8>)>`
shape carried during early branch iteration). Adapts:

- `attach_prefetched` builds the struct directly with empty `ops` and a
  `HashMap` of renew payloads keyed on `H256`.
- Three integration-test assertion sites updated to inspect the struct
  fields (`ops` + `renew_payloads`) instead of treating it as a `Vec`.

Compiles and 9 lib + 10 integration tests green against PR-head substrate.
Adds the zombienet-sdk storage-chain test module that exercises the full
storage-chain sync path against a pre-generated 100-block chain snapshot.

- `parachain_generate_db`: utility binary that submits 30 `store` extrinsics
  to a freshly-spawned parachain via the `Bob` authorization budget and cuts
  a `tip-sync-100.tgz` snapshot once both nodes finalize the target height.
- `parachain_tip_sync_with_renewals`: the main test; loads the snapshot,
  spins up a fresh full node, replays each store as a `renew_content_hash`
  extrinsic, and asserts the new node receives every renew payload over
  bitswap and matches storage roots with the producer.
- `fixture.rs`, `common.rs`: deterministic payload/algorithm/content_hash
  helpers and zombienet network setup shared by both binaries.
- `generate-snapshots.sh`: regenerates the 75MB `tip-sync-100.tgz` blob; the
  blob itself is downloaded from GCS at test time (not committed to git).
- `fixtures/*.json`: chain specs and snapshot metadata recorded alongside
  the snapshot.
- `full_node_warp_sync.rs`: small refactor to share the bitswap setup hook
  with the new storage-chain test.

`cumulus-zombienet-sdk-tests` gains the test as a binary target via the
`Cargo.toml` workspace entry.
Adds the gap-sync dispatch path to `StorageChainBlockImport`, gated in
production behind the existing `should_intercept` filter for `BlockOrigin::GapSync`.
Reachable only via the test-helpers `intercept_gap_sync_for_test` override while
the sync layer's body-fetch-inside-pruning-window work is still pending.

When enabled, the wrapper for a gap-synced block:
1. Queries `TransactionStorageApi::indexed_transactions(block_number)` at the
   local `finalized_hash` (always >= the gap block by construction; that state
   has `Transactions::<T>::get(block_number)` populated by the block's own
   `on_finalize`, within retention).
2. Tail-hashes each entry against the body to split it into synthetic
   `IndexOperation::Insert` ops (store data already in body) and synthetic
   `IndexOperation::Renew` ops (data missing, must be fetched).
3. Bitswap-fetches the missing renew payloads.
4. Attaches both halves to `BlockImportParams::prefetched_indexed_transactions`
   via the widened `PrefetchedIndexedTransactions { ops, renew_payloads }`
   carrier introduced in the parent substrate PR.
5. Forwards to the inner import with `state_action = Skip` unchanged.

The backend then routes via the synthetic-ops fallback (runtime `index_ops`
empty for gap-synced blocks) and writes both store tails and renew payloads
into the TRANSACTION column exactly as if the runtime had executed.

Refactors `body_classify_renews` into `body_classify_to_ops` returning both
halves of the classification (synthetic ops + renew-fetch set);
`body_classify_renews` becomes a `#[cfg(test)]` delegate-regression target.

Tests:
- 9 new unit tests for `body_classify_to_ops` (W12): pure stores / pure renews
  / mixed / per-hashing / oversized / `u32::MAX` skip / non-RAW skip /
  out-of-range skip / delegate regression.
- 9 new integration tests for the gap-sync dispatch (W13, behind
  `test-helpers`): pure renews / pure stores no fetch / mixed split /
  state_action stays Skip / below-retention pass-through / finalized-hash
  state-context assertion / already-present filter / fetcher partial failure
  propagation / production-gate regression guard
  (`import_gap_sync_disabled_by_default_passes_through`).

Test infrastructure:
- `cumulus-client-storage-chain-sync` gains a `test-helpers` feature that
  exposes `StorageChainBlockImport::intercept_gap_sync_for_test`.
- Mock client extensions: `set_finalized_hash`, `last_indexed_transactions_state`,
  `MockNetworkRequest::call_count` (all feature-gated).
- Default-features build remains clean; W13 tests skip when `test-helpers`
  is not enabled.

Also updates the existing `attach_prefetched` and tip-block test assertions
to use the widened struct (`renew_payloads` field) instead of the bare
`Vec<(_,_)>` shape. No production behaviour change on the tip-block path.
… onto one line

The two map/collect chains that adapt `Vec<(ContentHash, Vec<u8>)>` to the
PR #12086 `HashMap<H256, Vec<u8>>` carrier shape fit comfortably in 100 cols
on a single line. Matches nightly-fmt conventions used elsewhere in the file
(stable-fmt would also accept the multi-line form, but it disagrees with
nightly on import grouping in the same file anyway).

No behavior change.
…ayed_changes

Simplify indexed_transactions queries by using the typed runtime API
instead of manual call_api_at boilerplate. This leverages PR #12084's
set_overlayed_changes primitive.

Key changes:
- indexed_transactions_with_storage_changes: 60 lines → 10 lines
- indexed_transactions_at_finalized: 50 lines → 6 lines, removed finalized_hash parameter
- Removed: ProofRecorder, ProofSizeExt, CallApiAtParams, RefCell wrappers,
  manual decode, has_api_with checks (already gated by should_intercept)
- Removed: INDEXED_TRANSACTIONS_API constant, call_api_at_count/overlay_marker_seen
  test helpers (replaced with overlayed_changes inspection)

Also updated mock_impl_runtime_apis! to support set_overlayed_changes
and fixed all affected mock structs in other crates.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A5-run-CI Run CI on draft PR T0-node This PR/Issue is related to the topic “node”.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant