Introduce StorageChainBlockImport#12242
Draft
skunert wants to merge 18 commits into
Draft
Conversation
Introduces `cumulus-client-storage-chain-sync`, the parachain-side wrapper that fetches indexed-transaction bytes via bitswap and feeds them into the backend through the dedicated `BlockImportParams::prefetched_indexed_transactions` field added in the parent low-level PR. Wrapper: - `StorageChainBlockImport` wraps an inner `BlockImport` and intercepts blocks carrying `IndexOperation::Renew` ops. - Case A (`StateAction::ApplyChanges`): query `TransactionStorageApi` against the parent state plus the incoming `StorageChanges` overlay so renew metadata reflects state visible to the just-executed block, then fetch missing payloads over bitswap and attach them to `params.prefetched_indexed_transactions`. - Case B (`StateAction::Execute*`): execute the block once via runtime API on a rollback-only transaction; capture the produced `StorageChanges`; reuse the same `ApiRef` for the metadata query so no second execution is required; forward the captured changes downstream as `ApplyChanges` so the backend imports a single set of state changes per block. - Bitswap fetcher (`fetcher.rs`) rotates across peers from a `BitswapPeerSource` handle, hashes each returned blob against the algorithm declared by the runtime metadata, and rejects unverifiable blobs before they reach the backend. Integration tests (`tests/it.rs`) cover: - pass-through paths (warp sync, gap sync, body-none), - Case A no-renews, partial-fetcher error, already-present-hash skip, - Case A attaches prefetched bytes for each supported `HashingAlgorithm` (Blake2b256/Sha2_256/Keccak256) via rstest cases, - Case B executes exactly once and indexes on the same overlay, with a rollback marker proving the indexed-transactions query is rolled back. Omni-node wiring: - `polkadot-omni-node-lib` builds the wrapper inside `spec.rs` and threads the bitswap network + syncing handles through `aura.rs` so omni-node operators can opt into storage-chain sync without bespoke node code.
Gates the bitswap protobuf schema module and the CID Prefix builder behind a new `test-helpers` feature so downstream test crates (cumulus-client-storage-chain-sync integration tests) can hand-roll bitswap responses without depending on production-internal types. Default builds are unchanged. Extracted from earlier branch commit a862cda to keep upstream PR #12086 clean of cumulus-test-only changes.
Adds: - `#[derive(Hash)]` on `HashingAlgorithm` so consumers can key it in `HashSet`s. - `HashingAlgorithm::multihash_code(self) -> u64` returning the IANA-assigned IPFS multihash code for CID construction. - `HashingAlgorithm::hash(self, &[u8]) -> ContentHash` dispatching to the matching `sp_crypto_hashing` primitive. - Two unit tests verifying multihash codes and hash dispatch. Required by the downstream cumulus-client-storage-chain-sync block-import wrapper to build per-algorithm CIDs and verify fetched bytes. Additive, no behavior change for existing consumers. Extracted from earlier branch commit a862cda to keep upstream PR #12086 clean of cumulus-consumer-only additions.
…carrier shape The substrate-side `PrefetchedIndexedTransactions` carrier in PR #12086 uses `renew_payloads: HashMap<H256, Vec<u8>>` (not the `Vec<([u8;32], Vec<u8>)>` shape carried during early branch iteration). Adapts: - `attach_prefetched` builds the struct directly with empty `ops` and a `HashMap` of renew payloads keyed on `H256`. - Three integration-test assertion sites updated to inspect the struct fields (`ops` + `renew_payloads`) instead of treating it as a `Vec`. Compiles and 9 lib + 10 integration tests green against PR-head substrate.
Adds the zombienet-sdk storage-chain test module that exercises the full storage-chain sync path against a pre-generated 100-block chain snapshot. - `parachain_generate_db`: utility binary that submits 30 `store` extrinsics to a freshly-spawned parachain via the `Bob` authorization budget and cuts a `tip-sync-100.tgz` snapshot once both nodes finalize the target height. - `parachain_tip_sync_with_renewals`: the main test; loads the snapshot, spins up a fresh full node, replays each store as a `renew_content_hash` extrinsic, and asserts the new node receives every renew payload over bitswap and matches storage roots with the producer. - `fixture.rs`, `common.rs`: deterministic payload/algorithm/content_hash helpers and zombienet network setup shared by both binaries. - `generate-snapshots.sh`: regenerates the 75MB `tip-sync-100.tgz` blob; the blob itself is downloaded from GCS at test time (not committed to git). - `fixtures/*.json`: chain specs and snapshot metadata recorded alongside the snapshot. - `full_node_warp_sync.rs`: small refactor to share the bitswap setup hook with the new storage-chain test. `cumulus-zombienet-sdk-tests` gains the test as a binary target via the `Cargo.toml` workspace entry.
Adds the gap-sync dispatch path to `StorageChainBlockImport`, gated in
production behind the existing `should_intercept` filter for `BlockOrigin::GapSync`.
Reachable only via the test-helpers `intercept_gap_sync_for_test` override while
the sync layer's body-fetch-inside-pruning-window work is still pending.
When enabled, the wrapper for a gap-synced block:
1. Queries `TransactionStorageApi::indexed_transactions(block_number)` at the
local `finalized_hash` (always >= the gap block by construction; that state
has `Transactions::<T>::get(block_number)` populated by the block's own
`on_finalize`, within retention).
2. Tail-hashes each entry against the body to split it into synthetic
`IndexOperation::Insert` ops (store data already in body) and synthetic
`IndexOperation::Renew` ops (data missing, must be fetched).
3. Bitswap-fetches the missing renew payloads.
4. Attaches both halves to `BlockImportParams::prefetched_indexed_transactions`
via the widened `PrefetchedIndexedTransactions { ops, renew_payloads }`
carrier introduced in the parent substrate PR.
5. Forwards to the inner import with `state_action = Skip` unchanged.
The backend then routes via the synthetic-ops fallback (runtime `index_ops`
empty for gap-synced blocks) and writes both store tails and renew payloads
into the TRANSACTION column exactly as if the runtime had executed.
Refactors `body_classify_renews` into `body_classify_to_ops` returning both
halves of the classification (synthetic ops + renew-fetch set);
`body_classify_renews` becomes a `#[cfg(test)]` delegate-regression target.
Tests:
- 9 new unit tests for `body_classify_to_ops` (W12): pure stores / pure renews
/ mixed / per-hashing / oversized / `u32::MAX` skip / non-RAW skip /
out-of-range skip / delegate regression.
- 9 new integration tests for the gap-sync dispatch (W13, behind
`test-helpers`): pure renews / pure stores no fetch / mixed split /
state_action stays Skip / below-retention pass-through / finalized-hash
state-context assertion / already-present filter / fetcher partial failure
propagation / production-gate regression guard
(`import_gap_sync_disabled_by_default_passes_through`).
Test infrastructure:
- `cumulus-client-storage-chain-sync` gains a `test-helpers` feature that
exposes `StorageChainBlockImport::intercept_gap_sync_for_test`.
- Mock client extensions: `set_finalized_hash`, `last_indexed_transactions_state`,
`MockNetworkRequest::call_count` (all feature-gated).
- Default-features build remains clean; W13 tests skip when `test-helpers`
is not enabled.
Also updates the existing `attach_prefetched` and tip-block test assertions
to use the widened struct (`renew_payloads` field) instead of the bare
`Vec<(_,_)>` shape. No production behaviour change on the tip-block path.
… onto one line The two map/collect chains that adapt `Vec<(ContentHash, Vec<u8>)>` to the PR #12086 `HashMap<H256, Vec<u8>>` carrier shape fit comfortably in 100 cols on a single line. Matches nightly-fmt conventions used elsewhere in the file (stable-fmt would also accept the multi-line form, but it disagrees with nightly on import grouping in the same file anyway). No behavior change.
…ayed_changes Simplify indexed_transactions queries by using the typed runtime API instead of manual call_api_at boilerplate. This leverages PR #12084's set_overlayed_changes primitive. Key changes: - indexed_transactions_with_storage_changes: 60 lines → 10 lines - indexed_transactions_at_finalized: 50 lines → 6 lines, removed finalized_hash parameter - Removed: ProofRecorder, ProofSizeExt, CallApiAtParams, RefCell wrappers, manual decode, has_api_with checks (already gated by should_intercept) - Removed: INDEXED_TRANSACTIONS_API constant, call_api_at_count/overlay_marker_seen test helpers (replaced with overlayed_changes inspection) Also updated mock_impl_runtime_apis! to support set_overlayed_changes and fixed all affected mock structs in other crates.
…e_operator node_dev --bump minor'
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Storage chains rely on Renew extrinsics to keep indexed transaction payloads alive past their original retention window. A node that tip-syncs past a Renew without ever having executed the original store ends up with an empty TRANSACTION column locally, and apply_index_ops fails. This PR adds StorageChainBlockImport, a block-import wrapper that intercepts this case during tip-sync, fetches the missing bytes over bitswap, and hands them to the backend through the PrefetchedIndexedTransactions carrier introduced in #12086.
Gap-sync (the same problem after warp-sync) is wired up but stays disabled in production — the sync layer doesn't yet fetch bodies inside the pruning window. The dispatch and its tests are already in place behind the existing origin filter, so flipping the gate is a one-line change once the sync-layer work lands.