Skip to content

WIP: Event driven coordinators#81

Draft
bkontur wants to merge 10 commits into
devfrom
bko-event-driven-coordinators
Draft

WIP: Event driven coordinators#81
bkontur wants to merge 10 commits into
devfrom
bko-event-driven-coordinators

Conversation

@bkontur
Copy link
Copy Markdown
Collaborator

@bkontur bkontur commented May 18, 2026

No description provided.

bkontur added 10 commits May 15, 2026 09:12
Replace the 6s polling loop (which iterated all `AgreementRequests`
storage entries on every tick) with a finalized-block subscription
that reacts to `AgreementRequested` events targeted at this provider.

On startup, a one-time storage snapshot at the first finalized block
covers requests that existed before the subscription began.

`poll_interval` is retained on the config (and CLI flag) but now
governs the reconnect delay after a subscription failure.
ChainStream wraps subxt's finalized-block subscription with automatic
reconnect and surfaces it as an `mpsc::Receiver`-backed `next()` call.
Each yield carries an `is_first_after_reconnect` flag so consumers can
perform a one-time storage snapshot to recover anything that happened
while disconnected.

Also exposes a small set of SCALE-value field decoders (field_u64,
field_account, field_h256) — a subset of the helpers in
storage_client::scale_decode, inlined to avoid a cyclic dependency
(storage-client already depends on provider-node).

The subsequent commits port each coordinator off interval polling to
use this helper.
Replace the inline subscribe + reconnect loop and the local SCALE-decode
helpers with the new ChainStream abstraction. Behaviour is unchanged:
still runs an initial storage snapshot on first tick after (re)connect,
then reacts to AgreementRequested events on every subsequent finalized
block.
Swap the interval-driven polling tick for a ChainStream subscription.
`get_active_checkpoint_duties` is still a stub on this branch — the
migration is structural; behaviour is unchanged (still runs once per
finalized block, ~6s on this parachain).
Swap the interval-driven polling tick for a ChainStream subscription.
The doc comment that called this out as a placeholder
("How often to poll for challenges (if not using subscriptions)") is
updated to reflect the new meaning of the knob (reconnect delay).
Swap the interval-driven polling tick for a ChainStream subscription.
The module-level doc claimed "subscribes to checkpoint events on-chain"
but the implementation was polling every 12s; update the doc to reflect
that duties are now re-evaluated each finalized block.
Add `MembershipCache::invalidate` / `clear_all` and a free
`spawn_membership_invalidator` that uses ChainStream to drop cache
entries when `MemberSet` / `MemberRemoved` events fire. On every
(re)connect the cache is cleared whole — we may have missed events
while disconnected.

The TTL passed to `MembershipCache::new` becomes a defensive backstop:
the cache stays fresh through events, and TTL only matters if the
invalidator couldn't start (chain unreachable at boot).

While here: deduplicate the auth setup blocks in `command.rs` into a
single `configure_auth` helper.
Each coordinator previously opened its own subxt connection via a
two-step `new(...)` + `connect()` flow. Switch to constructing the
`OnlineClient<PolkadotConfig>` once in `command::run` and passing it
(cloned — it's Arc-backed) into each coordinator's `new(...)`.

`connect()` is removed; the api becomes a required field rather than
`Option<OnlineClient<_>>`. The auth invalidator and the multiaddr
sync also reuse the same client.

This is the prerequisite for the next commit's optional light-client
transport: a smoldot-backed light client must be shared across all
chain access (one parachain RPC, one relay-chain subscription) and
that's only possible with a single construction point.
Add `chain_client::ChainTransport` with `WebSocket` and (feature-gated)
`LightClient` variants, plus a single `connect()` that hands back an
`OnlineClient<PolkadotConfig>` regardless of backend. `command::run`
calls it once at boot, so the entire node — coordinators, auth
invalidator, multiaddr sync — runs over one chain client.

CLI:
  --chain-light-client          enable smoldot light client
  --chain-relay-spec FILE       relay chain spec JSON
  --chain-parachain-spec FILE   parachain spec JSON

Defaults are unchanged (WebSocket RPC at ws://127.0.0.1:2222).

The smoldot backend is gated behind the `light-client` Cargo feature
because it adds ~5–10 MB of code and pulls in a relay-chain sync;
default builds aren't affected. Using `--chain-light-client` without
the feature returns a clear error.

  cargo build --release -p storage-provider-node                       # WS only (default)
  cargo build --release -p storage-provider-node --features light-client
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant