Git commit rules:
- NEVER add Co-Authored-By lines to commits
- NEVER use git rebase
- NEVER use git push --force or git push -f
Automatic formatting:
- ALWAYS run
/formatafter generating or modifying Rust code - ALWAYS run
/formatbefore creating any git commit - This ensures all code follows project formatting standards (Rust, TOML, feature propagation) and passes clippy
Scalable Web3 Storage is a decentralized storage system built on Substrate with game-theoretic guarantees. Storage providers lock stake and face slashing for data loss, while the chain acts as a credible threat rather than the hot path.
Architecture: Two-node system where blockchain handles accountability and provider nodes handle actual storage:
- Parachain Node: On-chain logic for stake, agreements, checkpoints, and challenges
- Provider Node: Off-chain HTTP server for data upload, download, and MMR commitment
Key Purpose: Enable trustless storage where normal operations (reads, writes) happen off-chain via HTTP, and the chain is only touched for setup, checkpoints, and disputes.
# Build everything (release)
cargo build --release
# Build specific components
cargo build --release -p storage-parachain-runtime
cargo build --release -p storage-provider-pallet
cargo build --release -p storage-provider-node
cargo build --release -p storage-client
# Build with runtime benchmarks
cargo build --release --features runtime-benchmarks
# Using just (recommended)
just build# Run all tests
cargo test
# Run pallet tests
cargo test -p storage-provider-pallet
# Run provider node tests
cargo test -p storage-provider-node
# Run client SDK tests
cargo test -p storage-client
# Run file system tests (Layer 1)
cargo test -p file-system-primitives
cargo test -p pallet-drive-registry
cargo test -p file-system-client
# Or test all file system components at once
just fs-test-all
# Run integration tests (require chain + provider already running)
just start-chain # Terminal 1
just start-provider # Terminal 2
just demo # Terminal 3 — Layer-0 PAPI flow
just fs-demo-ci # Terminal 3 — Layer-1 file-system flow
just s3-demo-ci # Terminal 3 — Layer-1 S3 flow
# Clippy linting
cargo clippy --all-targets --all-features --workspace -- -D warnings# Rust formatting (requires nightly)
cargo +nightly fmt --all
# TOML formatting
taplo format --check --config .config/taplo.toml
# Feature propagation lint (checks Cargo.toml feature gates)
zepter run --config .config/zepter.yaml# One-time setup (downloads binaries, builds project)
just setup
# Start blockchain
just start-chain
# Start provider node manually
just start-provider
# Check provider health
just health
# Check chain health (relay + parachain + current block)
bash scripts/check-chain.sh
# Run end-to-end PAPI demo (setup, upload, 2 challenges)
just demoWhen the user says "run locally" (or "run the UIs", "start the UIs", "spin up the UIs"), invoke the run-local-uis project skill — it starts all four user-interfaces/ apps on their canonical ports with Vite HMR, including the landing page (which needs a custom dev config to substitute its build-time placeholders and rewrite card links). Canonical ports: landing 5176, console-ui 5173, drive-ui 5174, provider 5175.
The File System Interface provides a high-level abstraction over Layer 0's raw blob storage.
# Test all file system components (primitives + pallet + client)
just fs-test-all
# Run integration example against a running chain + provider node
just fs-demo-ci
# Manually run the basic_usage example
cargo run -p file-system-client --example basic_usageQuick Start Guide: FILE_SYSTEM_QUICKSTART.md
Complete Documentation: docs/filesystems/README.md
For any JavaScript or TypeScript code in this repo (demos, scripts, tooling, future SDKs), talk to the chain through polkadot-api (PAPI).
Do NOT introduce @polkadot/keyring, @polkadot/util-crypto, @polkadot/util, @polkadot/api, or any other @polkadot/* package.
They duplicate functionality PAPI already provides, drag in 20+ transitive deps, and force cryptoWaitReady() awaits everywhere. Use these instead:
| Need | Use |
|---|---|
| Chain client + typed API | polkadot-api (createClient, getWsProvider from polkadot-api/ws-provider) |
| Signer wrapper | getPolkadotSigner from polkadot-api/signer |
SCALE / Binary / Enum |
@polkadot-api/substrate-bindings |
Sr25519 key derivation (//Alice) |
sr25519CreateDerive from @polkadot-labs/hdkd + DEV_PHRASE + entropyToMiniSecret + mnemonicToEntropy from @polkadot-labs/hdkd-helpers |
| SS58 encode / decode | ss58Address / ss58Decode from @polkadot-labs/hdkd-helpers |
| blake2-256 hashing | blake2b256 from @polkadot-labs/hdkd-helpers |
cryptoWaitReady() |
Not needed — hdkd is synchronous; delete the import and the await |
Canonical signer/derive pattern — set up the derive function once at module load, then call makeSigner("//Alice") etc.:
import { createClient } from "polkadot-api";
import { getWsProvider } from "polkadot-api/ws-provider";
import { getPolkadotSigner } from "polkadot-api/signer";
import { sr25519CreateDerive } from "@polkadot-labs/hdkd";
import {
DEV_PHRASE,
entropyToMiniSecret,
mnemonicToEntropy,
ss58Address,
ss58Decode,
} from "@polkadot-labs/hdkd-helpers";
const devMiniSecret = entropyToMiniSecret(mnemonicToEntropy(DEV_PHRASE));
const deriveSr25519 = sr25519CreateDerive(devMiniSecret);
export function makeSigner(seed) {
const keyPair = deriveSr25519(seed); // seed is a SURI path like "//Alice"
return {
signer: getPolkadotSigner(keyPair.publicKey, "Sr25519", keyPair.sign),
address: ss58Address(keyPair.publicKey), // prefix 42 (`5…`), same as @polkadot/keyring default
publicKey: keyPair.publicKey,
seed,
};
}ss58Address defaults to substrate prefix 42 (5…) while PAPI surfaces accounts with the runtime SS58 prefix (Polkadot-style 1… on this parachain) — same key, different string, so string equality fails. Compare raw bytes via ss58Decode:
// ss58Decode(addr) → [bytes, prefix]
export function sameAddress(a, b) {
try {
const [aBytes] = ss58Decode(a);
const [bBytes] = ss58Decode(b);
if (aBytes.length !== bBytes.length) return false;
for (let i = 0; i < aBytes.length; i++) {
if (aBytes[i] !== bBytes[i]) return false;
}
return true;
} catch {
return false;
}
}web3-storage/
├── pallet/ # Substrate pallet (on-chain logic - Layer 0)
│ ├── src/lib.rs # Core pallet implementation
│ └── Cargo.toml # Pallet dependencies
├── runtime/ # Parachain runtime
│ ├── src/lib.rs # Runtime configuration
│ └── Cargo.toml # Runtime dependencies
├── provider-node/ # Off-chain HTTP storage server
│ ├── src/ # Provider implementation
│ │ ├── main.rs # Server entry point
│ │ ├── storage.rs # Storage layer
│ │ └── mmr.rs # MMR commitment logic
│ └── Cargo.toml # Provider dependencies
├── client/ # Layer 0 Client SDK
│ ├── src/ # SDK implementation
│ │ ├── lib.rs # Main client API
│ │ └── types.rs # Client types
│ ├── examples/ # Usage examples
│ └── README.md # SDK documentation
├── primitives/ # Layer 0 shared types and utilities
│ ├── src/lib.rs # Common types
│ └── Cargo.toml # Primitive dependencies
├── storage-interfaces/ # Layer 1 - High-level interfaces
│ └── file-system/ # File System Interface
│ ├── primitives/ # File system types (DriveInfo, CommitStrategy, etc.)
│ ├── pallet-registry/ # Drive Registry pallet (on-chain)
│ └── client/ # File System Client SDK
│ ├── src/
│ │ ├── lib.rs # Main file system client
│ │ └── substrate.rs # Blockchain integration (subxt)
│ ├── examples/
│ │ └── basic_usage.rs # Complete workflow example
│ └── README.md # File system client docs
├── scripts/ # Helper scripts
│ ├── build-chain-spec.sh # Build runtime + emit chain spec (used by `just generate-chain-spec`)
│ ├── check-chain.sh # Relay + parachain health probe
│ └── quick-test.sh # Curl-based smoke test of provider HTTP API
├── chain-specs/ # Chain specification files
├── docs/ # Documentation
│ ├── README.md # Documentation index
│ ├── getting-started/ # Quick start guides
│ ├── testing/ # Testing procedures
│ ├── reference/ # API references
│ ├── design/ # Architecture docs
│ └── filesystems/ # Layer 1 File System docs
│ ├── README.md # File system overview
│ ├── ARCHITECTURE.md # Encoding, security, chain integration
│ ├── USER_GUIDE.md # User guide
│ ├── API_REFERENCE.md # API documentation
│ └── ADMIN_GUIDE.md # Admin guide
├── FILE_SYSTEM_QUICKSTART.md # Quick start for file system
└── justfile # Development commands
Pallet (pallet/): On-chain logic for provider registration, bucket creation, storage agreements, checkpoints, and challenge/slashing mechanism.
Runtime (runtime/): Parachain runtime that includes the storage provider pallet and configures its parameters (stake requirements, challenge periods, etc.).
Provider Node (provider-node/): Off-chain HTTP server that:
- Stores data chunks locally
- Builds MMR commitments
- Serves data via HTTP API
- Signs checkpoints for on-chain submission
Client SDK (client/): Rust library for applications to:
- Create buckets and agreements (on-chain)
- Upload/download data (off-chain HTTP)
- Submit checkpoints (on-chain)
- Challenge providers (on-chain)
Primitives (primitives/): Shared types used across pallet, provider node, and client.
File System Primitives (storage-interfaces/file-system/primitives/): High-level types for file system:
DriveInfo: Drive metadata and configurationDirectoryNode: Protobuf-based directory structureFileManifest: File metadata with chunk trackingCommitStrategy: Checkpoint strategies (Immediate, Batched, Manual)- Helper functions for CID computation and path handling
Drive Registry Pallet (storage-interfaces/file-system/pallet-registry/): On-chain drive management:
- Drive creation with automatic infrastructure setup
- Root CID tracking for drive state
- User-to-drive mapping
- Bucket-to-drive mapping
- Drive lifecycle (create, update, clear, delete)
File System Client (storage-interfaces/file-system/client/): High-level SDK providing:
- Familiar file/folder interface over Layer 0 blob storage
- Automatic drive creation and provider selection
- Directory operations (create, list, navigate)
- File operations (upload, download, delete)
- Real blockchain integration using
subxt - Content-addressed storage with CID verification
- Flexible commit strategies
Example: storage-interfaces/file-system/client/examples/basic_usage.rs
- Complete workflow: drive creation → directories → file uploads/downloads
- Real blockchain integration with event extraction
- Demonstrates the full Layer 1 capabilities
- Setup:
just setup(one-time, downloads binaries and builds) - Start:
just start-chainthenjust start-provider(in separate terminals) - Configure: with chain + provider running,
just demoregisters the provider, opens an agreement, and exercises challenges end-to-end (it does not start the chain or provider for you) - Test:
just demo
- Format code:
cargo fmt --all - Run clippy:
cargo clippy --all-targets --all-features --workspace - Run tests:
cargo test - Build:
cargo build --releaseorjust build
The project uses Zombienet for local relay chain + parachain testing:
# Start network (relay chain + parachain)
just start-chain
# Or manually:
.bin/zombienet spawn zombienet.tomlNetwork URLs:
- Relay chain:
ws://127.0.0.1:9900 - Parachain:
ws://127.0.0.1:2222 - Provider HTTP:
http://localhost:3333
Web UI:
- Relay chain: https://polkadot.js.org/apps/?rpc=ws://127.0.0.1:9900
- Parachain: https://polkadot.js.org/apps/?rpc=ws://127.0.0.1:2222
This project is built on the Polkadot SDK (formerly Substrate). For deeper understanding of FRAME pallets, runtime macros, and consensus:
- Repository: https://github.com/paritytech/polkadot-sdk
- Documentation: https://paritytech.github.io/polkadot-sdk/
The Polkadot SDK provides:
- FRAME pallet system and runtime macros
- Parachain consensus (Cumulus)
- Networking (libp2p)
- RPC infrastructure
- XCM (Cross-Consensus Messaging)
- Polkadot SDK: See
Cargo.tomlworkspace dependencies - Rust: 1.74+ with
wasm32-unknown-unknowntarget - Just: Command runner (
cargo install just) - Zombienet: Network spawner (auto-downloaded by
just setup) - Polkadot: Relay chain binary (auto-downloaded)
- Polkadot Omni Node: Parachain node (auto-downloaded)
// Token decimals
pub const UNIT: Balance = 1_000_000_000_000; // 12 decimals
// Minimum provider stake: 1000 tokens
pub const MinProviderStake: Balance = 1_000 * UNIT;
// 1 token (1e12) per 1 GB (1e9 bytes) = 1000 per byte
pub const MinStakePerByte: Balance = 1_000;
// Challenge response deadline (provider must respond within this many blocks)
pub const ChallengeTimeout: BlockNumber = 48 * HOURS;
pub const SettlementTimeout: BlockNumber = 24 * HOURS;
pub const RequestTimeout: BlockNumber = 6 * HOURS;
// Provider-initiated checkpoint config
pub const DefaultCheckpointInterval: BlockNumber = 100;
pub const DefaultCheckpointGrace: BlockNumber = 20;
pub const CheckpointReward: Balance = 1_000_000_000_000; // 1 token
pub const CheckpointMissPenalty: Balance = 500_000_000_000; // 0.5 tokenpub struct ProviderSettings {
min_duration: BlockNumber, // Minimum agreement duration
max_duration: BlockNumber, // Maximum agreement duration
price_per_byte: Balance, // Price per byte per block
accepting_primary: bool, // Accepting new agreements
replica_sync_price: Option<Balance>, // Price for replica sync
accepting_extensions: bool, // Accepting agreement extensions
max_capacity: u64, // Maximum storage capacity (0 = unlimited)
}Providers must stake tokens proportional to their declared capacity:
// Minimum stake per byte of declared capacity
pub const MinStakePerByte: Balance = 1_000_000; // 1 unit per MB
// Required stake calculation
required_stake = max_capacity * MinStakePerByte
// Example: 1 TB capacity requires 1,000,000,000,000 units stake-
Setup (on-chain):
- Provider registers with stake
- Client creates bucket
- Agreement established
-
Storage (off-chain):
- Client uploads chunks via HTTP to provider
- Provider stores and builds MMR commitment
-
Checkpoint (on-chain):
- Provider signs MMR root
- Client submits checkpoint
- Provider now liable for data
-
Verification (off-chain):
- Client spot-checks chunks
- Client can download anytime
-
Dispute (on-chain, rare):
- Client submits challenge
- Provider must provide proof or get slashed
The provider builds an MMR over stored chunks:
- Each upload adds a leaf to the MMR
- MMR root represents commitment to all data
- Efficient proofs for individual chunks
- Enables challenge mechanism
payment = price_per_byte × max_bytes × duration
Example:
price_per_byte = 1,000,000
max_bytes = 1,073,741,824 (1 GB)
duration = 500 blocks
payment = 536,870,912,000,000,000
Set maxPayment with 10-20% buffer to account for price changes.
The SDK provides automatic provider discovery based on storage requirements:
use storage_client::{DiscoveryClient, StorageRequirements};
let mut client = DiscoveryClient::with_defaults()?;
client.connect().await?;
// Define requirements
let requirements = StorageRequirements {
bytes_needed: 10 * 1024 * 1024 * 1024, // 10 GB
min_duration: 100_000,
max_price_per_byte: 1_000_000,
primary_only: true,
};
// Find matching providers (sorted by score)
let providers = client.find_providers(requirements, 10).await?;
// Or get recommendations with cost estimates
let recommendations = client.suggest_providers(bytes, duration, budget).await?;Matching Algorithm: Providers are scored 0-100 based on:
- Accepting status (not accepting = 0)
- Capacity (insufficient = -50 points)
- Price (too high = -30 points)
- Duration (mismatch = -20 points)
See Storage Marketplace Design for details.
The client SDK provides comprehensive checkpoint management:
use storage_client::{CheckpointManager, CheckpointConfig, BatchedCheckpointConfig};
// Create checkpoint manager
let manager = CheckpointManager::new(chain_endpoint, CheckpointConfig::default()).await?;
let manager = manager.with_providers(provider_endpoints);
// Manual checkpoint submission
let result = manager.submit_checkpoint(bucket_id).await;
// Or enable automatic checkpoints
let config = BatchedCheckpointConfig {
interval: BatchedInterval::Blocks(100),
..Default::default()
};
let handle = manager.start_checkpoint_loop(bucket_id, config, callback).await?;
// Control the loop
handle.submit_now().await?; // Force immediate checkpoint
handle.stop().await?; // Stop background loopKey Components:
CheckpointManager: Coordinates multi-provider checkpoint collection and consensusCheckpointPersistence: Persists checkpoint state to disk with backup rotationEventSubscriber: Real-time blockchain event monitoring (checkpoints, challenges)ProviderHealthHistory: Tracks provider reliability and response times
See Checkpoint Protocol Design for details.
Subscribe to real-time blockchain events:
use storage_client::{EventSubscriber, EventFilter, StorageEvent};
let subscriber = EventSubscriber::connect(chain_endpoint).await?;
// Subscribe to specific events
let filter = EventFilter::bucket(bucket_id);
let mut stream = subscriber.subscribe(filter).await?;
while let Some(event) = stream.next().await {
match event {
StorageEvent::BucketCheckpointed { bucket_id, mmr_root, .. } => { /* ... */ }
StorageEvent::ChallengeCreated { challenge_id, .. } => { /* ... */ }
StorageEvent::ProviderSlashed { provider, amount, .. } => { /* ... */ }
_ => {}
}
}For the full review criteria (Parity Standards), see the /review skill. The review bot and all contributors follow those guidelines.
- Error Handling: Use
Resulttypes with meaningful error enums. Avoidunwrap()andexpect()in production code; they are acceptable in tests. - Arithmetic Safety: Use
checked_*,saturating_*, orwrapping_*arithmetic to prevent overflow. Never use raw arithmetic operators on user-provided values. - Naming: Follow Rust naming conventions (snake_case for functions/variables, CamelCase for types).
- Complexity: Prefer simple, readable code. Avoid over-engineering and premature abstractions.
- No useless comments: Comments should mostly explain why things are done, not how. The code should be readable enough to explain the how.
- Storage: Use appropriate storage types (
StorageValue,StorageMap,StorageDoubleMap,CountedStorageMap). - Events: Emit events for all state changes that external observers need to track.
- Errors: Define descriptive error types in the pallet's
Errorenum. - Weights: All extrinsics must have accurate weight annotations. Update benchmarks when logic changes.
- Origins: Use the principle of least privilege for origin checks.
- Hooks: Be cautious with
on_initializeandon_finalize; they affect block production time. Never panic or do unbounded iteration in them. Always benchmark them properly.
- No Panics in Runtime: Runtime code must never panic. Use defensive programming with
defensive_*macros. - Bounded Collections: Use
BoundedVec,BoundedBTreeMapetc. to prevent unbounded storage growth. - Input Validation: Validate all user inputs at the entry point.
- Storage Deposits: Consider requiring deposits for user-created storage items.
- Arithmetic: Always use checked arithmetic for financial calculations.
- Access Control: Verify origin permissions before state changes.
- Unit Tests: All new functionality requires unit tests.
- Edge Cases: Test boundary conditions, error paths, and malicious inputs.
- Integration Tests: Complex features should have integration tests.
- Mock Tests: Use
mock.rsandTestExternalitiesfor pallet tests. - Provider Node Tests: Test HTTP API endpoints and storage layer.
- Client SDK Tests: Test all public SDK methods.
- Single Responsibility: Each PR should address one concern.
- Tests Pass: All CI checks must pass (
cargo test,cargo clippy,cargo fmt). - No Warnings: Code should compile without warnings.
- Documentation: Public APIs require rustdoc comments.
- Changelog: Update changelog for user-facing changes.
📚 Complete Documentation - Full documentation index
| Document | Description |
|---|---|
| Layer 1 Quick Start | Three-terminal setup + SDK examples |
| Extrinsics Reference | Complete blockchain API |
| Payment Calculator | Calculate agreement costs |
| Architecture Design | System design, economics, common concerns |
| Implementation Details | Technical specs |
| Execution Flows | Sequence diagrams for all extrinsics |
| Storage Marketplace | Provider capacity & discovery |
| Checkpoint Protocol | Automated checkpoint management |
| File System Architecture | Layer 1 encoding, security, blockchain details |
- Minimum required: 1000 tokens =
1000000000000000(12 decimals) - Check Alice's balance in Accounts tab
- Calculate payment:
price_per_byte × max_bytes × duration - Set
maxPaymentwith 10-20% buffer - See Payment Calculator
- Complete on-chain setup first: register provider, create bucket, establish agreement
- With chain + provider already running,
just demoperforms that setup - For chain health, run
bash scripts/check-chain.sh(relay + parachain probe)
- Call
updateProviderSettingsafter registration - Set
acceptingPrimary: true
- Provider's
max_capacityis too low for the agreement - Or provider's stake doesn't cover their declared capacity
- Required:
stake >= max_capacity * MinStakePerByte - Use
DiscoveryClient.find_providers()to find providers with sufficient capacity
runtime-benchmarks- Enable weight generationtry-runtime- Runtime migration testingstd- Standard library features (default)
- Token decimals: 12 (like Polkadot)
- Minimum stake: 1000 tokens
- Challenge period: 100 blocks
- Data is content-addressed with blake2-256
- All data operations happen off-chain via HTTP
- Chain is only for accountability and disputes
- @claude - Mention in any comment to ask questions or request help
- Assign to claude[bot] - Assign an issue to have Claude analyze and propose solutions
- Label with
claude- Add theclaudelabel to an issue for Claude to investigate