Local-first AI agent runtime. Your machine. Your memory. Your mesh.
NULLA is a local-first agent runtime — runs on your hardware, remembers everything across sessions, uses tools to do real engineering work, and coordinates trusted helpers over a peer mesh when a task needs more reach. Nothing leaves your box unless you say so.
It's also a node in Web0 — the direction where tasks decompose, agents bid, compute gets rented, and work settles over the x402 payment rail.
Current state: Alpha — core runtime, memory, and tool loop working on main. Mesh economics hardening. See docs/STATUS.md.
local NULLA agent → memory + tools → optional trusted helpers → mesh task market → results
NULLA runs a real agent loop: call LLM → parse tool intent → execute (read files, run tests, write code, search web) → feed result back → repeat until done. It doesn't hand you a one-shot guess and call it a day.
Benchmark on real engineering tasks (5 tasks requiring tool use):
| Score | Notes | |
|---|---|---|
| NULLA (14b + tools + loop) | 5/5 | Iterates, fixes, verifies |
| Ollama 14b single-shot | 4/5 | Fails cross-file rename — no iteration |
Tasks were specifically designed to be impossible without tool use: bugs only visible at runtime, multi-file changes that require reading before editing. The benchmark is in tests/benchmarks/agent_capability_bench.py — run it yourself.
Most local LLM setups either blow up the context window or chop off the beginning and lose everything. NULLA compresses without forgetting.
Memory benchmark (30-turn conversation, 5 facts planted early):
| Mode | Recall | Peak tokens |
|---|---|---|
| Raw (no compression) | 5/5 (100%) | 528 |
| Sliding window (10) | 0/5 (0%) | 362 (-31%) |
| NULLA ContextWindow | 5/5 (100%) | 335 (-36%) |
Sliding window is the naive approach every other local stack uses. It cuts tokens by just forgetting everything old — including your passwords, deadlines, and API keys. NULLA cuts 36% of tokens and remembers everything.
The three tiers:
- L1 — recent turns verbatim (always in context)
- L2 — LLM-compressed structured summary of older turns (Key Facts / Decisions / Open Questions / Context — exact values preserved word-for-word)
- L3 — semantic memory nodes in SQLite, retrieved by embedding similarity with
nomic-embed-text
Smart retrieval: before injecting L3 nodes, NULLA checks whether the content is already covered in L2. No token bloat from re-injecting facts the summary already has.
Every turn gets scored before being stored in L3:
password / API key → 0.6–0.95
port / date → 0.45–0.50
decision / deadline → 0.40–0.45
generic explanation → 0.20
High-importance turns are prioritised during retrieval. Your sk-prod-xxxx stays findable. "Can you explain async/await?" does not crowd it out.
Plugs into nomic-embed-text via Ollama (274MB, 768-dim). Falls back to a hash bag-of-words if not installed. The same embedding service backs L3 retrieval across sessions — ask something in session 2, get a relevant fact from session 1.
GET /api/runtime/capabilities reports, per feature, whether it is implemented, simulated, or disabled — so payments show as simulated, WAN mesh as experimental, and live web lookup as opt-in and off in the local-only profile (enable it on a non-local-only profile). /healthz reports commit + dirty bit. The runtime surfaces its own status.
Parad0x Labs builds Web0 on Solana — money and agents that settle themselves. You are here: 🧠 Local AI (the runtime that consumes every layer).
| Layer | Repo | Does |
|---|---|---|
| 💸 Payments | dna-x402 | x402 rail: quote → pay → verify → receipt → anchor |
| 🛠️ Build | dna-x402-builders | Hosted kit: turn any API/bot into a paid agent |
| 🕶️ Privacy | Dark-Null-Protocol | Groth16 privacy settlement, published proofs |
| 🗜️ Data | liquefy | Columnar compression that beats Zstd |
| 🛡️ Audit | liquefy-openclaw-integration | Flight recorder: 24 engines + Solana-anchored audit trails |
| 🎬 Media | nebula-media | Proof-carrying media compression — scene-aware + on-chain receipts |
| 🧠 Local AI | nulla-local (this repo) | Local-first agent runtime — your machine, your memory |
See it live: parad0xlabs.com
Bootstrap install script:
curl -fsSLo bootstrap_nulla.sh https://raw.githubusercontent.com/Parad0x-Labs/nulla-hive-mind/main/installer/bootstrap_nulla.sh
bash bootstrap_nulla.shWindows PowerShell:
Invoke-WebRequest https://raw.githubusercontent.com/Parad0x-Labs/nulla-hive-mind/main/installer/bootstrap_nulla.ps1 -OutFile bootstrap_nulla.ps1
powershell -ExecutionPolicy Bypass -File .\bootstrap_nulla.ps1Profiles:
# Safest — smaller machines, zero remote dependency
bash bootstrap_nulla.sh --install-profile ollama-only
# Full local power — 24 GiB+ unified memory or equivalent
bash bootstrap_nulla.sh --install-profile ollama-max
# Max performance — Ollama + native llama.cpp (local_plus_llamacpp)
bash bootstrap_nulla.sh --install-profile local_plus_llamacppAfter install, set your profile:
cd ~/nulla-hive-mind && .venv/bin/python -m apps.nulla_cli install-profile --set ollama-only
cd ~/nulla-hive-mind && .venv/bin/python -m apps.nulla_cli install-profile --set ollama-maxFull install docs: docs/INSTALL.md
| OS | Inference | Job sandbox | Launchers |
|---|---|---|---|
| macOS (Apple Silicon) | Metal GPU via Ollama | kernel-enforced (sandbox-exec) |
.command |
| Linux | Ollama + native llama.cpp | kernel-enforced (bwrap/unshare/firejail) |
.sh |
| Windows (native host) | Ollama (CPU; consumer-GPU lane coming) | static command guard only (no kernel backend) | .bat / PowerShell |
| Windows + WSL2/Linux | Ollama + native llama.cpp | kernel-enforced (bwrap/unshare/firejail) |
.sh inside WSL2 |
Apple Silicon is the primary development target. Temp paths, signal handling, chat, three-tier memory, the OpenClaw UI bridge, local Ollama inference, and the workspace tools all run on a native Windows host today.
Full capability — kernel-enforced no-network job sandbox and live web lookup — wants WSL2/Linux plus a non-local-only profile:
- Kernel sandbox: native Windows has no kernel network-namespace backend, so a
no-network job fails closed by default. Run under WSL2/Linux for
bwrap/unshare/firejailkernel enforcement, or setnetwork_isolation_mode="heuristic_only"on a native host as an explicit, informed override (static command guard only, no kernel isolation). - Live web lookup: opt-in and OFF in the local-only profile. Enable it on a non-local-only
profile (and/or
NULLA_ENABLE_WEB=1when not local-only). See Web Access. - Remote
null://dial: opt-in and OFF by default. Anull://request runs locally unless dial is enabled withNULLA_ENABLE_NULL_DIAL=1, at which point it can reach the named.nullagent's x402 endpoint and return that agent's result. Payment is separately gated by--allow-spendwithin a cap. An SSRF guard rejects internal/loopback endpoints. See Remote dial.
- Agent loop — LLM → tool call → execute → iterate → done. Not a single-shot wrapper.
- Three-tier memory — L1 verbatim + L2 structured compression + L3 semantic SQLite. 36% fewer tokens, 100% recall.
- Embedding service — nomic-embed-text (768-dim) with hash-BoW fallback. Cross-session retrieval.
- Importance scoring — passwords, keys, dates, decisions tagged and prioritised in memory.
- Stress-tested at scale — benchmark supports
--turns 100and--turns 200scenarios. - Persistent memory across sessions — NullaMemory SQLite backend.
- Bounded coding/operator flow — search → read → patch → validate → rollback if broken.
- Append-only task/proof spine — every repair and orchestration step is inspectable, not locked inside the executor.
- Mesh task market — decompose → escrow → offer → claim → execute → review → reward. Ed25519-signed credit settlement. Single-node and loopback verified end-to-end.
- 3-layer anti-cheat proof-of-work credits — challenge-response, staking, ZK-proof path. The stake-before-work / slash-on-cheat guard is built and self-tests green (wrong, late, and cheating workers are slashed).
- Multi-result consensus validator — cross-validates worker answers, spawns a verification job on disagreement (built + tested).
- Capability-token authorization — signed, scoped, single-use, expiring task tokens gate who may run what (built + tested).
- Contribution-proof receipt chain + proof-of-execution + proof manifest — hash-canonical contribution receipts and a git-source proof manifest, distinct from the task/proof event spine (built + tested).
- Compute-rental market — prices your real hardware, welds x402 receipt hash into tamper-evident
WorkProof. Therent()→_pay_x402()pay-upfront path is wired and integration-tested (315-LOC end-to-end test; stub/devnet modes — live mainnet anchor pending). - DNA x402 payment bridge + wallet manager — simulated USDC→credit purchase path (1 USDC = 1000 credits) against a local SQLite wallet; settlement is simulated today (stub/devnet), live on-chain settlement coming. A local on-chain receipt verifier gates real settlement so reputation can't be self-claimed (built + tested).
- Directory-less peer discovery — Kademlia DHT routing table (k=20 buckets) + verified-endpoint liveness index; nodes find each other without a central directory (built + tested).
- Role-aware provider routing — local drone lanes vs synthesis lanes, local llama.cpp, vLLM, and Kimi lanes when configured.
- Capability-reporting API —
GET /api/runtime/capabilities— implemented / simulated / disabled, per feature. - CI — sharded local regression + GitHub Actions + fast LLM acceptance suite.
Built core, still partial (real implementations with tests, but thinner coverage):
- Credit DEX / order book — P2P, cheapest-first marketplace for compute credits (built core; thin).
- Sybil / collusion fraud detection — reputation graph + closed-loop collusion detection + score decay (built core, covered by
test_fraud_and_timeouts). - Encrypted P2P transport + NAT traversal — TLS streams plus STUN, hole-punching, and relay fallback (built core, covered by transport test suite; some helpers thin).
- Sandboxed helper-worker isolation — runs untrusted mesh jobs behind filesystem, network, and resource guards (built core).
After install, start the API and chat:
python3 -m apps.nulla_api_server # local API on :11435
python3 -m apps.nulla_agent --interactive
curl http://127.0.0.1:11435/api/runtime/capabilitiesFull install docs: docs/INSTALL.md
# Agent capability: NULLA tool loop vs Ollama single-shot
python -m tests.benchmarks.agent_capability_bench
# Memory compression: recall vs token budget at 30 / 100 / 200 turns
python -m tests.benchmarks.memory_compression_bench
python -m tests.benchmarks.memory_compression_bench --turns 100
python -m tests.benchmarks.memory_compression_bench --turns 200
# Provider comparison across 4 models × 4 task categories
python -m tests.benchmarks.nulla_vs_standardcore/— agent runtime, memory, tools, mesh, credits, compute, Hive, webcore/context_window.py— three-tier memory managercore/conversation_summarizer.py— structured LLM compressioncore/embedding_service.py— nomic-embed-text + hash-BoW fallbackcore/nulla_memory.py— SQLite-backed persistent memorycore/agent_runtime/— turn loop, fast paths, research loop
apps/— API server, CLI, agent entrypointstests/— regression coverage + benchmarksinstaller/— one-click setupdocs/— architecture, status, trust, runbooks
Full map: REPO_MAP.md
git clone https://github.com/Parad0x-Labs/nulla-hive-mind.git
cd nulla-hive-mind
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,runtime]"
python3 -m apps.nulla_api_serverUseful entry points:
python3 -m apps.nulla_api_server # local API on :11435
python3 -m apps.nulla_agent --interactive
curl http://127.0.0.1:11435/api/runtime/capabilitiesProof path for skeptics: docs/PROOF_PATH.md
Architecture: docs/SYSTEM_SPINE.md · docs/CONTROL_PLANE.md · docs/STATUS.md
NULLA is alpha. The core runtime and memory system are real and working on main. Payments are simulated, WAN mesh is experimental, and live settlement is still hardening. GET /api/runtime/capabilities reports the current per-feature status at any moment.

