NULLA

Local-first AI agent runtime. Your machine. Your memory. Your mesh.

NULLA is a local-first agent runtime — runs on your hardware, remembers everything across sessions, uses tools to do real engineering work, and coordinates trusted helpers over a peer mesh when a task needs more reach. Nothing leaves your box unless you say so.

It's also a node in Web0 — the direction where tasks decompose, agents bid, compute gets rented, and work settles over the x402 payment rail.

Current state: Alpha — core runtime, memory, and tool loop working on main. Mesh economics hardening. See docs/STATUS.md.

local NULLA agent → memory + tools → optional trusted helpers → mesh task market → results

What makes NULLA different

Tool-use agent loop — not prompt theater

NULLA runs a real agent loop: call LLM → parse tool intent → execute (read files, run tests, write code, search web) → feed result back → repeat until done. It doesn't hand you a one-shot guess and call it a day.

Benchmark on real engineering tasks (5 tasks requiring tool use):

	Score	Notes
NULLA (14b + tools + loop)	5/5	Iterates, fixes, verifies
Ollama 14b single-shot	4/5	Fails cross-file rename — no iteration

Tasks were specifically designed to be impossible without tool use: bugs only visible at runtime, multi-file changes that require reading before editing. The benchmark is in tests/benchmarks/agent_capability_bench.py — run it yourself.

Three-tier memory that actually works

Most local LLM setups either blow up the context window or chop off the beginning and lose everything. NULLA compresses without forgetting.

Memory benchmark (30-turn conversation, 5 facts planted early):

Mode	Recall	Peak tokens
Raw (no compression)	5/5 (100%)	528
Sliding window (10)	0/5 (0%)	362 (-31%)
NULLA ContextWindow	5/5 (100%)	335 (-36%)

Sliding window is the naive approach every other local stack uses. It cuts tokens by just forgetting everything old — including your passwords, deadlines, and API keys. NULLA cuts 36% of tokens and remembers everything.

The three tiers:

L1 — recent turns verbatim (always in context)
L2 — LLM-compressed structured summary of older turns (Key Facts / Decisions / Open Questions / Context — exact values preserved word-for-word)
L3 — semantic memory nodes in SQLite, retrieved by embedding similarity with nomic-embed-text

Smart retrieval: before injecting L3 nodes, NULLA checks whether the content is already covered in L2. No token bloat from re-injecting facts the summary already has.

Importance scoring

Every turn gets scored before being stored in L3:

password / API key  → 0.6–0.95
port / date         → 0.45–0.50
decision / deadline → 0.40–0.45
generic explanation → 0.20

High-importance turns are prioritised during retrieval. Your sk-prod-xxxx stays findable. "Can you explain async/await?" does not crowd it out.

Semantic search with real embeddings

Plugs into nomic-embed-text via Ollama (274MB, 768-dim). Falls back to a hash bag-of-words if not installed. The same embedding service backs L3 retrieval across sessions — ask something in session 2, get a relevant fact from session 1.

Capability reporting

GET /api/runtime/capabilities reports, per feature, whether it is implemented, simulated, or disabled — so payments show as simulated, WAN mesh as experimental, and live web lookup as opt-in and off in the local-only profile (enable it on a non-local-only profile). /healthz reports commit + dirty bit. The runtime surfaces its own status.

How this fits the Parad0x stack

Parad0x Labs builds Web0 on Solana — money and agents that settle themselves. You are here: 🧠 Local AI (the runtime that consumes every layer).

Layer	Repo	Does
💸 Payments	dna-x402	x402 rail: quote → pay → verify → receipt → anchor
🛠️ Build	dna-x402-builders	Hosted kit: turn any API/bot into a paid agent
🕶️ Privacy	Dark-Null-Protocol	Groth16 privacy settlement, published proofs
🗜️ Data	liquefy	Columnar compression that beats Zstd
🛡️ Audit	liquefy-openclaw-integration	Flight recorder: 24 engines + Solana-anchored audit trails
🎬 Media	nebula-media	Proof-carrying media compression — scene-aware + on-chain receipts
🧠 Local AI	nulla-local (this repo)	Local-first agent runtime — your machine, your memory

See it live: parad0xlabs.com

Install

Bootstrap install script:

curl -fsSLo bootstrap_nulla.sh https://raw.githubusercontent.com/Parad0x-Labs/nulla-hive-mind/main/installer/bootstrap_nulla.sh
bash bootstrap_nulla.sh

Windows PowerShell:

Invoke-WebRequest https://raw.githubusercontent.com/Parad0x-Labs/nulla-hive-mind/main/installer/bootstrap_nulla.ps1 -OutFile bootstrap_nulla.ps1
powershell -ExecutionPolicy Bypass -File .\bootstrap_nulla.ps1

Profiles:

# Safest — smaller machines, zero remote dependency
bash bootstrap_nulla.sh --install-profile ollama-only

# Full local power — 24 GiB+ unified memory or equivalent
bash bootstrap_nulla.sh --install-profile ollama-max

# Max performance — Ollama + native llama.cpp (local_plus_llamacpp)
bash bootstrap_nulla.sh --install-profile local_plus_llamacpp

After install, set your profile:

cd ~/nulla-hive-mind && .venv/bin/python -m apps.nulla_cli install-profile --set ollama-only
cd ~/nulla-hive-mind && .venv/bin/python -m apps.nulla_cli install-profile --set ollama-max

Full install docs: docs/INSTALL.md

Platform support

OS	Inference	Job sandbox	Launchers
macOS (Apple Silicon)	Metal GPU via Ollama	kernel-enforced (`sandbox-exec`)	`.command`
Linux	Ollama + native llama.cpp	kernel-enforced (`bwrap`/`unshare`/`firejail`)	`.sh`
Windows (native host)	Ollama (CPU; consumer-GPU lane coming)	static command guard only (no kernel backend)	`.bat` / PowerShell
Windows + WSL2/Linux	Ollama + native llama.cpp	kernel-enforced (`bwrap`/`unshare`/`firejail`)	`.sh` inside WSL2

Apple Silicon is the primary development target. Temp paths, signal handling, chat, three-tier memory, the OpenClaw UI bridge, local Ollama inference, and the workspace tools all run on a native Windows host today.

Full capability — kernel-enforced no-network job sandbox and live web lookup — wants WSL2/Linux plus a non-local-only profile:

Kernel sandbox: native Windows has no kernel network-namespace backend, so a no-network job fails closed by default. Run under WSL2/Linux for bwrap/unshare/firejail kernel enforcement, or set network_isolation_mode="heuristic_only" on a native host as an explicit, informed override (static command guard only, no kernel isolation).
Live web lookup: opt-in and OFF in the local-only profile. Enable it on a non-local-only profile (and/or NULLA_ENABLE_WEB=1 when not local-only). See Web Access.
Remote null:// dial: opt-in and OFF by default. A null:// request runs locally unless dial is enabled with NULLA_ENABLE_NULL_DIAL=1, at which point it can reach the named .null agent's x402 endpoint and return that agent's result. Payment is separately gated by --allow-spend within a cap. An SSRF guard rejects internal/loopback endpoints. See Remote dial.

What works right now

Agent loop — LLM → tool call → execute → iterate → done. Not a single-shot wrapper.
Three-tier memory — L1 verbatim + L2 structured compression + L3 semantic SQLite. 36% fewer tokens, 100% recall.
Embedding service — nomic-embed-text (768-dim) with hash-BoW fallback. Cross-session retrieval.
Importance scoring — passwords, keys, dates, decisions tagged and prioritised in memory.
Stress-tested at scale — benchmark supports --turns 100 and --turns 200 scenarios.
Persistent memory across sessions — NullaMemory SQLite backend.
Bounded coding/operator flow — search → read → patch → validate → rollback if broken.
Append-only task/proof spine — every repair and orchestration step is inspectable, not locked inside the executor.
Mesh task market — decompose → escrow → offer → claim → execute → review → reward. Ed25519-signed credit settlement. Single-node and loopback verified end-to-end.
3-layer anti-cheat proof-of-work credits — challenge-response, staking, ZK-proof path. The stake-before-work / slash-on-cheat guard is built and self-tests green (wrong, late, and cheating workers are slashed).
Multi-result consensus validator — cross-validates worker answers, spawns a verification job on disagreement (built + tested).
Capability-token authorization — signed, scoped, single-use, expiring task tokens gate who may run what (built + tested).
Contribution-proof receipt chain + proof-of-execution + proof manifest — hash-canonical contribution receipts and a git-source proof manifest, distinct from the task/proof event spine (built + tested).
Compute-rental market — prices your real hardware, welds x402 receipt hash into tamper-evident WorkProof. The rent() → _pay_x402() pay-upfront path is wired and integration-tested (315-LOC end-to-end test; stub/devnet modes — live mainnet anchor pending).
DNA x402 payment bridge + wallet manager — simulated USDC→credit purchase path (1 USDC = 1000 credits) against a local SQLite wallet; settlement is simulated today (stub/devnet), live on-chain settlement coming. A local on-chain receipt verifier gates real settlement so reputation can't be self-claimed (built + tested).
Directory-less peer discovery — Kademlia DHT routing table (k=20 buckets) + verified-endpoint liveness index; nodes find each other without a central directory (built + tested).
Role-aware provider routing — local drone lanes vs synthesis lanes, local llama.cpp, vLLM, and Kimi lanes when configured.
Capability-reporting API — GET /api/runtime/capabilities — implemented / simulated / disabled, per feature.
CI — sharded local regression + GitHub Actions + fast LLM acceptance suite.

Built core, still partial (real implementations with tests, but thinner coverage):

Credit DEX / order book — P2P, cheapest-first marketplace for compute credits (built core; thin).
Sybil / collusion fraud detection — reputation graph + closed-loop collusion detection + score decay (built core, covered by test_fraud_and_timeouts).
Encrypted P2P transport + NAT traversal — TLS streams plus STUN, hole-punching, and relay fallback (built core, covered by transport test suite; some helpers thin).
Sandboxed helper-worker isolation — runs untrusted mesh jobs behind filesystem, network, and resource guards (built core).

Try It

After install, start the API and chat:

python3 -m apps.nulla_api_server        # local API on :11435
python3 -m apps.nulla_agent --interactive
curl http://127.0.0.1:11435/api/runtime/capabilities

Full install docs: docs/INSTALL.md

Run the benchmarks

# Agent capability: NULLA tool loop vs Ollama single-shot
python -m tests.benchmarks.agent_capability_bench

# Memory compression: recall vs token budget at 30 / 100 / 200 turns
python -m tests.benchmarks.memory_compression_bench
python -m tests.benchmarks.memory_compression_bench --turns 100
python -m tests.benchmarks.memory_compression_bench --turns 200

# Provider comparison across 4 models × 4 task categories
python -m tests.benchmarks.nulla_vs_standard

Repo map

core/ — agent runtime, memory, tools, mesh, credits, compute, Hive, web
- core/context_window.py — three-tier memory manager
- core/conversation_summarizer.py — structured LLM compression
- core/embedding_service.py — nomic-embed-text + hash-BoW fallback
- core/nulla_memory.py — SQLite-backed persistent memory
- core/agent_runtime/ — turn loop, fast paths, research loop
apps/ — API server, CLI, agent entrypoints
tests/ — regression coverage + benchmarks
installer/ — one-click setup
docs/ — architecture, status, trust, runbooks

Full map: REPO_MAP.md

For developers

git clone https://github.com/Parad0x-Labs/nulla-hive-mind.git
cd nulla-hive-mind
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,runtime]"
python3 -m apps.nulla_api_server

Useful entry points:

python3 -m apps.nulla_api_server        # local API on :11435
python3 -m apps.nulla_agent --interactive
curl http://127.0.0.1:11435/api/runtime/capabilities

Proof path for skeptics: docs/PROOF_PATH.md

Architecture: docs/SYSTEM_SPINE.md · docs/CONTROL_PLANE.md · docs/STATUS.md

NULLA is alpha. The core runtime and memory system are real and working on main. Payments are simulated, WAN mesh is experimental, and live settlement is still hardening. GET /api/runtime/capabilities reports the current per-feature status at any moment.

Name		Name	Last commit message	Last commit date
Latest commit History 554 Commits
.github		.github
Beta2_Website		Beta2_Website
LICENSES		LICENSES
adapters		adapters
apps		apps
artifacts/ui		artifacts/ui
bootstrap		bootstrap
channels		channels
config		config
core		core
data/trainable_models/sshleifer-tiny-gpt2		data/trainable_models/sshleifer-tiny-gpt2
docs		docs
infra/searxng		infra/searxng
installer		installer
network		network
ops		ops
proofs		proofs
relay		relay
reports		reports
retrieval		retrieval
sandbox		sandbox
scripts		scripts
skills/nulla-hive-mind		skills/nulla-hive-mind
storage		storage
test_data		test_data
tests		tests
third_party		third_party
tools		tools
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENT_HANDOVER.md		AGENT_HANDOVER.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Install_And_Run_NULLA.bat		Install_And_Run_NULLA.bat
Install_And_Run_NULLA.command		Install_And_Run_NULLA.command
Install_And_Run_NULLA.sh		Install_And_Run_NULLA.sh
Install_NULLA.bat		Install_NULLA.bat
Install_NULLA.command		Install_NULLA.command
Install_NULLA.sh		Install_NULLA.sh
LICENSE		LICENSE
NULLA_STARTER_KIT.md		NULLA_STARTER_KIT.md
OpenClaw_NULLA.bat		OpenClaw_NULLA.bat
Probe_NULLA_Stack.bat		Probe_NULLA_Stack.bat
Probe_NULLA_Stack.sh		Probe_NULLA_Stack.sh
README.md		README.md
REPO_MAP.md		REPO_MAP.md
SECURITY.md		SECURITY.md
Stage_Trainable_Base.bat		Stage_Trainable_Base.bat
Stage_Trainable_Base.command		Stage_Trainable_Base.command
Stage_Trainable_Base.sh		Stage_Trainable_Base.sh
Start_NULLA.bat		Start_NULLA.bat
Talk_To_NULLA.bat		Talk_To_NULLA.bat
conftest.py		conftest.py
docker-compose.yml		docker-compose.yml
nulla_background.vbs		nulla_background.vbs
pyproject.toml		pyproject.toml
requirements-runtime.txt		requirements-runtime.txt
requirements.txt		requirements.txt
run.bat		run.bat
run.sh		run.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NULLA

What makes NULLA different

Tool-use agent loop — not prompt theater

Three-tier memory that actually works

Importance scoring

Semantic search with real embeddings

Capability reporting

How this fits the Parad0x stack

Install

Platform support

What works right now

Try It

Run the benchmarks

Repo map

For developers

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NULLA

What makes NULLA different

Tool-use agent loop — not prompt theater

Three-tier memory that actually works

Importance scoring

Semantic search with real embeddings

Capability reporting

How this fits the Parad0x stack

Install

Platform support

What works right now

Try It

Run the benchmarks

Repo map

For developers

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages