Skip to content

Parad0x-Labs/nulla-local

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

554 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NULLA

Local-first AI agent runtime. Your machine. Your memory. Your mesh.

NULLA is a local-first agent runtime — runs on your hardware, remembers everything across sessions, uses tools to do real engineering work, and coordinates trusted helpers over a peer mesh when a task needs more reach. Nothing leaves your box unless you say so.

It's also a node in Web0 — the direction where tasks decompose, agents bid, compute gets rented, and work settles over the x402 payment rail.

Current state: Alpha — core runtime, memory, and tool loop working on main. Mesh economics hardening. See docs/STATUS.md.

License: MIT Status: Alpha Python: 3.10+ CI

Parad0x Labs

local NULLA agent → memory + tools → optional trusted helpers → mesh task market → results

What makes NULLA different

Tool-use agent loop — not prompt theater

NULLA runs a real agent loop: call LLM → parse tool intent → execute (read files, run tests, write code, search web) → feed result back → repeat until done. It doesn't hand you a one-shot guess and call it a day.

Benchmark on real engineering tasks (5 tasks requiring tool use):

Score Notes
NULLA (14b + tools + loop) 5/5 Iterates, fixes, verifies
Ollama 14b single-shot 4/5 Fails cross-file rename — no iteration

Tasks were specifically designed to be impossible without tool use: bugs only visible at runtime, multi-file changes that require reading before editing. The benchmark is in tests/benchmarks/agent_capability_bench.py — run it yourself.

Three-tier memory that actually works

Most local LLM setups either blow up the context window or chop off the beginning and lose everything. NULLA compresses without forgetting.

Memory benchmark (30-turn conversation, 5 facts planted early):

Mode Recall Peak tokens
Raw (no compression) 5/5 (100%) 528
Sliding window (10) 0/5 (0%) 362 (-31%)
NULLA ContextWindow 5/5 (100%) 335 (-36%)

Sliding window is the naive approach every other local stack uses. It cuts tokens by just forgetting everything old — including your passwords, deadlines, and API keys. NULLA cuts 36% of tokens and remembers everything.

The three tiers:

  • L1 — recent turns verbatim (always in context)
  • L2 — LLM-compressed structured summary of older turns (Key Facts / Decisions / Open Questions / Context — exact values preserved word-for-word)
  • L3 — semantic memory nodes in SQLite, retrieved by embedding similarity with nomic-embed-text

Smart retrieval: before injecting L3 nodes, NULLA checks whether the content is already covered in L2. No token bloat from re-injecting facts the summary already has.

Importance scoring

Every turn gets scored before being stored in L3:

password / API key  → 0.6–0.95
port / date         → 0.45–0.50
decision / deadline → 0.40–0.45
generic explanation → 0.20

High-importance turns are prioritised during retrieval. Your sk-prod-xxxx stays findable. "Can you explain async/await?" does not crowd it out.

Semantic search with real embeddings

Plugs into nomic-embed-text via Ollama (274MB, 768-dim). Falls back to a hash bag-of-words if not installed. The same embedding service backs L3 retrieval across sessions — ask something in session 2, get a relevant fact from session 1.

Capability reporting

GET /api/runtime/capabilities reports, per feature, whether it is implemented, simulated, or disabled — so payments show as simulated, WAN mesh as experimental, and live web lookup as opt-in and off in the local-only profile (enable it on a non-local-only profile). /healthz reports commit + dirty bit. The runtime surfaces its own status.


How this fits the Parad0x stack

Parad0x Labs builds Web0 on Solana — money and agents that settle themselves. You are here: 🧠 Local AI (the runtime that consumes every layer).

Layer Repo Does
💸 Payments dna-x402 x402 rail: quote → pay → verify → receipt → anchor
🛠️ Build dna-x402-builders Hosted kit: turn any API/bot into a paid agent
🕶️ Privacy Dark-Null-Protocol Groth16 privacy settlement, published proofs
🗜️ Data liquefy Columnar compression that beats Zstd
🛡️ Audit liquefy-openclaw-integration Flight recorder: 24 engines + Solana-anchored audit trails
🎬 Media nebula-media Proof-carrying media compression — scene-aware + on-chain receipts
🧠 Local AI nulla-local (this repo) Local-first agent runtime — your machine, your memory

See it live: parad0xlabs.com


Install

Bootstrap install script:

curl -fsSLo bootstrap_nulla.sh https://raw.githubusercontent.com/Parad0x-Labs/nulla-hive-mind/main/installer/bootstrap_nulla.sh
bash bootstrap_nulla.sh

Windows PowerShell:

Invoke-WebRequest https://raw.githubusercontent.com/Parad0x-Labs/nulla-hive-mind/main/installer/bootstrap_nulla.ps1 -OutFile bootstrap_nulla.ps1
powershell -ExecutionPolicy Bypass -File .\bootstrap_nulla.ps1

Profiles:

# Safest — smaller machines, zero remote dependency
bash bootstrap_nulla.sh --install-profile ollama-only

# Full local power — 24 GiB+ unified memory or equivalent
bash bootstrap_nulla.sh --install-profile ollama-max

# Max performance — Ollama + native llama.cpp (local_plus_llamacpp)
bash bootstrap_nulla.sh --install-profile local_plus_llamacpp

After install, set your profile:

cd ~/nulla-hive-mind && .venv/bin/python -m apps.nulla_cli install-profile --set ollama-only
cd ~/nulla-hive-mind && .venv/bin/python -m apps.nulla_cli install-profile --set ollama-max

Full install docs: docs/INSTALL.md

Platform support

OS Inference Job sandbox Launchers
macOS (Apple Silicon) Metal GPU via Ollama kernel-enforced (sandbox-exec) .command
Linux Ollama + native llama.cpp kernel-enforced (bwrap/unshare/firejail) .sh
Windows (native host) Ollama (CPU; consumer-GPU lane coming) static command guard only (no kernel backend) .bat / PowerShell
Windows + WSL2/Linux Ollama + native llama.cpp kernel-enforced (bwrap/unshare/firejail) .sh inside WSL2

Apple Silicon is the primary development target. Temp paths, signal handling, chat, three-tier memory, the OpenClaw UI bridge, local Ollama inference, and the workspace tools all run on a native Windows host today.

Full capability — kernel-enforced no-network job sandbox and live web lookup — wants WSL2/Linux plus a non-local-only profile:

  • Kernel sandbox: native Windows has no kernel network-namespace backend, so a no-network job fails closed by default. Run under WSL2/Linux for bwrap/unshare/firejail kernel enforcement, or set network_isolation_mode="heuristic_only" on a native host as an explicit, informed override (static command guard only, no kernel isolation).
  • Live web lookup: opt-in and OFF in the local-only profile. Enable it on a non-local-only profile (and/or NULLA_ENABLE_WEB=1 when not local-only). See Web Access.
  • Remote null:// dial: opt-in and OFF by default. A null:// request runs locally unless dial is enabled with NULLA_ENABLE_NULL_DIAL=1, at which point it can reach the named .null agent's x402 endpoint and return that agent's result. Payment is separately gated by --allow-spend within a cap. An SSRF guard rejects internal/loopback endpoints. See Remote dial.

What works right now

  • Agent loop — LLM → tool call → execute → iterate → done. Not a single-shot wrapper.
  • Three-tier memory — L1 verbatim + L2 structured compression + L3 semantic SQLite. 36% fewer tokens, 100% recall.
  • Embedding service — nomic-embed-text (768-dim) with hash-BoW fallback. Cross-session retrieval.
  • Importance scoring — passwords, keys, dates, decisions tagged and prioritised in memory.
  • Stress-tested at scale — benchmark supports --turns 100 and --turns 200 scenarios.
  • Persistent memory across sessions — NullaMemory SQLite backend.
  • Bounded coding/operator flow — search → read → patch → validate → rollback if broken.
  • Append-only task/proof spine — every repair and orchestration step is inspectable, not locked inside the executor.
  • Mesh task market — decompose → escrow → offer → claim → execute → review → reward. Ed25519-signed credit settlement. Single-node and loopback verified end-to-end.
  • 3-layer anti-cheat proof-of-work credits — challenge-response, staking, ZK-proof path. The stake-before-work / slash-on-cheat guard is built and self-tests green (wrong, late, and cheating workers are slashed).
  • Multi-result consensus validator — cross-validates worker answers, spawns a verification job on disagreement (built + tested).
  • Capability-token authorization — signed, scoped, single-use, expiring task tokens gate who may run what (built + tested).
  • Contribution-proof receipt chain + proof-of-execution + proof manifest — hash-canonical contribution receipts and a git-source proof manifest, distinct from the task/proof event spine (built + tested).
  • Compute-rental market — prices your real hardware, welds x402 receipt hash into tamper-evident WorkProof. The rent()_pay_x402() pay-upfront path is wired and integration-tested (315-LOC end-to-end test; stub/devnet modes — live mainnet anchor pending).
  • DNA x402 payment bridge + wallet manager — simulated USDC→credit purchase path (1 USDC = 1000 credits) against a local SQLite wallet; settlement is simulated today (stub/devnet), live on-chain settlement coming. A local on-chain receipt verifier gates real settlement so reputation can't be self-claimed (built + tested).
  • Directory-less peer discovery — Kademlia DHT routing table (k=20 buckets) + verified-endpoint liveness index; nodes find each other without a central directory (built + tested).
  • Role-aware provider routing — local drone lanes vs synthesis lanes, local llama.cpp, vLLM, and Kimi lanes when configured.
  • Capability-reporting APIGET /api/runtime/capabilities — implemented / simulated / disabled, per feature.
  • CI — sharded local regression + GitHub Actions + fast LLM acceptance suite.

Built core, still partial (real implementations with tests, but thinner coverage):

  • Credit DEX / order book — P2P, cheapest-first marketplace for compute credits (built core; thin).
  • Sybil / collusion fraud detection — reputation graph + closed-loop collusion detection + score decay (built core, covered by test_fraud_and_timeouts).
  • Encrypted P2P transport + NAT traversal — TLS streams plus STUN, hole-punching, and relay fallback (built core, covered by transport test suite; some helpers thin).
  • Sandboxed helper-worker isolation — runs untrusted mesh jobs behind filesystem, network, and resource guards (built core).

Try It

After install, start the API and chat:

python3 -m apps.nulla_api_server        # local API on :11435
python3 -m apps.nulla_agent --interactive
curl http://127.0.0.1:11435/api/runtime/capabilities

Full install docs: docs/INSTALL.md


Run the benchmarks

# Agent capability: NULLA tool loop vs Ollama single-shot
python -m tests.benchmarks.agent_capability_bench

# Memory compression: recall vs token budget at 30 / 100 / 200 turns
python -m tests.benchmarks.memory_compression_bench
python -m tests.benchmarks.memory_compression_bench --turns 100
python -m tests.benchmarks.memory_compression_bench --turns 200

# Provider comparison across 4 models × 4 task categories
python -m tests.benchmarks.nulla_vs_standard

Repo map

  • core/ — agent runtime, memory, tools, mesh, credits, compute, Hive, web
    • core/context_window.py — three-tier memory manager
    • core/conversation_summarizer.py — structured LLM compression
    • core/embedding_service.py — nomic-embed-text + hash-BoW fallback
    • core/nulla_memory.py — SQLite-backed persistent memory
    • core/agent_runtime/ — turn loop, fast paths, research loop
  • apps/ — API server, CLI, agent entrypoints
  • tests/ — regression coverage + benchmarks
  • installer/ — one-click setup
  • docs/ — architecture, status, trust, runbooks

Full map: REPO_MAP.md


For developers

git clone https://github.com/Parad0x-Labs/nulla-hive-mind.git
cd nulla-hive-mind
python3 -m venv .venv && source .venv/bin/activate
pip install -e ".[dev,runtime]"
python3 -m apps.nulla_api_server

Useful entry points:

python3 -m apps.nulla_api_server        # local API on :11435
python3 -m apps.nulla_agent --interactive
curl http://127.0.0.1:11435/api/runtime/capabilities

Proof path for skeptics: docs/PROOF_PATH.md

Architecture: docs/SYSTEM_SPINE.md · docs/CONTROL_PLANE.md · docs/STATUS.md


NULLA is alpha. The core runtime and memory system are real and working on main. Payments are simulated, WAN mesh is experimental, and live settlement is still hardening. GET /api/runtime/capabilities reports the current per-feature status at any moment.

NULLA — Parad0x Labs open source systems

Packages

 
 
 

Contributors

Languages