Development
FIH Blackboard: Universal Interface for Multi-Agent Coordination
Abstract
neXus implements the FIH (Fact / Intent / Hint) Blackboard paradigm. The core uses a DualStorage composition of hot (core storage engine, layered coordinate model) and cold (DuckDB or CompositeColdStorage, Parquet-backed) backends, with optional Cypher queries and platform bindings for Cloudflare Workers, native servers, and (as a placeholder) blockchain targets. The Blackboard is a shared multi-modal storage space (three-tier record types across a temporal dimension) that every module reads from and writes to through a single interface. The hot storage is a native Rust coordinate-based index with a temporal ordering layer; PetgraphStorage is an optional 2D projection for legacy graph queries. Storage backends implement fine-grained capability traits instead of a monolithic Storage interface.
Executive Summary
neXus is a modular research infrastructure built on the FIH Blackboard paradigm. Every module reads from and writes to a shared layered coordinate model through exactly three primitives: Fact (validated result), Intent (exploration direction), and Hint (governance rule). There is no fixed pipeline and no direct module-to-module communication. The proven lifecycle — submit, claim, heartbeat, conclude — governs all interaction.
neXus models a self-describing paradigm: the Blackboard is the Scheme, Intents and Hints are the Field, and the act of reading, computing, and writing back is the Observation. The core compiles to WASM (edge) and native (server) from a single codebase. Contract.nex is the single governance surface.
Unified Architecture
The architecture has three layers. The Core Blackboard is the logical layer: a shared multi-modal storage space of Fact, Intent, and Hint primitives that every module reads from and writes to through a single interface. The storage architecture follows the FIH paradigm: the Blackboard holds a DualStorage composition of hot (core storage runtime with coordinate-based indexing and temporal ordering) and cold (DuckDB or CompositeColdStorage, Parquet-backed) backends. PetgraphStorage is an optional 2D projection for legacy graph queries. cyrs translates Cypher queries against the 2D projection. Capability traits replace the monolithic Storage trait, letting each backend implement only the interfaces it provides. Platform bindings expose the core to different deployment targets.
A third storage layer sits alongside the identity-based stores: a plug-in semantic similarity index. Where an identity store provides key-value access by record identifier, a semantic store provides retrieval by meaning. It follows the exact same USB-hub pattern as every other storage backend: the core defines only a thin trait, and external crates provide implementations. The core itself never references any specific methodology such as “vector”; it provides only a lookup handle that each external implementation uses to retrieve exactly the data it needs (feature vectors, raw text, origin strings, etc.). This lets radically different retrieval strategies — HNSW vector search, BM25 string matching, ngram fuzzy search, LLM reranking — all plug into the same index slot without any core changes.
Multi-Dimensional Blackboard Composition
The Blackboard is a layered coordinate model where the three primary storage domains and a temporal dimension together form a multi-dimensional record space. Relations between primitives are not edges but coordinate differences (distances between points in the coordinate space).
A Blackboard can contain other Blackboards. A Fact at Dimension N becomes a Scheme at Dimension N+1. An Observation at Dimension N+1 becomes a Hint at Dimension N. This mirrors the bootstrap recursion.
| Dimension | Role | Content |
|---|---|---|
| Infrastructure | DualStorage (HotStorage + ColdStorage) + SemanticStore, capability traits | Pluggable storage backends; each dimension can select its own backend composition, plus a semantic similarity index |
| Domain | Fact / Intent / Hint nodes | Research documents, experiment results, governance rules |
| Meta | Blackboard composition rules | Rules for how Facts at Domain become Schemes at Meta |
| Research | Gaps, hypotheses, validations | Domain-specific knowledge exploration |
Storage backends are swappable per dimension via capability traits. A dimension requiring filtered reads uses a ColdStorage backend; a dimension requiring low-latency access uses a HotStorage backend. Each dimension can compose hot and cold backends independently. The Queue layer enables stigmergic visibility across dimensions. A write at any dimension is visible to all dimensions. The type conversion (Fact to Scheme, Observation to Hint) happens at the Queue consumer, not at the write stage.
Design Decisions
Why WebAssembly
The decision to compile the core to WASM is not about web browsers. It is about a set of constraints that WASM uniquely satisfies and that together form a non-negotiable foundation.
Sandbox isolation. Every Agent, every projector, every verification module runs in its own WASM instance. Memory is isolated at the hardware level. A compromised projector cannot read another projector’s Fact cache. This is not configurable discipline — it is structural impossibility.
Deterministic execution. WASM’s specification guarantees that the same bytecode with the same input produces the same output, across all hosts and all platforms. A vIP asset’s value is that its verification result is reproducible by any party. WASM determinism is the mechanism that makes reproducibility a mathematical guarantee rather than a procedural hope.
Cold start under 1 millisecond. The OODA loop spawns and despawns agents on every tick. An agent that takes 100ms to start cannot participate in sub-second iteration. WASM instances start in microseconds and consume megabytes, not gigabytes. Docker is not an alternative here — it is a different category of tool for a different problem.
Portable binary. The same compiled WASM module runs on Cloudflare Workers (edge), AWS Lambda (Graviton), a researcher’s laptop (ARM), and a future on-chain contract environment (WASM blockchain host). The core never needs recompilation for a new deployment target. Platform bindings are thin adapters, not architectural commitments.
Double sandbox. WASM isolates execution at the machine level. Field isolates operation at the paradigm level. An agent that escapes its WASM sandbox still cannot execute a constraint combination that its Field does not permit. An agent whose Field permits a combination still cannot read host memory. The two sandboxes are orthogonal and cumulative. Forging a verification proof requires breaking both.
Together, these properties are not conveniences. They are the minimal set of constraints that make a cross-reality verification economy possible. No other runtime satisfies all of them.
SemanticStore: The Flashlight Pattern
Alongside identity-based stores (record identifiers to values, temporal ordering), a third storage pattern exists: a plug-in semantic similarity index. Where an identity store maps record identifiers to values, a semantic store maps semantic features to record identifiers for similarity retrieval.
The core design follows the same USB-hub pattern as every other storage backend. The core defines only a thin trait; external crates provide implementations. The trait itself uses a flashlight pattern: instead of receiving fixed-format data (such as numeric vectors), each method receives a lookup handle. The implementation uses this handle to request exactly the data it needs.
The lookup handle exposes accessors for content bytes, decoded text, numeric feature vectors, origin strings, and creator strings. Each implementation calls only the accessors it actually requires. A vector-based store requests feature vectors; a text-similarity store requests raw text; an ngram store requests origin strings. The core never determines which methodology is in use.
The coordinate index exposes a semantic slot alongside its existing indexes (origin, time, status). Each semantic backend plugs into this slot without any changes to the core index logic.
Execution Unit Model: nex Is Not a Library
nex is not a general-purpose library. Each storage instance is an execution unit: a single-threaded, self-contained runtime that owns its memory state and I/O channel exclusively. There is no shared mutable state between instances. No internal thread pool. No locking primitives in the hot path.
This design arises not from any specific platform constraint but from the nature of distributed blackboard coordination. Instances communicate exclusively through the FIH protocol — writing facts, intents, and hints to a shared external storage layer (object store, filesystem, network). They never share internal indices or entity stores with another instance. Coordination is stigmergic, not direct.
Scaling happens through physical instance replication, not internal sharding:
coordinator (process manager)
├── storage instance A (single thread, independent I/O channel)
├── storage instance B (single thread, independent I/O channel)
├── storage instance C (single thread, independent I/O channel)
└── ...
│
└── Shared blackboard (external storage / FIH protocol)
Each instance is an atomic unit. Adding an instance adds capacity linearly. Failure of one instance does not affect others. Every instance can run in a sandboxed environment, a native process, or a lightweight VM — the binary and the execution contract are identical.
Interior Mutability Without Locking
Internal mutability uses runtime borrow-checked cells, not OS locking primitives. This is not a platform concession. It is the simplest correct implementation for a single-owner execution unit. These cells are zero-cost at runtime (no atomic operations, no OS futex). If an external caller needs thread-safe access, it wraps the instance in a standard synchronization primitive externally — that is an external composition, not an internal requirement.
Internal locking would be a design error. It would introduce contention, deadlock risk, and complexity — all for the false promise of internal parallelism. These instances do not parallelize internally; they replicate externally.
Async-Only Storage Interface
Storage instances do not expose synchronous blocking interfaces. All public storage methods are asynchronous. This is not negotiable:
- Storage is inherently I/O-bound. Blocking on I/O in a single-threaded execution unit stalls all pending operations.
- A synchronous interface on a single-threaded storage engine would be a lie: the implementation uses runtime borrow-checked cells (not thread-safe) and relies on cooperative multitasking (not preemptive scheduling).
- Sync callers use an async runner (
block_on) externally. They do not require sync trait implementations on the storage engine itself.
No specific runtime is the reason for async-first. Every edge deployment where async-first happens to be essential is coincidental. The reason is that storage is asynchronous, and this is a single-threaded storage execution unit.
No Global Mutable State
A storage instance carries no global mutable state except fixed constants. Every resource (I/O handle, index, buffer) is owned by the instance. This guarantees that:
- Two instances never accidentally share state.
- Spawning a new instance is purely a construction operation with no global side effects.
- Each instance can be sandboxed independently (linear memory, container, process).
Storage IO Design
Storage is inherently asynchronous. At the hardware level, every I/O operation involves pipelining, interrupts, or completion queues — whether it is a DRAM read, a DMA transfer, an NVMe submission queue entry, or a network packet. Synchronous I/O is a programmer convenience abstraction layered on top of fundamentally async hardware.
Because of this, nex makes async the design center rather than an adapter bolted on afterward. The IO trait is async at the trait level. This is not an adapter layer — async is primitive, sync is extension.
This design aligns naturally with every target platform. Object stores support await directly with no blocking needed. On native runtimes, the pattern is spawn plus await on async filesystem and network operations. Each platform uses the same async trait; only the executor changes.
All writes go through a pending buffer and are committed in a single batch call. Rather than each operation traversing the I/O boundary independently, individual calls are amortized across the flush cycle. The caller controls durability by choosing when to flush.
Optional Sync Wrapper (Native Only)
For consumers that require a synchronous interface on native platforms, a blocking wrapper exists that uses an async runner internally. It is not the recommended interface for new code.
Why Layered Coordinate Model, Not Docker Containers
The initial neXus architecture used Docker containers (Memgraph + proxy + LightRAG) orchestrated by shell scripts. The transition to a native codebase eliminated runtime dependencies and revealed that the underlying model is not a 2D graph but a multi-dimensional coordinate model:
| Reason | Impact |
|---|---|
| Single codebase | cargo build produces WASM (CF Worker) or native binary (server) from same source |
| Zero runtime deps | No Docker, no Python, no bolt proxy, wrangler deploy only |
| Multi-dimensional model | FIH is not a 2D edge graph. It operates along multiple independent record domains where facts, intents, and hints each occupy their own coordinate set, with temporal ordering as an additional axis |
| Relations as coordinate differences | Relations between primitives are not graph edges but distances between points in the coordinate space |
| 2D projection optional | Traditional knowledge graphs are 2D node-edge models. FIH can project into 2D for interop, but the native representation is multi-dimensional |
This is why the WASM build of petgraph failed: a 2D graph library was being forced to model a multi-dimensional structure. The core storage runtime operates natively in the correct coordinate model directly.
Why Cypher
Cypher is the only graph query language with sufficient LLM training data for reliable code generation. Multiple languages supported. Our cypher/ crate translates Cypher to petgraph traversals using cyrs for parsing → Plan IR.
Cypher is optional syntactic sugar over the native coordinate API (FilterCapable + temporal ordering layer + from_facts). The layered coordinate model query interface is the primary access method; Cypher translation is a secondary concern. Consumers that only need native FIH queries do not depend on the cypher crate.
Graph Storage Approaches (Reference Implementations)
The FIH Blackboard can be backed by multiple storage approaches. The primary implementation is the native layered coordinate model; other approaches exist for legacy compatibility.
| Approach | Storage | Adoption |
|---|---|---|
| Core storage runtime | Coordinate-based index with temporal ordering (layered coordinate model, native) | Primary (Phase 3, #86) |
| SemanticStore | Plug-in semantic similarity index via FihLoad flashlight pattern |
Adopted (trait in core, implementations external) |
| PetgraphStorage | Petgraph (2D graph projection, optional) | Legacy compatibility, optional |
| DuckDB / Parquet | Cold storage for analytical queries | Adopted (cold backend) |
| Memgraph | In-memory LPG + RocksDB WAL | Patterns extracted (supplementary) |
The core storage runtime implements the layered coordinate model directly with coordinate-based indexing for O(1) lookups and a temporal ordering layer for time-range queries. It replaces PetgraphStorage as the default hot storage. PetgraphStorage remains as an optional 2D projection for legacy graph queries (community detection, PageRank, shortest path) that operate on a 2D slice of the coordinate space. DualStorage composes the core storage runtime (hot) with DuckDB (cold) as the default configuration; PetgraphStorage can be composed as an additional 2D view.
The supplementary analysis provided insights into specific capabilities. The table below documents the mapping between Memgraph and our adopted approach for granular traceability.
Memgraph Pattern Mapping
| Capability | Memgraph Approach | Our Adaptation |
|---|---|---|
| Graph storage | In-memory LPG + RocksDB WAL | Core storage runtime (layered coordinate model) + duckdb/Parquet |
| Vector search | USearch (C++), Single Store Vector Index | ndarray cosine (optional, on 2D projection) |
| Community detection | Louvain + Leiden (C++ MAGE) | community-detection crate (on 2D projection) |
| PageRank | Custom C++ | petgraph built-in (optional, on 2D projection) |
| Module isolation | C API (mg_procedure.h) | Rust trait in modules/ crate |
| Atomic GraphRAG | Single query = search + expand + rank + prompt | FilterCapable + temporal ordering layer native query |
Architecture Stack
The Blackboard is assembled from native Rust modules and optional third-party crates. The core storage runtime implements the layered coordinate model using only standard collection types and a custom temporal ordering layer. PetgraphStorage is an optional 2D projection for legacy graph queries.
External Dependencies: Candidates
Where possible, we depend on stable Rust crates rather than implementing from scratch. Rust crate dependencies are permanent.
Graph Storage & Query
| Concern | Candidate | Justification | Status |
|---|---|---|---|
| Native layered coordinate model | Core storage runtime (coordinate-based index + temporal ordering layer) | Primary hot storage. No external crate needed for the core data model. | Adopted (primary) |
| 2D graph projection | petgraph (0.6) | Optional. Standard Rust graph lib, StableGraph, NodeIndex, built-in PageRank and Dijkstra. For legacy graph queries on a 2D slice of the coordinate space. | Optional dependency |
| Cold storage / analytical | duckdb (1.105) | Parquet-backed, bundled, vector/JSON/CTE/window. Native only. | Adopted |
| Cypher parsing | cyrs | Parses Cypher to typed Plan IR in one step. A unified query IR decouples input languages from execution. | Optional (for Cypher path) |
| Vector similarity | ndarray (0.15) + cosine | Our data volumes do not require HNSW. When they do, USearch has Rust bindings. Memgraph uses USearch internally. | Adopted |
Graph Algorithms
| Algorithm | Candidate | Status | Provided By |
|---|---|---|---|
| Louvain community | community-detection crate | Adopted | petgraph compatible |
| Leiden community | community-detection crate | Adopted | Same crate, variant feature flag |
| PageRank | petgraph::algo::page_rank | Adopted | Built into petgraph |
| Dijkstra / shortest path | petgraph::algo::dijkstra | Adopted | Built into petgraph |
| Betweenness centrality | petgraph + custom | Candidate | Minimal implementation on top of petgraph |
| Cosine similarity | ndarray | Adopted | Used by vector index; no separate crate needed |
Platform Bindings
| Target | Candidate | Deploy Command |
|---|---|---|
| Cloudflare Worker | worker crate (worker-rs) | wrangler deploy |
| AWS Lambda | lambda_runtime crate | cargo lambda deploy |
| Native server | axum + clap | cargo run |
| On-chain (blockchain) | (placeholder) | Future decision |
| Zed ACP bridge | nex-zed-agent (standalone binary) |
cargo build; register in Zed settings |
Deployment Topology: Two Operational Modes
neXus supports N deployment modes sharing the same F-I-H interface. Two representative cases:
| Mode | Runtime | Storage | Subscribe | Use Case |
|---|---|---|---|---|
| Real-time Blackboard | Persistent daemon | In-memory (+ optional WAL) | Yes (live peer notification) | Agent coordination, interactive sessions (Zed, ev) |
| Storage / Fact Store | Serverless worker | R2 / DuckDB / object store | No (poll or query) | Batch ingestion, archival, cross-session retrieval |
A real-time Blackboard can use a storage variant as its durable backend, flushing accumulated blocks to R2 on graceful shutdown. Both expose the same BlockStore trait — only delivery semantics differ.
Design Principle: Implement Strategically, Depend Widely
Where differentiation matters (Cypher-to-petgraph translation, gap detection heuristics, platform bindings), we write code. Where stability matters (graph algorithms, persistence, parsing), we depend on verified Rust crates. The Blackboard is the thinnest possible layer that turns a collection of independent modules into a coherent multi-dimensional analysis platform.
Development Model: Three Hard Layers
The codebase stratifies into three layers with different change tolerances. This is not an architectural abstraction but a development workflow enforced by how each layer is validated.
| Layer | Change Tolerance | Validation | Examples |
|---|---|---|---|
| Consumption Scenarios | Immutable | nexus-sim scenario definitions capture the domain invariants; cannot be compromised without invalidating the ecosystem’s purpose | Proof by Structure 7 attack scenarios (see Proof by Structure); exchange contract requirements; ev verification flows |
| Orchestration Layer | Flexible | nexus-sim runs scenario definitions against the foundation; failures force orchestration changes | exchange_fact() overlay (defined in Proof by Structure); API gateway handlers; C2PA verification wrapper; session lifecycle |
| Foundation (IO Layer) | Stable | Existing comprehensive test suite; core storage runtime/DuckDB contract; never modified without a nexus-sim scenario proving the capability gap | AsyncStorageRead, AsyncFactCapable, DualStorage; nex core storage runtime / R2 bindings |
Consumption scenarios are the hardest layer. They express what the system must guarantee — “no entity can read a Fact without contributing one” is not negotiable because it derives from the ecosystem’s game-theoretic requirements. The foundation is also hard: the core storage runtime’s indexing and ordering contracts, DuckDB’s ACID guarantees, and the FIH capability trait contracts are fixed by their respective libraries.
The orchestration layer between them is the only soft layer. It evolves continuously as nexus-sim validates scenarios against the foundation. When a scenario fails, the orchestration layer is updated first; only when a scenario demands a capability the foundation cannot provide (e.g., atomic compare-and-swap that the storage backend lacks) is the foundation modified. This is the scenario-driven reverse development pattern: the consumption scenario drives the orchestration, which in turn pressures the foundation only when necessary.
A concrete example: exchange_fact() (defined in Proof by Structure) is an orchestration-layer function. It uses StorageRead and FactCapable from the foundation. If nexus-sim proves that a caller can bypass exchange_fact() by calling StorageRead::get() directly through a public API, the fix is in the orchestration layer — the public API endpoint must only expose exchange_fact(), not the raw storage trait. The foundation remains untouched. Only if nexus-sim proves that StorageRead itself cannot enforce the required ordering (e.g., because it lacks CAS semantics that the exchange contract needs) would the foundation be modified.
This three-layer validation pattern mirrors the IoBuffer+StoreSession architecture at a different scale: there, the foundation provides sync storage traits, the IoBufferSession overlay orchestrates hydration and flush, and the Worker request lifecycle (the consumption scenario) drives the entire cycle.
Core Storage Engine
The core is a Cargo workspace with clean separation of interfaces and implementations. The data model follows a layered coordinate model (three-tier record types across a temporal dimension), not a 2D edge graph. The monolithic Storage trait has been replaced by fine-grained capability traits. Each backend implements only the capabilities it provides. The core storage runtime (coordinate-based index with temporal ordering) is the primary hot storage; the 2D graph projection is optional.
Capability traits come in pairs: sync variants for multi-thread-safe backends (in-memory graph projection, composite), and async variants for the execution-unit storage engine. Each backend implements only the variant that matches its execution model. Aggregate aliases compose the traits needed for common backend roles.
CypherCapable (QueryCapable) is no longer part of ColdStorage, because the query interface is independent of storage. DuckDbStorage implements QueryCapable directly; CompositeColdStorage does not.
DualStorage composes a hot and cold backend:
- Writes delegate to both hot and cold (dual-write)
- Reads go to hot (sub-ms latency for edge computing)
- Filtered reads delegate to cold (hot has no SQL/filter capability)
- Commit channel:
CompositeColdStorageholds a separatecommit_kv/commit_blobpair used exclusively byflush_since(). The commit channel replaces dirty tracking —flush_sincewrites its output (blob archives, cursor state) through the commit channel, never polluting the general read/write path. The flush cursor stored incommit_kvserves as the deterministic flush boundary, makingdirtytracking unnecessary. Consumer reads the cursor viaread_cursor()to know which data has been flushed.
DualStorage is generic over <H: HotStorage, C: ColdStorage> rather than Box<dyn HotStorage> / Box<dyn ColdStorage>. This preserves AFIT (async fn in trait) compatibility for async trait migration without boxing overhead at the call site.
Crate structure:
model/— Data model crate: FIH lifecycle interface, capability traits (sync and async variants), DualStorage composition, blob/meta/object store traits, clock abstraction, detection trait hierarchy.interface/query/— Backend-agnostic tabular query specification (filter, ordering, aggregation).interface/cypher/— Optional Cypher parser and executor. Translates Cypher to native queries for the 2D graph projection.nex/— Core storage engine: async-only execution unit, coordinate-based index with temporal ordering, optional sync wrapper for native platforms, optional 2D graph projection, durable cold storage, OODA scheduler, all detectors.storage/duckdb/— Parquet-backed analytical cold storage.storage/sim/— In-memory I/O backends (test doubles + filesystem) and scenario-driven verification runner.storage/ve-composite/— HTTP server exposing blob/meta/object store endpoints for session-backed composite storage.gateway/api/— HTTP REST server exposing the FIH lifecycle.gateway/nex-cf/— Edge-deployed gateway with object-store-backed storage and semantic search.gateway/nex-cf/mock/— Local simulation of the edge gateway for offline development.
Cypher Translation
The cypher/ crate translates Cypher to petgraph operations on the optional 2D projection. Our gap-detector needs only three patterns on the 2D view:
| Cypher Pattern | petgraph Translation |
|---|---|
MATCH (c:Concept) WHERE... RETURN c |
node_indices().filter(label).filter(condition) |
OPTIONAL MATCH (c)-[r]-() WITH c, count(r) WHERE rc = 0 |
neighbors() + count + filter |
MATCH (a)-[r1]->(b) MATCH (a)-[r2]->(b) WHERE type != type |
edges() cartesian product + filter |
We do not implement full Cypher. We implement the subset our modules actually need. The primary query interface is the native layered coordinate model API (FilterCapable + temporal ordering layer + from_facts).
Graph Algorithms (Optional 2D Projection)
Graph algorithms operate on the optional 2D projection (PetgraphStorage). They follow the MAGE module pattern: each algorithm is a standalone function that takes a graph reference and returns results. The primary layered coordinate model does not require these algorithms.
| Algorithm | Implementation | Notes |
|---|---|---|
| Louvain | community-detection crate | On 2D projection only |
| Leiden | community-detection crate | Same crate, variant feature flag |
| PageRank | petgraph::algo::page_rank | Built into petgraph |
| Dijkstra | petgraph::algo::dijkstra | Built into petgraph |
| Cosine similarity | ndarray | Used by vector index |
Analysis Modules
Modules follow Blackboard semantics: read Facts and Hints, emit Facts (detectors) or Intents (agents). Every module shares the same layered coordinate model interface. Internal implementation (native, LLM-driven, heuristic) is invisible to other modules. Detectors observe patterns and record them as immutable Facts; agents read detector Facts and decide which to act on by creating Intents. This separation — observe as Fact, act as Intent — is enforced architecturally through the DetectionCapable trait hierarchy.
| Module | Input | Output | Status |
|---|---|---|---|
| Gap Detector | Fact graph | Fact (gap pattern) | Rust (origin + cross-origin topic levels) |
| Contradiction Detector | Fact graph | Fact (contradiction) | Rust (same-topic/different-position) |
| State Change Detector | Fact graph | Fact (state transition) | Rust (State ReasonCheckpoint) |
| New Document Analyzer | Fact graph | Fact (+factor/-factor/gap) | Rust (baseline-aware) |
| OODA Loop (in-process) | Fact + Intent + Hint | Scheduler tick | Rust (in-process, detection traits) |
| Hypothesis Generator | Gap Fact + Fact evidence | Intent (hypothesis) | Future |
| Concept Validator | Hypothesis Intent + experiment | Fact (validated/rejected) | Future |
| Entity extraction | Raw document | Fact (entity) | Future |
| Flow-GRPO Planner | Successful Intent histories | Updated Planner weights | Future |
No module calls another module. All communication passes through the Blackboard. The queue layer serializes writes; coordination is emergent from agents observing and responding to Blackboard state. Detectors implement fine-grained capability traits (GapDetection, ContradictionDetection, StateChangeDetection) mirroring the storage trait architecture — each detector provides only the capabilities it supports, and the Scheduler composes them via Vec<Box<dyn DetectionCapable>>.
Artifact Storage & Ingestion
R2 (bucket: ssccs-nexus-af) is the single source of truth. The af-sync worker performs incremental sync between R2 and LightRAG via Queue-based processing with drift detection. Already deployed; no change required.
Platform Bindings
| Binding | Crate | Compilation Target | Status |
|---|---|---|---|
| Cloudflare Worker | gateway/ (planned: rs-worker) |
wasm32-unknown-unknown | Future (SessionExecute ready) |
| Native Server | gateway/api/ |
host | Active (axum server) |
| On-chain (blockchain) | Future | wasm32-wasi | Placeholder |
TypeScript Orchestration
| Worker | Role | Protocol |
|---|---|---|
| af-sync | R2 artifact sync → RAG engines | HTTP, writes Facts to RAG Blackboard |
The gap-detector and other analysis modules have been migrated to Rust in the nexus crate, where they run as in-process DetectionCapable implementations composed by the Scheduler. The TypeScript layer retains only the sync worker for R2 artifact ingestion.
Module Communication: Stigmergy Through the Blackboard
Modules do not call each other. Each module reads from the Core Blackboard and writes back to it. Coordination is indirect, inspired by stigmergy patterns: agents leave traces in a shared environment, other agents perceive those traces and adapt their behavior.
The Blackboard stores three primitives. Facts are what the system has learned — including detector observations about the knowledge state. Intents are what agents want to explore. Hints are governance rules and human guidance. Detectors observe the layered coordinate model and record patterns as Facts; agents read detector Facts and decide which to act on by creating Intents. The Scheduler drives the OODA cycle, calling each detector every tick.
Key principles. Every method converges on the same three primitives regardless of internal complexity. There is no fixed pipeline. The Blackboard’s current state determines which module acts next.
Contract Governance & Future DeSci
Contract.nex defines research rules: evidence thresholds, novelty minimums, report structure, and reward schedules. Every module evaluates contract.nex before writing to the Blackboard. The governance surface is a single file, not distributed across modules.
Future: Token-incentivized research. Contract.nex can execute on a blockchain. Research contributions become automatically verifiable and rewardable:
Contract.nex → Smart contract (blockchain)
├── Gap discovery → Token reward
├── Hypothesis validation → Staking + reward
├── Experiment replication → Replication reward
└── Concept drift detection → Drift token
Every (origin, intent, result) tuple on the Blackboard has a content-addressable hash. This hash serves as a verifiable proof of contribution, recordable on-chain without storing the full payload.
Architecture Inspirations
Current Layer (Multi-Dimensional Storage Core)
The storage layer is now a multi-dimensional coordinate model, not a graph core. The core storage runtime (coordinate-based index with temporal ordering) is the primary hot storage. PetgraphStorage is an optional 2D projection for legacy graph queries.
| Source | Pattern | What We Adopted |
|---|---|---|
| Memgraph (C++, production) | Atomic GraphRAG (single query = search + expand + rank + prompt) | Iterator chain translation in cypher/translate.rs |
| Memgraph | MAGE module isolation (C API, standalone algorithms) | Rust trait isolation in modules/ crate |
| Memgraph | Single Store Vector Index (embedding as node property) | SemanticStore + FihLoad (methodology-agnostic; vector is one implementation) |
| Memgraph | WAL + in-memory dual storage | duckdb/Parquet + core storage runtime memory store |
| cyrs / cypher-rs ecosystem | Cypher → typed Plan IR | cyrs as parser dependency (optional, Cypher path) |
| Core storage runtime | Coordinate-based index + temporal ordering | Primary hot storage for the layered coordinate model |
| SemanticStore (nex core) | Flashlight pattern (FihLoad) for methodology-agnostic semantic search | Plug-in semantic similarity index alongside EntityStore and OrderedIndex |
| PetgraphStorage (optional) | StableGraph, NodeIndex, built-in PageRank/Dijkstra | Optional 2D projection for legacy graph algorithms |
| Capability-based traits | Fine-grained Rust traits replacing monolithic interfaces | StorageRead, FactCapable, FilterCapable, EvictCapable, etc. — also DetectionCapable, GapDetection, ContradictionDetection, StateChangeDetection for the detection layer |
| DualStorage pattern | Hot + cold composition for edge-cloud routing | DualStorage { hot: HotStorage, cold: ColdStorage } |
| CQRS-inspired commit channel | Command/query separation for flush output | CompositeColdStorage with commit_kv/commit_blob (dirty tracking OFF), breaking self-referential dirty in flush_since() |
A key architectural insight is that a unified query IR decouples input languages from execution. Future agent modules (Planner, Verifier, Hypothesis Generator) will query the layered coordinate model through this same IR. They do not need to know whether the backend is the core storage runtime in memory, DuckDB on disk, or a remote CF Worker. The query interface is the plug, the core is the socket. Adding a new query language (GQL, SPARQL, or a future agent DSL) requires only a new parser adapter, not a core change.
Stigmergy Layer (Implemented)
The Blackboard architecture is validated by proven stigmergic search: a minimal OODA loop that reads from and writes to a shared blackboard of Fact, Intent, and Hint primitives.
The stigmergy layer is implemented in the nexus crate. The Scheduler drives the OODA loop, calling detectors every tick. Detectors (Gap, Contradiction, State Change, New Document) observe the layered coordinate model and record patterns as Facts. Agents read these detector Facts and create Intents. The ReasonCheckpoint pattern — simple count-based state change detection — is built into StateChangeDetector. All detectors implement the DetectionCapable trait hierarchy, which mirrors the storage capability trait architecture.
Beyond the current implementation, future research layers will attach to the same Blackboard through the same interface:
| Paradigm | Source | Core Insight | Blackboard Role |
|---|---|---|---|
| Stigmergic Search | Stigmergic Search (validated) | Indirect coordination through shared traces | Core coordination mechanism; Queue + Blackboard separation |
| In-the-Flow Agentic Optimization | AgentFlow arXiv:2510.05592 | Trainable Planner, Flow-GRPO learning | Reads successful Intent histories, updates Planner weights |
| Hypothesis-Driven Discovery | HypoChainer arXiv:2507.17209 | LLMs + KG + humans build hypothesis chains | Writes Hypothesis Intents, reads the layered coordinate model for evidence |
| Contract-Governed Generation | Story2Proposal arXiv:2601.20833 | Shared contract enforces structural obligations | contract.nex evaluated before every Blackboard write |
Agent Oracle: The Purest Observer
The Agent Oracle is a module that reads from the Blackboard and writes back predictions and simulation results. It has no special privileges. It uses the same Fact / Intent / Hint interface as every other module.
Oracle input: accumulated (origin, intent, result) histories on the Blackboard. Oracle output: new Facts (predicted outcomes) and new Intents (simulation branches to explore). The Oracle does not need direct communication with any other module. It reads the traces left by real-world experiments, market data ingestors, or hardware simulations, and writes its projections back to the same space.
This applies to any domain. Business simulation: Fact = market data, Intent = strategy proposal, Oracle output = projected outcome. Hardware emulation: Fact = RTL trace, Intent = optimization, Oracle output = utilization prediction. The Blackboard does not distinguish between the two. The primitives are the same.
Strategic Value
- Blackboard is the single interface. Every module reads and writes the same three types in a layered coordinate model. Internal complexity is irrelevant.
- Stigmergy over orchestration. Modules coordinate indirectly through the Blackboard. No module calls another module. No pipeline dependency chain.
- LLM-optional. A stigmergic search system proved a full suite of problems without any LLM. LLMs are accelerators, not requirements.
- Iteration over planning. Interface design improves through repeated use, not upfront specification. Virtual responses enable continuous iteration.
- DeSci-ready. Contract.nex enables token-incentivized research. Content-addressable hashes provide verifiable proofs of contribution.
- Portable core. Same codebase compiles for WASM (edge) and native (server). Zero platform lock-in.
Current Status (2026-06)
Phase 3 (native layered coordinate model storage) is complete: a comprehensive test suite across the core storage runtime, incremental persistence, ColdStorage trait, and temporal query validation. The core storage runtime (nex crate, coordinate-based index with temporal ordering) replaces PetgraphStorage as the primary hot storage.
| Component | Status |
|---|---|
| Semantic search trait (flashlight pattern) | Implemented (core storage module) |
| Semantic index slot in coordinate index | Implemented (plug-in slot alongside origin, time, status indexes) |
| In-memory BM25 search | Implemented with tests |
| Vectorize cloud backend | Implemented: pluggable embedder trait, offline local embedder for development |
| External semantic backends (HNSW, ngram, LLM reranker) | Future (external crates) |
| af-sync worker (R2 to engines) | Deployed |
| RAG engines (LightRAG) | Reference implementation; sync pipeline active |
| CI/CD | Deployed; WASM + native dual-path workflows |
| Core data model crate | Blackboard trait, FIH lifecycle, capability traits (sync + async variants), DualStorage, blob/meta/object store traits |
| Backend-agnostic query interface | Tabular query specification, filter, ordering, aggregation |
| Cypher parser and executor | Optional: translates Cypher to native queries for the 2D graph projection |
| Core storage engine | Async-only execution unit; coordinate-based index with temporal ordering; no sync blocking interfaces |
| DuckDB cold storage | Parquet-backed analytical backend with CTE, window functions, JSON, vector search |
| In-memory IO backends | Test doubles and filesystem IO for development and verification |
| Cloudflare Worker gateway | R2-backed storage with Durable Object, BM25 + Vectorize semantic search |
| HTTP API gateway | Axum REST server exposing the FIH lifecycle |
| Serialization proxy | Serde validation layer over storage traits |
| Local simulation server | In-memory mock of the full Cloudflare pipeline for offline development |
| Async-only storage execution unit | All public methods async; no synchronous trait implementations; runtime borrow-checked cells; single-thread; no global mutable state |
| In-memory 2D graph projection (optional) | Legacy graph queries (community detection, PageRank, shortest path) |
| Durable cold storage | Blob + metadata + object store for long-term persistence |
| FIH lifecycle | submit, claim, heartbeat, release, conclude |
| Sync storage capability traits | Implemented by multi-thread-safe backends (PetgraphStorage, HybridBlackboard); NOT implemented by the async-only storage engine |
| Async storage capability traits | Implemented by the async-only storage engine |
| Hot + cold composition | Dual-write to hot and cold backends; reads from hot; filtered reads delegate to cold |
| OODA scheduler | Tick-based polling, heartbeat TTL, eviction trigger, stale intent cleanup |
| Gap detection | Origin-based + cross-origin topic gap analysis |
| Contradiction detection | Same-topic/different-position analysis |
| State change detection | Pattern-based checkpoint detection, snapshot-safe |
| New document analysis | Plus-factor/minus-factor/gap analysis |
| Detection capability traits | Unified trait hierarchy for all detectors |
| Cypher translation | cyrs pipeline, dual-path executor, cold query routing |
| Document ingestion pipeline | Read markdown files from object store, chunk, submit as facts, auto-index semantically |
| Deferred-write IO | Batch multiple writes into a single storage call |
| Scenario tests | 60+ tests across storage engine, IO backends, and gateway |
| Core storage tests | Comprehensive suite (Phase 3 complete) |
| Incremental persistence | Delta aggregation with cursor tracking |
| Cold storage interface | Unified trait for cold backends |
| Temporal query validation | Replay and consistency checks across time |
| Agentic loop | Full FIH lifecycle (detector to fact to intent to conclusion) |
| CQRS commit channel | Cursor-based flush boundary, separate commit path |
| Session-backed IO | Generic serialized queue for async I/O emulation |
| Process coordinator | Planned: physical instance replication, sandboxed lifecycle |
| On-chain DeSci | Placeholder |
Edge-to-Cloud Portability & AWS Integration
Principle: Bindings Are Deployment Targets, Not Architectural Commitments
The platform bindings layer (bindings/) isolates deployment-specific code. Adding a new deployment target requires only a new binding crate. The core (nexus-core/) never changes. Available targets: Cloudflare Workers (wasm32-unknown-unknown), AWS Lambda + Graviton (aarch64-unknown-linux-gnu), Native server (axum, host architecture), On-chain (wasm32-wasi, placeholder).
| Environment | Binding | Hardware | Latency Profile | Best For |
|---|---|---|---|---|
| Cloudflare Workers | bindings/cf/ |
Edge (300+ locations) | < 50ms cold, sub-ms warm | Real-time queries, API gateway, gap detection on document arrival |
| AWS Lambda (Graviton) | bindings/aws/ |
ARM64 (Graviton3/4) | < 100ms cold, ms warm | Batch analysis, large-scale community detection, model training |
| AWS EC2 / HPC | bindings/aws/ |
x86_64 / GPU | Sub-ms (warm resident) | Hardware simulation, compiler optimization loops, pre-silicon emulation |
| Local / VPS | bindings/server/ |
Any | 0ms | Development, offline research, private data |
Why This Matters for SSCCS
SSCCS’s research agenda spans multiple compute domains that no single platform can satisfy:
| Research Activity | Compute Profile | Optimal Platform |
|---|---|---|
| Document ingestion & entity extraction | I/O-bound, bursty | CF Workers (edge proximity to R2) |
| Knowledge graph query & gap detection | CPU-bound, graph traversal | CF Workers or Lambda |
| Community detection (Louvain/Leiden) | CPU-bound, iterative | Lambda (Graviton, longer timeout) |
| Hypothesis generation (LLM) | Memory-bound, GPU-accelerated | EC2 with GPU or Workers AI |
| Compiler optimization (SSCCS ↔︎ llvm-project) | CPU-intensive, iterative | EC2 / HPC |
| Hardware emulation (pre-silicon RTL) | CPU + memory intensive, long-running | EC2 / HPC |
| Training (Flow-GRPO for Planner) | GPU-bound, hours-long | SageMaker / EC2 GPU |
AWS Integrations: Concrete Paths
AWS Lambda + Graviton (Immediate)
Rust compiles natively to aarch64-unknown-linux-gnu. The same nexus-gap crate that runs as a CF Worker also runs as a Lambda function on Graviton: no code changes.
Amazon S3 ↔︎ R2 Bridge (Data Gravity)
SSCCS documents live in R2. AWS compute can access them via S3-compatible API:
Or migrate hot data to S3 for lower latency from AWS compute, keeping cold data in R2.
AWS Nitro Enclaves for Contract Governance
contract.nex verification can run inside an AWS Nitro Enclave for cryptographic attestation: the verification result comes with a signed proof that the contract was executed faithfully, without relying on blockchain.
SageMaker for Flow-GRPO Training
The Learning Loop (Layer 4) trains the Planner via Flow-GRPO. Training trajectories collected from CF Workers are stored in R2, then training runs on SageMaker with GPU instances:
CF Workers → R2 (trajectory JSONL)
↓
SageMaker Training Job (ml.g5.xlarge)
↓ policy update
New Planner checkpoint → R2
↓
CF Workers pick up new checkpoint
Future: Hardware Simulation Integration
SSCCS’s core research: Segment, Scheme, Field, Observation, Projection: is about rewriting computing’s ontology. This extends naturally to hardware:
Pre-Silicon Emulation Pipeline
R2 (RTL designs, benchmark configs)
↓
EC2 HPC (Graviton4, 64 vCPU, 256GB RAM)
↓
nexus-core compiled as native binary
├── layered coordinate model ingests simulation traces as temporal coordinates
├── gap detector finds inefficiencies in pipeline utilization
├── community detection groups related hardware modules
└── contract governs which optimizations are valid
↓
Findings → R2 → LLM analysis → Compiler patch proposals → llvm-project PRs
Why Same Core for Hardware Analysis
The gap-detector that today finds orphaned concepts in documentation will tomorrow find underutilized functional units in RTL simulations. Same layered coordinate model structure, different data:
| Today (Documents) | Tomorrow (Hardware) |
|---|---|
| Fact = validated research result | Fact = RTL simulation trace |
| Intent = exploration hypothesis | Intent = optimization candidate |
| Coordinate = (three-tier record types, temporal) | Coordinate = (three-tier record types, temporal) |
| Gap = orphaned concept | Gap = idle functional unit |
Strategic Position (Example Narrative)
For example, this is not “CF or AWS.” It is “CF for what CF does best, AWS for what AWS does best, core is the same either way.” The meeting-ready narrative is: SSCCS is building a portable multi-dimensional analysis platform whose deployment surface spans edge-to-cloud, with zero architectural commitment to any single vendor.
Nexus-SIM: Virtual Emulation Test Suite
Every async I/O boundary (CF Workers KV, R2, Durable Object; ROS2 topic pub/sub; blockchain validators; distributed transaction coordinators; edge AI inference) can be emulated through the same SessionExecute trait. The virtual emulation test suite is not a convenience. It is the primary development surface. A backend that passes the nexus-sim test suite will work with any real binding that implements the same trait.
nexus-sim sits between consumption scenarios and the foundation, proving that the orchestration layer enforces scenario requirements and that the foundation provides the capabilities the orchestration layer needs. This is the three-layer development model (Development Model) in action: nexus-sim is the validation engine that closes the loop from scenario to orchestration to foundation.
Scenario-Driven Reverse Development
nexus-sim inverts the conventional development flow. Instead of implementing and then testing, the consumption scenario drives the entire stack: define the scenario, express as a test case, run against orchestration, update orchestration until pass, and commit the scenario as a permanent regression test. The full process with agent models, depths, CI integration, and implementation phases is in Simulation Suite.
nexus-sim’s test suite is always ahead of the implementation. See issue #69 for current status and roadmap.