Development Roadmap

Unified Knowledge Pipeline with Learnable Agentic Research

Author
Affiliation

SSCCS Foundation

Published

May 10, 2026

Abstract

Project Nexus unifies everything into a single, queryable knowledge pipeline, and transforms it into a self‑improving agentic research system. At its core is GraphRAG — a class of graph‑native retrieval‑augmented generation engines that decompose documents into structured knowledge and support multiple query modes. The initial phase implements and evaluates multiple GraphRAG backends (e.g., LightRAG, Qdrant‑based, Neo4j‑based) behind a common engine‑agnostic interface, allowing the system to benchmark, compare, and select the optimal backend per workload. An artifact ingestion pipeline ensures strong consistency and decouples CI/CD from the KG. The agentic loop (Planner, Executor, Verifier, Generator) orchestrates research activity, grounded in KG evidence and governed by a contract schema. Finally, a Flow‑GRPO learning loop refines the Planner based on research outcomes and human feedback.

Executive Summary

Project Nexus unifies everything into a single, queryable knowledge pipeline, and transforms it into a self‑improving agentic research system.

Unified Architecture Diagram

Figure 1: Unified Nexus Architecture

More than a passive knowledge base, it is a self‑improving agentic research infrastructure that:

  • Ingests documentation, code symbols, and external sources into a unified knowledge graph (KG).
  • Detects conceptual gaps and contradictions across these domains.
  • Constructs and validates hypothesis chains, each step grounded in graph‑based evidence.
  • Generates contract‑compliant research artifacts (reports, experiment proposals, structured narratives).
  • Learns from research outcomes, refining its planning strategy through on‑policy reinforcement learning.

Research Inspirations

Three complementary research advances underpin this vision:

Paradigm Source Core Insight Role in Nexus
In‑the‑Flow Agentic Optimization AgentFlow arXiv:2510.05592 A trainable Planner inside a multi‑turn loop, optimized via Flow‑GRPO, learns long‑horizon tool‑use strategies from sparse outcome rewards. The Nexus Planner learns to decompose research questions, select appropriate tools, and assemble hypothesis chains that improve over time.
Hypothesis‑Driven Discovery HypoChainer arXiv:2507.17209 LLMs, Knowledge Graphs, and human experts collaboratively build and validate hypothesis chains anchored in KG entities. The Planner and Verifier embody this cycle: chains are generated, then each step is grounded against entities and relationships retrieved from GraphRAG.
Contract‑Governed Multi‑Agent Generation Story2Proposal arXiv:2601.20833 Specialized agents (Architect, Writer, Refiner, Renderer) operate under a persistent shared contract that enforces structural obligations and visual‑artifact alignment. The Generator and Reporter agents enforce contract.json to guarantee that every generated report meets required sections, citations, and evidence thresholds.

Architecture Stack

The Nexus architecture is assembled from distinct, purpose‑aligned layers. Each layer addresses a specific concern while maintaining clean interfaces to the others.

Layer 1: Knowledge Graph Engine — GraphRAG

At the center of Nexus sits a GraphRAG layer — a class of graph‑native retrieval‑augmented generation engines that decompose documents into structured knowledge: entities, typed relationships, and community clusters. Rather than committing to a single implementation, Nexus adopts an engine‑agnostic design: multiple GraphRAG backends operate behind a common interface, allowing the system to benchmark, compare, and route queries to the optimal backend per workload.

Engine‑Agnostic Interface

All GraphRAG backends conform to a common EngineHandler interface that abstracts the core operations Nexus requires:

Operation Interface Method Description
Upload upload(document) → doc_id Ingest a document, extract entities/relationships, return handle
Delete delete(doc_id) Remove a document and all associated artifacts
Query query(text, mode) → results Execute retrieval across supported query modes
List list(filter) → documents Enumerate ingested documents with metadata
Inject inject(terms, workspace) Inject domain glossary and entity definitions

This abstraction isolates the rest of Nexus from any particular backend’s storage architecture, query syntax, or API conventions. Adding a new GraphRAG implementation requires only implementing this interface and registering a route in the sync worker (e.g., /sync/lightrag, /sync/qd).

Initial Phase: Multi‑Backend Benchmarking

The initial phase deploys and evaluates multiple GraphRAG implementations to inform the long‑term backend strategy:

Backend Routing Key Architecture Strengths Evaluation Focus
LightRAG /sync/lightrag PostgreSQL + pgvector + Apache AGE Single‑DB ACID consistency, Louvain community detection, six query modes Baseline: entity extraction quality, query latency, operational simplicity
Qdrant‑based /sync/qd Qdrant vector DB + external graph store High‑performance vector search, horizontal scaling, cloud‑native Vector recall vs. graph‑aware recall, ingestion throughput
Neo4j‑based /sync/neo4j Neo4j native graph + separate vector index Native graph traversal (Cypher), mature ecosystem, property graph model Multi‑hop reasoning quality, graph query expressiveness

Each backend ingests the same corpus through the same sync pipeline. The Verifier benchmarks query quality across modes (Local, Global, Hybrid), and the Planner learns to route questions to the most appropriate backend based on question type and observed performance.

Common Entity Extraction Pipeline

Regardless of backend, all GraphRAG implementations share a common conceptual ingestion pipeline:

  1. Chunking — documents are split into semantically coherent segments with configurable overlap.
  2. Entity Extraction — an LLM parses each chunk into (entity, type, description) triples and (source, target, keywords, description) relationship tuples.
  3. Gleaning — an optional second pass catches entities missed in the first extraction.
  4. Normalization — case normalization and description merging reduce duplicate entities.
  5. Embedding — document chunks, entity descriptions, and relationship descriptions each receive vector embeddings.
  6. Community Detection — graph clustering (e.g., Louvain, Leiden) groups related entities for thematic, high‑level queries.

Each backend implements this pipeline according to its own storage and compute architecture. The common interface ensures that the Planner, Verifier, and Generator need not know which backend produced a given result.

Query Modes

The EngineHandler interface exposes a standard set of retrieval strategies that all backends must support:

Mode Strategy Best For
Naive Pure vector similarity on document chunks Simple keyword‑like lookups
Local Entity‑centric: keywords → entities → local graph neighborhood Specific entity questions
Global Relationship‑centric: keywords → relationships → global graph context Thematic, high‑level surveys
Hybrid Parallel Local + Global, merged context Complex cross‑domain reasoning
Mix Weighted combination of all retrieval sources Multi‑faceted queries
Bypass Direct LLM call without retrieval General questions

The Planner selects the appropriate mode based on question type. The multi‑backend architecture allows mode performance to be compared across implementations: a Hybrid query might be faster on Qdrant‑based while a Global query yields richer results on Neo4j‑based.

Performance Benchmarking Framework

The initial phase includes a benchmarking framework that evaluates each backend on:

Metric Description Target
Entity extraction quality Precision/recall against gold‑standard annotations F1 ≥ 0.85
Query latency (Hybrid) End‑to‑end response time for complex queries < 1s
Ingestion throughput Documents processed per minute > 10 docs/min
Storage efficiency Disk footprint per document < 5 MB/doc
Concurrent query capacity Simultaneous queries before degradation > 50

Results inform not only backend selection but also the Planner’s learned routing policy — a question type that consistently yields better results from a specific backend will bias the Planner toward that backend.

Layer 2: Artifact Storage & Ingestion Pipeline — Cloudflare R2 + Worker

Rather than uploading documents directly to a GraphRAG engine from CI/CD (which would couple build pipelines to a specific running instance), Nexus introduces an intermediate artifact store and a synchronization worker.

Why a Decoupled Pipeline?

Direct uploads create several problems:

  1. CI coupling: every docs or code build must reach a specific GraphRAG instance. If the instance is unavailable, the build fails.
  2. No change detection: the GraphRAG engine cannot tell whether a re‑uploaded document has actually changed; every upload triggers LLM extraction.
  3. Multiple sources are hard to merge: core documentation, POC code artifacts, and user‑injected documents come from different places and must converge on one KG — potentially across multiple backends.

The decoupled pipeline addresses all three by establishing a single source of truth (R2) and a dedicated sync process (Worker).

R2: Strongly Consistent Object Store

Cloudflare R2 (bucket: ssccs-nexus-af) serves as the artifact repository. Key properties for Nexus:

  • Strong consistency on the S3 API — unlike AWS S3 before 2020, R2 guarantees read‑after‑write consistency, meaning that once PutObject returns, the object is immediately visible to the sync worker. This eliminates race conditions during incremental updates.
  • S3‑compatible API — supports GetObject, PutObject, DeleteObject, and ListObjectsV2, enabling direct use of aws s3 sync from CI without custom tooling.
  • Zero egress fees — no cost for transferring data from R2 to the Worker or GraphRAG backends, even across regions.
  • Free tier: 10 GB storage, 1 million write operations/month, 10 million read operations/month — well within SSCCS’s expected usage.

The Sync Worker

The sync worker (nexus-sync-worker, TypeScript) exposes a single endpoint, POST /sync/:engine, and performs incremental synchronization between R2 and the target GraphRAG engine.

Internal Architecture:

  1. Authentication — the request must carry Authorization: Bearer <SYNC_API_KEY>. The API key is stored as a Cloudflare Secret, never appearing in source code or configuration files.
  2. Engine Routing — the URLPattern API matches /:engine from the path. Each registered engine key maps to an EngineHandler implementation:
    • lightragLightRAGHandler — PostgreSQL‑based GraphRAG backend
    • qdQdrantHandler — Qdrant vector DB + external graph backend
    • neo4jNeo4jHandler — Neo4j native graph backend
    Adding a new engine requires implementing the EngineHandler interface and registering it in the engines map. The same R2 artifact can be synced to multiple engines simultaneously by issuing POST /sync/lightrag and POST /sync/qd.
  3. Diff Computation — the producer compares two data sets:
    • R2 inventory: all objects in the bucket, each with an ETag (MD5 hash of content).
    • KV mapping: a persistent record of {engine}/{r2_key: {doc_id, etag}} stored in Cloudflare KV, partitioned by engine.
    The diff produces three task lists per engine: deletions (in KV but not in R2), uploads (in R2 but not in KV, or ETag mismatch), and no‑ops (ETag match).
  4. Queue‑Based Processing — tasks are chunked (≤10 items per message) and pushed to a Cloudflare Queue (nexus-sync-queue). This is the critical design element that avoids the free‑plan subrequest limit (50 external fetch calls per worker invocation).
  5. Consumer Processing — queue consumers receive chunks and execute the actual GraphRAG engine API calls (DELETE then POST for modified files) against the target backend. After each successful upload, the KV mapping is updated with the new document ID and ETag for that engine.

Queue Consumer Concurrency: Cloudflare Queues support automatic horizontal scaling. Consumers are invoked in parallel when the backlog grows, up to the platform maximum — typically 250 concurrent invocations. This means a large sync (hundreds of changed documents) will scale out automatically without any configuration changes.

Figure 2: Artifact ingestion pipeline

Design Decision: KV for Mapping

Workers KV is eventually consistent: changes made in one edge location may take up to 60 seconds to propagate globally. However, all sync worker invocations for our architecture are triggered from a single entry point per engine (e.g., /sync/lightrag), and the producer writes the mapping and immediately uses it within the same invocation — read‑your‑own‑writes consistency is guaranteed. The eventual consistency of KV is therefore not a practical concern for this design: we never rely on multiple concurrent producers writing the same mapping keys.

Layer 3: Agentic Research Loop — Planner, Executor, Verifier, Generator

The agentic loop is the operational brain of Nexus. It is modeled after AgentFlow’s four‑component architecture but adapted for the research domain with KG‑grounded verification.

Agent Flow

Figure 3: Nexus as the realization of SSCCS Organic Growth

The architecture compresses the Organic Growth model into a concrete, implementable pipeline while preserving its foundational insight: knowledge grows organically when governed by explicit, evolvable contracts — and when the system that processes it can learn from its own discoveries.

Planner: The Only Trainable Component

The Planner is an LLM — initially a frozen instruct model (Qwen2.5‑7B or equivalent), later fine‑tuned via Flow‑GRPO. Its responsibilities:

  • Decompose a research question into sub‑goals.
  • Select the appropriate GraphRAG query mode and backend (hybrid for complex reasoning, local for entity lookup, global for thematic surveys).
  • Decide when to crawl external sources or execute Python code.
  • Evaluate intermediate results and decide whether evidence is sufficient to terminate.

The Planner is trained on trajectories collected from actual research sessions — successful hypothesis chains, tool sequences, and human feedback ratings all become training signals.

Executor: Tool Orchestration

The Executor invokes registered tools on the Planner’s behalf and returns structured results to the Evolving Memory. Tool examples:

  • search_kg(query, mode, backend?) — calls the GraphRAG engine’s /query with the specified mode, optionally routing to a specific backend.
  • crawl_external(url) — fetches and uploads external documents to R2 (and thus into the KG).
  • run_python(code) — executes computational snippets.
  • generate_hypothesis_step(context) — uses a frozen LLM to propose candidate hypothesis links.
  • feedback(score, comment) — prompts the researcher for evaluation, which feeds into the reward signal.

Verifier: KG‑Grounded Evaluation

The Verifier acts as both the HypoChainer‑style validation engine and the Story2Proposal evaluation agent. For each hypothesis step, it:

  1. Queries the GraphRAG layer (Hybrid or Local mode) for entities and relationships semantically aligned with the claim.
  2. Computes a support score: fraction of steps that have at least one grounded KG path.
  3. Computes a novelty score: counts how many new KG edges would be created if the hypothesis were validated — bridging previously disconnected entity clusters earns higher novelty.
  4. Checks the contract.json rules (required sections, citation formats, novelty thresholds).
  5. Outputs a structured feedback signal (weak steps, missing evidence) and a binary termination flag.

Generator: Contract‑Compliant Output

When the Verifier signals termination, the Generator produces:

  • A hypothesis chain diagram with confidence scores.
  • An evidence table (KG‑grounded sources per step).
  • Gap analysis and proposed experiments.
  • A Quarto‑ready .qmd file.

A Reporter agent performs a final cross‑check against contract.json before the file is committed. If contract rules fail, the Generator is re‑invoked with the error report.

Evolving Memory

All research activity is recorded in an append‑only Evolving Memory (JSONL files). Each session logs the query, every planning turn (tool choice, arguments, results, verifier feedback), the final output, and any human feedback. These trajectories are the raw material for RL training.

Layer 4: Learning Loop — Flow‑GRPO

The collected trajectories feed a Flow‑GRPO pipeline that refines the Planner:

  1. Rollout Collection: each completed session is stored as memory/session_<timestamp>.jsonl.
  2. Reward Computation: the final reward blends KG‑support score, novelty score, contract compliance (binary), and human feedback (score normalized to [0,1]).
  3. Group Sampling: trajectories are batched (typically 8 per group).
  4. Advantage Calculation: group‑normalized advantages (as in GRPO) stabilize training.
  5. Policy Update: PPO‑style clipped objective with KL penalty toward a frozen reference model. Because the reward is broadcast to all steps of a trajectory, the multi‑turn credit assignment problem is decomposed into single‑turn updates.

Training can run on a consumer GPU (7B model) or be deferred to cloud resources. The Planner checkpoint is versioned and stored alongside the repository.

Layer 5: Contract Governance (Future)

The SSCCS Organic Growth diagram in index.qmd envisions a system where Contract‑Governed Ingestion feeds a Unified Knowledge Graph, driving Hypothesis Generation and Validation. Nexus currently implements this as contract.json — a JSON schema that defines required hypothesis steps, evidence thresholds, novelty minimums, and report section requirements. As the project matures, the contract can expand to enforce:

  • Cryptographic provenance chains (C2PA manifests for generated artifacts).
  • Mandatory code‑documentation traceability for every new hypothesis.
  • Formal verification of experimental proposals against KG evidence.

Because the contract is versioned, machine‑readable, and enforced at both the Verifier and Reporter layers, it provides an evolvable governance surface without requiring architectural changes.

Component Interaction Matrix

Component GraphRAG R2 Sync Worker Planner Verifier Generator
GraphRAG ← syncs ← queries ← grounds
R2 ← reads
Sync Worker → DELETE/POST → list/get
Planner → queries → delegates
Verifier → hybrid queries ← receives → signals
Generator ← triggered

Strategic Value & Open‑Source Alignment

  • No lock‑in: every GraphRAG backend is replaceable; the engine‑agnostic interface isolates the system from any particular implementation. R2 and Workers are replaceable with any S3‑compatible store and any serverless function platform.
  • Free tier viability: the entire stack operates within Cloudflare’s free tier (R2: 10 GB, 1M writes/month; Workers: 100k requests/day; Queues: included) and OCI’s Always Free tier.
  • Research‑first design: optimized for the academic exploration cycle — hypothesize, validate, publish — rather than for commercial RAG benchmarks.
  • Autonomous learning: among the first applications of on‑policy reinforcement learning to scientific knowledge management.

Integration with AWS Ecosystem as a Strategic Collaboration

Project Nexus serves as the knowledge backbone for the broader SSCCS × AWS Strategic Collaboration Roadmap, which outlines autonomous compiler optimization and pre‑silicon hardware emulation workstreams. The Nexus agentic loop and Graph‑RAG infrastructure are directly applicable to the experiments proposed in that roadmap:

  • AI‑Driven Compiler Optimization on Graviton: Autonomous agents, powered by the Planner and Verifier, can explore compiler pass sequences, evaluate metrics (CEI, FGR, buffer reduction), and store experimental results in the GraphRAG knowledge graph. This accelerates the closed‑loop tuning cycle described in the roadmap.
  • Pre‑Silicon Hardware Emulation on AWS HPC: The same hypothesis‑generation and validation pipeline can manage the massive parameter space of structural observations across simulated HPC environments, tracking determinism, energy efficiency, and latency. Results from these emulations feed back into the knowledge graph, enabling cross‑domain discovery between software and hardware research.

The ingestion pipeline ensures that benchmark data, compiler configurations, and emulation traces are continuously integrated with Nexus, turning every AWS experiment into a reusable knowledge asset.

Reference Papers and Solutions

Below is a comprehensive list of key papers and open‑source solutions related to knowledge graph construction, incremental KG building, autonomous research, document processing, RAG systems, and community‑driven curation platforms.

Click to expand to see external references and links

Knowledge Graph Construction & Processing

Item Type Description
AutoSchemaKG / ATLAS (arXiv:2505.23628) Paper / Solution Fully autonomous KG construction without predefined schemas. Two‑stage pipeline: triple extraction + schema induction via conceptualization. Produces ATLAS family of billion‑node KGs (Wiki, academic papers, Common Crawl).
Wikontic (EACL 2026) Paper Wikidata‑aligned, ontology‑aware KG construction from open‑domain texts. Extracts triplets with qualifiers, enforces type/relation constraints, normalises entities. State‑of‑the‑art information retention (86% on MINE‑1), compact build (<1k tokens).
Darth Vecdor (arXiv:2512.15906) Paper / Solution LLM‑based system that extracts structured, terminology‑mapped knowledge into SQL databases (KGs). Built for healthcare and high‑volume operations where cost, speed, and safety matter. Browser‑based GUI included.
Graphify (also hub.baai.ac.cn) Solution Zero‑config, all‑modal knowledge graph compiler. Local AST parsing (tree‑sitter) + parallel LLM sub‑agents for semantic extraction. 71.5× token savings vs raw files. Outputs interactive HTML + analysis report. Inspired by Karpathy’s LLM Wiki pattern.
Graphiti by Zep AI Solution Python library for building temporal KGs that evolve over time. Tracks fact/relationship lifecycles, episodic processing, hybrid search (semantic + BM25). Designed for dynamic data like conversation histories.
Docling‑Graph Solution Turns PDFs, images, Markdown, Office files into validated Pydantic objects and directed KGs. Supports local VLM or LLM extraction (LiteLLM). High precision for chemistry, finance, legal domains.
Ontology2Graph Solution Generates synthetic KGs from ontological schemas using LLMs. Includes quality assurance, interactive visualisation, graph merging, and key performance indicators. BSD 4‑Clause licence.
sift‑kg Solution CLI‑first tool: drop documents → browsable KG in minutes. Schema‑free, any LLM provider (incl. Ollama), human‑in‑the‑loop entity deduplication, interactive viewer with community regions.
llm4s Knowledge Graph Module Specification Schema‑guided extraction and incremental multi‑document graph building. Entity deduplication, source tracking, coreference resolution. Planned for Scala 2.13/3.x.
Frugal KG Construction with Local LLMs (arXiv:2604.11104) Paper Zero‑shot, multi‑model pipeline running entirely on consumer hardware (RTX 3090). Achieves F1 0.70 on DocRED, 0.80 text‑to‑query accuracy, 0.96 faithfulness on multi‑hop QA.
m-flow Solution GraphRAG system that replaces similarity‑based retrieval with graph‑path reasoning. Propagates evidence along chains of relevance, scoring by connectedness.
Open-Knowledge-Base Solution AI workflow to structure and atomize scientific knowledge (math, biochemistry). Uses PaddleOCR for figures/tables, AI rewriting to avoid copyright issues.
CollabNext Solution Researcher collaboration recommender system using OpenAlex + Neo4j. Matches by topic, institution, community. Contributes to NSF Proto‑OKN.
omop-graph Solution Lightweight KG layer on top of OMOP medical data standard. Graph navigation, path finding, ranking, explainability. Optimised for Jupyter and reproducible research.

Autonomous Research & Hypothesis Generation

Item Type Description
AgentFlow: In-the-Flow Agentic System Optimization for Effective Planning and Tool Use Paper / Solution Outcome-driven reinforcement learning has advanced reasoning in large language models (LLMs), but prevailing tool-augmented approaches train a single, monolithic policy that interleaves thoughts and tool calls under full context; this scales poorly with long horizons and diverse tools and generalizes weakly to new scenarios.
Deep Researcher Agent (arXiv:2604.05854) Paper / Solution End‑to‑end framework for 24/7 autonomous deep learning experiments: hypothesis formation → code implementation → training → analysis → refinement. Zero‑cost monitoring, capped memory, 73% token reduction.
Karpathy Autoresearch Solution Minimal (630 lines) Python framework that lets an AI agent autonomously design, run, and evaluate ML experiments overnight on a single GPU. 5‑minute fixed budget per experiment, 100+ experiments per night.
InternAgent 1.5 Solution Unified agentic framework for long‑horizon autonomous scientific discovery across physical, biological, earth, and life sciences. Three subsystems: Generation (hypothesis), Verification (evaluation), Evolution (memory‑driven refinement). Top performance on GAIA, GPQA, FrontierScience.
MiroThinker‑1.7 / H1 (arXiv:2603.15726) Paper / Model Research agents for complex long‑horizon reasoning. H1 incorporates local/global verification into reasoning trajectories. Open‑source models (1.7 and mini) with strong efficiency.
Firecrawl Web Agent Solution Open‑source foundation for building autonomous web research agents. Based on LangChain Deep Agents, includes skills, sub‑agents, structured output, and streaming.
Sirchmunk Solution Embedding‑free, agentic search engine. Indexless retrieval (ripgrep‑all), self‑evolving knowledge clusters, Monte Carlo evidence sampling, ReAct agent fallback. Real‑time streaming chat and knowledge cluster browsing.
Idea2Story (arXiv:2601.20833) Paper Pre‑computation based autonomous scientific discovery. Extracts methodology units into a methodological KG, then navigates at runtime to formulate research plans.
HypoChainer (arXiv:2507.17209) Paper LLM + KG + human expert collaborative hypothesis generation. Three‑stage: context exploration → hypothesis chain → validation prioritisation.
MIND (arXiv:2604.13699) Paper AI co‑scientist for materials research. Multi‑agent closed‑loop: hypothesis refinement → MLIP simulation → discussion‑based verification.
Theorizer (Ai2) Solution Automatically generates theories in ⟨law, scope, evidence⟩ structure from thousands of papers. Uses multiple LLMs.
ResearchEVO (arXiv:2604.05587) Paper End‑to‑end automatic scientific discovery (discovery→explanation). LLM‑based code exploration + RAG paper generation. Zero‑shot citation.
Super Research (arXiv:2603.03623) Paper Autonomous research framework for complex questions. Structured decomposition, broad search, iterative querying.
HypoExplore (arXiv:2604.08528) Paper Agent framework modelling neural architecture discovery as hypothesis‑driven exploration.

Graph‑Enhanced RAG & Question Answering

Item Type Description
LightRAG (EMNLP 2025) Solution Simple and fast GraphRAG framework. Integrates Neo4j, OpenSearch, MongoDB, PostgreSQL. Supports reranking, multimodal processing (RAG‑Anything), document deletion with automatic KG regeneration. Web UI included.
Rce-KGQA (arxiv.org/abs/2110.12679) Paper Improving Embedded Knowledge Graph Multi-hop Question Answering by introducing Relational Chain Reasoning
HiGraAgent (ACL 2026) Paper Hierarchical Knowledge Graph (HiGra) with entity alignment (34.5% redundancy reduction). Hybrid graph‑semantic retriever + dual‑agent adaptive reasoning (Seeker/Librarian). 85.3% average accuracy on HotpotQA, 2WikiMultihopQA, MuSiQue.
GraphRAG spec (llm4s) Specification Community detection (Leiden), hierarchical summaries, global + local search, hybrid vector‑graph retrieval. Integration plan for existing RAG pipelines.
CDRAG (Clustered Dynamic RAG) Solution LLM‑guided cluster‑aware retrieval for legal QA. Pre‑clusters corpus, extracts keywords per cluster, routes queries to relevant clusters. Outperforms top‑K RAG on faithfulness (+0.51) and overall quality (+0.34).
BioGraphletQA Solution Graphlet‑anchored generation of complex, factually grounded QA data from KGs. Controls complexity and ensures grounding of LLM‑generated questions.

Incremental & Temporal Knowledge Graphs

Item Type Description
Graphiti (Zep AI) Solution (listed above) Temporal KG with episodic processing, fact lifecycle tracking, hybrid search.
llm4s incremental building Specification Schema‑guided extraction and incremental multi‑document graph building. Entity deduplication, source tracking.

Lightweight RAG & Document Processing

Item Type Description
QuantumRAG Solution Zero‑config RAG engine. Deeply indexes documents via multiple lenses (semantic, hypothetical questions, keywords, entity relationships). 176 scenario tests, multi‑hop reasoning, cross‑document verification. Korean support.
VerifAI (arXiv:2604.08549) Paper / Solution Biomedical QA with post‑hoc claim verification. Decomposes answers into atomic claims, validates via fine‑tuned NLI engine. Outperforms GPT‑4 on HealthVer, reduces hallucinated citations.
Mentat Solution Strategic retrieval: extracts logical structure (ToC, hierarchy, metadata) without LLM overhead. Two‑step protocol (find → read) saves 85% tokens vs naive RAG. Pluggable architecture for vector stores, embedding models, lifecycle hooks.
haldy Solution GraphRAG for git history. Builds KG from commits, authors, files. MCP tools for finding experts, tracing decisions, measuring coupling. Querying always free – agent does reasoning using its own tokens.

Knowledge Graph Visualization

Item Type Description
Context‑KG (arXiv:2604.10384) Paper Context‑aware KG visualisation using LLMs to extract user preferences. Ontology‑guided, semantics‑aware layout. Generates high‑level insights beyond traditional methods.
papergraph Solution CLI tool that discovers academic papers, traces citation networks, computes text similarity, runs graph algorithms, and produces explorable visualisations.
graphwiki Solution TypeScript‑based KG with persistent wiki compilation, dual‑transport MCP (stdio + HTTP), AST + embedding deduplication. Context Loading Protocol for token‑efficient retrieval.

Curation & Community‑Driven Knowledge Hubs

Item Type Description
ML+X Nexus Curation Platform Community‑curated hub for ML/AI resources. Four sections: Learn, Toolbox, Stories, Projects. Crowdsourcing‑based, evolves continuously. (GitHub)
Paper Circle Solution Multi‑agent academic literature discovery and analysis. Paper search → KG construction → graph‑based QA. ACL 2026 Oral. (GitHub)
OpenScholar https://openscholar.ai/ Solution Searches 45M open‑access papers, generates answers with citations. Nature 2026. Human‑level citation accuracy. Fully open source. (GitHub)
Knoll Paper / Solution End‑user knowledge module creation for LLMs. Imports from web clippings, Google Docs, GitHub. Public deployment with 200+ users. (GitHub)
browzy.ai Solution Terminal‑based LLM personal knowledge base. Compiles articles, PDFs, images, web links into an interconnected KB. Local, no API keys required.
Memex Solution LLM‑maintained filesystem wiki. Knowledge accumulates and cross‑references in Markdown without RAG. Containerised execution. Inspired by Vannevar Bush.
Wikidata Embedding Project Infrastructure Public embeddings of Wikidata (119M items) for direct LLM use. MCP support.

Knowledge Graph Evaluation & Validation

Item Type Description
GPTKB v1.5 Knowledge Base Large‑scale KB fully extracted from GPT‑4.1: 100M triples, 6.1M entities. 10× cheaper than previous KBC projects. AAAI 2026.
MMKG-RDS Paper / Solution Multi‑modal KG reasoning data synthesis framework. 5 domains, 17 tasks, 14,950 samples. Fine‑tuned Qwen3 improves reasoning accuracy by 9.2%.
BioGraphletQA Solution Fact‑based QA generation anchored in KG graphlets. Controls complexity and ensures grounding. ECIR 2026.
SHAPR Paper Human‑AI collaborative research framework using Structured Knowledge Units (SKUs). Iterative cycle: Explore‑Build‑Use‑Evaluate‑Learn.

RAG Tutorials & Learning Resources

Item Type Description
HERE AND NOW AI 2026 Lab Tutorial Project‑based tutorials from foundational chatbots to advanced RAG. Local‑only stack (Ollama + LangChain), multimodal support, vector databases (FAISS).
Milvus + RustFS RAG Chatbot Tutorial Step‑by‑step build of a lightweight RAG chatbot using Milvus vector DB and RustFS object storage. FastAPI backend, Next.js frontend.

Summary by Category

Category Paper Count Solution Count Key Keywords
Knowledge Graph Construction & Processing 5 9 AutoSchemaKG, Graphify, Graphiti, Docling‑Graph, sift‑kg, m-flow, Open‑Knowledge‑Base, CollabNext, omop‑graph
Autonomous Research & Hypothesis Generation 10 6 Deep Researcher, Autoresearch, InternAgent, Idea2Story, HypoChainer, MIND, Theorizer, ResearchEVO, Super Research, HypoExplore, MiroThinker, Firecrawl, Sirchmunk
Graph‑Enhanced RAG & QA 3 4 LightRAG, ProgRAG, HiGraAgent, CDRAG, BioGraphletQA
Incremental & Temporal KG 0 2 Graphiti, llm4s spec
Lightweight RAG & Document Processing 1 4 QuantumRAG, VerifAI, Mentat, haldy
KG Visualization 1 3 Context‑KG, papergraph, graphwiki
Curation & Community Hubs 1 6 ML+X Nexus, Paper Circle, OpenScholar, Knoll, browzy, Memex, Wikidata Embedding
KG Evaluation & Validation 3 3 GPTKB, MMKG-RDS, BioGraphletQA, SHAPR
Tutorials & Learning Resources 0 2 HERE AND NOW AI Lab, Milvus+RustFS

Total unique entries (approximate): 24 papers + 39 solutions/tools = 63+ items.


© 2026 SSCCS Foundation — Open-source computing systems initiative building a computing model, software compiler infrastructure, and open hardware architecture.