Product Requirements Doc

Simpleflo Conduit
Expanded title: Private Knowledge Base for AI Coding Tools
Status: Implemented (V1 launch)
Last updated: January 2026
Primary platform: macOS (Windows + Ubuntu supported)
Owner: AD

Conduit is the system I wanted to exist the first time I tried to “bring my docs to my AI tools” and realized I was about to build an accidental platform instead of doing my actual work.

This page is the public-facing PRD: it explains what Conduit is, what V1 actually ships, why the design looks the way it does, and what I’m intentionally not doing (yet).

Executive summary

Conduit V1 delivers a local-first private knowledge base that turns your documents into searchable, AI-accessible context—exposed to AI coding tools via MCP (Model Context Protocol).

The core thesis is simple:

Your docs + your AI tools + zero cloud dependency.

V1 prioritizes retrieval quality (hybrid search, optional knowledge graphs) and multi-client consistency over the original “connector marketplace” vision. That pivot was deliberate: the private KB use case is the highest-signal wedge, and it’s the foundation the broader “AI Intelligence Hub” can grow from.

The problem (as it shows up in real workflows)

AI coding tools are powerful, but they’re context-poor by default. Developers operate inside a private universe:

Architecture docs, runbooks, internal wikis
ADRs, design notes, meeting decisions
Conventions that matter (“we don’t do it that way here”)
A long tail of historical context (“we tried this already; it broke for reason X”)

Today, bringing that context to AI tools usually means one of these bad options:

Manual uploads and copy/paste (fragile, repetitive, and easy to overshare)
One-off RAG scripts (hard to maintain, not integrated into daily tools)
“Just dump more context in the prompt” (context bloat → worse answers)

Conduit exists to make private context usable without turning the user into a retrieval engineer or a configuration janitor.

What Conduit is (V1)

Conduit is a local application that helps you:

Ingest private documents into a knowledge base (KB)
Retrieve with high quality using hybrid search (keyword + semantic + optional graph)
Expose that KB to AI coding tools via an MCP server
Configure multiple clients without hand-editing config files
Stay local-first (privacy by architecture, not by policy)

Evolution from the original vision (and why the pivot was correct)

The original Conduit vision had two big pillars:

A “connectivity hub” for third-party MCP servers (discover, install, sandbox, lifecycle-manage)
A private knowledge base that feeds the right context to AI tools on demand

V1 intentionally shipped the second pillar first.

Why? Because the private KB is the compounding asset:

It’s universal (everyone has docs).
It’s immediately useful (even before any marketplace exists).
It’s the “answer engine” layer that makes AI tools feel smarter without context bloat.

The connectivity hub remains on the roadmap—but it’s built on top of the same primitives: isolation, policies, adapters, and a user-owned local control plane.

If you’re curious about the original north-star vision, see:

https://github.com/amlandas/Conduit-AI-Intelligence-Hub/blob/main/docs/Vision%20and%20PRD/ORIGINAL_VISION.md

Vision

Vision statement

Become the standard way developers bring private knowledge to AI coding tools: easy ingestion, sophisticated retrieval, secure local operation.

Product thesis

“Your docs + your AI tools + zero cloud.”

Goals and non-goals

V1 goals (achieved)

Fast time-to-value: go from documents → working KB in minutes
Sophisticated retrieval: hybrid search that beats naive “vector-only RAG”
Multi-client support: configure once, use across 4+ AI clients
Security by default: local-only operation + container isolation primitives
CLI-first: power users get full control via the command line

V1 non-goals (explicitly deferred)

Connector marketplace / discovery
Third-party MCP server lifecycle management
Remote client support (ex: ChatGPT Secure Link)
Enterprise RBAC / org-wide policy engines
Cloud sync or cloud storage of documents
Community trust scoring and auditing signals

Target users

Primary persona: developer using AI coding tools

Top needs:

Ingest docs without manual uploads
Query private docs inside the coding workflow
Consistent behavior across tools (Claude Code / Cursor / VS Code / Gemini CLI)
No cloud dependency for sensitive docs

Secondary persona: tech lead / architect

Top needs:

Bulk ingestion of doc repos (standards, runbooks, ADRs)
High-quality retrieval with citations
Minimal maintenance overhead
Works across the team’s tool choices

Core use cases (V1)

Connect private docs → AI coding tool
“Use my architecture docs when I ask Claude Code questions.”
Cross-client parity
“If my KB works in Claude Code, it should also work in Cursor and VS Code.”
High-quality retrieval
“Find relevant context even if my query uses different terminology than the docs.”
Knowledge graph queries (optional)
“Show me dependencies of AuthService and how they connect.”

End-to-end UX flows (how V1 is meant to feel)

Loop A — documents → private KB → AI tools

# 1) Install
conduit install
 
# 2) Add sources
conduit kb add ~/docs/project
 
# 3) Build KB (vectors + indexes)
conduit kb sync
 
# 4) Configure your AI client
conduit mcp configure --client claude-code
 
# 5) Ask questions in the AI tool
# Example: “What’s the auth flow for service X?”

The point of this loop is that Conduit becomes an always-available context layer. You shouldn’t have to “re-explain your world” every time you open a tool.

Loop B — advanced retrieval (what’s happening under the hood)

Conduit doesn’t bet on a single retrieval technique. It composes several:

Full-text search for exact matches
Semantic search for conceptual similarity
Optional graph search for entity relationships
Fusion + reranking so the final context is tight and high-signal

Loop C — maintenance (should feel boring, in a good way)

# Add more docs anytime
conduit kb add ~/docs/new-feature
 
# Incremental update
conduit kb sync
 
# Health + stats
conduit doctor
conduit kb stats
conduit kb sources

⸻

What ships in V1 (scope)

Supported AI clients

Client	Transport	Status
Claude Code	stdio	✅
Cursor	stdio	✅
VS Code (Copilot/Cline)	stdio	✅
Gemini CLI	stdio	✅

Supported document formats

30+ formats including Markdown, PDF, Word, Excel, PowerPoint, HTML, JSON/YAML, source code, and more.

Search capabilities

Capability	Tech	Status
Full-text search	SQLite FTS5	✅
Semantic search	Qdrant (768-dim)	✅
Graph search	FalkorDB	✅
Hybrid fusion	RRF	✅
Diversity filtering	MMR	✅
Semantic reranking	Top-30 → Top-10	✅
Query classification	5 query types	✅

⸻

Why Conduit uses hybrid search (and not “just vector RAG”)

Vector search is great—right up until it isn’t.

The failure mode isn’t always “no results.” Often it’s worse: results that are plausible but not the right thing. That’s why Conduit blends:

Keyword precision (FTS5 catches exact strings and identifiers)
Semantic recall (Qdrant catches conceptual matches)
Graph structure (optional) (FalkorDB captures relationships when structure matters)

Then Conduit fuses and reranks so the AI tool receives the few chunks that matter, not a novel-length context blob.

⸻

KAG (knowledge-augmented generation): when it’s worth it

KAG is not “RAG but fancier.” It’s a different bet: you’re investing in structure so multi-hop reasoning becomes auditable and repeatable.

Conduit supports building a domain knowledge graph by extracting entities and relationships from your documents. This is powerful—but it costs time, compute, and storage.

Rule of thumb: Use KAG only when you truly need structured, multi-hop, constraint-heavy reasoning over a stable domain.

Situations where KAG is justifiable:

Strong structure and stable ontology
- Domain naturally expressed as entities/relations (drugs–conditions–contraindications, instruments–positions–counterparties, services–dependencies–SLOs)
- You care about canonical IDs and consistency, and you’re willing to invest in that structure
Complex, multi-hop, constraint-heavy queries
- Questions require chaining facts and constraints (joins/filters/constraints)
- The path matters—not just “semantic similarity”
High demands for accuracy, explainability, auditability
- Safety-critical domains where hallucinations are unacceptable
- You need deterministic-ish reasoning traces that can be inspected, tested, and versioned
Entity-centric factual workloads (vs open-ended chat)
- Most questions are about specific entities and relationships
- You want professional-grade factuality (policy, guidelines, internal standards)
You can amortize graph cost across many uses
- Same graph powers Q&A, monitoring, analytics, rule checks
- You already have partial graphs (data catalogs, lineage graphs, product graphs)

Under these conditions, KAG’s up-front cost buys reduced hallucinations, better logical coherence, and stronger multi-hop reasoning.

For the hands-on workflow and commands, see:

/docs/kag

⸻

Requirements (dependency model)

Conduit is local-first, but it still needs a few components to do serious retrieval. By default, Conduit prefers Podman over Docker for a more secure-by-default posture, but Docker works fine too.

At a minimum you should expect:

A supported OS (macOS primary; Windows via WSL2; Ubuntu supported)
A container runtime: Podman preferred, Docker supported
Local services used by Conduit:
- Qdrant (vector DB)
- FalkorDB (graph DB for KAG)
- Ollama (local models for embeddings + entity extraction)

This doc intentionally doesn’t duplicate installation steps. Use:

/docs/install

⸻

Success metrics

North star metric

Time-to-First-Query (TTFQ): time from conduit install to first successful KB query inside an AI client.

Target: < 10 minutes for a developer with existing documents.

Supporting metrics

Install success rate
Client configuration success rate
Query success rate + latency
Retrieval quality (relevance of returned results)
KB utilization (queries/day)
Document coverage (indexed vs failed)

⸻

Release history (how the product actually evolved)

V0 (Dec 2025): Core daemon + CLI, RuntimeProvider (Podman/Docker), KB ingestion, FTS5, client adapters, policy engine
V0.5 (Dec 2025): Qdrant + embeddings (Ollama), hybrid search (RRF), MMR diversity filtering
V1.0 (Jan 2026): FalkorDB + entity extraction (Mistral via Ollama), KAG queries, semantic reranking, query classification, desktop GUI (experimental)

⸻

Roadmap (what’s next)

V1.x — polish and stability

Consent ledger integration
OS keychain secrets manager
KB export/import
Better errors and diagnostics

V2.0 — connector ecosystem

Connector marketplace + discovery
Third-party MCP server management
Trust signals and community scoring
Secure Link for remote clients (ex: ChatGPT)
Auditor / security scanning

V3.0 — enterprise

Team/org KB sharing
RBAC + admin controls
Optional cloud backup
Audit trails

⸻

Risks and mitigations (practical reality)

Risk	Why it matters	Mitigation
Large doc sets slow indexing	users abandon before value	incremental indexing, parallelization, clear progress
Ollama resource usage	laptops aren’t servers	keep models optional, tune defaults, document tradeoffs
Container setup friction	runtime issues block adoption	“doctor” checks + guided fixes + Podman-first defaults
Search quality variance	trust dies fast	hybrid retrieval + reranking + query classification
Client config formats change	adapters break	adapter abstraction + version detection + quick patches

⸻

Glossary

MCP: Model Context Protocol (standard for AI tool integration)
KB: Knowledge Base (indexed document collection)
FTS5: SQLite full-text search extension
Qdrant: vector database used for semantic retrieval
FalkorDB: graph database used for KAG
RRF: Reciprocal Rank Fusion (combines result rankings)
MMR: Maximal Marginal Relevance (diversity filtering)
KAG: Knowledge-Augmented Generation (graph-enhanced retrieval)
RuntimeProvider: container runtime abstraction layer (Podman/Docker)

⸻

Deep dives (if you want the engineering details)

Design overview: https://github.com/amlandas/Conduit-AI-Intelligence-Hub/blob/main/docs/HLD/DESIGN.md
KB search design: https://github.com/amlandas/Conduit-AI-Intelligence-Hub/blob/main/docs/KB_SEARCH_HLD.md
KAG design: https://github.com/amlandas/Conduit-AI-Intelligence-Hub/blob/main/docs/KAG_HLD.md
Query-adaptive retrieval: https://github.com/amlandas/Conduit-AI-Intelligence-Hub/blob/main/docs/QUERY_ADAPTIVE_DESIGN.md
MCP server design: https://github.com/amlandas/Conduit-AI-Intelligence-Hub/blob/main/docs/MCP_SERVER_DESIGN.md