Back to climacs.net

Evidence-first AI reliability, not autonomous remediation

HARP — Homelab & Hybrid AI Reliability Platform

A human-in-the-loop reliability platform that turns alerts, runbooks, RCAs, session logs, and monitoring context into cited evidence packs. HARP, Nero-Camp, OpenRAG, read-only MCP, LiteLLM, and n8n work together so AI tools can explain incidents while humans keep approval authority.

BM25 + hybrid searchEvidence packs with citationsRead-only MCP agent surfaceOpenRAG semantic retrievalLiteLLM model routingNero-Camp workflow & approval staten8n workflow glueHuman approval before remediationEval harness + golden queries
Control-room architecture read-only first
HARPBM25 · hybrid search · evidence packs · MCP
AI clientsCodex, Cursor, Claude Desktop
SignalsPrometheus, Alertmanager, Uptime Kuma
LiteLLMmodel routing gateway
OpenRAGsemantic retrieval
Nero-Campworkflow and approval state
Git + runbooksRCAs, docs, sessions, evals

74

tests passing

Stage A–B test suite

29

golden queries

Eval harness corpus

D+H

current stage

OpenRAG hybrid + MCP

0.5.0

MCP version

Read-only tools live

The reliability boundary is the product.

HARP is built around a clear split: evidence, workflow state, retrieval, model routing, event glue, and approvals each have one owner.

HARP is evidence

HARP turns Git-tracked runbooks, RCAs, session logs, alerts, metrics, and generated evidence packs into searchable reliability memory with citations and confidence signals.

MCP is read-only

AI tools can ask HARP for context through MCP tools and resources. They do not get shell access, Git writes, kubectl mutation, or remediation power.

Nero owns approval

Nero-Camp owns workflow state, approval, audit timeline, and the human decision record. HARP provides evidence; Nero records the operational decision.

What HARP is not.

The deliberate scope is as important as the feature set. These are hard boundaries, not roadmap gaps.

Not autonomous remediation

HARP does not restart services, apply patches, or execute kubectl mutations. Evidence-only means evidence-only.

Not a general chatbot

HARP has no broad operational credentials. Every answer must cite a repo-relative file path or live evidence.

Not a SaaS product

HARP is self-hosted, content-agnostic, and profile-driven. Point it at any Git runbook repo and it indexes that repo.

Not a replacement for monitoring

HARP does not replace Prometheus, Grafana, Alertmanager, or Argo CD. It is a reliability knowledge layer that sits alongside them.

How to explain HARP.

01

The problem

Incident context is scattered across alerts, logs, runbooks, dashboards, and old RCAs. Copying that manually into AI chats creates weak grounding.

02

The architecture

Signals enter HARP. HARP searches local BM25 and optional OpenRAG, merges results, and returns cited evidence instead of invented remediation.

03

The agent interface

MCP standardizes how Codex, Cursor, Claude Desktop, or reviewer agents ask HARP for tools, resources, and prompt templates.

04

The workflow layer

Nero-Camp consumes HARP evidence, updates task state, keeps approval history, and separates operator decisions from model output.

05

The enterprise lesson

The homelab shape maps to enterprise platform engineering: retrieval, model routing, agent contracts, workflow orchestration, governance, and evals.

Explain each part one by one.

HARP

Evidence

Search API, evidence renderer, MCP server, policy labels, eval harness, and OpenRAG connector.

Explain HARP →

OpenRAG

Semantic

Milvus-backed semantic retrieval for symptom-language discovery beyond exact keyword matches.

Explain OpenRAG →

MCP

Agent API

A constrained tool/resource/prompt surface for AI clients that need context, not production power.

Explain MCP →

LiteLLM

Routing

Model gateway for local models or gated cloud providers such as AWS Bedrock when policy allows.

Explain LiteLLM →

Nero-Camp

Approval

Task state, source records, timelines, approval gates, and human closeout decisions.

Explain Nero-Camp →

What I built vs. what I integrated.

A clear split between components I authored from scratch and components I selected, configured, and integrated into the architecture.

Component Role Evidence
HARP search engine Built Pure-Python BM25 + RRF hybrid merge in harp/core/
Evidence pack generator Built Alert payload → cited Markdown with action tiers
MCP server Built FastMCP 3.x, 5 tools, 4 resources, 3 prompts
Eval harness Built 29 golden queries, MRR scoring, harp/evals/
RAG export connector Built Provider-neutral interface with redaction layer
OpenRAG Integrated Self-hosted Milvus + BGE-small embeddings on OpenRAG VM
LiteLLM gateway Integrated OpenAI-compatible proxy; policy-gated cloud routing
Nero-Camp Integrated Workflow/approval state layer; HARP feeds evidence only
n8n Integrated Webhook fan-out, alert routing, low-risk event glue
Prometheus + Alertmanager Integrated Alert signal source; HARP reads but does not write

How this maps to enterprise platform engineering.

Each homelab component maps to a recognised enterprise capability. The homelab is the proof of concept; the concepts are production-grade.

HARP Component Enterprise Equivalent Value
HARP search (BM25 + RRF) Enterprise RAG retrieval layer Grounded context for AI agents and on-call engineers
MCP server (read-only) API gateway / agent contract Safe, scoped access surface for AI clients
Evidence packs + citations Audit-grade answer grounding Replaces ungrounded LLM output with traceable docs
Action tiers (R/P/X/D) Policy-as-code guardrails Maps to enterprise change-management tiers
Nero-Camp approval loop Workflow orchestration + ITSM Human-in-the-loop control plane for risky actions
Eval harness + golden corpus Model/system evaluation Regression testing for retrieval quality
OpenRAG connector Vector store integration Pluggable semantic backend behind HARP API
Redaction layer Data-loss prevention (DLP) Strips secrets before export to external RAG backends