Under the hood
Built for how incidents actually work.
A stateful investigation graph. Parallel evidence collection. Human-gated actions. Every decision logged, traced, and auditable.
How the investigation graph works
ingest
entry point
triage
classify · dedupe · drop noise
synthesiser
Pass 1: primary_role · Pass 2: deep_rca_role if confidence < 0.70
embed_rca
stores RCA in vector store · silent side-effect
human_review
graph_interrupt · approve / reject / re-investigate
create_ticket
Jira ticket · full RCA evidence attached
Every signal. In parallel.
Each fetch node queries one category of evidence. All 15 run concurrently.
Proactive detection. Always running.
All intervals configurable via aqweth.yaml. Scheduled jobs use deterministic math — no LLM cost on routine reports.
No black box.
Every decision is logged, traced, and inspectable.
Structured logs
JSON · correlation IDs
Every log line carries incident_id, run_id, node, duration. Middleware stamps a correlation ID across the whole investigation.
Distributed tracing
OTel · per-node spans
An aqweth.node.<name> span per graph node. Exportable to any OTLP backend you already run.
Prometheus metrics
cost · latency · by model
Investigation duration, per-node timing, backend error rates, LLM token spend per model. Grafana dashboards included.
Prompt audit trail
snapshot per investigation
Each LangGraph checkpoint stores the exact prompt config that produced its RCA. Auditable record of which prompt produced which conclusion.
Per-role LLM assignment.
Every agent role is independently configurable. No code changes.
llm: gateway_url: http://litellm.internal:4000 roles: triage: qwen3-4b-instruct primary: claude-sonnet-4-6 deep_rca: claude-opus-4-7 coder: qwen3-coder-next embedder: bge-m3
One Helm chart. Online in under a day.
Interactive wizard
Live connection validation. Profile presets: aws · gcp · k8s · local
Single config file
Per-role LLM, per-service backends, deployment context.
Own namespace
Deploys into its own namespace. Never touches production namespaces.
First investigation
RCA card returns within ~30s.
aqweth validate re-checks every backend connection before deployment. No "deploy and pray."
Exit is a namespace delete.
You delete a namespace, revoke an IAM role, and you are done.
Let us run an investigation on one of your incidents.
No deployment required. You nominate an incident from your retro doc — we run the analysis together.
Request accessOr email us at hello@aqweth.ai · No commitment