Skip to content

Running HOMEBASE

Streamlit UI

uv run streamlit run app.py

UI Features

Core Run Flow

Feature Description
Prompt library sidebar 10 pre-built trigger phrases, one-click load into command field
Unified command field Single input handles run triggers, registry commands, and chart requests via hybrid intent routing
Live agent log Color-coded by agent type, streams in real time during a run
Quadrant summary Metric counters with post-run deferred count
Charts Scatter plot, category breakdown, stale donut, score distribution
Classification table Sortable by quadrant; deferred items dimmed post-run
Recommendation cards Tabbed HU/HI and LU/HI views with confidence scoring
HITL checkpoint panel Approve or defer HU/HI and LU/HI items; add notes before finalizing
Final report Claude Sonnet or Groq-generated narrative (runtime provider selection); highlighted item IDs and provider attribution footer
Export PDF Print-ready light-theme PDF download of the final report
Run history tab Audit trail of all runs; expandable cards with quadrant breakdown, HITL decisions, deferred items, and full report

AI Agents (Dashboard Expanders)

Agent Access Description
Predictive Quadrant Preview Below command field Predicts HU/HI, HU/LI, LU/HI, or LU/LI from free-text description before any run
Completeness Scorer Inline with Quadrant Preview Scores description against a per-category rubric; surfaces numbered follow-up questions for missing fields
AI Chart Generation Command field (chart ...) Plain language chart requests; two-tier LLM pipeline (simple spec or full Plotly figure dict)
Cross-item RCA Command field (rca ...) Root cause analysis across full registry or scoped to a category; pattern clusters, systemic narrative, recommendations
5 Whys Agent Command field (5 whys ...) Category-based causal chain; 5-level structured chain per category; auto-triggers RCA synthesis when 2+ categories analyzed
Document Intake ⬡ Document Intake expander Upload PDF/image; Gemini extracts structured fields; HITL required before registry write
Spreadsheet Analytics 📊 Spreadsheet Analytics expander Upload CSV/XLSX/ODS; pandas profiling + Gemini findings; HITL registry correlation
Schema Metric Discovery 🔬 Schema Metric Discovery expander Upload tabular schema or paste Mermaid ERD; Gemini surfaces metrics, derived fields, gaps, quality observations
Guided Intake (Submit New Issue) 📋 Submit New Issue expander 5-step HITL intake flow: Describe → Duplicate Check → Triage → Review & Approve → Done; mirrors RMA submitter checklist

Prompt Library

Trigger Behavior
"what needs immediate attention" Full registry, HU/HI items only
"weekly home review" Full registry, all quadrants
"morning briefing" Full registry, all quadrants
"fire and safety inspection" Electrical + appliance + general only
"plumbing systems audit" Plumbing category only
"electrical systems inspection" Electrical category only
"hvac seasonal maintenance check" HVAC category only
"appliance status review" Appliance category only
"exterior and grounds walkthrough" General category only
"full home assessment" Full registry, all quadrants

CLI — Interactive HITL Mode

uv run python main.py
uv run python main.py --trigger "morning briefing"

Pauses at the HITL checkpoint for terminal input. Useful for testing graph behavior without the Streamlit UI.


CLI — Non-Interactive Mode

uv run python main.py --no-hitl
uv run python main.py --no-hitl --trigger "plumbing audit"

Skips the HITL checkpoint and runs through to completion automatically.


Tests

No API key required — all LLM calls are mocked via conftest.py. All DB calls use an isolated in-memory SQLite instance per test.

uv run pytest
uv run pytest -v
uv run pytest tests/test_hitl.py -v

619 passing tests across 18 files.

File Tests Covers
conftest.py Global LLM mock + in-memory SQLite fixture (no API key needed)
test_registry_tools.py 34 Classification logic, boundary conditions, stale detection
test_orchestrator.py 20 Orchestrator node, report formatting, delegation
test_graph.py 10 Graph structure, full invocation, state key completeness
test_subagent_tools.py 33 Domain recommendation functions, category router
test_subagents.py 19 Subagent nodes, category filtering, result structure
test_hitl.py 31 HITL briefing, deferral filtering, interrupt/resume behavior
test_update_agent.py 16 NL interpretation, field validation, clamping, apply_update path
test_chart_agent.py 38 Chart spec building, complex figure dict, intent routing, analytics data
test_rca_agent.py 45 Data loaders, RCA output structure, confidence scoring, category scoping
test_whys_agent.py 34 Causal chain structure, auto-category resolution, safety keyword routing
test_quadrant_preview.py 47 Input guards, confidence normalization, LLM output validation, error handling
test_completeness_agent.py 60 Input guards, score normalization, all five rubrics, category inference
test_intake_agent.py 53 Input guards, doc types, confidence, field sanitization, item ID validation
test_analytics_agent.py 55 File dispatch, pandas profiling, LLM normalization, registry correlation
test_schema_agent.py 54 is_mermaid, Mermaid parsing, tabular profiling, pandas 2.x StringDtype
test_llm_providers.py 29 Provider detection, model factory (ChatAnthropic vs ChatGroq), subagent model, provider metadata
test_duplicate_detector.py 36 TF-IDF dual-channel scoring, threshold behavior, status filter, has_duplicates, top_match, edge cases