Technical Notes¶
Rate Limits¶
The free Groq tier has per-minute rate limits. If you see a 429 error, wait 60 seconds
and retry. Check current usage at https://console.groq.com.
Registry Behavior¶
data/registry.jsonis read once during initial DB seed. Afterhomebase.dbexists, all registry reads/writes go through SQLite only.days_since_updateis computed on read from theupdated_attimestamp column viajulianday()difference. Every registry write (add,update,close) setsupdated_at = datetime('now').- Item IDs are assigned sequentially per category prefix:
HV-001,HV-002, etc. Generated by counter query at insert time — no UUID, no gaps in normal operation.
Rule-Based Fallback¶
tools/subagent_tools.py contains the original deterministic recommendation logic
predating the LLM-backed subagents. It is retained as a reference implementation and
fallback. It is not called in the default run path.
Test Isolation¶
tests/conftest.py provides two fixtures used across the entire test suite:
- Global LLM mock — patches all Groq and Gemini calls; no API key required to run tests
- In-memory SQLite fixture — isolated DB per test; no file I/O; no shared state between tests
This means uv run pytest works in any environment without .env configuration.
Multi-Provider Notes¶
HOMEBASE runs three active LLM providers coordinated by the same LangGraph runtime:
Groq (Llama 3.3 70B) — handles all five specialist subagents (HVAC, Plumbing, Electrical, Appliance, General), orchestration, RCA, 5 Whys, chart generation, registry commands, quadrant preview, and completeness scoring. Low latency, high throughput.
Anthropic (Claude Sonnet) — handles the synthesizer node when ANTHROPIC_API_KEY is set.
Selected at runtime by tools/llm_providers.get_synthesizer_model(); falls back to Groq
transparently when the key is absent. Model string: claude-sonnet-4-20250514.
Gemini (2.5 Flash-Lite) — handles Document Intake, Spreadsheet Analytics, and Schema
Metric Discovery agents via the google.genai SDK (from google import genai).
Chosen for native multimodal support and strong data extraction performance.
This demonstrates a provider-agnostic multi-model architecture where each model is used where it performs best — and where swapping any provider requires only a new node-level model binding, not changes to the graph topology, state schema, or HITL flow.
Duplicate Detection Notes¶
tools/duplicate_detector.py uses dual-channel TF-IDF cosine similarity:
- Channel 1 — full text (title + description) vs existing registry full texts
- Channel 2 — title only vs existing registry titles
- Final score —
max(channel1, channel2)per item — catches paraphrased titles even when descriptions differ
Default threshold is 0.55. Closed items are excluded from comparison by default (status_filter=["open", "in_progress"]). sklearn is a required dependency (scikit-learn>=1.3.0).
The detector is called by execute_add() before any DB write. If sklearn is not installed it fails silently and allows the add to proceed.
Auto-Migration¶
On startup, HOMEBASE checks for databases created before v1.10.0 (which stored
days_since_update as a static integer column) and back-fills updated_at timestamps
automatically. No manual migration step is required.