Reference
The ReAct agent
The agent runs a plan → reason → act loop. Given a goal it first emits a short numbered plan, then narrates a one-line thought before each tool call, gathers structured evidence, and finishes with a plain-English summary.
The plan and per-step reasoning are surfaced in every front-end — the one-shot CLI, the interactive REPL, and the web UI. Planning degrades gracefully: if a small local model doesn’t produce a clean plan, the agent synthesises one from the goal and continues. The loop never aborts on formatting.
sentinel agent \
--model llama3.2 \
--goal "Audit my Downloads folder and report anything suspicious." How findings flow
- The CLI invokes a tool (directly, or via the agent).
-
The tool emits a structured
ToolOutputcontaining:findings— strongly-typedFindingobjects fed into the final report.payload— JSON metadata that’s safe to show the LLM.
- The agent loop sends the payload back as a
toolrole message and asks the model to either call another tool or produce a plain-English summary. - The CLI renders the resulting
Reportin the chosen format.
The chat session
The chat UI and REPL keep a single conversation per session: every user
message extends the same ChatSession, so prior tool results and
the model’s earlier answers stay in context. Each assistant turn shows:
- the plan it produced,
- the per-turn reasoning,
- the tool calls it made (args, success/error, result excerpt), and
- any structured findings produced.
Reliability with local models
Smaller local models (7B–8B via Ollama) are far less reliable at structured tool calling than cloud models. The ReAct loop is built to be robust against malformed output: plan extraction degrades gracefully, with retry logic and an evaluation harness measuring tool-call fidelity across models as they change. Hardening this further — plus real-time token streaming with progress and cancellation — is on the roadmap.
The agent loop is tested against an in-process mock LLM, so its core logic is verifiable with no Ollama and no network access at all.