event-sourced
agent runtime
in Go
replay past runs. resume crashed ones. stop runaway costs.
go get github.com/jerkeyray/starling- #01RunStartedmodel · tools · system prompt pinned
- #02TurnStartedturn 1 · prompt hash committed
- #03AssistantMessageCompletedtool plan: search · fetch
- #04ToolCallScheduledsearch · attempt 1
- #05ToolCallScheduledfetch · attempt 1
- #06ToolCallCompletedsearch · 28ms
- #07ToolCallCompletedfetch · 312ms
- #08TurnStartedturn 2
- #09AssistantMessageCompletedfinal answer
- #10RunCompletedmerkle root committed
what you get without writing it yourself.
Replay, audit, multi-provider, MCP, budgets, and operator tooling - all in the box, with one Go import.
Replay any past run
Re-run a recorded run against your current code. The first step that behaves differently shows up as a test failure.
Audit-grade history
Each run is hash-signed end to end. If anyone edits a past event, validation breaks and you know.
Use any model
OpenAI, Anthropic, Gemini, Bedrock, OpenRouter, and any OpenAI-compatible endpoint. Swap models without touching agent code.
Tools and MCP
Write tools as plain Go functions, or mount any MCP server. Both behave the same in live runs and in replay.
Hard cost limits
Cap tokens, dollars, and wall-clock per run. The runtime stops when a cap trips - not after the bill arrives.
Production basics, included
Postgres or SQLite storage, schema migrations, Prometheus metrics, structured logs, and a built-in web inspector.
five phases,
start to finish.
From a goal in to a verified answer out - every step recorded in order, replayable later.
- 01
define
Wire up the agent. Pick a model, give it tools, set a budget. No I/O yet.
modeltoolsbudget - 02
run
Call Run with a goal. The runtime mints a run id and starts recording from the first byte.
fresh run idrecording started - 03
loop
The model thinks, calls tools, reads the results, thinks again. Every step lands in the recording as it happens.
turnstool callstokens - 04
finish
When the run ends, the recording is sealed and signed. You can prove later that nothing was edited.
final answersigned history - 05
replay
Re-run the recording against your current code. Any difference shows up as a typed error pointing at the exact step.
same wiringdiff at exact step
every meaningful runtime action is an event.
Starling treats the event log as the source of truth. The runtime, the inspector, and replay verification all read the same shape.
Every event is hash-chained on append. The terminal event commits a Merkle root over all priors. Mutate any prior event and eventlog.Validate fails.
Replay re-executes the agent against the same wiring. The first event that does not byte-match surfaces as a typed replay.Divergence carrying seq, kind, expected kind, class, and reason.
in-tree adapters
OpenAI-compatible endpoints (Groq, Together, Ollama, vLLM, LM Studio, Azure OpenAI, …) plug in via openai.WithBaseURL.
contributions welcome.
issues, pull requests, and design discussions are genuinely appreciated. star the repo to follow along.