Reference
Per-package API reference: types, signatures, and short examples for the Starling Go runtime.
Per-package types, signatures, and short examples.
starling (root)
The agent loop, run lifecycle, and replay surface.
Agent
type Agent struct {
Provider provider.Provider
Tools []tool.Tool
Log eventlog.EventLog
Config Config
Budget *Budget
Metrics *Metrics
Namespace string // optional run-id prefix
}Agent holds no per-run state. Two instances pointing at the same log
are interchangeable.
Config
| Field | Default | Notes |
|---|---|---|
Model | required | Provider-specific model id, e.g. "gpt-4o-mini". |
MaxTurns | 0 = ∞ | Caps the ReAct loop. 0 is allowed but not recommended. |
SystemPrompt | "" | Prepended to every conversation. Captured into RunStarted. |
Params | nil | Provider-specific param blob (CBOR). Hashed into RunStarted.ParamsHash. |
RequireRawResponseHash | false | Fail any turn whose ChunkEnd lacks a 32-byte raw-response digest. |
AppVersion | "" | Stamped into RunStarted alongside the Starling library version. |
EmitTimeout | 0 = ∞ | Bounds each event-log Append under context.WithoutCancel. |
SkipSchemaCheck | false | Disables eventlog.Preflight on Run / Resume. Tests only. |
Logger | slog.Default() | Structured slog records for run lifecycle. |
Run / Resume / Replay
Run(ctx, goal) (*RunResult, error)— live entry. Mints a fresh run id (namespaced whenNamespace != ""), emitsRunStarted, runs the loop, returns the terminal*RunResult.Resume(ctx, runID, extraMessage) (*RunResult, error)— re-enters a run from its last seq. Pending tool calls reissue under freshCallIDs; the orphan stays for audit.ResumeWith(...opts)addsWithReissueTools(false)for manual recovery.Replay(ctx, log, runID, agent, opts...) error— re-executes against the same wiring. Returnsnilon a clean replay, wraps a*replay.DivergencewithErrNonDeterminismon the first mismatch, orErrProviderModelMismatchwhen the agent'sProvider.ID/APIVersion/Config.Modeldisagree with the recording.WithForceProvider()disables the identity check.RunStream(ctx, goal) (string, <-chan AgentEvent, error)— typed event stream layered overStream. Variants:TextDelta,ToolCallStarted,ToolCallEnded,Done. Channel closes after a singleDone.
RunResult carries RunID, FinalText, totals (TurnCount,
ToolCallCount, TotalCostUSD, InputTokens, OutputTokens,
Duration), TerminalKind, MerkleRoot, and CacheStats
(Hits, Misses, ReadTokens, CreateTokens). All recoverable
from the log; the struct is a convenience.
Sentinel errors
| Error | Meaning |
|---|---|
ErrNonDeterminism | Replay diverged from the recording. Wraps *replay.Divergence. |
ErrPartialToolCall | Resume saw pending tool calls and WithReissueTools(false) was set. |
ErrRunNotFound | Resume target run id is absent from the log. |
ErrRunAlreadyTerminal | Resume target ended with a terminal event. |
ErrRunInUse | Another writer already advanced the chain. |
ErrSchemaVersionMismatch | The recording's schema version is unsupported by this binary. |
ErrProviderModelMismatch | Replay agent's Provider.ID / APIVersion / Config.Model disagrees with RunStarted. |
budget
Budget has four axes; zero on any field disables it. A trip emits
BudgetExceeded{Limit, Cap, Actual, Where} and unwinds with
RunFailed{ErrorType:"budget"}.
| Axis (field) | Type | When |
|---|---|---|
MaxInputTokens | int64 | Pre-call, before every step.LLMCall. |
MaxOutputTokens | int64 | Mid-stream on every ChunkUsage. |
MaxUSD | float64 | Mid-stream using budget/prices.go per-model rates. |
MaxWallClock | time.Duration | context.WithDeadline wrapping the run. |
budget.RegisterPricing(model, inPerMtok, outPerMtok) registers
or overrides per-model USD pricing at runtime; resets the
unknown-model warn-once memo so a stale warning doesn't outlive
the call. Built-in rates ship for major-vendor models in
budget/prices.go.
event
The wire format. Every event carries:
type Event struct {
RunID string
Seq uint64
PrevHash []byte // BLAKE3 of canonical CBOR of prev event
Timestamp int64 // Unix nanoseconds
Kind Kind
Payload cborenc.RawMessage // kind-specific struct, CBOR-encoded
}The full schema with payload definitions, the kinds the runtime emits, the reserved kinds, and the invariants live on the Events page.
Encoding helpers: Marshal, Unmarshal, Hash, ToJSON.
event.HashSize is 32. Each typed payload has an EncodePayload[T]
helper; each kind has a matching accessor (AsRunStarted,
AsToolCallCompleted, …).
eventlog
type EventLog interface {
Append(ctx, runID, ev) error
Read(ctx, runID) ([]Event, error)
Stream(ctx, runID) (<-chan Event, error)
Close() error
}RunLister adds ListRuns(ctx) ([]RunSummary, error). RunPageLister
adds ListRunsPage(ctx, opts) (RunPage, error) for filtered,
server-side pagination. RunPruner adds explicit whole-run retention
cleanup with PruneRuns(ctx, opts) (PruneReport, error). All three
built-in backends implement these optional interfaces.
RunSummary carries per-run aggregates (TurnCount,
ToolCallCount, InputTokens, OutputTokens, CostUSD,
DurationMs) so dashboards don't have to re-aggregate event streams.
Helpers: eventlog.AggregateRun(events) returns the same totals
over a chained event slice (single source of truth for the
inspector and the MCP server). eventlog.ForkSQLite(ctx, src, dst, runID, beforeSeq) is a WAL-safe SQLite branch via
VACUUM INTO, truncating one run's events at a sequence
boundary. The BLAKE3 chain helpers used by Agent.Run are public
at github.com/jerkeyray/starling/merkle.
Backends
| Constructor | Use when |
|---|---|
NewInMemory() | Tests, demos, ephemeral CLI tools. |
NewSQLite(path, opts...) | Single-host services, edge nodes. |
NewPostgres(db, opts...) | Multi-host services. Per-run advisory locks serialize appenders. |
Options: WithReadOnly() / WithReadOnlyPG() for inspector mode,
WithAutoMigratePG() to run migrations on connect.
Validation, migrations, preflight
Validate(events)— seq monotonicity, hash chain, terminal placement, Merkle root, and the semantic pairing rules from the Event schema.SchemaVersion(ctx, log)/Migrate(ctx, log, opts...)— forward-only migration API.Migratereturns aMigrationReport.Preflight(ctx, log)— fails fast withErrSchemaOutdatedorErrSchemaTooNew.Agent.Run,Agent.Resume, and the inspector all call it unlessConfig.SkipSchemaCheck = true.WithMetrics(log, obs)— wraps anyEventLogso directAppendcallers see the same latency histogramsstep.emitrecords.
Sentinel errors: ErrLogClosed, ErrLogCorrupt, ErrInvalidAppend,
ErrReadOnly, ErrSchemaOutdated, ErrSchemaTooNew.
step
The determinism layer. Anything non-deterministic in the agent loop must
go through step so replay can reproduce it byte-for-byte.
Helpers
func Now(ctx context.Context) time.Time
func Random(ctx context.Context) int64
func SideEffect[T any](ctx context.Context, name string, fn func() (T, error)) (T, error)Live mode runs fn and emits a SideEffectRecorded event. Replay reads
the recorded value back without invoking fn. T must be CBOR-serializable.
LLM calls
LLMCall(ctx, req) drives a streaming completion through the configured
provider. Emits TurnStarted, optional ReasoningEmitted, and
AssistantMessageCompleted. Enforces input/output/USD budgets inline.
Validates the chunk state machine (no EOF before ChunkEnd, no
duplicate ChunkToolUseStart, no chunks after ChunkEnd).
Tool dispatch
type ToolCall struct {
CallID, TurnID, Name string
Args json.RawMessage
Idempotent bool
MaxAttempts int
Backoff func(attempt int) time.Duration
}CallTool(ctx, c)— sequential dispatch.CallTools(ctx, calls)— fan-out with a semaphore (cap isstep.DefaultMaxParallelTools, 8).- Retries kick in on
tool.ErrTransientwhenIdempotentandMaxAttempts > 1.NewCallID()mints fresh IDs.
Replay errors
MismatchError carries Seq, Kind, ExpectedKind, Class
("exhausted" | "kind" | "payload" | "turn_id"), and Reason. It
satisfies errors.Is(ErrReplayMismatch). Use errors.As for the
structured fields. Other sentinels: ErrInvalidStream,
ErrMissingRawResponseHash. The replay package lifts these into
replay.Divergence (next section).
tool
type Tool interface {
Name() string
Description() string
Schema() json.RawMessage // JSON Schema for input
Execute(ctx, in) (json.RawMessage, error)
}tool.Typed[In, Out](name, description, fn) derives the JSON Schema
from In via reflection. Errors wrapping tool.ErrTransient opt the
call into retry under step.ToolCall{Idempotent: true, MaxAttempts: N}.
tool.Wrap(t Tool, mw ...Middleware) Tool composes middleware
around Execute while passing Name, Description, and Schema
through unchanged. Last middleware passed runs first
(net/http.Handler ordering); short-circuiting middleware can skip
the inner call entirely. Useful for logging, timing, span
injection, request authentication, output redaction.
Test scaffolding (starlingtest/)
ScriptedProvider is a deterministic provider.Provider driven
by a slice of canned chunks per turn. Helpers NewStream,
AppendRunStarted, AssertReplayMatches, and
AssertReplayDiverges cover the common test shapes without
contacting an LLM.
MCP adapter (tool/mcp)
Three constructors mount remote MCP tools as ordinary Starling tools:
New(ctx, transport, opts...)— anymcp.Transport.NewCommand(ctx, exec.Cmd, opts...)— stdio subprocess.NewHTTP(ctx, endpoint, client, opts...)— streamable HTTP.
Each connects, lists remote tools, and exposes them via
client.Tools(ctx). Calls route through step.SideEffect so replay
never re-contacts the server. Full options table on the
MCP tools page. The inbound counterpart - a
read-only MCP server that exposes a recorded log to AI clients -
lives at MCP server.
HTTP daemon (starlingd)
starlingd.Command(factory) builds a CLI entrypoint for serving your
own agent over HTTP. starlingd.New(config) returns an http.Handler
for apps that already own server setup. The daemon exposes async run
creation, bounded in-process queueing, SSE streams, read APIs,
Prometheus metrics, bearer auth, and an optional inspector mount. Full
reference lives at HTTP daemon.
Built-in tools
tool/builtin/ ships Fetch() (public http/https only, 15s
timeout, 1 MiB cap, local/private-address and unsafe redirect
rejection) and ReadFile(baseDir) (path-escape rejection). Use
directly or as templates.
provider
The streaming-completion abstraction.
type Provider interface {
Info() Info
Stream(ctx, req) (EventStream, error)
}Optional Capabler exposes Capabilities() so the conformance suite
can skip what the adapter doesn't support. A Request carries Model,
SystemPrompt, Messages, Tools, ToolChoice ("" | "auto" |
"any" | "none" | tool name), StopSequences, TopK,
MaxOutputTokens, and a vendor-specific Params blob.
EventStream yields StreamChunk values: ChunkText,
ChunkReasoning, ChunkRedactedThinking,
ChunkToolUseStart/Delta/End, ChunkUsage, ChunkEnd. The state
machine is enforced by step.LLMCall.
Adapters
| Package | Use when |
|---|---|
provider/openai | OpenAI, Groq, Together, Ollama, vLLM, LM Studio, Azure, anything else OpenAI-compatible (set WithBaseURL). |
provider/anthropic | Messages API. Tool use, extended thinking with signature, prompt caching. |
provider/gemini | Native Google Gemini. |
provider/bedrock | Amazon Bedrock via native ConverseStream (AWS SDK v2). |
provider/openrouter | OpenRouter: thin wrapper over the OpenAI adapter with attribution headers. |
provider/conformance | The contract test every adapter passes. |
Each adapter advertises its support set via
provider.Capabler.Capabilities(). The conformance suite skips
capability-gated assertions when the adapter reports false.
Error classification
Adapters wrap underlying SDK / HTTP errors with one of four
sentinels for retry policy via errors.Is:
| Sentinel | When |
|---|---|
provider.ErrRateLimit | 429 / quota |
provider.ErrAuth | 401 / 403 |
provider.ErrServer | 5xx |
provider.ErrNetwork | DNS / dial / TLS / broken stream |
Helpers: provider.WrapHTTPStatus(err, status) annotates by HTTP
status (delegates to ClassifyTransport when status == 0);
provider.ClassifyTransport(err) wraps net.Error and
*url.Error with ErrNetwork. 4xx errors that are neither auth
nor rate-limit pass through unwrapped on purpose - they reflect
caller bugs, not transient conditions.
replay
Verify(ctx, log, runID, agent)— headless check. Returnsnilon a clean replay or wraps*DivergencewithErrNonDeterminismon the first mismatch.starling.Replayis a thin wrapper that takes*Agentdirectly.Stream(ctx, factory, log, runID)— inspector path. Yields aReplayStepper emitted event so the UI can render recorded vs produced side-by-side. The final step hasDiverged: truewhen the replay didn't reach the recorded terminal.
Divergence carries RunID, Seq, Kind, ExpectedKind, Class,
Reason. Factory is func(ctx) (Agent, error).
inspect
Embedded HTTP handler. Serves the runs list, per-run timeline, event
detail, live tail (SSE), and replay. The standalone binary at
cmd/starling-inspect opens any SQLite log read-only.
inspect.New(log, opts...) (*Server, error). Options:
WithAuth(authenticator)— protect every endpoint.BearerAuth(token)— convenienceAuthenticator.WithReplayer(factory)— enable replay re-execution.WithDBPath(path)— show the DB basename in the topbar context chip (full path on hover).
Read-only by construction, CSRF-protected on the replay POST endpoints. Front it with TLS in production: see Operations.
CLI (cmd/starling)
starling validate <db> [<runID>] # hash chain + Merkle check
starling export <db> <runID> # NDJSON event dump
starling prune [flags] <db> # dry-run-first retention deletion
starling inspect [flags] <db> # local web inspector (read-only)
starling mcp <db> # read-only MCP server over stdio for AI clients
starling replay <db> <runID> # headless replay (dual-mode binary only)
starling migrate [-dry-run] <db> # apply pending schema migrations
starling schema-version <db> # print the current schema version
starling doctor # quick health check: version, env vars, schema, validation
starling version # print the binary's Starling version (also -v / --version)The stock binary is SQLite-only. Building a dual-mode binary that
links your agent factory enables starling replay and
starling inspect with replay re-execution.
Examples
| Path | What it shows |
|---|---|
examples/m1_hello | Minimal hello agent, dual-mode inspector, OTel stdout exporter. |
examples/incident_triage | Multi-tool agent, budgets, Resume, replay regression test, Postgres, Prometheus, OTel. |