Persist runs
Pick a backend. In-memory for tests, SQLite for single-host, Postgres for multi-host. Migrations, preflight, read-only inspectors.
eventlog.EventLog is the persistence interface. Three built-in
backends share it; your code only sees the interface, so swapping is a
constructor change.
The interface
type EventLog interface {
Append(ctx context.Context, runID string, ev event.Event) error
Read(ctx context.Context, runID string) ([]event.Event, error)
Stream(ctx context.Context, runID string) (<-chan event.Event, error)
Close() error
}
type RunLister interface {
ListRuns(ctx context.Context) ([]RunSummary, error)
}
type RunPageLister interface {
ListRunsPage(ctx context.Context, opts RunPageOptions) (RunPage, error)
}
type RunPruner interface {
PruneRuns(ctx context.Context, opts PruneOptions) (PruneReport, error)
}
type RunPageOptions struct {
Limit int
Offset int
Status string
Query string
StartedAfter time.Time
RequireToolCalls bool
}
type RunPage struct {
Runs []RunSummary
TotalMatching int
Limit int
Offset int
}
type RunSummary struct {
RunID string
StartedAt time.Time
LastSeq uint64
TerminalKind event.Kind
// Aggregates over the run's events. Computed by every backend's
// ListRuns implementation; zero values are valid for runs that
// haven't produced an AssistantMessageCompleted yet.
TurnCount int
ToolCallCount int
InputTokens int64
OutputTokens int64
CostUSD float64
DurationMs int64 // wall time from RunStarted to last event
}All three built-in backends satisfy RunLister, RunPageLister, and
RunPruner. The inspector uses RunPageLister when available, and
falls back to RunLister for custom backends that only implement the
older listing interface. The aggregate fields on RunSummary are
computed at list time so dashboards don't have to re-aggregate event
streams.
Picking a backend
| Backend | Use when | Avoid when |
|---|---|---|
NewInMemory() | Tests, demos, ephemeral CLIs. | Anything you want to replay later. |
NewSQLite(path, ...) | Single-host services, edge nodes. WAL mode, one writer, many readers. | Multi-host writers (no cross-host locking). |
NewPostgres(db, ...) | Multi-host services, regulated workloads, anything wanting PITR. | Workloads where the DB is unavailable for stretches. |
In-memory
log := eventlog.NewInMemory()No migration, no schema check, no persistence. The whole log is gone
when the process exits. Useful for go test and one-shot CLIs.
SQLite
log, err := eventlog.NewSQLite("starling.db")
if err != nil { return err }
defer log.Close()What you get:
- WAL mode +
synchronous=NORMAL— fast appends, fsync on commit. - Auto-migration on open — first open installs the schema; later opens migrate forward to the binary's schema version.
- Per-run
_txlock=immediate— one writer, many readers. - File permissions —
0600, owned by the agent user.
Options:
| Option | Purpose |
|---|---|
WithReadOnly() | Open with mode=ro. Append returns ErrReadOnly. Inspector mode. |
Read-only example (e.g., a separate inspector binary against the same file):
log, err := eventlog.NewSQLite("starling.db", eventlog.WithReadOnly())You can backup a live SQLite log without stopping the agent:
sqlite3 starling.db ".backup /tmp/starling-backup.db"Postgres
import (
_ "github.com/jackc/pgx/v5/stdlib"
"github.com/jerkeyray/starling/eventlog"
)
db, err := sql.Open("pgx", os.Getenv("DATABASE_URL"))
if err != nil { return err }
db.SetMaxOpenConns(8)
log, err := eventlog.NewPostgres(db, eventlog.WithAutoMigratePG())
if err != nil { return err }
defer log.Close()What you get:
- Per-run
pg_advisory_xact_lockon the run id hash — appenders to the same run serialize; different runs are independent. - Multi-host safe — any number of writers across hosts.
- PITR / replication — standard Postgres tooling works.
- Postgres ≥ 11 required (uses
hashtextextended).
Options:
| Option | Purpose |
|---|---|
WithAutoMigratePG() | Run InstallSchema at open. Without it, run migrations explicitly. |
WithReadOnlyPG() | Append returns ErrReadOnly. Inspector mode. |
Use a Postgres role with the minimum privileges you need:
-- writer
GRANT SELECT, INSERT ON eventlog_events TO starling_writer;
-- reader (inspector)
GRANT SELECT ON eventlog_events TO starling_reader;Migrations
import "github.com/jerkeyray/starling/eventlog"
// Print current version.
v, err := eventlog.SchemaVersion(ctx, log)
// Apply pending migrations (forward-only).
report, err := eventlog.Migrate(ctx, log)
// Dry-run for CI.
report, err := eventlog.Migrate(ctx, log, eventlog.WithDryRun())CLI equivalents:
starling schema-version /var/lib/starling/log.db
starling migrate /var/lib/starling/log.db
starling migrate --dry-run /var/lib/starling/log.dbPreflight
Agent.Run and Agent.Resume call eventlog.Preflight(ctx, log) on
startup. It returns:
nilif the schema matches.ErrSchemaOutdatedif the database is older than the binary (runMigrate).ErrSchemaTooNewif the database is newer than the binary (deploy a newer binary or rollback the schema).
In-memory backends skip the check (return nil). Disable with
Config.SkipSchemaCheck = true in tests only.
Validation
eventlog.Validate(events) re-checks an entire run end to end. Use it
in CI to verify a recorded fixture hasn't drifted:
events, err := log.Read(ctx, runID)
if err != nil { return err }
if err := eventlog.Validate(events); err != nil {
// wraps ErrLogCorrupt with a diagnostic.
}Validate checks:
- Slice non-empty,
events[0].Seq == 1, monotonic seq with no gaps. RunIDconsistent across all events.- Hash chain unbroken.
- Exactly one terminal event, last in the slice.
- First event is
RunStartedwith a supportedSchemaVersion. TurnStartedpaired with a same-turn terminal.ToolCallScheduledpaired withToolCallCompletedorToolCallFailedunder the same(CallID, Attempt).- Merkle root matches over every pre-terminal event.
Reading and streaming
// One-shot read of a finished run.
events, err := log.Read(ctx, runID)
// Stream as the run unfolds (historical + live).
ch, err := log.Stream(ctx, runID)
for ev := range ch {
// ...
}Stream delivers historical events first, then live events. The
channel closes on context cancel, log close, or buffer overflow
(internal buffer is 256 events).
Paged listings
Use ListRunsPage for UI or API surfaces that browse many runs:
page, err := log.ListRunsPage(ctx, eventlog.RunPageOptions{
Limit: 50,
Offset: 0,
Status: "completed",
Query: "support-ticket",
})Limit <= 0 uses the backend default. SQLite and Postgres apply
filters, ordering, and pagination in SQL before materializing run
summaries, so large logs do not need to load every run just to render
the first page. No schema migration is required.
Retention pruning
Pruning is an explicit operator action outside the append-only
EventLog contract. It deletes whole runs only; it never removes a
single event or suffix from a run.
starling prune --older-than 720h /var/lib/starling/log.db # dry run
starling prune --older-than 720h --confirm /var/lib/starling/log.db
starling prune --before 2026-01-01T00:00:00Z --status completed /var/lib/starling/log.dbThe default selection is terminal runs (completed, failed, and
cancelled) older than the cutoff. In-progress runs are kept unless
you pass --status "in progress" or --include-in-progress.
For Postgres, wire the same retention policy as a maintenance job with
a role that has SELECT and DELETE on eventlog_events:
report, err := log.(eventlog.RunPruner).PruneRuns(ctx, eventlog.PruneOptions{
Before: time.Now().Add(-90 * 24 * time.Hour),
DryRun: true,
})
if err != nil { return err }
fmt.Printf("would delete %d runs\n", report.MatchedRuns)
_, err = log.(eventlog.RunPruner).PruneRuns(ctx, eventlog.PruneOptions{
Before: time.Now().Add(-90 * 24 * time.Hour),
})Keep inspector roles read-only (SELECT only).
Helpers
turns, tools, inTok, outTok, cost, durNs :=
eventlog.AggregateRun(events)AggregateRun is the single source of truth for per-run totals
across the runtime: the inspector's totals strip, the MCP server's
summarize_run tool, and RunSummary's aggregate fields all share
this implementation. An event whose payload fails to decode is
skipped rather than failing the whole aggregation, since callers
are typically presentation surfaces where one broken row should not
blank the dashboard.
err := eventlog.ForkSQLite(ctx, srcPath, dstPath, runID, beforeSeq)WAL-safe SQLite branching. Copies the source via VACUUM INTO (the
only way to copy a live WAL-mode database without leaking the
.db-wal and .db-shm sidecars) and truncates runID's events to
those with seq < beforeSeq. Other runs are preserved verbatim.
beforeSeq=0 keeps every event for runID (forks the run as-is);
returns ErrForkNotFound when nothing matches in the source. See
docs/cookbook/branching.md
in the runtime repo for a worked example.
Public merkle package
import "github.com/jerkeyray/starling/merkle"The BLAKE3 hash-chain helpers used by Agent.Run are exposed as a
public package. Third parties writing their own event producers can
reuse the chain implementation rather than copying it - useful for
non-Agent.Run recorders that need to write into an EventLog and
maintain compatible chain output (e.g. importers, replay
harnesses, or custom dashboards that want to rebuild a Merkle root).
Sentinel errors
var (
ErrLogClosed = errors.New("eventlog: log is closed")
ErrLogCorrupt = errors.New("eventlog: log is corrupt")
ErrInvalidAppend = errors.New("eventlog: invalid append")
ErrReadOnly = errors.New("eventlog: log is read-only")
ErrSchemaOutdated = errors.New("eventlog: schema outdated; run migrate")
ErrSchemaTooNew = errors.New("eventlog: schema too new for this binary")
)Wrapping a backend with metrics
If you call Append directly outside step.emit, wrap the log to
capture the same latency histograms:
import "github.com/jerkeyray/starling/eventlog"
obs := starling.NewMetrics(reg).EventLogObserver()
log = eventlog.WithMetrics(log, obs)Anti-patterns
- Multiple processes writing to one SQLite file. Use Postgres.
SkipSchemaCheck: truein production. Hides migrations you forgot to run.- Calling
Migrateon every process start without coordination. It's idempotent but wastes a transaction. Run it from your release pipeline; let the binary preflight on startup. - Reusing a
runIDacross runs. Once recorded, ids are retired. The agent mints fresh ULIDs; don't pass synthetic ids.