open-source. self-hostable.

Measure agentic engineering output.

See the spend. See the work. Scale what ships. Analytics across Claude Code, Codex and the rest of your dev-AI stack.

The instrument

One surface for the whole stack.

What your agents cost, what they shipped, and which prompts actually ship code. Built for engineering leaders handed an AI bill, a pile of session logs, and asked to make sense of both.

01 / The product
01

Track every dollar across the stack

The collector auto-detects coding agents on the engineer's machine, parses native session files locally, and normalizes spend across models. Pricing pinned at capture, so model-price shifts don't silently rewrite history.

02

See what AI is actually shipping

The GitHub App joins sessions to merged PRs through accepted-edit events, AI-Assisted commit trailers, and a git-log fallback. Cost per merged PR with the commits that earned it. Sessions that shipped vs sessions that burned.

03

Replicate the workflows that work

Cluster the team's prompts, find the cohort that solved the same problem cheaper, and promote winning workflows as playbooks. The pattern becomes the asset.

02 / See the spend · every agent, every token
AdapterStatusWhat it captures
Claude CodeFullSessions, input/output/cache tokens, models, tool calls, accepted edits
CodexFullSessions, per-turn token diffs, tool executions, dollar cost
CursorIn devMessages, lines suggested, accept rate. Subscription-billed — no per-request cost exposed.
Continue.devIn devChat turns, token generation, edit outcomes, tool usage
OpenCodeIn devSessions, tokens, model routing (SQLite, post-v1.2)
VS Code (generic SDK)In devPluggable handlers via SDK — community adapters supported
03 / See the work · spend tied to merged code
OUTCOME METRIC
14.2x
accepted edits per dollar
GitHub App joins sessions to merged PRs. Pricing pinned at capture. Reverts subtract.
WHAT YOU SEE
  • Spend per engineerby model, project, day
  • Cost per merged PRwith the commits that earned it
  • Shipped vs burnedsessions, workflows, repos
04 / Scale what ships · AI Leverage Score v1
35%

Outcome quality

Sessions that end in merged code.

25%

Efficiency

Accepted edits per dollar, peer-normalized.

20%

Autonomy

How often a session ships without a hand-hold.

10%

Adoption depth

How many of your agents the engineer actually uses.

10%

Team impact

Playbooks this engineer promoted that others adopted.

05 / Install
# Five minutes to first event — signed binary, no proxy, no API keys.
$docker compose up -d  # backend
$pnpm --filter @pella/collector start  # collector

The most expensive system your engineering org has ever bought may be the one you understand the least.

Pella Metrics measures it. Spend across every agent. Outcomes tied to merged code. The prompts that ship, surfaced and shareable. Open-source, self-hostable, runs against your local sessions on day one. The data was always yours — now it's an instrument.

06 / License
Apache 2.0 for the collector, dashboard, adapters, schemas, and CLI.