tracebase.ink

00Problem· leaking

Agents are brilliant once. You pay for it every run.

Re-derivationsame fix from scratch every run

×17 / mo

Amnesic filesre-reading a module owned yesterday

40k tokens

Doom-loopsfour attempts, same region of code

3 re-greps

Redundant fetchesone answer, billed three times

×3 billed

Context thrashMonday's decision dropped by Wednesday

−12 turns

Each line above is a paid second the agent will spend again next run. Unless something writes it down.

Split· live

Same problem. Two runs, written side by side.

Same prompt, same model, same tools. Left — no memory. Right — tracebase attached.

without

$2.85

16 steps

08Readsrc/realtime/ws_client.ts

09Grep/reconnect|backoff/

10Readsrc/realtime/socket_pool.ts

11Retry with exponential backoff — first attempt.

12Testnpm test realtimeTwo specs still flake under jitter.

13Readsrc/realtime/ws_client.ts

14Grep/addEventListener\("open"/

15Second attempt — guard flag on reconnect.

16Testnpm test realtimeOne green, two still flake.

Partial

16 steps in, two specs still flaking. Same region of code touched three times.

with tracebase

$0.92

4 steps

1Recallshape = reconnect flaptrace · matched #5204

2Readsrc/realtime/ws_client.tsgist · cached (8.1kb saved)

3Editws_client.ts + seq_buffer.tstool · skipped 3 re-greps

4Testnpm test realtimeloop · clear

Resolved

trace saved · #6102

Backoff + seq-id dedupe applied. Reconnect stable across simulated jitter.

per-run cost

without

$2.85

with tracebase

$0.92

saved−68%

resolved in4 of 16

We cut your agents’ token bill nearly in half.

See how we did it40-task paired benchmark

the runtime

One memory. Five arms that earn their keep.

Each arm intervenes on a specific failure mode agents hit at runtime — and only when it does. Scroll: the octopus reaches for each one in turn.

01Recall· reaching
Reasoning reuse
Surfaces past solutions when a similar problem returns.
trace lookup · 190k indexed
#1842refactor auth.py modulematch
#1207move auth to middlewaresimilar
#0931rename auth_token envsimilar
02Gist
Semantic file memory
Recalls what a file means without re-reading it.
semantic index · 218 files
pysrc/auth.py12.4kb · cached
pysrc/config/loader.py8.1kb · cached
03Loop
Loop detection
Catches doom loops mid-run and injects a redirect.
trace monitor · window 6 turns
22grep auth_token src/
23grep AUTH_TOKEN src/
24rg auth[_-]token --hidden
25grep getenv src/loop
04Guard
Tool supervision
Spots redundant fetches before they compound on the bill.
tool calls · last 5 turns
24read src/auth.py×3
26search "token"repeat
05Fold
Context compression
Folds older turns into summaries so long horizons stay coherent.
horizon · 27 turns · 10.1k
01–08explore codebase4.2k → 340
17–22test hypothesis3.1k → 210
23–27live windowactive

use cases

One memory layer. Earning its keep across teams.

Engineering is the validated wedge; the other three are surfaces where the same runtime — MCP, SDK middleware, hosted control plane — architecturally fits on the same store.

Engineering

Claude Code, Cursor, Codex. Resolved fixes and migrations compound across PRs — no re-derivation.

Support & customer ops

Same-shape tickets, same playbook. Past resolutions surface for the next agent before re-work.

Sales & RevOps

Won-deal motions and CRM workflows reused across reps. The next outreach starts from what closed.

Internal ops & research

Runbooks, docs, and long PDFs keep their meaning. Past extractions surface on revisit.

faq

Common questions. Eight answers before you install.

pick up the pen

Stop paying for the same reasoning twice.

One memory layer for every team and every agent. Self-serve install for individuals; custom runtime for production traffic. Open source today — MIT, self-hosted, ships out of the box.

v0.9 · may 2026 · no cloud required

00Problem· leaking

Agents are brilliant once. You pay for it every run.

Re-derivationsame fix from scratch every run

×17 / mo

Amnesic filesre-reading a module owned yesterday

40k tokens

Doom-loopsfour attempts, same region of code

3 re-greps

Redundant fetchesone answer, billed three times

×3 billed

Context thrashMonday's decision dropped by Wednesday

−12 turns

Each line above is a paid second the agent will spend again next run. Unless something writes it down.

Split· live

Same problem. Two runs, written side by side.

Same prompt, same model, same tools. Left — no memory. Right — tracebase attached.

without

$2.85

16 steps

08Readsrc/realtime/ws_client.ts

09Grep/reconnect|backoff/

10Readsrc/realtime/socket_pool.ts

11Retry with exponential backoff — first attempt.

12Testnpm test realtimeTwo specs still flake under jitter.

13Readsrc/realtime/ws_client.ts

14Grep/addEventListener\("open"/

15Second attempt — guard flag on reconnect.

16Testnpm test realtimeOne green, two still flake.

Partial

16 steps in, two specs still flaking. Same region of code touched three times.

with tracebase

$0.92

4 steps

1Recallshape = reconnect flaptrace · matched #5204

2Readsrc/realtime/ws_client.tsgist · cached (8.1kb saved)

3Editws_client.ts + seq_buffer.tstool · skipped 3 re-greps

4Testnpm test realtimeloop · clear

Resolved

trace saved · #6102

Backoff + seq-id dedupe applied. Reconnect stable across simulated jitter.

per-run cost

without

$2.85

with tracebase

$0.92

saved−68%

resolved in4 of 16

We cut your agents’ token bill nearly in half.

See how we did it40-task paired benchmark

the runtime

One memory. Five arms that earn their keep.

Each arm intervenes on a specific failure mode agents hit at runtime — and only when it does. Scroll: the octopus reaches for each one in turn.

01Recall· reaching
Reasoning reuse
Surfaces past solutions when a similar problem returns.
trace lookup · 190k indexed
#1842refactor auth.py modulematch
#1207move auth to middlewaresimilar
#0931rename auth_token envsimilar
02Gist
Semantic file memory
Recalls what a file means without re-reading it.
semantic index · 218 files
pysrc/auth.py12.4kb · cached
pysrc/config/loader.py8.1kb · cached
03Loop
Loop detection
Catches doom loops mid-run and injects a redirect.
trace monitor · window 6 turns
22grep auth_token src/
23grep AUTH_TOKEN src/
24rg auth[_-]token --hidden
25grep getenv src/loop
04Guard
Tool supervision
Spots redundant fetches before they compound on the bill.
tool calls · last 5 turns
24read src/auth.py×3
26search "token"repeat
05Fold
Context compression
Folds older turns into summaries so long horizons stay coherent.
horizon · 27 turns · 10.1k
01–08explore codebase4.2k → 340
17–22test hypothesis3.1k → 210
23–27live windowactive

use cases

One memory layer. Earning its keep across teams.

Engineering is the validated wedge; the other three are surfaces where the same runtime — MCP, SDK middleware, hosted control plane — architecturally fits on the same store.

Engineering

Claude Code, Cursor, Codex. Resolved fixes and migrations compound across PRs — no re-derivation.

Support & customer ops

Same-shape tickets, same playbook. Past resolutions surface for the next agent before re-work.

Sales & RevOps

Won-deal motions and CRM workflows reused across reps. The next outreach starts from what closed.

Internal ops & research

Runbooks, docs, and long PDFs keep their meaning. Past extractions surface on revisit.

faq

Common questions. Eight answers before you install.

pick up the pen

Stop paying for the same reasoning twice.

One memory layer for every team and every agent. Self-serve install for individuals; custom runtime for production traffic. Open source today — MIT, self-hosted, ships out of the box.

v0.9 · may 2026 · no cloud required

Agents that learn
from every run.

Agents are brilliant once. You pay for it every run.

Same problem. Two runs, written side by side.

We cut your agents’ token bill nearly in half.

One memory. Five arms that earn their keep.

Reasoning reuse

Semantic file memory

Loop detection

Tool supervision

Context compression

One memory layer. Earning its keep across teams.

Engineering

Support & customer ops

Sales & RevOps

Internal ops & research

Common questions. Eight answers before you install.

Stop paying for the same reasoning twice.

Agents that learn
from every run.

Agents are brilliant once. You pay for it every run.

Same problem. Two runs, written side by side.

We cut your agents’ token bill nearly in half.

One memory. Five arms that earn their keep.

Reasoning reuse

Semantic file memory

Loop detection

Tool supervision

Context compression

One memory layer. Earning its keep across teams.

Engineering

Support & customer ops

Sales & RevOps

Internal ops & research

Common questions. Eight answers before you install.

Stop paying for the same reasoning twice.

Agents that learn from every run.

We cut your agents’ token bill nearly in half.

One memory. Five arms that earn their keep.

One memory layer. Earning its keep across teams.

Engineering

Support & customer ops

Sales & RevOps

Internal ops & research

Common questions. Eight answers before you install.

Stop paying for the same reasoning twice.

Agents that learn from every run.

We cut your agents’ token bill nearly in half.

One memory. Five arms that earn their keep.

One memory layer. Earning its keep across teams.

Engineering

Support & customer ops

Sales & RevOps

Internal ops & research

Common questions. Eight answers before you install.

Stop paying for the same reasoning twice.

Agents that learn
from every run.

Agents that learn
from every run.