Attova records the decision behind every action: the evidence the agent had, the alternatives it rejected, its confidence, and the outcome that followed.
Refund the $42,000 disputed charge, or escalate to human review?
Execution runs the agent. Observation records the trace. Cognition preserves why it decided and whether that reasoning held up. Today only the first two exist.
Enough to see the steps. Not enough to defend the call.
Enough to reconstruct the reasoning and answer for it.
A billing-recovery agent, instrumented with Attova. Open a decision to see the evidence it had and what it rejected, then replay the run.
Sandbox data is illustrative and runs entirely in your browser. No backend, no login.
| Decision | Confidence | Outcome | Divergence | Replay |
|---|
Nothing learns silently. Candidate experiences queue for approval before they become active knowledge, each one grounded in the run, decision, and outcome it came from.
Do not auto-refund disputed charges above $25,000 without a fraud-ring check. Escalate to human review instead.
Inspect which experiences a runtime scenario would retrieve, why they matched, and where they land in the agent's context.
Disputed charges above $25,000 need a fraud-ring check before any auto-refund.
issuer_code lost_card can lag a customer card update by up to an hour.
High-value accounts: prefer escalation over write-off on ambiguous disputes.
Injected ahead of tool selection, highest-confidence first. Two lower-ranked experiences fell below the recall threshold and were dropped.
Most agents log what they did. The decision call records why: the question, what it chose, what it rejected, the evidence, the confidence. Dependency-free, in TypeScript and Python.
const run = await attova.startRun({ trigger: 'invoice.payment_failed' });
const action = await run.action({ actionType: 'tool_call', toolName: 'stripe.refund' });
// the call that makes it defensible
await run.decision({
actionId: action.id,
question: 'Refund or escalate?',
chosen: 'refund',
confidence: 0.95,
alternatives: [{ option: 'escalate', rejectedReason: 'under auto-approve threshold' }],
evidence: [{ type: 'tool_output', summary: 'charge 4d old, issuer code lost_card' }],
});
await run.outcome({ outcomeType: 'success', signal: 'succeeded' });
await run.finish('succeeded');
Work with us to instrument one production decision and see where confidence, evidence, and outcomes diverge. We do the setup. You tell us the truth.