purchase-to-pay replay oracle

Catch AP agent failures before payment

A small replay gate for invoice agents: duplicate invoice, vendor mismatch, missing receipt, missing approval, active holds, and consignment leakage.

clean traces

12 2-way, 3-way, invoice-before-GR, consignment

seeded defects

48 unique replay scenarios, all expectation-checked

critical caught

36/36 duplicate, vendor, early payment, blocked payment

false holds

0 unnecessary holds emitted on clean traces

agent gate

block unsafe clear_payment sample exits before execution

External Log Path

Streaming CSV/XES import keeps the public repo usable without redistributing proprietary or benchmark event logs. Policy templates block missing amount data by default.

p2p-replay-gate import-csv --input events.csv --output imported/events.jsonl --report imported/adapter_report.json p2p-replay-gate import-xes --input events.xes --output imported/events.jsonl --strict p2p-replay-gate policy-template --events imported/events.jsonl --output imported/policy.json --flow-type three_way_gr_based p2p-replay-gate audit --events imported/events.jsonl --policy imported/policy.json --output imported/audit.json

BPIC2019 Pack

Bundled activity mapping for the public purchase-order handling log. The source dataset is not redistributed; the pack maps accounting evidence into replay events and infers 2-way, 3-way, invoice-before-GR, and consignment policies.

p2p-replay-gate pack-info bpic2019 p2p-replay-gate import-xes --pack bpic2019 --input BPI_Challenge_2019.xes --output imported/bpic2019.jsonl --report imported/bpic2019_adapter.json --max-cases 1000 p2p-replay-gate policy-template --events imported/bpic2019.jsonl --output imported/bpic2019_policy.json --flow-type auto --approval-limit 1000000000 p2p-replay-gate audit --events imported/bpic2019.jsonl --policy imported/bpic2019_policy.json --output imported/bpic2019_audit.json

Real smoke run: 1,000 BPIC2019 cases, 15,351 source rows, 10,627 replay events, 0 row errors, 0.970 audit trace coverage.

Use the adapter report before making claims. Activities outside the replay oracle scope should remain visible as unmapped or be handled by a custom activity map. BPIC2019 does not expose the full approval workflow, so the pack keeps approval checks disabled unless custom evidence is supplied.

Agent Action Gate

Before an AP agent clears payment, the proposed action is converted into a replay event and checked against the same policy oracle.

p2p-replay-gate gate-action --events data/scenarios/base_events.jsonl --policy data/scenarios/policy.json --action clear_payment --case-id C004 --output reports/agent_action_gate_sample.json decision: block (blocked by PAYMENT_BEFORE_APPROVAL)

The sample report records baseline state, proposed event, after-action violations, new codes, resolved codes, and an exit code for CI or agent tool wrappers.

Operational Readiness Check

The ops report checks replay hygiene over the supplied log. It does not claim production SLA coverage.

Check	Fixture result	Why it matters
idempotency	0 duplicate event_id rows	safe retry and replay behavior starts with stable event identity
ordering	0 input inversions	late or out-of-order events remain visible before policy replay
schema	51 / 51 events versioned	event contract can evolve without silent parser drift
parallel replay	serial output matches 4-worker output	case-level partitioning stays deterministic under concurrent evaluation
digest	SHA-256 replay fingerprint	same event and policy payload can be compared across runs
resume	last case/timestamp/event cursor	batch replay can expose a restart point

p2p-replay-gate ops-report --events data/scenarios/base_events.jsonl --policy data/scenarios/policy.json --output reports/ops_readiness.json --iterations 20 summary: status=pass events=51 cases=12

Persistent Replay Store

The store replay path adds a local SQLite event store, idempotent append, bounded ingest flushes, partition checkpoints, and simulated interruption recovery.

Check	Fixture result	Model
append pressure	255 attempted appends	5 repeated passes over 51 unique events
idempotency	204 duplicate retries	event_id primary key, at-least-once append
backpressure	peak queue depth 16	bounded local flush threshold
partition replay	4 partitions	stable sha256(case_id) modulo partition count
recovery	2 checkpoints before interrupt, 4 after recovery	partition cursor after successful replay

p2p-replay-gate store-replay --events data/scenarios/base_events.jsonl --policy data/scenarios/policy.json --db reports/replay_store.sqlite --output reports/store_replay_report.json --partitions 4 --repeats 5 --queue-limit 16 --simulate-crash-after 2 summary: status=recovered events=51 checkpoints=4 duplicate_retries=204

Replay Shape

The gate tests workflow state, not only final labels.

case policy, vendor, amount, quantity, approval threshold

Evidence

goods receipt, invoice, hold, release, approval, payment

Mutation

seeded defect inserted into a clean trace or replay replacement

Oracle

stateful rule evaluation with event-level evidence

Scorecard

JSON report with summary, failure queue, and metric glossary

Scenario Pack

Mutation	Count	Policy signal
duplicate_invoice	12	critical
vendor_mismatch	12	critical
amount_overbill	6	value mismatch
quantity_mismatch	3	quantity mismatch
payment_before_gr	3	critical
payment_before_approval	3	critical
payment_while_blocked	3	critical
consignment_invoice	3	PO-level invoice leak
consignment_duplicate_invoice	3	critical

Metric Notes

Metric	Meaning
critical caught	seeded critical rows where every expected code was emitted
duplicate recall	duplicate invoice scenarios caught by duplicate rule
false holds	clean traces that produced an unnecessary hold finding
audit coverage	runs containing the events needed to audit the policy decision

Fixture evidence only. No real vendor, invoice, payment, or company data is included.