Payments fintech, ~200 employees (anonymized) · AP cycle from 9 days to 2 days, zero vendor-bank-change incidents in 6 months

Outcomes

AP cycle: 9 days → 2 days
800+ invoices/month processed; 87% auto-coded by the LLM
Zero vendor-bank-change fraud incidents in 6 months (vs. 2 attempts blocked)
External audit firm walked the AP SoD chain and signed off in one session

The CFO had a specific problem: 800+ vendor invoices per month flowing through an AP team of 4, with two vendor-bank-change attempts in the prior quarter (both caught, but barely). The team was processing invoices but had no real-time fraud-detection on the vendor banking side.

The starting point

800+ invoices/month, 70% recurring SaaS / cloud / payment-processing vendors, 30% one-time
4-person AP team, working AP nightly during close week
9-day average AP cycle (invoice received → payment submitted)
Existing AP tool: a custom Python pipeline on top of QuickBooks + Mercury
Two recent vendor-bank-change attempts (one caught by the AP manager’s pattern recognition; one caught by sheer luck — a payment delay that gave the AP coordinator time to call the vendor)

The CFO’s brief: “We need an open chokepoint we can defend to our regulators, plus AI-assisted coding to reduce the cycle time.”

The architecture

Engine: closegate-engine on Kubernetes (3-node deploy for resilience)
Agent: closegate-agent + Claude Sonnet 4.6 + custom invoice-OCR via AWS Textract
Workspace: closegate’s Vue UI deployed at ap.internal.<company>.com
Auth: OIDC via Microsoft Entra ID; per-AP-team-member actor identity
Materiality: $25K threshold, with per-vendor-category overrides
Sensitive accounts: cash, intercompany clearing, vendor bank-change events (the key one), tax provisions, all routed to HITL regardless of materiality
Dual-HITL T3: payment-run submission enforced requestor ≠ approver ≠ payer across three distinct actor identities

The fraud-prevention pattern

The vendor-bank-change defense was the highest-priority requirement. Three controls layered:

always_human_accounts includes VENDOR_BANK_CHANGE — any state change touching vendor banking metadata fires HITL regardless of materiality.
First-payment-to-changed-bank lockout — any payment to a bank account changed in the prior 14 days requires HITL even if the match is otherwise clean. Policy field: bank_change_lockout_days: 14.
Audit log carries the bank-change timestamp on every match event — auditors can sample for “match against bank account changed in last 14 days” specifically.

In 6 months of production, two vendor-bank-change attempts arrived (both via spear-phishing emails). Both routed to HITL. Both caught by the AP manager during the human-review step. Both reported as fraud attempts; no money moved.

The dual-HITL chain

Three named actors for every payment run:

Requestor (typically AP coordinator) — proposes the payment run
Approver (typically AP manager) — reviews + approves the run
Payer (typically controller) — actually releases to bank

closegate’s gate denies any reuse: requestor ≠ approver ≠ payer. The runtime check is server-side; the LLM cannot fake any of the three identities. The audit log records all three.

External audit firm walked this chain in one session. The auditor’s quote, paraphrased: “this is the cleanest SoD walkthrough I’ve done on an AI-touched system.”

TCO

Cost line	Annual
Kubernetes infrastructure (3-node)	~$4,800/yr
AWS Textract (invoice OCR)	~$2,400/yr at this volume
Anthropic API (Sonnet 4.6)	~$28,000/yr at this volume (high)
One-time implementation (in-house, ~120 hours)	$24,000
Year 1 total	~$59,200
Year 2+	~$35,200/yr

Comparable Vic.ai quote for the same volume was $140K+/yr (per-invoice pricing). The savings funded the AP team’s tooling improvements for the rest of the year.

What worked unexpectedly well

The LLM-proposed coding accuracy was higher than the team expected — 87% on first attempt, with the misses concentrated in a handful of niche vendor types that needed manual rule-tuning. After 3 months of policy.yaml tuning, the accuracy was 93% steady-state.

The MCP server surface meant the controller could ask Claude Desktop questions like “what’s our exposure to vendor X?” or “find me invoices we received but never paid” and get correct answers via the query_audit and recon://exceptions tools. The CFO uses this for monthly board prep.

What didn’t work initially

First materiality threshold was too loose. $50K initially. The HITL queue was too small; the AP manager wasn’t catching enough above-threshold cases. Lowered to $25K in week 4.
OCR-to-coding handoff had latency issues. First architecture had OCR + coding running serially; second invoice in a batch waited on the first. Refactored to parallel coding per invoice; cycle time dropped further.
The custom invoice-OCR pre-processor needed two iterations. AWS Textract handles 90% of invoices cleanly; ~10% needed a fallback to Mathpix for handwritten/scanned ones. Took 3 weeks to get the fallback chain right.

What’s next

Multi-entity support for the upcoming acquisitions (currently US-only; adding UK + EU)
Slack-bot integration for the HITL queue (currently web UI only)
Migration of the close + recon workflows to closegate (currently still in a separate tool)

This case study is published with the design partner’s permission. Company name and revealing details anonymized; numbers cited are real.

Case study published with the design partner's permission; company name and revealing details anonymized at their request. The numbers cited are real.