Floowed/Insights/AP & Finance/Benchmark
Benchmark · 13 min read

We benchmarked OCR + LLM extraction on 1,200 commercial invoices

Six vendors, one dataset, five failure modes. Where Floowed sits on the accuracy/latency/cost triangle — and where we don't.

Six vendors. 1,200 commercial invoices. One human-reviewed gold set. This is the benchmark we wish had existed before we built Floowed's extraction engine.

What we measured

Accuracy (line-item match), median extract latency, P95 latency, cost per document, and failure mode distribution.

Where Floowed sits

Median extract time: 3.2 seconds. P95: 8.1s. Line-item accuracy: 97.3% on standard invoices. Cost at scale: $0.004 per document.

The five failure modes

Handwritten annotations, multi-currency line items, non-standard date formats, merged cells, and vendor abbreviations. Every vendor hits all five. We handle the first three better than average.

Floowed's AP and BPO document automation handles three-way match, GL coding, and ERP posting with preset flows that go live in days.

Read next.

More from AP & Finance
Back to Insights