← Back to Insights

We benchmarked OCR + LLM extraction on 1,200 commercial invoices

Six vendors, one dataset, five failure modes. Where Floowed sits on the accuracy/latency/cost triangle — and where we don't.

Mira Reyes
April 19, 2026

Six vendors. 1,200 commercial invoices. One human-reviewed gold set. This is the benchmark we wish had existed before we built Floowed's extraction engine.

What we measured

Accuracy (line-item match), median extract latency, P95 latency, cost per document, and failure mode distribution.

Where Floowed sits

Median extract time: 3.2 seconds. P95: 8.1s. Line-item accuracy: 97.3% on standard invoices. Cost at scale: $0.004 per document.

The five failure modes

Handwritten annotations, multi-currency line items, non-standard date formats, merged cells, and vendor abbreviations. Every vendor hits all five. We handle the first three better than average.

On this page

Run your document workflows 10x faster

See how leading teams automate document workflow in days, not months.