INFLXD MediaSubscribe →
AI and Data

Daloopa benchmark puts structured-data grounding 71 points ahead on AI finance retrieval

A $47M Series C and a published accuracy gap sharpen the argument that the data layer, not the model, is now the binding constraint on agent-driven research.

INFLXD Research··5 min read
Daloopa benchmark puts structured-data grounding 71 points ahead on AI finance retrieval

Daloopa announced a $47M Series C on May 28, 2026, led by Brighton Park Capital with participation from Squarepoint Capital, Touring Capital, and Nexus Venture Partners. Alongside the round, the company published benchmark results showing that AI agent accuracy on financial-retrieval tasks improved by up to 71 percentage points when the agent was grounded in structured, auditable data rather than open web retrieval.

The round and the benchmark together frame a category argument: as buy-side research migrates into agents and chat surfaces, the reliability ceiling is set by the data layer underneath, not by the foundation model on top.

What an auditable data layer actually means

In an investment research context, "structured, auditable data" carries a specific load. Structured means each figure (revenue by segment, capex by quarter, a reconciled non-GAAP bridge) sits in a defined field with a known schema, not as free text inside a PDF. Auditable means each figure traces back to a primary source: a 10-K line, an 8-K table, a slide on page 14 of the Q1 2026 earnings deck, with the page, the filing, and the timestamp preserved.

A single brushed-steel dial gauge labeled "RETRIEVAL ACCURACY," its needle pinned far to the right against a hard stop ,  its mounting plate not metal but a tight lattice of tagged financial line-items

That audit trail is what makes a number citable to an investment committee. An analyst memo that says "segment operating margin compressed 180 bps YoY" has to survive the question: where did that number come from, and can I click through to the filing? A model that hallucinates a plausible-looking 180 bps fails that test silently. A retrieval system that pulls from a structured table tied to the source filing passes it.

Why the data layer became the constraint

The first wave of generative AI in finance focused on the model: which LLM, which context window, which fine-tune. The accuracy gap Daloopa is pointing at suggests a different bottleneck. When the same agent is pointed at open web results versus a structured, reconciled financial dataset, the structured-data version answers correctly far more often. The model is not the variable. The retrieval substrate is.

This tracks with how research desks already think. A sell-side analyst's first-read note is only as good as the figures in the comp table; an investment-committee memo is only as good as the citations behind each claim. Moving those workflows into an agent does not change the underlying requirement. It raises the stakes, because the agent will produce an answer either way.

Three implications follow for firms moving AI from pilots into production:

  1. Vendor selection shifts. Evaluation criteria move from "which model" toward "which data layer, with what coverage, at what update latency, with what provenance per field."
  2. Compliance review gets a new surface. If an agent's answer is going to be cited internally, the audit trail behind each number has to be inspectable. Structured retrieval makes that possible; unstructured web retrieval generally does not.
  3. The build-versus-buy line moves. Maintaining a reconciled, primary-source-linked financial dataset across thousands of issuers is a heavy lift. Firms that previously planned to wire LLMs directly into filings are reconsidering the middle layer.

MCP as the distribution mechanism

The other half of the announcement is distribution. Daloopa is exposing its data layer through Model Context Protocol connectors with OpenAI's ChatGPT, Anthropic's Claude, Perplexity, and Rogo, and has introduced a Partner API for developers building on the data directly.

MCP, the open protocol for connecting LLM-based agents to external tools and data sources, matters here because it standardizes how an agent reaches into a structured dataset mid-conversation. An analyst inside Claude or ChatGPT can, in principle, ask a question about a specific issuer's segment trajectory and have the agent pull the underlying figures from a vetted source rather than guess from training data or scrape the open web. The protocol does not improve the data; it makes the good data reachable from where the analyst is already working.

For the category, that is the meaningful pattern. The accuracy gains do not require the analyst to leave their tool of choice. They require the tool of choice to be wired into a data layer that can answer with provenance.

What the 71-point gap implies for production

A 71-percentage-point spread is large enough to change the calculus on what AI workflows can be put in front of an analyst at all. A pilot that answers correctly 25% of the time is a demo; one that answers correctly 90%+ of the time is a tool. The published gap is on retrieval tasks specifically, not on judgment, scenario construction, or write-up. Those layers remain analyst work. But retrieval is the substrate everything else sits on.

What to watch next: how quickly other financial data providers publish comparable grounded-versus-ungrounded benchmarks, whether MCP coverage expands to the major terminal and OMS surfaces, and how compliance teams adapt review processes when an agent's answer arrives with a clickable citation chain attached.

From INFLXD

Powering institutional-grade transcription for expert networks.

INFLXD provides AI-powered, human-edited transcription with sub-1% error rates for the world's leading expert networks and financial research firms.

Visit inflxd.com →