Why AI Agents Need Adaptive Data Infrastructure
AI agents will not fit neatly into fixed APIs or index-everything data platforms. They need adaptive data infrastructure that can interpret intent, compose datasets on demand, and optimize around real usage.
I’ve argued that composability wins, first for tools, then for data. Relational databases and OLAP cubes gave us standalone data components, but they still lived inside rigid, pre‑defined spaces. To escape that constraint, we need to harness intent.
We know how to index and expose crypto data. We are still building tools that allow agents to express what they want, and a data infrastructure capable of responding to those wants.
Index everything vs. guess everything
Today, we have two models for serving blockchain data.
On one side, we have “index everything” platforms like Dune and Allium. They ingest raw blocks, decode logs and traces using ABIs, and store petabytes of normalized data across 100+ chains. They’re doing heroic work: they save everyone else from having to run nodes, manage decoders, or build bespoke pipelines just to answer basic questions.
The trade‑off is brutal. Indexing everything means:
- Paying upfront to process data from every chain, protocol, and event, whether anyone ever queries the data or not.
- Maintaining and updating thousands of decoding strategies as protocols evolve.
- Carrying the operational cost of big‑iron warehouses that can keep up with ever-growing data volumes.
That approach may be sustainable for a slow chain like Bitcoin, where new blocks of transactions are created every 10 minutes. It’s a different story for L2s and high‑performance chains pushing blocks every few hundred milliseconds and adding new protocols weekly.
On the other side, we have API providers like Moralis, Alchemy, and QuickNode. They do something much narrower: instead of indexing everything, they pre‑package a small set of high‑value answers.
You want balances for a wallet? Hit one endpoint and get token balances, metadata, maybe even prices in a single call. You want recent transactions for an address? There’s an endpoint for that too, with a fixed time window and fixed shape.
This is far more tractable. They don’t need to anticipate every query; they just need to support the top n patterns that humans actually use.
In practice, that means:
- Defining a handful of standard data bundles like balances, positions, and prices.
- Optimizing the hell out of those endpoints.
- Accepting that if the user wants something off the menu, they’re on their own.
Index‑everything is over‑provisioned for today, API‑everything is under‑provisioned for the agents that are coming.
The hidden cost of “everything”
Index‑everything sounds ideal on paper. The reality is messier.
First, decoding is never finished. Protocols upgrade, ABIs change, weird edge cases surface months after millions of blocks have been processed. When a decoding bug is discovered, you don’t just fix a function, you often have to re‑index large swaths of historical data. That can take days or weeks, during which time “known bad” data is still being served to dashboards, models, and downstream pipelines.
Second, derivative datasets multiply the problem. Customers rarely want raw events; they want higher‑level constructs like balances, positions, TVL, user cohorts, or protocol‑specific metrics. So indexers build transformation layers on top of their raw data. If a bug slips through at the decoding layer, it propagates into every derivative dataset that depends on it.
This is exactly what we see in practice: discrepancies between providers, and sometimes even between different datasets exposed by the same provider. These teams are not careless. The problem is hard enough that “perfect everywhere, all the time” just isn’t realistic at this scale.
The more chains we add, the more protocols we support, the more transformations we maintain, the worse this gets.
APIs: designed for humans, not agents
API‑centric providers escape a lot of that complexity by narrowing their scope.
If your job is “give me the latest balance and a 14‑day transaction history for this address,” you can engineer a highly tuned pipeline just for that use case. If your Token API returns balances plus metadata and prices in one response, you’re explicitly optimizing for a human developer who wants to get something on screen with a single call.
The constraint is hidden in plain sight: APIs anticipate human needs. They encode a bet about what questions people will ask. That’s why you get endpoints like:
- /address-latest with a fixed history window.
- /wallet/{address}/token-balances with predefined enrichments.
This is great for typical human workflows that produce dashboards, portfolio trackers, tax tools, or scanners. It’s far from ideal for agents.
Agents don’t browse. They don’t scroll through dashboards looking for patterns. They typically start with a much sharper intent:
- “Simulate the PnL impact of unwinding this set of positions across three chains.”
- “Identify wallets whose behavior matches this pattern over the last n epochs.”
- “Combine on-chain activity with off‑chain telemetry and internal risk signals to surface anomalies.”
You can’t pre‑bake endpoints for every one of those. Trying to guess what an AI agent will want is a mug’s game.
Intent is the missing layer
So we’re stuck between two imperfect worlds:
- Index‑everything platforms that try to be ready for any question, at high cost and fragility.
- APIs that pre‑optimize for a few questions, at the cost of flexibility.
Both are missing the same thing: a first‑class notion of intent.
An ideal data service for the agentic world would do three things:
- Let agents specify what they want in rich, declarative terms, not just pick from a fixed menu of endpoints.
- Use that intent to adapt its internal pipelines – indexing, decoding, and materialization – dynamically over time.
- Align costs with actual demand instead of indexing or pre‑computing everything “just in case.”
In that model, an agent doesn’t care whether the data is pre‑materialized or constructed on the fly. It just cares that:
- The semantics are consistent.
- The latency is acceptable for the task.
- Future queries get cheaper and faster as it clarifies its needs.
The data platform’s job is to negotiate that contract: “What do you need now? What are you likely to need later? What guarantees do you want me to uphold?”
Letting intent shape the dataset
If you squint, you can see the outline of this in today’s systems.
Index‑everything platforms are effectively pre‑building the entire “data space” in case anyone wanders into any corner of it. API providers pre‑build a few well‑lit paths through that space for human tourists. The missing piece is an adaptive layer that can turn agent intent into just‑in‑time answers.
Concretely, that looks like:
- Lazy completeness: the agent can behave as if the source “contains everything,” even though, under the hood, some views are being built on demand the first time they’re requested.
- Intent‑driven optimization: the first query might be slow, but the agent can communicate its future retrieval patterns – frequency, latency sensitivity, and accuracy requirements – so the system knows what to cache, index, or materialize.
- Feedback into decoding and quality: when agents detect anomalies or conflicting results, that signal feeds back into decoders and transformations, triggering targeted re‑indexing instead of yet another full rebuild.
In this model, the economics becomes saner: you pay heavily where there is real demand and confidence requirements, and you keep everything else closer to “raw, but reachable.” Revenue and cost are tied together by actual usage, not by the theoretical possibility that someone might someday care about a given piece of data.
This also changes how we think about “data products.” Instead of shipping a fixed set of tables or endpoints, a data provider:
- Prepares a set of composable primitives (raw events, curated entities, well‑defined metrics, etc.).
- Curates semantics and lineage.
- Exposes a negotiation protocol for intent: “tell me what you’re trying to do, and I’ll rearrange myself to serve you better over time.”
Back to Unix
The original Unix designers assumed they didn’t know what users would want to do, so they optimized for composition over prediction. They gave us a small set of primitives – files, text streams, processes, and pipes – and then got out of the way.
Most data systems, especially in crypto, still assume the opposite: that we can predict the important questions and bake them into schemas, cubes, endpoints, and dashboards. That assumption is already under strain in a human‑driven world. In an agent‑driven world, it breaks completely.
An intent‑aware, adaptive data service is the Unix shell for data:
- Indexers and decoders are small, focused utilities.
- Raw and curated datasets are the files.
- Materialized views and transformations are the scripts.
- The agent’s intent is the command line.
Composability won for tools. It will win again for data. The next step is to let intent drive how data is composed, so we stop pretending we can predict the future and start building systems that can adapt to it.
About Ormi
Ormi is the next-generation data layer for Web3, purpose-built for real-time, high-throughput applications like DeFi, gaming, wallets, and on-chain infrastructure. Its hybrid architecture ensures sub-30ms latency and up to 4,000 RPS for live subgraph indexing.
With 99.9% uptime and deployments across ecosystems representing $50B+ in TVL and $100B+ in annual transaction volume, Ormi is trusted to power the most demanding production environments without throttling or delay.