How to Access Crypto Data Using Subgraphs

Crypto data isn’t designed to be queried. This guide explains how to access, structure, and query on-chain data, comparing RPC nodes, APIs, data warehouses, and subgraphs, and what actually works in production.

How to Access Crypto Data Using Subgraphs

If you've spent any time trying to pull structured data from a blockchain, you’ve likely run into this.

Raw on-chain data isn’t designed to be queried. It’s append-only, sequential, and spread across thousands of blocks.

Getting a clean answer to something as simple as“what’s the total trading volume for this token over the last 30 days?” usually means running your own node, using a data warehouse, or stitching together multiple API calls and decoding everything yourself.

There’s a better way to access crypto data. Subgraphs solve a big part of that problem, but they come with tradeoffs. This guide walks through how they work and why they’ve become the standard.

The crypto data problem

Every on-chain action is recorded as a log inside a transaction, inside a block. That structure works well for consensus, but isn't efficient for querying.

If you want to answer something basic like:

answering these requires reconstructing the state from raw logs rather than querying a database.

In practice, that leaves you with a few options:

Run your own node

This gives you full access to raw blockchain data, but you're responsible for syncing, storage, and building your own indexing layer. Archive nodes for Ethereum alone require 15+ TB of storage.

Use a commercial RPC provider

Services like Alchemy or QuickNode give you node access without the infrastructure overhead. But you're still querying raw data. Every request requires ABI decoding, block iteration, and custom logic to transform event logs into something usable.

Pay for a data warehouse

Platforms like Dune or Flipside offer pre-indexed blockchain data with SQL interfaces. They're powerful for analytics, but expensive at scale and not designed for real-time application data.

Use a crypto data API

Third-party APIs abstract away the complexity, but you're locked into their data models, rate limits, and pricing. Coverage is often incomplete for newer protocols.

Each approach has tradeoffs between cost, flexibility, latency, and maintenance burden. Subgraphs offer a different balance.

What is a Subgraph?

A subgraph is an indexed data layer that sits on top of a blockchain. It listens for events emitted by smart contracts, maps that data to a schema you define, and exposes it through a GraphQL API.

The core idea is borrowed from data engineering. ETL pipelines have done this for decades: extract raw data, transform it into something structured, and load it somewhere queryable. A subgraph does the same for blockchain data, with the added complexity that the source of truth is a decentralized, immutable ledger rather than a relational database.

In practical terms: instead of looping through event logs block by block to calculate a user's total swap volume on a DEX, you query a subgraph and get the answer in milliseconds.

Why Subgraphs are the standard for crypto data

The Graph Protocol popularized the subgraph model around 2020-2021, and the timing was right. DeFi was exploding. Every new protocol needed a way to surface its on-chain data to dashboards, analytics tools, and third-party integrations. Running a full archive node for every protocol wasn't viable. Centralized data providers were slow and expensive. Subgraphs filled the gap.

Today, subgraphs power a significant chunk of the data layer behind crypto analytics, covering everything from DEX trading dashboards to NFT marketplaces to lending protocol monitors. If you've used Uniswap's analytics page, you've been hitting a subgraph.

The reasons they became the default:

  • GraphQL interface: Developers already knew GraphQL. The query language is expressive and self-documenting.
  • Open schema: Subgraph schemas are public, so you can inspect exactly what data is available and how it's structured.
  • Historical and real-time: A well-indexed subgraph gives you both current state and historical queries through the same endpoint.
  • Protocol-specific: Each protocol can define a data model that matches its own logic, which is far more useful than generic block explorers.

How Subgraph indexing works

Understanding the mechanics helps when something goes wrong. And something will go wrong.

When you deploy a subgraph, you provide three things:

  1. A manifest: which contracts to watch, on which chains, from which block
  2. A schema: the GraphQL types that define your data model
  3. Mapping handlers: AssemblyScript functions that translate raw on-chain events into schema entities

The indexer listens for events emitted by the specified contracts (like Swap, Transfer, Mint), runs the relevant handler for each event, and writes the result to a Postgres-backed store. Queries hit that store through a GraphQL layer.

The key thing to understand: Subgraphs primarily index event logs rather than raw storage. While mappings can read contract state via calls, most subgraphs rely on events as their primary data source.

This is a meaningful limitation. Some on-chain state changes aren't easily noticeable. When one contract triggers a function in another without emitting a tracked event, you get gaps in your data. Knowing this upfront saves a lot of debugging time.

Related readings: 4 best practices to optimize your subgraph indexing

Accessing crypto data with Subgraphs

Once a subgraph is indexed, accessing data is straightforward. You send a GraphQL query to the endpoint and get back JSON.

A basic query to get recent swaps from a DEX subgraph might look like this:

{
  swaps(
    first: 100,
    orderBy: timestamp,
    orderDirection: desc,
    where: { token0: "0xabc..." }
  ) {
    id
    timestamp
    amount0In
    amount0Out
    amountUSD
    sender
  }
}

This is far cleaner than querying raw event logs from an RPC node. No ABI decoding or block iteration. The data is already structured neatly and ready to use.

For more complex analytics, including time-series aggregations, cross-entity joins, and filtered historical windows, you build out more elaborate queries using GraphQL's filtering and pagination capabilities. Most well-designed subgraphs support where filters, time-range queries, and nested entity resolution.

Subgraphs vs. other crypto data solutions

Approach Best for Limitations
RPC nodes Raw, unfiltered access Requires custom indexing logic; slow for complex queries
Data warehouses Historical analytics, research Expensive at scale; not real-time; SQL-based
Third-party APIs Quick integrations Limited coverage; rate limits; locked into their schema
Subgraphs Real-time app data, protocol-specific queries Highly variable by infrastructure provider

Subgraphs hit a sweet spot for applications that need structured, queryable, real-time blockchain data without the overhead of building custom infrastructure.

They're particularly strong when you need protocol-specific data models that match your application's logic.

Subgraph limitations

Subgraphs are powerful, but not a complete solution on their own, as with any data product:

Indexing lag: During peak network activity, subgraph indexing can fall behind the chain tip. If the application demands real-time data for trading or liquidation monitoring, a performant hosted subgraph on a dedicated infrastructure is essential.

Re-org sensitivity: Blockchains reorganize. Blocks get dropped or replaced. A subgraph that doesn't handle re-orgs properly will end up with "ghost" data that never actually happened on the final chain.

Event dependency. As mentioned, subgraphs index events. If the smart contract data required isn't emitted as an event, you'll need to work around this limitation with contract calls or supplementary data sources.

AssemblyScript learning curve: Subgraph mappings are written in AssemblyScript, which looks like TypeScript but has quirks around memory management and BigInt handling. Expect some friction if you're coming from a pure TypeScript background.

Deployment and maintenance. Someone has to deploy, monitor, and maintain the subgraph. Schema changes require reindexing. Bugs in mapping logic require redeployment.

For many teams, the solution is using a hosted subgraph provider like Ormi that handles infrastructure, uptime, and performance optimization.

How to get started with subgraphs

Getting started with subgraphs is straightforward. Getting them right takes a bit more thought.

At a minimum, you define three things:

  • A schema: this is your data model. What entities exist, how they relate, and what you’ll query later.
  • A manifest: this tells the indexer which contracts to watch, which events matter, and where to start syncing from.
  • Mappings: these are the handlers that take raw on-chain events and turn them into structured data.

Once deployed, the indexer does the rest. It backfills historical data, stays in sync with new blocks, and writes everything into a database that you query through GraphQL.

That’s the optimistic scenario.

Where teams usually get stuck is not in getting a subgraph running, but in making it usable in production.

  • Schema decisions affect query speed.
  • Mapping logic affects indexing performance.
  • Event design determines what data you can even access.

A subgraph shapes how the blockchain data is stored and accessed.

If you get that part right early, everything downstream becomes easier.

So why Ormi?

Subgraphs give you structure, but they don’t guarantee performance, freshness, or reliability.

That part is entirely dependent on the infrastructure running them. In practice, this is where most issues show up.

  • Indexing falls behind the chain head when chains processes txns faster
  • Queries slow down as the dataset grows
  • Endpoints fail under sustained traffic
  • Re-orgs introduce inconsistencies that are hard to debug

None of these are problems with subgraphs themselves, but they are problems with the architecture and infrastructure it's built on.

Ormi is built to handle those conditions.

  • Subgraphs stay in sync with the chain tip, even under load
  • Query performance remains consistent as usage scales
  • Data stays accurate, even through re-orgs
  • You don’t have to think about nodes, failover, or scaling

The goal is simple. You set and forget it. That’s the gap Ormi is designed to close.

Frequently asked questions

What's the difference between a subgraph and a blockchain API?

An RPC API exposes raw blockchain data: blocks, transactions, and contract storage. A subgraph is an indexed, structured layer built on top of that. Use subgraphs for querying patterns across events; use RPC for direct reads and writes.

Can subgraphs provide real-time data?

They index new blocks within seconds in most cases. Actual latency depends on infrastructure, network conditions, and subgraph complexity. Production systems need indexers that stay close to the chain head under sustained load.

How do you query a subgraph?

Send a POST request containing a GraphQL query to the subgraph endpoint. Client libraries make this easier, but they're not required.

How do you query stablecoin data?

Stablecoin data can be queried using indexed blockchain data through subgraphs or APIs. This includes tracking transfers, supply changes, mint and burn events, and cross-chain activity.

Are subgraphs free to use?

The Graph is pay per query, and Ormi offers a free dev plan for users to run tests and prototyping.

What chains do subgraphs support?

It depends on the indexing provider. Most platforms cover major EVM chains: Ethereum, Arbitrum, Optimism, Polygon, and Base. Coverage for newer or high-throughput chains varies.

What is Ormi's 0xGraph?

0xGraph is Ormi's managed subgraph infrastructure. It provides a GraphQL interface for indexed blockchain data while handling indexing reliability, synchronization, reorg management, and multi-chain coverage.

About Ormi

Ormi is the next-generation data layer for Web3, purpose-built for real-time, high-throughput applications like DeFi, gaming, wallets, and on-chain infrastructure. Its hybrid architecture ensures sub-30ms latency and up to 4,000 RPS for live subgraph indexing.

With 99.9% uptime and deployments across ecosystems representing $50B+ in TVL and $100B+ in annual transaction volume, Ormi is trusted to power the most demanding production environments without throttling or delay.