Indexing DEX liquidity across EVM and Substrate chains

Official DEX subgraphs are built for dashboards. They produce one state snapshot per block, with no record of what happened inside it. For anyone who needs to reconstruct the exact sequence of price changes within a block, that resolution is a hard limit.

This article describes xchain-dex-indexer, an open-source TypeScript framework for indexing DEX liquidity pools on EVM-compatible and Substrate-based blockchains. The design priorities are: per-block and intra-block snapshot granularity, a unified schema across protocol types, and a chain-agnostic architecture that separates the indexing engine from chain-specific implementations.

The granularity problem

A standard subgraph snapshot tells you the state of a pool at the end of a block. If ten transactions touched that pool within the block, you see only the result of the last one. The intermediate states, the ordering of those transactions, and the priority fees attached to each are not visible.

The sequence matters because identical end-of-block states can result from very different intra-block dynamics. Consider three scenarios that produce the same final pool price:

A large swap followed by an arbitrage that restores the price.
A sandwich: a frontrun, the victim's swap, a backrun.
Several independent swaps that happen to net out.

With one snapshot per block, these are indistinguishable. With intra-block snapshots ordered by transactionIndex, the sequence is recoverable and each scenario produces a distinct signature in the data.

This granularity enables two classes of analysis that are not possible with standard subgraph data.

The first is strategy simulation based on intra-block positioning. Given a pool and a historical block range, you can simulate what a strategy would have observed and executed at a given position within each block, using the actual sequence of price states rather than the single end-of-block approximation.

The second is block producer analysis. If you control a validator node with front- and back-running capabilities, intra-block data provides the price states needed to estimate what value a node with those capabilities could have captured across a historical block range.

xchain-dex-indexer addresses this at two levels.

Per-block gap filling. Every tracked pool receives an end-of-block snapshot at every block, regardless of whether any event occurred. This guarantees a continuous time series with no missing blocks, which is a prerequisite for any statistical analysis across a block range.

Intra-block snapshots. When multiple transactions modify the same pool within a single block, a snapshot is saved after each one, ordered by transactionIndex. Each snapshot carries an afterTxId field (null for end-of-block snapshots, the transaction hash otherwise) and an index field for ordering within the block.

Each snapshot also records priorityInclusionFeePerUnit, which is baseFee + priorityFee on EVM chains and tip + weightFee on Substrate. This field does not identify MEV actors, but it provides the raw input for correlating transaction position with the fee a sender was willing to pay for that position, which is a necessary input for any ordering simulation.

To query only end-of-block snapshots, equivalent to a standard subgraph, filter with afterTxId_isNull: true. To reconstruct the full intra-block sequence, query by blockNumber and sort by index_ASC.

Intra-block snapshots are opt-in per DEX via intraBlockSnapshots: true in the DEX config, so the additional storage cost applies only where sub-block granularity is needed.

Cross-chain architecture

The framework is designed around a separation between the indexing engine and chain-specific implementations. The core is chain-agnostic. EVM and Substrate processors are implemented as distinct abstractions at the base layer, both feeding into the same snapshot schema and the same PostgreSQL database.

The class hierarchy reflects this separation:

AbstractDataImporter                  ← block loop, gap filling, commit
  └── AbstractDexDataImporter         ← pool registry, snapshot tracker, per-tx dedup
        ├── V2EvmDexDataImporter      ← Uniswap V2 forks (EVM)
        ├── V3EvmDataImporter         ← Uniswap V3 forks (EVM)
        │     └── V3EvmAlgebraDataImporter   ← adapter: Algebra ABI + event processors
        ├── V4EvmAlgebraDataImporter  ← Algebra V4 (EVM)
        └── [SubstrateDexDataImporter]  ← Substrate DEX pallet indexer (planned)

AbstractDataImporter owns the block loop, tracks the last processed height, and fills end-of-block snapshots for any skipped range between batches. AbstractDexDataImporter manages the pool registry, the per-pool snapshot tracker used for gap filling, and the per-transaction deduplication map. The EVM and Substrate implementations sit above this layer and handle only chain-specific concerns: decoding events, reading logs, and extracting fee data in the format each runtime exposes.

The consequence is that adding a new chain requires implementing a chain-specific processor config, not modifying the core. Gap filling, snapshot tracking, intra-block ordering, and GraphQL exposure are all inherited.

Schema unification across protocol types. V2 (Uniswap forks), V3 (Uniswap and Algebra forks), V4 (Algebra), and custom vault types are stored in a single PostgreSQL database under a consistent snapshot structure. All snapshot entities share the same ordering fields (blockNumber, afterTxId, index, priorityInclusionFeePerUnit), so cross-protocol queries use the same filter patterns.

Per-chain schema extensions. The schema system has two layers. common.graphql defines entities shared across all chains: pool types and their snapshot counterparts. Each chain can extend this with a chain-specific file, for example moonbeam.graphql adds StDotVaultSnapshot for the Nimbus liquid staking vault. At build time, a codegen script merges the two layers, generates TypeORM entities, and applies migrations to the chain's isolated database. Each chain gets its own PostgreSQL database, constructed at runtime as DB_URL_PREFIX + chain_name. Running codegen for one chain never touches another chain's database.

Pool filtering via token whitelist. On public DEXs anyone can deploy a pool with any token pair, including tokens designed to mimic legitimate assets. Indexing all pools indiscriminately would pollute the dataset. Each chain defines a WhiteListTokensManager with a static list of trusted token addresses. A pool is registered only if both tokens are present in the whitelist. On Moonbeam, the whitelist covers native assets, XCM cross-chain assets, major stablecoins, wrapped assets, and liquid staking tokens, approximately 35 tokens total.

Extending the framework

The extension model is designed to minimize the surface area of a new integration.

Adding a Uniswap V2 fork requires four steps: defining the pool list filtered through the chain whitelist, adding a DexType enum entry to the schema, registering the DEX config with factory address and ABI, and enabling the importer in the chain entry point. The V2 ABI is reused across forks. The result is full gap-filling, intra-block ordering, and GraphQL exposure with no changes to the core engine.

Adding a Uniswap V3 fork uses an adapter pattern. V3EvmDataImporter implements all snapshot logic, tick tracking, and gap filling against the standard Uniswap V3 event interface. A fork adapter overrides only the methods whose event signatures differ. For most forks this means one EventChecker and one EventsProcessor, each overriding one or two decode methods. The Algebra adapter (V3EvmAlgebraDataImporter) is the reference implementation: it injects a custom AlgebraEventsProcessor and AlgebraEventChecker into the standard V3 importer, covering the differences in swap event structure without touching the parent class.

export class AcmeDexV3DataImporter extends V3EvmDataImporter {
    constructor(dexType: DexType, parachain: ParachainInfo) {
        super(
            dexType,
            parachain,
            new AcmeDexV3EventsProcessor(DexConfig.getConfig(dexType, parachain)!.poolAbi),
            new AcmeDexV3EventChecker(DexConfig.getConfig(dexType, parachain)!.poolAbi),
        );
    }
}

Gap filling, snapshot tracking, intra-block ordering, and GraphQL exposure are all inherited. A new V3 fork integration adds roughly two adapter files and thirty lines of code.

Adding a Substrate chain follows the same pattern at a higher level. AbstractDexDataImporter is chain-agnostic. A Substrate DEX importer would extend it with a @subsquid/substrate-processor, decode the relevant pallet events, and map them to the shared pool snapshot schema. The gap-filling and snapshot-tracking logic is inherited unchanged.

Data integrity verification. The framework includes a test suite that validates indexed data against official DEX subgraphs. It samples random blocks within a configured range, queries both the local GraphQL server and the official endpoint, and diffs the results. This covers V2, V3, and V4. It was used during development on Moonbeam against StellaSwap and Beamswap subgraphs and caught several edge cases in intra-block deduplication.

Current state and roadmap

The framework is deployed and tested on Moonbeam, indexing StellaSwap (V2, V3, V4), Beamswap (V2, V3), and the Nimbus stDOT vault. The Stable AMM indexer (Curve-style pools) has schema and entity models defined but the indexer implementation is not yet complete.

Open items on the roadmap:

Bootstrap pool state from on-chain data at indexer start block, to avoid depending on historical event replay from genesis
Stable AMM indexer implementation
Substrate DEX pallet indexer
Docker Compose setup for one-command startup

The repository is at https://github.com/xchain-mev-research/xchain-dex-indexer.

Contributions and questions are welcome, particularly around Substrate chain integrations and additional EVM DEX support.

This article is part of the series Building a Cross-Chain MEV Bot. Next up: once you have the data, you need to know what to do with it. The next article covers the simulation engine — exact AMM math, price impact across sequential hops, and how to compute the real PnL of a route before committing capital.