Geth Pruning Benchmarks and Quiet API Tradeoffs
Two pull requests against the geth repository closed in the same window today. One was a parallel state pruning experiment that ran into a database wall. The other was a defensive limit on a rarely used HTTP API that maintainers decided was not worth the maintenance load. Together they show how Ethereum execution client decisions actually get made: benchmarks, not narratives.
Parallel pruning hit a database ceiling
Ethereum nodes carry a transaction index that maps every transaction hash to its block location. After years of mainnet traffic, that index holds hundreds of millions of entries. When operators prune old history below a cutoff block, the node has to scan that index and delete the stale rows. Serial scans take a long time.
A recent attempt rewrote that scan to run in parallel. The author split the index keyspace into 256 ranges by the first byte after the prefix, handed each range to a worker, and coordinated completion with an errgroup. Each worker read keys, decoded the block number, and deleted entries below the prune cutoff. Clean design on paper.
The benchmarks told a sharper story. At small scale, with about one million transactions in the index, parallel pruning ran twice as fast as the serial version. By three million transactions the speedup dropped to about 1.45x. From ten million transactions and up, the speedup flattened to roughly 1.0x. Sixty million transactions took 67.9 seconds parallel against 68.4 seconds serial. Effectively no gain.
The likely cause was pebble write stalling. When many goroutines push reads and writes in parallel against the same key value store, the storage engine pushes back. The compactor cannot keep up, the write path stalls, and the parallelism evaporates. The author noted the stall, tried to work around it, and then closed the PR.
This is what good engineering looks like when nobody is watching. A clean theoretical win, real measurements that contradicted the theory at the size that actually matters, and an honest close.
Why this matters for history expiry
Pruning speed is not an academic question. The Ethereum roadmap has a history expiry track that aims to let nodes drop old chain data after a cutoff. Operators get smaller disk footprints and faster syncs. The tradeoff is that historical data has to move to other layers: archive providers, portal network nodes, indexing services.
For history expiry to be operationally smooth, the pruning step needs to be fast and predictable. If a node spends a full day pruning before it can serve traffic again, operators delay upgrades. If pruning takes minutes, the upgrade is routine. The parallel pruning experiment was a bet that the database layer could absorb more concurrency. The benchmark said it cannot, not yet, not with pebble’s current behavior.
The next attempt likely needs to push the parallelism upstream, not downstream. Smaller batches, fewer concurrent writers, or staging deletions through a write ahead queue. None of that is obvious, which is why the work matters.
The GraphQL surface gets the cold shoulder
The second PR proposed a five megabyte cap on GraphQL request bodies, with a 413 response when exceeded and a requirement that the request end at the JSON object boundary. The reasoning was simple. The JSON RPC handler already enforces a five megabyte cap. The GraphQL handler did not. A maliciously oversized body could chew memory inside the decoder, which is a small but real denial of service surface.
A maintainer pushed back. The argument: GraphQL on the execution client is very little used, so the marginal cost of a missing limit is small, and review time is rationed. The contributor agreed and closed the PR.
This is a useful tell about Ethereum client policy. The GraphQL endpoint exists, but the API war on the execution layer was won by JSON RPC and websocket event streams. Wallets, dapps, indexers, and RPC providers all standardized on JSON RPC. GraphQL stayed as a niche option that most operators do not expose and most clients do not query. Defending it is real work for very few users.
Closing a defensive fix sounds reckless on paper. In practice it reflects threat modeling under finite reviewer attention. If almost nobody runs the surface, the blast radius is small.
What the two decisions signal together
State management work is absorbing reviewer time. Surface area on minor endpoints is not. Read those two signals together and you see where Ethereum execution clients are putting their engineering budget through the rest of the year.
The investment is in making nodes lighter, faster to maintain, and easier to operate at scale. The deprioritization is on interfaces that did not get adoption. This is normal lifecycle behavior in mature open source projects. Early years collect features. Middle years cut and consolidate.
For builders the practical implications are straightforward. Plan for history expiry to land in real client releases. Build on JSON RPC and websocket event streams, not GraphQL. Source older chain data from archive providers if your application needs it. Do not assume your local full node will retain everything forever.
What to watch
Three things worth tracking over the rest of the year.
First, the next pruning attempt. Once someone figures out how to keep pebble from stalling under parallel writes, the speedup curve will look different. Watch the benchmark at thirty million transactions and above. That is where the current attempt died.
Second, the history expiry timeline. The specification has been on the roadmap for years. The engineering work happening now is what turns the spec into client releases that operators actually run.
Third, client diversity. If geth keeps a steady pace on this work, alternative execution clients have to keep up on similar pruning paths. The long tail of features and edge cases is heavy. A single dominant client is bad for the network. The maintenance burden is what keeps the others honest.