Summary
When trace_filter (or any trace API covering a block range) is called, reth spawns one trace_block_until_with_inspector blocking task per block in the range, all running concurrently. Each task holds an MDBX read transaction open for the full EVM replay duration.
Under concurrent trace load from multiple clients, this creates hundreds of simultaneous MDBX read transactions that:
- Saturate NVMe I/O bandwidth to 100%
- Starve the sync pipeline's
MerkleExecute stage of I/O
- Cause the node to fall progressively behind chain head
- Trigger continuous
WARN storage::db::mdbx: A database read transaction has been open for too long spam
The node becomes self-reinforcing: as it falls behind, it receives more forkchoice_updated payloads, which enqueue more sync work — while trace calls continue consuming all available I/O.
The concurrency ceiling is controlled by --rpc-cache.max-concurrent-db-requests (default/configured: 1024). With multiple concurrent clients each requesting 100-block ranges, the theoretical concurrent MDBX reader count is clients × blocks_per_call, up to the configured ceiling.
Ethereum mainnet is especially severe: each mainnet block trace replay takes 60–120 seconds (vs 5–15 seconds for less complex chains), generating thousands of random IOPS per concurrent task. Just 4–5 simultaneous mainnet trace replays are sufficient to saturate a high-end NVMe RAID array.
Environment
| Field |
Value |
| OS |
Ubuntu 25.10, kernel 6.17.0-29-generic |
| CPU |
AMD EPYC 7J13, 64 cores / 128 threads |
| RAM |
500 GiB |
| Storage device |
6× Micron 7450 7 TiB NVMe in RAID-0 (md0), XFS |
Reth Configuration (relevant flags)
Docker Container:
node
--chain=mainnet
--storage.v2
--db.max-size=4TB
--db.max-readers=1024
--http
--http.api=all
--rpc-cache.max-concurrent-db-requests=1024
--rpc.max-trace-filter-blocks=2000
--rpc.max-response-size=200
--rpc.gascap=100000000
--engine.cross-block-cache-size=4096
--engine.memory-block-buffer-target=128
--engine.persistence-threshold=4
--engine.storage-worker-count=48
--engine.account-worker-count=48
--engine.prewarming-threads=32
--rpc-cache.max-blocks=10000
--rpc-cache.max-receipts=10000
--rpc-cache.max-headers=5000
--mem-limit 32g
Expected Behavior
trace_filter over a block range should not starve the sync/consensus pipeline of I/O
- The node should maintain sync progress regardless of RPC trace load
Additional Notes
- The
WARN storage::db::mdbx: A database read transaction has been open for too long message itself is not a bug — it is a correct diagnostic. The bug is the lack of back-pressure between RPC trace ops and the sync pipeline.
- The node does not crash or corrupt data. It eventually completes each
MerkleExecute run (after 23 minutes in our case) and catches up between trace bursts. The risk is falling so far behind that lighthouse considers the execution client unresponsive.
Steps to reproduce
- Run reth on Ethereum mainnet as an archive node with
--http.api=all (trace APIs enabled) and --rpc-cache.max-concurrent-db-requests set to a large value (≥ 64).
- Connect one or more RPC clients that issue
trace_filter calls over block ranges (e.g., 50–100 blocks per call). The clients should be catching up from a point significantly behind chain head so requests are continuous rather than one-per-block.
- Run multiple such clients concurrently (3–5 simultaneous connections, each with their own
trace_filter loop).
- Observe:
WARN storage::db::mdbx: A database read transaction has been open for too long begins appearing repeatedly; MerkleExecute stage checkpoint stops advancing; NVMe utilization reaches 100%.
Minimum reproduction (single client):
- Start reth mainnet archive node with
--rpc-cache.max-concurrent-db-requests=1024.
- Send a single
trace_filter RPC call covering 50+ consecutive mainnet blocks from a historically complex range (e.g., blocks 18000000–18000050, high DeFi activity).
- While the call is in flight, observe
iostat -x 1 on the NVMe device and reth sync status — I/O will spike and pipeline stages will stall.
Node logs
Continuous WARN spam (135 occurrences in 10 minutes)
2026-05-23T08:56:45.061395Z WARN storage::db::mdbx: A database read transaction has been open for too long
open_duration=60.000748474s id=9274075 backtrace=
0: reth_db::implementation::mdbx::tx::MetricsHandler<K>::log_backtrace_on_long_read_transaction
at ./crates/storage/db/src/implementation/mdbx/tx.rs:259:32
1: reth_db::implementation::mdbx::tx::Tx<K>::execute_with_operation_metric
at ./crates/storage/db/src/implementation/mdbx/tx.rs:165:29
2: <reth_db::implementation::mdbx::tx::Tx<K> as reth_db_api::transaction::DbTx>::get_by_encoded_key
at ./crates/storage/db/src/implementation/mdbx/tx.rs:297:14
3: <reth_provider::providers::database::provider::DatabaseProvider<TX,N> as reth_storage_api::stage_checkpoint::StageCheckpointReader>::get_stage_checkpoint
at ./crates/storage/provider/src/providers/database/provider.rs:2236:21
4: <reth_provider::providers::database::provider::DatabaseProvider<TX,N> as reth_storage_api::block_id::BlockNumReader>::best_block_number
at ./crates/storage/provider/src/providers/database/provider.rs:1798:14
5: reth_provider::providers::state::historical::HistoricalStateProviderRef<Provider,N>::storage_history_lookup
at ./crates/storage/provider/src/providers/state/historical.rs:221:41
6: reth_provider::providers::state::historical::HistoricalStateProviderRef<Provider,N>::storage_by_lookup_key
at ./crates/storage/provider/src/providers/state/historical.rs:247:20
7: <reth_provider::providers::state::historical::HistoricalStateProviderRef<Provider,N> as reth_storage_api::state::StateProvider>::storage
at ./crates/storage/provider/src/providers/state/historical.rs:671:14
...
11: <T as reth_revm::database::EvmStateProvider>::storage
at ./crates/revm/src/database.rs:58:9
12: <reth_revm::database::StateProviderDatabase<DB> as revm_database_interface::DatabaseRef>::storage_ref
at ./crates/revm/src/database.rs:161:19
...
24: revm_interpreter::instructions::host::sload
at /usr/local/cargo/registry/.../revm-interpreter-35.0.1/src/instructions/host.rs:205:32
25: revm_interpreter::instructions::Instruction<W,H>::execute
26: revm_interpreter::interpreter::Interpreter<IW>::step
27: revm_inspector::handler::inspect_instructions
28: revm_inspector::traits::InspectorEvmTr::inspect_frame_run
...
33: revm_inspector::mainnet_inspect::<impl revm_inspector::inspect::InspectEvm for ...>::inspect_one_tx
34: revm_inspector::inspect::InspectEvm::inspect_tx
35: <alloy_evm::eth::EthEvm<DB,I,PRECOMPILE> as alloy_evm::evm::Evm>::transact_raw
36: alloy_evm::evm::Evm::transact
37: <alloy_evm::tracing::TracerIter<E,Txs,F> as core::iter::traits::iterator::Iterator>::next
...
52: reth_rpc_eth_api::helpers::trace::Trace::trace_block_until_with_inspector::{{closure}}::{{closure}}
53: reth_rpc_eth_api::helpers::call::Call::spawn_with_state_at_block::{{closure}}::{{closure}}
54: reth_rpc_eth_api::helpers::blocking_task::SpawnBlocking::spawn_blocking_io_fut::{{closure}}
...
70: reth_tasks::runtime::Runtime::spawn_on_rt::{{closure}}
71: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
...
88: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
All 135 WARNs trace to trace_block_until_with_inspector. Multiple simultaneous warnings with identical tx IDs confirm concurrent open transactions.
Sync pipeline stall — 23 minutes on 64 blocks
2026-05-23T08:51:49Z INFO reth_node_events::node: Executing stage pipeline_stages=8/13 stage=MerkleExecute checkpoint=25156717 target=25156781
2026-05-23T08:52:12Z INFO reth::cli: Status connected_peers=61 stage=MerkleExecute checkpoint=25156717 target=25156781
2026-05-23T08:53:52Z INFO reth::cli: Status connected_peers=68 stage=MerkleExecute checkpoint=25156717 target=25156781
2026-05-23T08:55:32Z INFO reth::cli: Status connected_peers=69 stage=MerkleExecute checkpoint=25156717 target=25156781
2026-05-23T08:57:12Z INFO reth::cli: Status connected_peers=80 stage=MerkleExecute checkpoint=25156717 target=25156781
2026-05-23T08:58:52Z INFO reth::cli: Status connected_peers=82 stage=MerkleExecute checkpoint=25156717 target=25156781
# ... checkpoint=25156717 unchanged for entire period ...
2026-05-23T09:15:07Z INFO reth::cli: Status connected_peers=130 stage=MerkleExecute checkpoint=25156781 target=25156877
# ^ Finally advanced after 23 minutes; chain was 226 blocks ahead
MerkleExecute for 64 blocks normally completes in under 1 second. It took 23 minutes because all I/O was consumed by concurrent trace replays.
System resource state during incident
System load average (128-thread machine):
reth-eth container stats:
CONTAINER CPU % MEM USAGE / LIMIT PIDS
reth-eth 4152.78% 23.6GiB / 32GiB 1181
NVMe RAID-0 I/O — fully saturated:
Device r/s rkB/s r_await w/s wkB/s %util
md0 67083.90 1796190.84 0.22 13448.08 86983.82 91.39
md0 63756.00 2038088.00 0.72 25958.00 105352.00 100.10
md0 61955.45 2131053.47 0.74 6347.52 100479.21 99.01
Platform(s)
Linux (x86)
Container Type
Docker
What version/commit are you on?
Reth Version: 2.2.0
Commit SHA: 88505c7fcbfdebfd3b56d88c86b62e950043c6c4
Build Timestamp: 2026-04-29T19:53:57.473810535Z
Build Features: asm_keccak,jemalloc,keccak_cache_global,min_debug_logs,otlp,otlp_logs
Build Profile: maxperf-symbols
What database version are you on?
Current database version: 2
Local database version: 2
Which chain / network are you on?
--chain mainnet (Ethereum mainnet, archive node, --storage.v2)
What type of node are you running?
Archive (default)
Code of Conduct
Summary
When
trace_filter(or any trace API covering a block range) is called, reth spawns onetrace_block_until_with_inspectorblocking task per block in the range, all running concurrently. Each task holds an MDBX read transaction open for the full EVM replay duration.Under concurrent trace load from multiple clients, this creates hundreds of simultaneous MDBX read transactions that:
MerkleExecutestage of I/OWARN storage::db::mdbx: A database read transaction has been open for too longspamThe node becomes self-reinforcing: as it falls behind, it receives more
forkchoice_updatedpayloads, which enqueue more sync work — while trace calls continue consuming all available I/O.The concurrency ceiling is controlled by
--rpc-cache.max-concurrent-db-requests(default/configured: 1024). With multiple concurrent clients each requesting 100-block ranges, the theoretical concurrent MDBX reader count isclients × blocks_per_call, up to the configured ceiling.Ethereum mainnet is especially severe: each mainnet block trace replay takes 60–120 seconds (vs 5–15 seconds for less complex chains), generating thousands of random IOPS per concurrent task. Just 4–5 simultaneous mainnet trace replays are sufficient to saturate a high-end NVMe RAID array.
Environment
Reth Configuration (relevant flags)
Docker Container:
Expected Behavior
trace_filterover a block range should not starve the sync/consensus pipeline of I/OAdditional Notes
WARN storage::db::mdbx: A database read transaction has been open for too longmessage itself is not a bug — it is a correct diagnostic. The bug is the lack of back-pressure between RPC trace ops and the sync pipeline.MerkleExecuterun (after 23 minutes in our case) and catches up between trace bursts. The risk is falling so far behind that lighthouse considers the execution client unresponsive.Steps to reproduce
--http.api=all(trace APIs enabled) and--rpc-cache.max-concurrent-db-requestsset to a large value (≥ 64).trace_filtercalls over block ranges (e.g., 50–100 blocks per call). The clients should be catching up from a point significantly behind chain head so requests are continuous rather than one-per-block.trace_filterloop).WARN storage::db::mdbx: A database read transaction has been open for too longbegins appearing repeatedly;MerkleExecutestage checkpoint stops advancing; NVMe utilization reaches 100%.Minimum reproduction (single client):
--rpc-cache.max-concurrent-db-requests=1024.trace_filterRPC call covering 50+ consecutive mainnet blocks from a historically complex range (e.g., blocks 18000000–18000050, high DeFi activity).iostat -x 1on the NVMe device andrethsync status — I/O will spike and pipeline stages will stall.Node logs
Continuous WARN spam (135 occurrences in 10 minutes)
All 135 WARNs trace to
trace_block_until_with_inspector. Multiple simultaneous warnings with identical tx IDs confirm concurrent open transactions.Sync pipeline stall — 23 minutes on 64 blocks
MerkleExecute for 64 blocks normally completes in under 1 second. It took 23 minutes because all I/O was consumed by concurrent trace replays.
System resource state during incident
System load average (128-thread machine):
reth-eth container stats:
NVMe RAID-0 I/O — fully saturated:
Platform(s)
Linux (x86)
Container Type
Docker
What version/commit are you on?
What database version are you on?
Which chain / network are you on?
--chain mainnet(Ethereum mainnet, archive node,--storage.v2)What type of node are you running?
Archive (default)
Code of Conduct