Skip to content

[Diagnosis] RPC Responsiveness and Observability Under Computation-Intensive Trace Workloads #24105

@Cripto5588

Description

@Cripto5588

his diagnostic details the impact of simultaneous, computationally intensive trace requests on the responsiveness of lightweight RPC methods (eth_blockNumber) on a reth node (main branch).

The primary objective is to observe how standard RPC endpoints behave when the node is subjected to high-cost trace workloads and to assess whether existing telemetry provides sufficient visibility into potential resource contention or scheduling pressure.

Sandbox:

  • Platform: GitHub Codespaces (4-core instance)
  • Node Configuration: --dev mode
  • Tools: hey (HTTP benchmarking) and curl
  • Workload: Sepolia Block Trace #6000000 (Hash: 0x1ec318985958569724128038843c08971f1e31d4590373264177d4c7849e7b99), which contains multiple contract iterations and transfers, ensuring plausible execution pressure on the EVM.

Playback:

  1. Start the node in a controlled environment.

  2. Establish a performance baseline for lightweight reading:
    ``bash

hey -n 200 -c 8 -m POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' http://localhost:8545

Start concurrent trace requests for the specified Sepolia block: Bashwhile true; do curl -s -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"debug_traceBlockByNumber","params":["0x5B8D80", {"tracer": "callTracer"}],"id":1}' http://localhost:8545 > /dev/null; Completed
Run the baseline benchmark again during the tracing workload and compare the metrics. Diagnostic Results (Comparative Table) Metric Baseline (Idle) Tracing Workload Variation Average Latency 0.0033s 0.0043s +30.3% Requests/sec 2295,417 37.6 -24.3% p95 Latency 0.0061s 0.0087s +42.6% p99 Latency 0.0073s 0.0128s +75.3% Slowest Request 0.0093s 0.0138s +48.3% Observability Findings and Gaps Measurements demonstrate that concurrent tracing workloads significantly influence the responsiveness of lightweight RPC methods within the same node instance. However, during testing, the following points remained unclear: Attribution: It is difficult to determine whether the observed slowness originates from executor contention, trace replay cost, RPC scheduling pressure, or shared resource saturation. Limitations: It is unclear whether this level of degradation is considered expected behavior in the current RPC architecture, where computationally intensive requests share the contention domain with standard queries. Telemetry sufficiency: It is uncertain whether existing RPC metrics are granular enough to attribute resource scarcity or identify trace-induced contention in production environments. Conclusion: This diagnosis confirms that higher-cost trace requests impact the final latency and throughput of standard API endpoints. Objectives 1: to provide a reproducible basis for further analysis of RPC scheduling, 2: to highlight the current difficulty in differentiating execution costs from network-level contention.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-enhancementNew feature or requestS-needs-triageThis issue needs to be labelledS-staleThis issue/PR is stale and will close with no further activity

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions