[Diagnosis] RPC Responsiveness and Observability Under Computation-Intensive Trace Workloads

his diagnostic details the impact of simultaneous, computationally intensive trace requests on the responsiveness of lightweight RPC methods (`eth_blockNumber`) on a `reth` node (main branch).

The primary objective is to observe how standard RPC endpoints behave when the node is subjected to high-cost trace workloads and to assess whether existing telemetry provides sufficient visibility into potential resource contention or scheduling pressure.

### Sandbox:
- **Platform:** GitHub Codespaces (4-core instance)
- **Node Configuration:** `--dev` mode
- **Tools:** `hey` (HTTP benchmarking) and `curl`
- **Workload:** Sepolia Block Trace **#6000000** (Hash: `0x1ec318985958569724128038843c08971f1e31d4590373264177d4c7849e7b99`), which contains multiple contract iterations and transfers, ensuring plausible execution pressure on the EVM.

### Playback:
1. Start the node in a controlled environment.

2. Establish a performance baseline for lightweight reading:
``bash

hey -n 200 -c 8 -m POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' http://localhost:8545

Start concurrent trace requests for the specified Sepolia block: Bashwhile true; do curl -s -X POST -H "Content-Type: application/json" -d '{"jsonrpc":"2.0","method":"debug_traceBlockByNumber","params":["0x5B8D80", {"tracer": "callTracer"}],"id":1}' http://localhost:8545 > /dev/null; Completed
Run the baseline benchmark again during the tracing workload and compare the metrics. Diagnostic Results (Comparative Table) Metric Baseline (Idle) Tracing Workload Variation Average Latency 0.0033s 0.0043s +30.3% Requests/sec 2295,417 37.6 -24.3% p95 Latency 0.0061s 0.0087s +42.6% p99 Latency 0.0073s 0.0128s +75.3% Slowest Request 0.0093s 0.0138s +48.3% Observability Findings and Gaps Measurements demonstrate that concurrent tracing workloads significantly influence the responsiveness of lightweight RPC methods within the same node instance. However, during testing, the following points remained unclear: Attribution: It is difficult to determine whether the observed slowness originates from executor contention, trace replay cost, RPC scheduling pressure, or shared resource saturation. Limitations: It is unclear whether this level of degradation is considered expected behavior in the current RPC architecture, where computationally intensive requests share the contention domain with standard queries. Telemetry sufficiency: It is uncertain whether existing RPC metrics are granular enough to attribute resource scarcity or identify trace-induced contention in production environments. Conclusion: This diagnosis confirms that higher-cost trace requests impact the final latency and throughput of standard API endpoints. Objectives 1: to provide a reproducible basis for further analysis of RPC scheduling, 2: to highlight the current difficulty in differentiating execution costs from network-level contention.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Diagnosis] RPC Responsiveness and Observability Under Computation-Intensive Trace Workloads #24105

Sandbox:

Playback:

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Diagnosis] RPC Responsiveness and Observability Under Computation-Intensive Trace Workloads #24105

Description

Sandbox:

Playback:

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions