Add OpenTelemetry tracing across the backbeat pipeline#2733
Conversation
Hello delthas,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
Request integration branchesWaiting for integration branch creation to be requested by the user. To request integration branches, please comment on this pull request with the following command: Alternatively, the |
Codecov Report❌ Patch coverage is Additional details and impacted files
... and 6 files with indirect coverage changes
@@ Coverage Diff @@
## development/9.4 #2733 +/- ##
===================================================
- Coverage 74.73% 74.66% -0.07%
===================================================
Files 199 201 +2
Lines 13650 13741 +91
===================================================
+ Hits 10201 10260 +59
- Misses 3439 3471 +32
Partials 10 10
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
9d08f7b to
2f7afb0
Compare
2f7afb0 to
d562a0a
Compare
d562a0a to
51a9f61
Compare
51a9f61 to
b9d3528
Compare
970a811 to
849d6b0
Compare
6357120 to
a54ec82
Compare
a54ec82 to
54399e3
Compare
54399e3 to
3189d3a
Compare
Pin arsenal at the ARSN-586 branch (shared tracing module + W3C trace-context stamping on MongoDB metadata writes). Drop the SDK-core packages now that arsenal carries them as optionalDependencies, and keep the four instrumentation packages (http, ioredis, mongodb, aws-sdk) here — the consumer owns and configures them. Issue: BB-764
lib/tracing/index.js becomes a thin shim over arsenal's shared module: it carries backbeat's config (serviceName, the http/ioredis/mongodb/ aws-sdk instrumentations, outbound-only HTTP via makeHttpInstrumentationConfig + disableIncomingRequestInstrumentation) so the 8 entry points keep calling init() with no args. kafkaTraceContext.js re-exports arsenal's kafka helpers so the existing require sites are unchanged. The trust-boundary filter and SDK bootstrap now live in arsenal. Issue: BB-764
3189d3a to
cedc91c
Compare
Wire arsenal's tracing into the replication, lifecycle, GC, notification, and oplog-populator pods: init() at each entry point, per-pod spans, and trace-context propagation across the Kafka pipeline (producers stamp traceparent via the kafka helpers; consumers start linked spans from it). Out-of-process Kafka hops use span links, not parent/child, so traces stay bounded. Issue: BB-764
|
Summary
Add OpenTelemetry tracing across the backbeat pipeline, gated behind
ENABLE_OTEL=true. When the flag is unset, no@opentelemetry/*package is loaded — zero overhead off the OTEL path.The SDK bootstrap, trust-boundary host filter, and Kafka trace-context helpers now live in arsenal's shared
lib/tracingmodule (scality/Arsenal#2632, ARSN-586); backbeat consumes it through a thin shim instead of carrying its own copy. Companion to the cloudserver (#6140, CLDSRV-884) and vault (#203, VAULT-708) PRs, so all four services share one implementation.Commits
chore: depend on arsenal OTEL tracing module— pin arsenal at the ARSN-586 branch (shared tracing module + the W3C trace-context stamping on MongoDB metadata writes, ARSN-572, which the Kafka pipeline relies on to continue traces across the oplog boundary). Drop the SDK-core packages now that arsenal carries them asoptionalDependencies, and keep the four instrumentation packages backbeat configures itself:instrumentation-http/-ioredis/-mongodb/-aws-sdk.feat: replace in-tree tracing with arsenal shim—lib/tracing/index.jsbecomes a thin shim overrequire('arsenal/build/lib/tracing')carrying backbeat's config in one place:serviceName: 'backbeat', the four instrumentations, and outbound-only HTTP (...makeHttpInstrumentationConfig()for the trust-boundaryrequestHook, plusdisableIncomingRequestInstrumentation: truesince backbeat pods serve no application HTTP).lib/tracing/kafkaTraceContext.jsre-exports arsenal's kafka helpers so the existing require sites are unchanged. The trust-boundary filter and SDK bootstrap that used to live here are now arsenal's.feat: instrument backbeat pods and the Kafka pipeline— wire tracing into the replication, lifecycle, GC, notification, and oplog-populator pods:init()at each of the 8 entry points, per-pod spans, and trace-context propagation across the Kafka pipeline. Producers stamptraceparentonto message headers via the kafka helpers; consumers start a span linked to (not a child of) the upstream span — out-of-process Kafka hops can fire long after the original request, so links keep traces bounded.Why a shim (vs cloudserver/vault's direct calls)
backbeat has 8
init()entry points and 6 Kafka-helper require sites. The shim keeps all 14 call sites untouched and the backbeat-specific config in one file; cloudserver and vault have a single entry point each, so they deep-require arsenal directly.Configuration
OpenTelemetry environment variables are documented in the arsenal module.
Out of scope (follow-ups)
The trust-boundary enforcement (read
OTEL_TRUSTED_HOSTS, striptraceparenton untrusted outbound) lives entirely in arsenal and is wired automatically. Populating that env var per deployment is routine operator config — no backbeat-side work.Related tickets
Issue: BB-764