Skip to content

[Fix][Zeta] Fix job detail metrics NaN with 2.3.13 backend#10637

Closed
davidzollo wants to merge 2 commits into
apache:devfrom
davidzollo:fix/ui-detail-metrics-compat-2313
Closed

[Fix][Zeta] Fix job detail metrics NaN with 2.3.13 backend#10637
davidzollo wants to merge 2 commits into
apache:devfrom
davidzollo:fix/ui-detail-metrics-compat-2313

Conversation

@davidzollo
Copy link
Copy Markdown
Contributor

What does this PR do?

Fixes the job detail page metrics lookup so the SeaTunnel UI can read the prefixed table metric keys returned by the 2.3.13 backend and avoid rendering NaN.

This change only targets the 2.3.13 backend contract.

Closes #10636

Why is this needed?

The 2.3.13 backend returns table metrics keyed by Source[n].<tablePath> / Sink[n].<tablePath>, while the UI detail page still looked up metrics by raw table path. That mismatch made Number(undefined) show up as NaN in the job detail metrics table.

How was it tested?

  • cd seatunnel-engine/seatunnel-engine-ui && npm run type-check
  • cd seatunnel-engine/seatunnel-engine-ui && npm run test:unit -- --run src/tests/detail-metrics.spec.ts
  • ./mvnw spotless:apply
  • ./mvnw -q -DskipTests verify (running locally while opening this PR)

@github-actions github-actions Bot added the Zeta label Mar 23, 2026
@DanielCarter-stack
Copy link
Copy Markdown

Issue 1: Missing test coverage for null values and edge cases

Location: seatunnel-engine/seatunnel-engine-ui/src/tests/detail-metrics.spec.ts

Problem Description:
The test file covers main scenarios, but lacks tests for the following important edge cases:

  1. When metricMap is undefined, whether readVertexMetricValue correctly returns 0
  2. Behavior when vertexName does not contain identifier (e.g., vertexName: 'simple-name')
  3. Behavior when there is no matching key in metricMap
  4. Priority validation when metricMap contains both prefixed keys and original keys
  5. Handling of Number.isFinite when metric value is a non-numeric string (e.g., 'abc')

Potential Risks:

  • Edge cases are not covered by tests, unexpected behavior may occur in production environments
  • When vertexName format does not match expectations, metric lookup may fail
  • Unable to ensure code robustness under various abnormal inputs

Impact Scope:

  • Direct impact: resolveMetricKey, readVertexMetricValue, collectVertexMetrics functions in detail-metrics.ts
  • Indirect impact: Metric display logic in detail.tsx
  • Impact area: Only affects frontend UI metric display, does not affect backend Job execution

Severity: MINOR

Improvement Suggestions:

test('returns 0 when metricMap is undefined', () => {
  expect(readVertexMetricValue(undefined, sourceVertex, 'fake.user_table')).toBe(0)
})

test('handles vertexName without identifier', () => {
  const vertexWithoutIdentifier = {
    vertexId: 1,
    type: 'source',
    vertexName: 'simple-source-name',
    tablePaths: ['fake.user_table']
  } as Vertex
  
  const metricMap = { 'fake.user_table': '15' }
  expect(readVertexMetricValue(metricMap, vertexWithoutIdentifier, 'fake.user_table')).toBe(15)
})

test('returns 0 when no matching key found', () => {
  const metricMap = { 'other.table': '100' }
  expect(readVertexMetricValue(metricMap, sourceVertex, 'fake.user_table')).toBe(0)
})

test('prefers prefixed key over raw key when both exist', () => {
  const metricMap = {
    'Source[0].fake.user_table': '10',
    'fake.user_table': '999'
  }
  expect(readVertexMetricValue(metricMap, sourceVertex, 'fake.user_table')).toBe(10)
})

test('handles non-numeric metric values', () => {
  const metricMap = { 'fake.user_table': 'invalid' }
  expect(readVertexMetricValue(metricMap, sourceVertex, 'fake.user_table')).toBe(0)
})

Issue 2: Key names returned by collectVertexMetrics may not match user expectations

Location: seatunnel-engine/seatunnel-engine-ui/src/views/jobs/detail-metrics.ts:81-100

Problem Description:
The key name format returned by the collectVertexMetrics function is ${metricName}.${path}, for example TableSinkWriteCount.fake.user_table. However, when the actual key used is the prefixed Sink[1].fake.user_table, the key name users see in the Configuration component is inconsistent with the actual metricMap key name, which may cause confusion.

Although this does not affect functionality (Configuration component only displays key-value pairs), it may make debugging difficult because the key name users see does not match the actual key name returned by the backend.

Potential Risks:

  • Users may be confused when viewing Job details why the displayed key name is inconsistent with the one returned by the backend
  • May not quickly locate the actual metricMap key when debugging issues
  • Cannot distinguish which vertex the metric belongs to when multiple vertices share the same table path

Impact Scope:

  • Direct impact: focusedVertex computed property in detail.tsx
  • Indirect impact: Configuration component display
  • Impact area: Only affects UI display, does not affect functionality

Severity: MINOR

Improvement Suggestions:
Consider displaying both the original key name and the actual used key name in the Configuration component, or keep the current status and add comments for explanation.

Alternatively, if clearer display is needed, you can modify collectVertexMetrics to return an object containing more information:

export const collectVertexMetrics = (
  metricName: string,
  metricMap: MetricMap,
  vertex: MetricVertex
): Record<string, { value: string; actualKey: string }> => {
  const metrics: Record<string, { value: string; actualKey: string }> = {}

  if (!metricMap) {
    return metrics
  }

  vertex.tablePaths.forEach((path) => {
    const metricKey = resolveMetricKey(metricMap, vertex, path)
    if (metricKey !== undefined) {
      metrics[`${metricName}.${path}`] = {
        value: metricMap[metricKey],
        actualKey: metricKey
      }
    }
  })

  return metrics
}

But this requires synchronized changes to the Configuration component, which may be beyond the scope of this PR.


Issue 3: Fallback logic of resolveMetricKey may lead to ambiguity

Location: seatunnel-engine/seatunnel-engine-ui/src/views/jobs/detail-metrics.ts:50-61

Problem Description:
In the resolveMetricKey function, when an exact matching key cannot be found, it attempts to fall back to suffix matching (lines 50-60). Specifically, the fallback logic on line 60:

return suffixedKeys.length === 1 ? suffixedKeys[0] : undefined

This logic may cause problems in certain situations:

  1. When multiple vertices share the same table path but metrics are only reported from one vertex, this fallback logic will incorrectly return the only existing key
  2. When multiple table paths with the same name belong to different vertices, if there is no identifier, it may return the wrong metric value

Potential Risks:

  • In scenarios where multiple vertices share table paths, incorrect metric values may be returned
  • When a vertex's metrics have not yet been reported, another vertex's metric values may be used
  • Causes users to see inaccurate metric data, misleading operations decisions

Impact Scope:

  • Direct impact: sourceCell and sinkCell functions in detail.tsx
  • Indirect impact: Metric display on Job details page
  • Impact area: Metric display in multi-vertex scenarios

Severity: MAJOR

Improvement Suggestions:
In the fallback logic on line 60, you should be more cautious and only use fallback when it is confirmed that there will be no confusion:

// Only fallback when there is no identifier and only one matching key exists
if (!identifier && suffixedKeys.length === 1) {
  return suffixedKeys[0]
}

// If there is an identifier but no matching prefix key, do not blindly fallback
// In this case, undefined should be returned to let the caller handle it
if (identifier && suffixedKeys.length > 0) {
  const sameVertexKey = suffixedKeys.find((key) => key.startsWith(`${identifier}.`))
  if (sameVertexKey) {
    return sameVertexKey
  }
  // If a table with the same name is found but does not belong to the current vertex, return undefined instead of using another vertex's data
  return undefined
}

return undefined

But this requires more careful consideration of various scenarios and may need to add more test cases for verification.


Issue 4: Performance Consideration - Multiple traversals of Object.keys

Location: seatunnel-engine/seatunnel-engine-ui/src/views/jobs/detail-metrics.ts:50-51

Problem Description:
In the resolveMetricKey function, each call executes Object.keys(metricMap).filter((key) => key.endsWith(suffix)), which traverses all metricMap keys.

Considering:

  1. sourceCell and sinkCell functions call readVertexMetricValue once for each tablePath of each vertex
  2. If a Job has 10 vertices and each vertex has 5 table paths, resolveMetricKey will be called 50 times
  3. Each call traverses all metricMap keys (possibly dozens to hundreds)

This may have performance issues when the number of metrics is large.

Potential Risks:

  • In large Jobs with large amounts of metric data, page rendering performance may decline
  • Frequent array traversal and filtering operations may affect user experience
  • In real-time update scenarios (refreshing every 5 seconds), performance issues become more obvious

Impact Scope:

  • Direct impact: Metric table rendering in detail.tsx
  • Indirect impact: Performance of Job details page
  • Impact area: Details pages of large Jobs

Severity: MINOR

Improvement Suggestions:
Consider pre-building an index to speed up lookup, or use more efficient data structures. However, since this is a frontend optimization and metrics are usually not too many, the priority of this issue is lower.

If optimization is indeed needed, you can consider:

// Pre-build a Map grouped by table path
const buildMetricPathIndex = (metricMap: MetricMap) => {
  if (!metricMap) return new Map()
  
  const index = new Map<string, string[]>()
  Object.keys(metricMap).forEach(key => {
    const parts = key.split('.')
    if (parts.length >= 2) {
      const path = parts.slice(1).join('.')
      if (!index.has(path)) {
        index.set(path, [])
      }
      index.get(path)!.push(key)
    }
  })
  return index
}

But this increases code complexity and requires weighing performance against maintainability.


Issue 5: Missing metric handling for Transform type vertices

Location: seatunnel-engine/seatunnel-engine-ui/src/views/jobs/detail.tsx:95-126

Problem Description:
Although the VERTEX_IDENTIFIER_PATTERN regular expression includes Transform, the sourceCell and sinkCell functions only handle source and sink type vertices, not transform type vertices.

This may lead to the following problems:

  1. If Transform type vertices also have table metrics (although not very common), these metrics will not be displayed
  2. In the computed property of focusedVertex, there is also no handling for transform type

Potential Risks:

  • If Transform type vertices support table metrics in the future, these metrics will not be able to be displayed in the UI
  • Incomplete code structure may cause confusion for maintainers

Impact Scope:

  • Direct impact: sourceCell, sinkCell and focusedVertex functions in detail.tsx
  • Indirect impact: Metric display of Transform type vertices
  • Impact area: Transform type vertices

Severity: MINOR

Improvement Suggestions:
If it is certain that Transform type will not have table metrics, you can add comments to explain in the code. Alternatively, for code completeness, you can add support for Transform:

const transformCell = (
  row: Vertex,
  key: string
) => {
  if (row.type === 'transform') {
    // If the Transform type also has table metrics, processing logic can be added here
    return row.tablePaths.reduce(
      (s, path) => s + readVertexMetricValue(job.metrics?.[key], row, path),
      0
    )
  }
  return 0
}

But this requires first confirming whether the backend returns table metrics for Transform.


@DanielLeens
Copy link
Copy Markdown
Contributor

Hi @davidzollo, I pulled this PR locally and reviewed the job-detail metrics path against the current dev baseline.

Local branch: seatunnel-review-10637
Head reviewed: 40cc831d83fe8eacd686b4a319aba0638fd799ef

What this PR fixes:

  • User pain: with the 2.3.13 backend, table-level metrics in the job detail page can be returned under vertex-prefixed keys such as Sink[0].fake.user_table instead of only the raw table name fake.user_table. The previous UI summed job.metrics?.[key][path], so the normal table row could render NaN when only prefixed keys were present.
  • Fix approach: the UI now resolves a metric key by first trying the focused vertex identifier plus table path, then falling back to the legacy raw table-path key, and finally to an unambiguous suffix match.
  • In one sentence: this keeps the job detail metric table compatible with both the newer prefixed REST response and the older raw-table response.

Runtime path checked:

User opens the Zeta job detail page
  -> detail.tsx fetches getJobInfo(jobId)
      -> backend BaseService.getJobInfoJson()/convertToJson() adds metrics from getJobMetrics()
          -> BaseService.processMetric() can emit keys like Sink[0].fake.user_table when several vertices share the same table
  -> UI detail.tsx tableData renders source/sink rows
      -> sourceCell()/sinkCell() call readVertexMetricValue()
          -> resolveMetricKey() tries ${vertexIdentifier}.${tablePath}
          -> falls back to raw tablePath for older backend responses
  -> drawer focusedVertex uses collectVertexMetrics()
      -> only metrics belonging to the focused vertex are shown

Key local evidence:

  • Before this PR, seatunnel-engine/seatunnel-engine-ui/src/views/jobs/detail.tsx read Number(job.metrics?.[key][path]) directly, so missing raw-table keys could become NaN.
  • After this PR, detail.tsx:103-123 uses readVertexMetricValue(), and detail-metrics.ts:29-79 resolves prefixed, raw, and unambiguous suffixed keys.
  • The backend path already supports vertex-prefixed table metrics: BaseService.processMetric() writes identifier + "." + tableName when attribution is available, and JobDAGInfo.toJsonObject() exposes the vertex names/table paths consumed by the UI.
  • The added Vitest coverage checks prefixed key preference, raw-key fallback, and focused-vertex metric collection.

Local checks performed:

  • git fetch upstream pull/10637/head:seatunnel-review-10637 --force: passed.
  • git checkout seatunnel-review-10637: passed.
  • git diff --stat upstream/dev...seatunnel-review-10637: 3 files, +219/-30.
  • Source-chain inspection covered detail.tsx, detail-metrics.ts, JobDAGInfo.toJsonObject(), and BaseService.processMetric().
  • Local build/tests were not run; this was a static/source review and the PR already adds focused UI helper tests.
  • GitHub checks observed: Build is failing; Notify test workflow and labeler are passing.

Merge conclusion

Conclusion: can merge after fixes

Blocking items:

  1. The PR should not be merged while the GitHub Build check is failing. The code-level change looks focused and correct, but CI must be green before merge.

Non-blocking suggestions:

  • If possible, include the failed Build log in the PR discussion or rerun CI, because the current GitHub status only shows the top-level Build failure from the review metadata.

Overall evaluation:

This is a precise front-end compatibility fix for the prefixed table-metric keys produced by the current backend. I did not find a source-level blocker in the implementation itself; once CI is green, this looks mergeable from the code path reviewed here.

Copy link
Copy Markdown
Contributor

@DanielLeens DanielLeens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution. I reviewed the latest head against the real job-detail metrics path from the backend metric map to the UI rendering.

What This PR Fixes

  • User pain: when the backend returns table metrics under vertex-prefixed keys like Source[0].table or Sink[1].table, the old job detail page still reads only the raw table name key, which can turn into NaN or mis-attribute values.
  • Fix approach: this PR introduces detail-metrics.ts as a compatibility layer so the UI first tries the prefixed key for the current vertex, then falls back to the legacy raw table key, and only then to an unambiguous suffix match.
  • One-line summary: this is a focused UI compatibility fix for mixed old/new metric-key shapes.

Runtime Chain Rechecked

backend BaseService.processMetric()
  -> may emit `${vertexIdentifier}.${tableName}` keys

job detail page
  -> detail.tsx sourceCell()/sinkCell()
      -> readVertexMetricValue(metricMap, row, tablePath)
          -> resolveMetricKey()
              -> prefixed key for this vertex
              -> raw table key
              -> unique suffix fallback

detail drawer
  -> collectVertexMetrics()
      -> keep only metrics belonging to the focused vertex

Findings

Issue 1: the PR is still blocked by a red Build

  • Location:
    • GitHub Build
    • fork run davidzollo/seatunnel#26333762529
  • Current visible failure:
    • Run / Code style
  • Why this matters:
    • this PR only changes frontend TS files, so the code-style lane needs to be green before merge
  • Severity: Medium

Test Stability Assessment

  • Rating: Stable
  • Basis:
    • detail-metrics.spec.ts is a pure-function test with no timing, network, or container dependency
    • I did not see any new flaky-test pattern in the diff

Merge Conclusion

Conclusion: can merge after fixes

  1. Blocking items
  • Issue 1: please get Run / Code style green first.
  1. Non-blocking suggestion
  • If more metric-key variants appear later, I would keep expanding the centralized helper instead of spreading string rules back into the page component.

Overall, I do not see a code-level blocker in the compatibility logic itself. The remaining gate is CI, not the direction of the implementation.

@davidzollo
Copy link
Copy Markdown
Contributor Author

Happy to see #10965 , I will close my PR

@davidzollo davidzollo closed this May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug][UI] Job detail metrics render as NaN against 2.3.13 backend

3 participants