fix(voice): skip end-of-turn metrics on stale/out-of-order speaking anchor by anshulkulhari7 · Pull Request #6098 · livekit/agents

anshulkulhari7 · 2026-06-14T13:01:24Z

Summary

For some user turns, the reported transcription_delay / end_of_turn_delay metrics on ChatMessage are extremely large (often >200s) even though the session recordings show no real delay, and stopped_speaking_at can precede started_speaking_at. In other turns the fields are missing entirely.

Root cause

The metrics are computed in _bounce_eou_task (audio_recognition.py) from three captured anchors:

started_speaking_at  = speech_start_time
stopped_speaking_at  = last_speaking_time           # the internal _last_speaking_time anchor
transcription_delay  = max(last_final_transcript_time - last_speaking_time, 0)
end_of_turn_delay    = time.time() - last_speaking_time

The guard around this block only checked that the three values are not None. When the turn detector commits a user turn whose _last_speaking_time was never refreshed for that segment — e.g. consecutive same-role turns split from one continuous utterance, with no VAD speech-stop/start cycle between them — the anchor is left over from an earlier point in the session and can predate the start of the current turn.

In that case the not-None guard still passes, so end_of_turn_delay = now - last_speaking_time becomes ~200s and stopped_speaking_at ends up before started_speaking_at, exactly the payload reported in the issue.

This is the same class of bug noted in #2361 / #5669 / #4388 (stale/0 anchor), now manifesting as an out-of-order anchor on adjacent turns within one long utterance.

Fix

An anchor that predates the start of the turn (last_speaking_time < speech_start_time) is logically impossible — you cannot stop speaking before the turn started. The existing code already has a policy for unreliable timing (see the in-code comment): skip the calculation and report the metrics as None, because that is better than emitting a likely-wrong value. This change extends that same policy to the out-of-order case.

The computation is extracted into a small pure helper, _compute_end_of_turn_metrics, which:

returns None for all four metrics when any anchor is missing or when last_speaking_time < speech_start_time (stale/out-of-order), and
otherwise returns the same values as before (with end_of_turn_delay now clamped to >= 0, consistent with the existing transcription_delay clamp).

This makes the behaviour directly unit-testable without audio/STT/VAD.

Testing

New unit test module tests/test_end_of_turn_metrics.py exercises the pure helper with crafted timestamps (no audio):

test_normal_turn_produces_small_bounded_delays — well-ordered turn yields the expected sub-second delays.
test_stale_anchor_predating_turn_start_is_skipped — regression for this issue, using the exact ~220s numbers from the reported payload; all four metrics must be None.
test_anchor_equal_to_start_is_accepted — boundary (last_speaking_time == speech_start_time) stays valid.
test_missing_anchor_is_skipped — any missing anchor skips the calculation.

$ uv run pytest tests/test_end_of_turn_metrics.py --unit -q
......                                                                   [100%]
6 passed in 0.02s

Confirmed RED before the fix (reverting the ordering guard): test_stale_anchor_predating_turn_start_is_skipped failed with started_speaking_at=1781342804.815377, end_of_turn_delay=220.28458189964294 — i.e. the bogus >200s value. The existing tests/test_speech_start_time_persistence.py still passes.

ruff check, ruff format --check, and mypy are clean on the changed files.

AI disclosure

This change was AI-assisted; all logic, tests, and verification were reviewed by the author.

…nchor When the turn detector commits a user turn whose _last_speaking_time anchor was never refreshed for that segment (e.g. consecutive same-role turns split from one continuous utterance), the anchor can be left over from an earlier point in the session and predate the start of the current turn. The metric computation only guarded against None values, so it still produced transcription_delay / end_of_turn_delay on the order of hundreds of seconds and a stopped_speaking_at that precedes started_speaking_at. Treat an out-of-order anchor (last_speaking_time < speech_start_time) the same as unreliable VAD timing: skip the calculation and report the metrics as None rather than emitting a likely-wrong value. Extract the computation into a pure _compute_end_of_turn_metrics helper and add unit tests covering the normal, boundary, stale-anchor, and missing-anchor cases. Fixes livekit#6093

CLAassistant · 2026-06-14T13:01:30Z

All committers have signed the CLA.

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no bugs or issues to report.

chenghao-mou

Thanks for the PR! I have one small comment.

chenghao-mou · 2026-06-14T20:06:31Z



+@dataclass
+class _EndOfTurnMetrics:


since _EndOfTurnInfo is internal, I would go one step further to replace the four variables in _EndOfTurnInfo with this directly so we don't have to unpack or pass around those values individually.

Done — folded the four metric fields on _EndOfTurnInfo into a single metrics: _EndOfTurnMetrics. The computed object is now passed straight through; _user_turn_completed_task, _init_metrics_from_end_of_turn, and the turn span read info.metrics.*. mypy strict (593 files) and the unit tests pass.

… duplicated fields Per review on livekit#6098: _EndOfTurnInfo (internal) carried the same four metric fields as _EndOfTurnMetrics. Replace them with a single metrics field so the computed value is passed through directly instead of unpacked and repacked. Readers (_user_turn_completed_task, _init_metrics_from_end_of_turn) and the turn span now read info.metrics.*.

anshulkulhari7 requested a review from a team as a code owner June 14, 2026 13:01

devin-ai-integration Bot reviewed Jun 14, 2026

View reviewed changes

chenghao-mou reviewed Jun 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(voice): skip end-of-turn metrics on stale/out-of-order speaking anchor#6098

fix(voice): skip end-of-turn metrics on stale/out-of-order speaking anchor#6098
anshulkulhari7 wants to merge 2 commits into
livekit:mainfrom
anshulkulhari7:fix/eou-metrics-stale-speaking-anchor

anshulkulhari7 commented Jun 14, 2026

Uh oh!

CLAassistant commented Jun 14, 2026 •

edited

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

chenghao-mou left a comment

Uh oh!

chenghao-mou Jun 14, 2026

Uh oh!

anshulkulhari7 Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

anshulkulhari7 commented Jun 14, 2026

Summary

Root cause

Fix

Testing

AI disclosure

Uh oh!

CLAassistant commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

chenghao-mou left a comment

Choose a reason for hiding this comment

Uh oh!

chenghao-mou Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

anshulkulhari7 Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

CLAassistant commented Jun 14, 2026 •

edited

Loading