[build] fix: restore ModelOpt-compatible TransformerEngine revision by yaoyu-33 · Pull Request #4615 · NVIDIA-NeMo/Megatron-Bridge

yaoyu-33 · 2026-07-01T15:12:42Z

Summary

Mirrors the automated release bump from #4613 and restores the TransformerEngine override to the ModelOpt-compatible d64bc14dc87eb658ab98839e4b7687595ee53e2d revision.

Target: release-r0.5.0 (base branch r0.5.0)
Classification: Bridge broke itself
Guards: none added or removed

Root cause

On 2026-06-26, release PR #4535 updated the Bridge override to TransformerEngine b9d690e0 while retaining nvidia-modelopt==0.44.0rc5. TransformerEngine now passes m_splits as an explicit grouped-linear argument, but that ModelOpt release still reads the first non_tensor_args item as the split sequence and raises TypeError: object of type 'bool' has no len().

The 2026-07-01 automated MCore bump #4613 exposes the already-present incompatibility in both H100 and GB200 Qwen3 MoE quantization jobs.

Fix

Override TransformerEngine to d64bc14dc87eb658ab98839e4b7687595ee53e2d, before the incompatible grouped-linear argument change.
Regenerate the corresponding TransformerEngine entries in uv.lock.
Leave the automated MCore pointer at chore(beep boop 🤖): Bump uv.lock (r0.5.0, mcore-core_r0.18.0) (2026-07-01) #4613's exact 458c8d0ecafdf6d9e36771600d62ade27f2a67b7 commit.

This is the release-line counterpart of #4600, whose H100 and GB200 Qwen quantization jobs both pass with the same TransformerEngine revision.

Validation

uv run pre-commit run --all-files — passed on 2026-07-01.
CW interactive job 13313466 on 2026-07-01:
- uv lock --check — passed (344 packages).
- NVTE_CUDA_ARCHS=90 uv sync --locked --group dev --group test --extra te — passed and installed TransformerEngine 2.16.0+d64bc14d.
- uv run --no-sync python -m pytest tests/unit_tests/models/gpt/test_gpt_builder.py tests/unit_tests/models/test_gpt_provider.py -v — 81 passed.
- Installed grouped-linear compatibility smoke — passed; signature is (ctx, inp, non_tensor_args, *weights_and_biases).

…-07-01) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Signed-off-by: Yu Yao <yaoyu.094@gmail.com>

copy-pr-bot · 2026-07-01T15:12:47Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

yaoyu-33 · 2026-07-01T15:12:57Z

/ok to test 7ea9c2f

claude · 2026-07-01T15:14:32Z

LGTM — clean, surgical dependency fix.

The PR correctly reverts the TE override to the ModelOpt-compatible revision (d64bc14d) while keeping the automated MCore bump from #4613. The lockfile delta is consistent: only the intended TE change plus expected transitive movements (e.g. ast-serialize 0.5.0→0.6.0, bracex 2.6→3.0) from floating CVE floors.

The full-test-suite label is applied, which is appropriate for a TE+MCore bump on a release branch.

Suggested test cases: No perf tests impacted.

dimapihtar and others added 2 commits July 1, 2026 06:46

chore(beep boop 🤖): Bump uv.lock (r0.5.0, mcore-core_r0.18.0) (2026…

c719d5e

…-07-01) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

fix: restore ModelOpt-compatible TransformerEngine revision

7ea9c2f

Signed-off-by: Yu Yao <yaoyu.094@gmail.com>

yaoyu-33 requested a review from a team as a code owner July 1, 2026 15:12

yaoyu-33 mentioned this pull request Jul 1, 2026

chore(beep boop 🤖): Bump uv.lock (r0.5.0, mcore-core_r0.18.0) (2026-07-01) #4613

Closed

copy-pr-bot Bot temporarily deployed to public July 1, 2026 15:13 Inactive

yaoyu-33 added the full-test-suite label Jul 1, 2026

copy-pr-bot Bot temporarily deployed to test July 1, 2026 15:13 Inactive

copy-pr-bot Bot temporarily deployed to public July 1, 2026 15:23 Inactive

copy-pr-bot Bot temporarily deployed to public July 1, 2026 15:44 Inactive

yaoyu-33 closed this Jul 1, 2026

yaoyu-33 mentioned this pull request Jul 2, 2026

chore(beep boop 🤖): Bump uv.lock (r0.5.0, mcore-core_r0.18.0) (2026-07-02) #4627

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[build] fix: restore ModelOpt-compatible TransformerEngine revision#4615

[build] fix: restore ModelOpt-compatible TransformerEngine revision#4615
yaoyu-33 wants to merge 2 commits into
r0.5.0from
yuya/mcore-release-r0.5.0-autofix-20260701-pr4613

yaoyu-33 commented Jul 1, 2026

Uh oh!

copy-pr-bot Bot commented Jul 1, 2026

Uh oh!

yaoyu-33 commented Jul 1, 2026

Uh oh!

claude Bot commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

yaoyu-33 commented Jul 1, 2026

Summary

Root cause

Fix

Validation

Uh oh!

copy-pr-bot Bot commented Jul 1, 2026

Uh oh!

yaoyu-33 commented Jul 1, 2026

Uh oh!

claude Bot commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants