Skip to content

[Dev] Numerical fix for moe single grouped weight with fp8 fp4 primary weight and grad norm spikes#5464

Open
zhongbozhu wants to merge 19 commits into
NVIDIA:devfrom
zhongbozhu:dev_fix_single_weight
Open

[Dev] Numerical fix for moe single grouped weight with fp8 fp4 primary weight and grad norm spikes#5464
zhongbozhu wants to merge 19 commits into
NVIDIA:devfrom
zhongbozhu:dev_fix_single_weight

fix grouped tensor remap bug, improve UT

812f72d
Select commit
Loading
Failed to load commit list.
DCO / DCO succeeded Jun 30, 2026 in 1s

DCO

All commits are signed off!