Improve Nemotron3 Super B200 BF16 Config#4474
Closed
zuriz-nv wants to merge 1 commit into
Closed
Conversation
Contributor
|
LGTM. The refactoring is correct. BF16 is intentionally changed to match the GB200 config (tp=2, CUDA graphs enabled, no recompute). FP8_MX and NVFP4 are preserved with identical effective configs. Removing expert_model_parallel_size=64 from BASE_B200 is correct since BASE_NEMOTRON_3_SUPER_CONFIG already sets it to 64. Suggested test cases: test_nemotron_3_super_perf_config_instantiation and test_nemotron_3_super_perf_config_nvfp4 (both existing, only exercise GB300, do not cover B200). No perf tests impacted. |
fec1633 to
b2d5de5
Compare
malay-nagda
requested changes
Jun 24, 2026
malay-nagda
left a comment
Contributor
There was a problem hiding this comment.
@zuriz-nv can you post the TFLOPs and step time you get for all 3 precisions with these changes?
Signed-off-by: Zuri Zheng <zuriz@nvidia.com>
b2d5de5 to
8a39cf8
Compare
Author
|
Closing this PR as it has been superseded by PR #4621 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Update Nemotron3 Super B200 BF16 Config to use GB200 Config
NEMOTRON_3_SUPER_PRETRAIN_CONFIG_B200_NVFP4_V1already followsBASE_NEMOTRON_3_SUPER_CONFIG_GB200, meaning changingBASE_NEMOTRON_3_SUPER_CONFIG_B200would allow BF16 and NVFP4 to directly use it, only needing to changeNEMOTRON_3_SUPER_PRETRAIN_CONFIG_B200_FP8_MX_V1.BASE_NEMOTRON_3_SUPER_CONFIG_B200'sexpert_model_parallel_size=64is removed as it is already inBASE_NEMOTRON_3_SUPER_CONFIG