Skip to content

Pull requests: NVIDIA/Megatron-LM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

bump emerging optimizers to v0.3.0 deepseekv4 DeepSeek V4 PRs
#5321 opened Jun 12, 2026 by FDecaYed Contributor Draft
6 tasks
[dev] bump emerging optimizers to v0.3.0
#5320 opened Jun 12, 2026 by FDecaYed Contributor Loading…
6 tasks
Shifang/mtp loss split
#5318 opened Jun 12, 2026 by shifangx Contributor Loading…
5 tasks
Enable dbias_dprob triton kernel in TE GroupedLinear
#5315 opened Jun 12, 2026 by vasunvidia Contributor Draft
6 tasks
chore: nightly sync main into dev (12_06_2026) Run functional tests Run MBridge tests Attach this for testing this PR against MBridge main
#5314 opened Jun 12, 2026 by svcnvidia-nemo-ci Draft
Account for reasoning token stripping complexity: low Final Review PR is in the "final review" stage
#5313 opened Jun 12, 2026 by tdene Contributor Loading…
1 of 6 tasks
Add support for non-Gym multi-turn environments
#5312 opened Jun 12, 2026 by tdene Contributor Draft
6 tasks
Update SFT dataset and loss calculation
#5311 opened Jun 12, 2026 by parthmannan Contributor Draft
1 of 6 tasks
Support fused MLA QKV checkpoint reload complexity: low
#5310 opened Jun 11, 2026 by sraman-rgb Contributor Loading…
6 tasks
Add --mamba-training-ssm-states-dtype argument
#5309 opened Jun 11, 2026 by tdene Contributor Draft
6 tasks
Add RL rollout submission and consumption granularity controls
#5306 opened Jun 11, 2026 by lauradang Loading…
6 tasks done
Fix crash due to tool call at sequence length complexity: low
#5302 opened Jun 11, 2026 by tdene Contributor Loading…
1 of 6 tasks
Allow for pre-bound socket to be passed in server complexity: low Final Review PR is in the "final review" stage
#5301 opened Jun 11, 2026 by tdene Contributor Loading…
1 of 6 tasks
Handle None values in sampling parameters complexity: low Final Review PR is in the "final review" stage
#5300 opened Jun 11, 2026 by tdene Contributor Loading…
1 of 6 tasks
Avoid X11 master port default complexity: low
#5299 opened Jun 11, 2026 by guihong-nv Contributor Loading…
Add code owners for optimizer-related files complexity: low
#5297 opened Jun 11, 2026 by janEbert Contributor Loading…
1 task done
feat(fusions): fused mRoPE for Qwen3.5-VL complexity: high
#5294 opened Jun 11, 2026 by wplf Member Loading…
ProTip! Add no:assignee to see everything that’s not assigned.