Skip to content

Conversation

RaymondLi0
Copy link
Collaborator

No description provided.

@RaymondLi0 RaymondLi0 changed the base branch from multi-query-attention to before-merge June 20, 2023 20:12
@RaymondLi0 RaymondLi0 changed the base branch from before-merge to multi-query-attention June 20, 2023 20:12
ko3n1g and others added 28 commits August 16, 2025 11:36
ci: Add copy-pr-bot

See merge request ADLR/megatron-lm!3829
…model grads refactor

Co-authored-by: Pranav Prashant Thombre <[email protected]>
Co-authored-by: yaoyu-33 <[email protected]>
Co-authored-by: Mcore Bot <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: Hongbin Liu <[email protected]>
Co-authored-by: Yu Yao <[email protected]>
M4 p2p communication, schedules and finalize model grads refactor

See merge request ADLR/megatron-lm!3378
feat(moe): Add MoE router fusion

See merge request ADLR/megatron-lm!3809
Apex.contrib.nccl_allocator migration

See merge request ADLR/megatron-lm!3814
Signed-off-by: oliver könig <[email protected]>
Signed-off-by: oliver könig <[email protected]>
perf(MoE): Support recomputation for FP8 layernorm/moe_act/shared_experts

See merge request ADLR/megatron-lm!3465
…rallel inference

Co-authored-by: Mcore Bot <[email protected]>
Co-authored-by: Oliver Koenig <[email protected]>
Co-authored-by: Youngeun Kwon <[email protected]>
Co-authored-by: Helen Ngo <[email protected]>
Co-authored-by: Shifang Xu <[email protected]>
Co-authored-by: James Shen <[email protected]>
Co-authored-by: Kunlun Li <[email protected]>
Co-authored-by: Slawek Kierat <[email protected]>
Co-authored-by: Zijie Yan <[email protected]>
Co-authored-by: Li Tao <[email protected]>
Co-authored-by: Mikolaj Blaz <[email protected]>
Co-authored-by: Charlie Truong <[email protected]>
Co-authored-by: Dong Hyuk Chang <[email protected]>
Co-authored-by: Chenjie Luo <[email protected]>
ZMQ based communication of requests during parallel inference

See merge request ADLR/megatron-lm!3757
Add is_cg_capturable flag to CrossEntropyLoss to support full CUDA graph capture

See merge request ADLR/megatron-lm!3815
…ndently installable

Co-authored-by: jianbinc <[email protected]>
Co-authored-by: Youngeun Kwon <[email protected]>
Co-authored-by: Cory Ye <[email protected]>
Co-authored-by: Boxiang Wang <[email protected]>
[FSDP] Decouple Custom FSDP to make it independently installable

See merge request ADLR/megatron-lm!3443
…ation gives an error about missing eval_iters
This fixes the bug where not using full_validation gives an error about missing eval_iters

See merge request ADLR/megatron-lm!3842
Fix cuda graph when VPP is used

See merge request ADLR/megatron-lm!3824
chore: Upgrade dependencies (2025-08-18)

See merge request ADLR/megatron-lm!3834
gdengk and others added 30 commits September 19, 2025 11:31
… independetly parallel modules.

Co-authored-by: Mcore Bot <[email protected]>
Co-authored-by: Pranav Prashant Thombre <[email protected]>
Author: Robin Zhang <[email protected]>
Signed-off-by: oliver könig <[email protected]>
…gradient existence assertion to fully_shard tests.
… engine case with decode-only graphs"

This reverts commit 4cf968c.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.