pytorch / FBGEMM Public

Notifications You must be signed in to change notification settings
Fork 684
Star 1.5k

Code
Issues 59
Pull requests 587
Discussions
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Pull requests: pytorch/FBGEMM

Labels 46 Milestones 0

New pull request New

587 Open 4,335 Closed

Author

Filter by author

Uh oh!

There was an error while loading. Please reload this page.

Label

Filter by label

Uh oh!

There was an error while loading. Please reload this page.

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Uh oh!

There was an error while loading. Please reload this page.

Milestones

Filter by milestone

Uh oh!

There was an error while loading. Please reload this page.

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Uh oh!

There was an error while loading. Please reload this page.

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Update to batch processing delta update for kvzch and fix table idx issue. cla signed fb-exported meta-exported

#5148 opened Nov 18, 2025 by EddyLXJ

Loading…

shortcut for merge_pooled_embedding cla signed fb-exported meta-exported

#5147 opened Nov 18, 2025 by garroud

Loading…

Add support rowwise_adagrad_wtith_counter on CPU cla signed fb-exported meta-exported

#5146 opened Nov 18, 2025 by gchalump

Loading…

[fbgemm_gpu] Enable CUDA 13 builds in OSS cla signed

#5143 opened Nov 17, 2025 by q10

Loading…

Adding support for top_k=8 to index_shuffling moe kernel cla signed fb-exported meta-exported

#5142 opened Nov 17, 2025 by metastableB

Loading…

cutlass-fa3 new mask interface cla signed fb-exported meta-exported

#5141 opened Nov 17, 2025 by arsatis

Loading…

Bump setuptools from 75.1.0 to 78.1.1 in /fbgemm_gpu cla signed dependencies

Pull requests that update a dependency file

python

Pull requests that update python code

#5139 opened Nov 16, 2025 by dependabot bot

Loading…

Updated asmjit & adapted to its latest changes cla signed

#5137 opened Nov 15, 2025 by kobalicek

Loading…

Slightly improve requantize_ AVX2 code performance cla signed fb-exported meta-exported

#5135 opened Nov 14, 2025 by mcfi

Loading…

Vectorize requantize_ for Arm64 with NEON intrinsics cla signed fb-exported meta-exported

#5130 opened Nov 14, 2025 by mcfi

Loading…

Allow specifiying the use of persistent kernel cla signed fb-exported meta-exported

#5129 opened Nov 14, 2025 by peiying779

Loading…

Extend raw_id_tracker to track ShardedManagedCollisionEmbeddingCollection cla signed fb-exported meta-exported

#5128 opened Nov 14, 2025 by FriedCosey

Loading…

Enable arm64 convolution for fbgemm through the reference convolution APIs cla signed fb-exported meta-exported

#5126 opened Nov 13, 2025 by mcfi

Loading…

minimize gpuAtomicAdd overhead in bounds_check_indices_kernel_v2 cla signed

#5124 opened Nov 13, 2025 by liligwu

Loading…

Backward optimization for group_index_select_or_add_2d_kernel cla signed

#5123 opened Nov 13, 2025 by shbiswas834

Loading…

backward performance optimization for MI350 (#4925) cla signed fb-exported meta-exported module: rocm

#5121 opened Nov 12, 2025 by spcyppt

Loading…

embedding forward optimization for rocm cla signed module: rocm

#5120 opened Nov 12, 2025 by JaxChen29

Loading…

add group_index_select_or_add_2d_kernel optimizations cla signed module: rocm

#5119 opened Nov 12, 2025 by shbiswas834 • Draft

Add NEON implementation of FloatOrHalfToFusedNBitRowwiseQuantizedSBHalf cla signed fb-exported meta-exported

#5115 opened Nov 11, 2025 by Nicoshev

Loading…

Add support of 64 headDim cla signed fb-exported meta-exported

#5114 opened Nov 11, 2025 by Aya-ZIbra

Loading…

upgrade cutlass cla signed fb-exported meta-exported

#5112 opened Nov 11, 2025 by jianyuh

Loading…

accelerate permute_1D_data_kernel cla signed fb-exported meta-exported

#5110 opened Nov 10, 2025 by royren622

Loading…

Bm/cuda 13e cla signed

#5090 opened Nov 5, 2025 by q10

Loading…

Fix NAN for the prediction (#2096) cla signed fb-exported meta-exported

#5088 opened Nov 4, 2025 by quhang

Loading…

Several kernel optimization from aiter team cla signed module: rocm

#5074 opened Oct 31, 2025 by Bernard-Liu • Draft

Previous 1 2 3 4 5 … 23 24 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!