Skip to content

Conversation

@royren622
Copy link
Contributor

Summary: accelerate permute_1D_data_kernel using vectorization by 3x from 18.20ms ->6.1616ms (see benchmark below)

Reviewed By: arsatis

Differential Revision: D86492866

@netlify
Copy link

netlify bot commented Nov 10, 2025

Deploy Preview for pytorch-fbgemm-docs failed.

Name Link
🔨 Latest commit f4d797a
🔍 Latest deploy log https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/69126eac44b8dd000850ac8e

@meta-cla meta-cla bot added the cla signed label Nov 10, 2025
royren622 added a commit to royren622/FBGEMM that referenced this pull request Nov 10, 2025
Summary:

accelerate permute_1D_data_kernel using vectorization by 3x from 18.20ms ->6.1616ms (see benchmark below)

Reviewed By: arsatis

Differential Revision: D86492866
royren622 added a commit to royren622/FBGEMM that referenced this pull request Nov 18, 2025
Summary:
X-link: facebookresearch/FBGEMM#2115


accelerate permute_1D_data_kernel using vectorization by 3x from 18.20ms ->6.1616ms (see benchmark below)

Reviewed By: arsatis, francomomo

Differential Revision: D86492866
Summary:
X-link: facebookresearch/FBGEMM#2115


accelerate permute_1D_data_kernel using vectorization by 3x from 18.20ms ->6.1616ms (see benchmark below)

Reviewed By: arsatis, francomomo

Differential Revision: D86492866
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant