Skip to content

Commit 278e302

Browse files
committed
3 fixes from team
1 parent cc0e47e commit 278e302

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

_posts/2024-09-25-pytorch-native-architecture-optimizaion.md renamed to _posts/2024-09-25-pytorch-native-architecture-optimization.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ title: "CUDA-Free Inference for LLMs"
44
author: Team PyTorch
55
---
66

7-
# PyTorch Native Architecture Optimization: torchao
7+
# PyTorch Native Architecture Optimization: torchao
88

99
By Team PyTorch
1010

@@ -63,7 +63,7 @@ from torchao.quantization import (
6363

6464
![](/assets/images/Figure_1.png){:style="width:100%"}
6565

66-
We also have extensive benchmarks on diffusion models in collaboration with the HuggingFace diffusers team in [diffusers-torchao](https://github.com/sayakpaul/diffusers-torchao.) where we demonstrated 53.88% speedup on Flux.1-Dev and 27.33% speedup on CogVideoX-5b
66+
We also have extensive benchmarks on diffusion models in collaboration with the HuggingFace diffusers team in [diffusers-torchao](https://github.com/sayakpaul/diffusers-torchao) where we demonstrated 53.88% speedup on Flux.1-Dev and 27.33% speedup on CogVideoX-5b
6767

6868
Our APIs are composable so we’ve for example composed sparsity and quantization to bring 5% [speedup for ViT-H inference](https://github.com/pytorch/ao/tree/main/torchao/sparsity)
6969

@@ -119,7 +119,7 @@ We’ve been actively working on making sure torchao works well in some of the m
119119

120120
#
121121

122-
# Conclusion
122+
## Conclusion
123123

124124
If you’re interested in making your models faster and smaller for training or inference, we hope you’ll find torchao useful and easy to integrate.
125125

0 commit comments

Comments
 (0)