Commit b195f11
committed
(wip) gemlite integration and llama batchsize>1
Summary:
compile isn't working with gemlite, probably need to rewrite the kernel
wrapper in a more compatible way, added batch size > 1 to llama model
see benchmark_results.txt for numbers
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
new gemlite integration using pip install
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
tests ran
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
fixing gemlite to do int4 matmul instead of fp16 fp16
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
running tests
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
more testing
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
AQT integration wip
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Wip
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
testing on gemlite a100_int8_tuning branch
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
gemlite subclass testing bitpacking 8 bits
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
bug fixing stuff
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
hicham fixes
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
new benchmarks
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
testing gemlite 8 bit
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
WIP
Summary:
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:1 parent 039cef4 commit b195f11
File tree
9 files changed
+900
-149
lines changed- torchao
- _models/llama
- dtypes
- uintx
- quantization
9 files changed
+900
-149
lines changedLarge diffs are not rendered by default.
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
171 | 172 | | |
172 | 173 | | |
173 | 174 | | |
174 | | - | |
| 175 | + | |
175 | 176 | | |
176 | 177 | | |
177 | 178 | | |
| |||
368 | 369 | | |
369 | 370 | | |
370 | 371 | | |
| 372 | + | |
371 | 373 | | |
372 | 374 | | |
373 | 375 | | |
| |||
377 | 379 | | |
378 | 380 | | |
379 | 381 | | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
380 | 489 | | |
381 | 490 | | |
382 | 491 | | |
| |||
1053 | 1162 | | |
1054 | 1163 | | |
1055 | 1164 | | |
| 1165 | + | |
1056 | 1166 | | |
1057 | 1167 | | |
1058 | 1168 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
170 | 170 | | |
171 | 171 | | |
172 | 172 | | |
173 | | - | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
174 | 176 | | |
175 | 177 | | |
176 | 178 | | |
| |||
243 | 245 | | |
244 | 246 | | |
245 | 247 | | |
246 | | - | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
247 | 253 | | |
248 | 254 | | |
249 | 255 | | |
| |||
311 | 317 | | |
312 | 318 | | |
313 | 319 | | |
314 | | - | |
| 320 | + | |
315 | 321 | | |
316 | 322 | | |
317 | 323 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
225 | 225 | | |
226 | 226 | | |
227 | 227 | | |
| 228 | + | |
228 | 229 | | |
229 | 230 | | |
230 | 231 | | |
| |||
233 | 234 | | |
234 | 235 | | |
235 | 236 | | |
236 | | - | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
237 | 243 | | |
238 | 244 | | |
239 | 245 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
46 | 50 | | |
47 | 51 | | |
48 | 52 | | |
| |||
135 | 139 | | |
136 | 140 | | |
137 | 141 | | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
138 | 146 | | |
139 | 147 | | |
140 | 148 | | |
| |||
0 commit comments