Commit 620b4aa
kv-cache : avoid modifying recurrent cells when setting inputs (ggml-org#13834)
* kv-cache : avoid modifying recurrent cells when setting inputs
* kv-cache : remove inp_s_mask
It was replaced with equivalent and simpler functionality
with rs_z (the first zeroed state) and the already-existing inp_s_copy.
* kv-cache : fix non-consecutive token pos warning for recurrent models
The problem was apparently caused by how the tail cells were swapped.
* graph : simplify logic for recurrent state copies
* kv-cache : use cell without src refs for rs_z in recurrent cache
* llama-graph : fix recurrent state copy
The `state_copy` shuffle assumes everything is moved at once,
which is not true when `states_extra` is copied back to the cache
before copying the range of states between `head` and `head + n_seqs`.
This is only a problem if any of the cells in [`head`, `head + n_seqs`)
have an `src` in [`head + n_seqs`, `head + n_kv`),
which does happen when `n_ubatch > 1` in the `llama-parallel` example.
Changing the order of the operations avoids the potential overwrite
before use, although when copies are avoided (like with Mamba2),
this will require further changes.
* llama-graph : rename n_state to state_size in build_recurrent_state
This naming should reduce confusion between the state size
and the number of states.1 parent 144209a commit 620b4aa
File tree
4 files changed
+810
-2133
lines changed- src
4 files changed
+810
-2133
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
250 | 250 | | |
251 | 251 | | |
252 | 252 | | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
267 | | - | |
268 | | - | |
269 | 253 | | |
270 | 254 | | |
271 | 255 | | |
| |||
987 | 971 | | |
988 | 972 | | |
989 | 973 | | |
990 | | - | |
991 | | - | |
992 | | - | |
993 | | - | |
994 | | - | |
995 | | - | |
996 | | - | |
997 | | - | |
998 | | - | |
999 | | - | |
1000 | | - | |
1001 | | - | |
1002 | | - | |
1003 | | - | |
1004 | | - | |
1005 | | - | |
1006 | | - | |
1007 | 974 | | |
1008 | 975 | | |
1009 | 976 | | |
| |||
1456 | 1423 | | |
1457 | 1424 | | |
1458 | 1425 | | |
1459 | | - | |
| 1426 | + | |
1460 | 1427 | | |
1461 | 1428 | | |
1462 | 1429 | | |
1463 | | - | |
1464 | | - | |
1465 | | - | |
| 1430 | + | |
| 1431 | + | |
| 1432 | + | |
1466 | 1433 | | |
1467 | 1434 | | |
1468 | 1435 | | |
1469 | 1436 | | |
| 1437 | + | |
1470 | 1438 | | |
1471 | | - | |
| 1439 | + | |
1472 | 1440 | | |
1473 | | - | |
1474 | | - | |
1475 | | - | |
1476 | | - | |
| 1441 | + | |
| 1442 | + | |
| 1443 | + | |
| 1444 | + | |
1477 | 1445 | | |
1478 | | - | |
1479 | | - | |
1480 | | - | |
| 1446 | + | |
| 1447 | + | |
| 1448 | + | |
| 1449 | + | |
| 1450 | + | |
| 1451 | + | |
| 1452 | + | |
| 1453 | + | |
| 1454 | + | |
| 1455 | + | |
| 1456 | + | |
| 1457 | + | |
| 1458 | + | |
1481 | 1459 | | |
1482 | | - | |
| 1460 | + | |
| 1461 | + | |
1483 | 1462 | | |
1484 | 1463 | | |
1485 | | - | |
1486 | | - | |
| 1464 | + | |
| 1465 | + | |
1487 | 1466 | | |
1488 | | - | |
1489 | | - | |
| 1467 | + | |
1490 | 1468 | | |
1491 | 1469 | | |
1492 | 1470 | | |
1493 | 1471 | | |
1494 | 1472 | | |
1495 | | - | |
1496 | 1473 | | |
1497 | 1474 | | |
1498 | 1475 | | |
| |||
1503 | 1480 | | |
1504 | 1481 | | |
1505 | 1482 | | |
1506 | | - | |
1507 | | - | |
| 1483 | + | |
| 1484 | + | |
1508 | 1485 | | |
1509 | 1486 | | |
1510 | 1487 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
200 | 200 | | |
201 | 201 | | |
202 | 202 | | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | 203 | | |
216 | 204 | | |
217 | 205 | | |
| |||
521 | 509 | | |
522 | 510 | | |
523 | 511 | | |
524 | | - | |
525 | 512 | | |
526 | 513 | | |
527 | 514 | | |
| |||
606 | 593 | | |
607 | 594 | | |
608 | 595 | | |
609 | | - | |
| 596 | + | |
610 | 597 | | |
611 | 598 | | |
612 | 599 | | |
613 | | - | |
614 | | - | |
615 | | - | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
616 | 603 | | |
617 | 604 | | |
618 | 605 | | |
619 | 606 | | |
620 | | - | |
621 | 607 | | |
622 | 608 | | |
623 | 609 | | |
| |||
0 commit comments