Commit 3ba2cff
committed
PERF: use StringHasTable for strings
xref pandas-dev#13745
provides a modest speedup for all string hashing. The
key thing is, it will release the GIL on more operations where this is
possible (mainly factorize).
can be easily extended to value_counts() and .duplicated() (for strings)
Author: Jeff Reback <[email protected]>
Closes pandas-dev#14859 from jreback/string and squashes the following commits:
98f46c2 [Jeff Reback] PERF: use StringHashTable for strings in factorizing1 parent 033d345 commit 3ba2cff
File tree
7 files changed
+330
-66
lines changed- asv_bench/benchmarks
- doc/source/whatsnew
- pandas
- core
- src
7 files changed
+330
-66
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
| |||
23 | 24 | | |
24 | 25 | | |
25 | 26 | | |
| 27 | + | |
26 | 28 | | |
27 | 29 | | |
28 | 30 | | |
29 | 31 | | |
30 | 32 | | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
31 | 36 | | |
32 | 37 | | |
33 | 38 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
379 | 379 | | |
380 | 380 | | |
381 | 381 | | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
135 | 135 | | |
136 | 136 | | |
137 | 137 | | |
138 | | - | |
| 138 | + | |
139 | 139 | | |
140 | 140 | | |
141 | 141 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
65 | 65 | | |
66 | 66 | | |
67 | 67 | | |
68 | | - | |
| 68 | + | |
69 | 69 | | |
70 | 70 | | |
71 | 71 | | |
| |||
102 | 102 | | |
103 | 103 | | |
104 | 104 | | |
105 | | - | |
| 105 | + | |
106 | 106 | | |
107 | 107 | | |
108 | 108 | | |
| |||
759 | 759 | | |
760 | 760 | | |
761 | 761 | | |
762 | | - | |
| 762 | + | |
763 | 763 | | |
764 | 764 | | |
765 | 765 | | |
| 766 | + | |
| 767 | + | |
766 | 768 | | |
767 | 769 | | |
768 | 770 | | |
| |||
773 | 775 | | |
774 | 776 | | |
775 | 777 | | |
776 | | - | |
777 | | - | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
778 | 785 | | |
779 | 786 | | |
780 | 787 | | |
781 | 788 | | |
| 789 | + | |
782 | 790 | | |
783 | 791 | | |
784 | 792 | | |
785 | 793 | | |
786 | 794 | | |
| 795 | + | |
| 796 | + | |
787 | 797 | | |
788 | 798 | | |
789 | 799 | | |
| |||
796 | 806 | | |
797 | 807 | | |
798 | 808 | | |
799 | | - | |
| 809 | + | |
800 | 810 | | |
| 811 | + | |
| 812 | + | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
801 | 822 | | |
802 | 823 | | |
803 | 824 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
7 | | - | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
8 | 12 | | |
9 | 13 | | |
10 | 14 | | |
| |||
33 | 37 | | |
34 | 38 | | |
35 | 39 | | |
36 | | - | |
| 40 | + | |
37 | 41 | | |
38 | 42 | | |
39 | 43 | | |
| |||
0 commit comments