Commit af22336
perf: improve performance of
## Which issue does this PR close?
N/A
## Rationale for this change
Making multi column aggregation even faster
## What changes are included in this PR?
In `PrimitiveGroupValueBuilder.vectorized_equal_to` always evaluate and
use unchecked as both of these changes are what making the code compile
to SIMD.
## Are these changes tested?
Existing tests
## Are there any user-facing changes?
Nope
-----
I tried a LOT of variations [GodBolt](https://godbolt.org/z/Kc8ze6E9n)
from splitting to fixed size chunks and trying to get auto-vectorization
to use gather and creating bitmask to even testing portable SIMD (just
to see what it will generate).
this version only optimize the non null path for the moment as it is the
easiest.
once and if we change from `&mut [bool]` to mutable packed bits we
could:
1. evaluate in chunks of `64` items (I tried different variations to see
what is the best - you can tweak in the godbolt above with different
type and size to check for yourself), 64 is not necessarily the best but
it will be the fastest I think for doing AND with the `equal_to_results`
boolean buffer
2. add optimization for nullable as well by just doing bitwise operation
at 64 items at a time and avoid the cost of getting each bit manually
3. skip 64 items right away if the the `equal_to_results` equal to
`0x00` (i.e. all false)
---------
Co-authored-by: Andrew Lamb <[email protected]>vectorized_equal_to for PrimitiveGroupValueBuilder in multi group by aggregation (#17977)1 parent 8b8bdc4 commit af22336
File tree
2 files changed
+96
-26
lines changed- datafusion/physical-plan/src/aggregates/group_values
- multi_group_by
2 files changed
+96
-26
lines changedLines changed: 88 additions & 26 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
56 | 56 | | |
57 | 57 | | |
58 | 58 | | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
59 | 138 | | |
60 | 139 | | |
61 | 140 | | |
| |||
99 | 178 | | |
100 | 179 | | |
101 | 180 | | |
102 | | - | |
103 | | - | |
104 | | - | |
105 | | - | |
106 | | - | |
107 | | - | |
108 | | - | |
109 | | - | |
110 | | - | |
111 | | - | |
112 | | - | |
113 | | - | |
114 | | - | |
115 | | - | |
116 | | - | |
117 | | - | |
118 | | - | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
125 | | - | |
126 | | - | |
127 | | - | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
128 | 190 | | |
129 | 191 | | |
130 | 192 | | |
| |||
Lines changed: 8 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
92 | 100 | | |
0 commit comments