perf: Optimize `array_sort()` by neilconway · Pull Request #21083 · apache/datafusion

neilconway · 2026-03-20T19:45:24Z

Which issue does this PR close?

Closes Optimize array_sort #21005.
Closes Optimize array_sort with nested-list-aware sort kernel #21041.

Rationale for this change

The previous array_sort implementation called the Arrow sort kernel for every row, and then used concat to produce the final results. This was quite inefficient. Instead, we employee three different techniques depending on the input:

(1) For arrays of primitives types without null elements, we copy all values into a single Vec, sort each row's slice of the Vec in-place, and then wrap the Vec in a GenericListArray.

(2) For arrays of primitives types with null elements, we use a similar approach but we need to incur some more bookkeeping to place null elements in the right place and construct the null buffer.

(3) For arrays of non-primitive types, we use RowConverter to convert the entire input into the row format in one call, sort row indices by comparing the encoded row values, and then use a single take() to construct the result of the sort.

Benchmarks (8192 rows, vs main):

int32/5 elements:          886 µs →  57 µs  (-94%)
int32/20 elements:        1.64 ms → 846 µs  (-48%)
int32/100 elements:       4.03 ms → 3.22 ms (-20%)
int32_null_elements/5:    1.17 ms → 168 µs  (-86%)
int32_null_elements/1000: 47.2 ms → 44.1 ms  (-7%)
string/5 elements:        2.12 ms → 727 µs  (-66%)
string/1000 elements:      405 ms → 293 ms  (-28%)

What changes are included in this PR?

New array_sort benchmark
Extended unit test coverage
Improve docs
Implement optimizations as described above

Are these changes tested?

No.

Are there any user-facing changes?

No.

neilconway · 2026-03-20T19:47:12Z

FYI @Dandandan -- thanks for the suggestion about avoiding the per-row sort kernel, seems quite effective.

datafusion/functions-nested/src/sort.rs

Dandandan · 2026-03-20T21:42:01Z

datafusion/functions-nested/src/sort.rs

+    let values_start = offsets[0].as_usize();
+    let total_values = offsets[row_count].as_usize() - values_start;
+
+    let converter = RowConverter::new(vec![SortField::new_with_options(


Why using a RowConverter for this one? I think the path in arrow-rs is something like:

partition on nulls

create the indices (begin...end)

sort by the strings (using the indices)

Add the nulls if it starts with nulls

Add the values (using take)

Add the nulls if it ends with nulls

I briefly considered something like that, but I figured that all the pointer chasing would be pretty expensive. You're right that it's worth comparing though.

Here's a quick Claude-generated version -- lmk if you had something else in mind.

Benchmarking it against the RowComparator approach, RowComparator wins for medium-sized arrays (20 elements) and larger, and loses to the index-based comparison approach for small arrays:

┌─────────────┬──────────┬─────────────────┬─────────────────┐ │ Benchmark │ main │ RowConverter │ make_comparator │ ├─────────────┼──────────┼─────────────────┼─────────────────┤ │ string/5 │ 2.12 ms │ 727 µs (-66%) │ 608 µs (-71%) │ ├─────────────┼──────────┼─────────────────┼─────────────────┤ │ string/20 │ 5.94 ms │ 4.42 ms (-26%) │ 4.76 ms (-20%) │ ├─────────────┼──────────┼─────────────────┼─────────────────┤ │ string/100 │ 26.8 ms │ 22.6 ms (-16%) │ 25.1 ms (-6%) │ ├─────────────┼──────────┼─────────────────┼─────────────────┤ │ string/1000 │ 404.9 ms │ 293.1 ms (-28%) │ 403.9 ms (~0%) │ └─────────────┴──────────┴─────────────────┴─────────────────┘

Not sure offhand which typical real-world workloads look like; lmk if you have a view.

The Claude code looks close to what I had in mind except for the make_comparator and some non-null specializilation / Vec::push which can not be inlined and generate slow / branchy code.

I guess for strings you could also do the same tricks as used in arrow kernels, create: create some inlined key / small string as well for fast comparisons.

RowFilter is fine of course, but there is some higher fixed overhead upfront and has some higher space usage as well, so for single columns I think type specialization always wins.

Dandandan · 2026-03-20T21:47:26Z

Woah, great!

alamb · 2026-03-21T12:59:29Z

Merged up from main to get the fix for cargo audit

neilconway added 2 commits March 20, 2026 13:59

Add benchmark for array_sort()

7396c4c

Optimize array_sort

294d769

github-actions bot added documentation Improvements or additions to documentation sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Mar 20, 2026

neilconway mentioned this pull request Mar 20, 2026

perf: Optimize array_sort for arrays of non-primitive types #21006

Closed

Merge branch 'main' into neilc/array-sort-custom-kernel

8326cb1

Dandandan reviewed Mar 20, 2026

View reviewed changes

datafusion/functions-nested/src/sort.rs Outdated Show resolved Hide resolved

Dandandan reviewed Mar 20, 2026

View reviewed changes

Dandandan approved these changes Mar 21, 2026

View reviewed changes

Merge branch 'main' into neilc/array-sort-custom-kernel

430f315

neilconway added 2 commits March 21, 2026 11:24

Avoid making array_sort_inner public for benchmarks

d92f38e

cargo fmt

2ee21e1

comphead added this pull request to the merge queue Mar 22, 2026

Merged via the queue into apache:main with commit e5c69a4 Mar 22, 2026
33 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Optimize `array_sort()`#21083

perf: Optimize `array_sort()`#21083
comphead merged 6 commits intoapache:mainfrom
neilconway:neilc/array-sort-custom-kernel

neilconway commented Mar 20, 2026

Uh oh!

neilconway commented Mar 20, 2026

Uh oh!

Uh oh!

Dandandan Mar 20, 2026 •

edited

Loading

Uh oh!

neilconway Mar 21, 2026

Uh oh!

Dandandan Mar 21, 2026

Uh oh!

Dandandan commented Mar 20, 2026

Uh oh!

alamb commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

neilconway commented Mar 20, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

neilconway commented Mar 20, 2026

Uh oh!

Uh oh!

Dandandan Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

neilconway Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Dandandan Mar 21, 2026

Choose a reason for hiding this comment

Uh oh!

Dandandan commented Mar 20, 2026

Uh oh!

alamb commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Dandandan Mar 20, 2026 •

edited

Loading