You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CNDB-12962: Change the vector-specific config to only apply different defaults… (#1575)
... as different option names do not really make sense in per-table
options (the user will know if the table is a vector one).
Drop validation checks that may depend on -D options (which may differ
between replica and coordinator) and adjust invalid min vs target size
when -D options create nonsensical combinations.
Copy file name to clipboardExpand all lines: src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md
+21-7Lines changed: 21 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -475,20 +475,21 @@ UCS accepts these compaction strategy parameters:
475
475
expense of making reads more difficult.
476
476
N is the middle ground that has the features of levelled (one sstable run per level) as well as tiered (one
477
477
compaction to be promoted to the next level) and a fan factor of 2. This can also be specified as T2 or L2.
478
-
The default value is T4, matching the default STCS behaviour with threshold 4. To select an equivalent of LCS
479
-
with its default fan factor 10, use L10.
478
+
The default value is T4, matching the default STCS behaviour with threshold 4. The default value in vector mode (see
479
+
paragraph below) is L10, equivalent to LCS with its default fan factor 10.
480
480
*`target_sstable_size` The target sstable size $t$, specified as a human-friendly size in bytes (e.g. 100 MiB =
481
481
$100\cdot 2^{20}$ B or (10 MB = 10,000,000 B)). The strategy will split data in shards that aim to produce sstables
482
482
of size between $t / \sqrt 2$ and $t \cdot \sqrt 2$.
483
483
Smaller sstables improve streaming and repair, and make compactions shorter. On the other hand, each sstable
484
484
on disk has a non-trivial in-memory footprint that also affects garbage collection times.
485
485
Increase this if the memory pressure from the number of sstables in the system becomes too high. Also see
486
486
`sstable_growth` below.
487
-
The default value is 1 GiB.
487
+
The default value is 1 GiB. The default value in vector mode is 5GiB.
488
488
*`base_shard_count` The minimum number of shards $b$, used for levels with the smallest density. This gives the
489
489
minimum compaction concurrency for the lowest levels. A low number would result in larger L0 sstables but may limit
490
-
the overall maximum write throughput (as every piece of data has to go through L0). The base shard count only applies after `min_sstable_size` is reached.
491
-
The default value is 4 for all tables.
490
+
the overall maximum write throughput (as every piece of data has to go through L0). The base shard count only applies
491
+
after `min_sstable_size` is reached.
492
+
The default value is 4. The default value in vector mode is 1.
492
493
*`sstable_growth` The sstable growth component $\lambda$, applied as a factor in the shard exponent calculation.
493
494
This is a number between 0 and 1 that controls what part of the density growth should apply to individual sstable
494
495
size and what part should increase the number of shards. Using a value of 1 has the effect of fixing the shard
@@ -503,10 +504,12 @@ UCS accepts these compaction strategy parameters:
503
504
two can be further tweaked by increasing $\lambda$ to get fewer but bigger sstables on the top level, and decreasing
504
505
it to favour a higher count of smaller sstables.
505
506
The default value is 0.333 meaning the sstable size grows with the square root of the growth of the shard count.
507
+
The default value in vector mode is 1 which means the shard count will be fixed to the base value.
506
508
*`min_sstable_size` The minimum sstable size $m$, applicable when the base shard count will result is sstables
507
509
that are considered too small. If set, the strategy will split the space into fewer than the base count shards, to
508
-
make the estimated sstables size at least as large as this value. A value of 0 disables this feature. A value of `auto` sets the minimum sstable size to the size
509
-
of sstables resulting from flushes. The default value is 100MiB.
510
+
make the estimated sstables size at least as large as this value. A value of 0 disables this feature.
511
+
A value of `auto` sets the minimum sstable size to the size of sstables resulting from flushes.
512
+
The default value is 100MiB. The default value in vector mode is 1GiB.
510
513
*`reserved_threads` Specifies the number of threads to reserve per level. Any remaining threads will take
511
514
work according to the prioritization mechanism (i.e. higher overlap first). Higher reservations mean better
512
515
responsiveness of the compaction strategy to new work, or smoother performance, at the expense of reducing the
@@ -531,6 +534,17 @@ UCS accepts these compaction strategy parameters:
531
534
Disabled by default and cannot be used in combination with `base_shard_count`, `target_sstable_size` or
532
535
`sstable_growth`.
533
536
537
+
All UCS options can also be supplied as system properties, using the prefix `unified_compaction.`, e.g.
538
+
`-Dunified_compaction.sstable_growth=0.5` sets the default `sstable_growth` to 0.5.
539
+
540
+
In addition to this, the strategy permits different defaults to be applied to tables that have a vector column when the
541
+
system property `unified_compaction.override_ucs_config_for_vector_tables` is set to `true`. If this is enabled and the
542
+
table has a column of type `vector`, the "vector mode" defaults in the list above apply. These vector defaults can be
543
+
altered using the prefix `unified_compaction.vector_`, e.g.
544
+
`-Dunified_compaction.vector_sstable_growth=1` in combination with
545
+
`-Dunified_compaction.override_ucs_config_for_vector_tables=true` sets the growth to 1 only for tables with a vector
546
+
column.
547
+
534
548
In `cassandra.yaml`:
535
549
536
550
*`concurrent_compactors` The number of compaction threads available. Higher values increase compaction performance
0 commit comments