Skip to content

Commit 68653f2

Browse files
authored
CNDB-12962: Change the vector-specific config to only apply different defaults… (#1575)
... as different option names do not really make sense in per-table options (the user will know if the table is a vector one). Drop validation checks that may depend on -D options (which may differ between replica and coordinator) and adjust invalid min vs target size when -D options create nonsensical combinations.
1 parent c794818 commit 68653f2

File tree

11 files changed

+298
-70
lines changed

11 files changed

+298
-70
lines changed

src/java/org/apache/cassandra/db/compaction/UnifiedCompactionContainer.java

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,7 @@ public class UnifiedCompactionContainer implements CompactionStrategyContainer
5050
private final CompactionParams metadataParams;
5151
private final UnifiedCompactionStrategy strategy;
5252
private final boolean enableAutoCompaction;
53+
private final boolean hasVector;
5354

5455
AtomicBoolean enabled;
5556

@@ -66,6 +67,7 @@ public class UnifiedCompactionContainer implements CompactionStrategyContainer
6667
this.strategy = new UnifiedCompactionStrategy(factory, backgroundCompactions, params.options());
6768
this.enabled = new AtomicBoolean(enabled);
6869
this.enableAutoCompaction = enableAutoCompaction;
70+
this.hasVector = strategy.getController().hasVectorType();
6971

7072
factory.getCompactionLogger().strategyCreated(this.strategy);
7173

@@ -142,6 +144,14 @@ public CompactionStrategyContainer reload(@Nonnull CompactionStrategyContainer p
142144
return create(previous, factory, compactionParams, reason, enableAutoCompaction);
143145
}
144146

147+
@Override
148+
public boolean shouldReload(CompactionParams params, ReloadReason reason)
149+
{
150+
return reason != CompactionStrategyContainer.ReloadReason.METADATA_CHANGE
151+
|| !params.equals(getMetadataCompactionParams())
152+
|| hasVector != factory.getRealm().metadata().hasVectorType();
153+
}
154+
145155
private static CompactionParams createMetadataParams(@Nullable CompactionStrategyContainer previous,
146156
CompactionParams compactionParams,
147157
ReloadReason reason)

src/java/org/apache/cassandra/db/compaction/UnifiedCompactionStrategy.md

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -475,20 +475,21 @@ UCS accepts these compaction strategy parameters:
475475
expense of making reads more difficult.
476476
N is the middle ground that has the features of levelled (one sstable run per level) as well as tiered (one
477477
compaction to be promoted to the next level) and a fan factor of 2. This can also be specified as T2 or L2.
478-
The default value is T4, matching the default STCS behaviour with threshold 4. To select an equivalent of LCS
479-
with its default fan factor 10, use L10.
478+
The default value is T4, matching the default STCS behaviour with threshold 4. The default value in vector mode (see
479+
paragraph below) is L10, equivalent to LCS with its default fan factor 10.
480480
* `target_sstable_size` The target sstable size $t$, specified as a human-friendly size in bytes (e.g. 100 MiB =
481481
$100\cdot 2^{20}$ B or (10 MB = 10,000,000 B)). The strategy will split data in shards that aim to produce sstables
482482
of size between $t / \sqrt 2$ and $t \cdot \sqrt 2$.
483483
Smaller sstables improve streaming and repair, and make compactions shorter. On the other hand, each sstable
484484
on disk has a non-trivial in-memory footprint that also affects garbage collection times.
485485
Increase this if the memory pressure from the number of sstables in the system becomes too high. Also see
486486
`sstable_growth` below.
487-
The default value is 1 GiB.
487+
The default value is 1 GiB. The default value in vector mode is 5GiB.
488488
* `base_shard_count` The minimum number of shards $b$, used for levels with the smallest density. This gives the
489489
minimum compaction concurrency for the lowest levels. A low number would result in larger L0 sstables but may limit
490-
the overall maximum write throughput (as every piece of data has to go through L0). The base shard count only applies after `min_sstable_size` is reached.
491-
The default value is 4 for all tables.
490+
the overall maximum write throughput (as every piece of data has to go through L0). The base shard count only applies
491+
after `min_sstable_size` is reached.
492+
The default value is 4. The default value in vector mode is 1.
492493
* `sstable_growth` The sstable growth component $\lambda$, applied as a factor in the shard exponent calculation.
493494
This is a number between 0 and 1 that controls what part of the density growth should apply to individual sstable
494495
size and what part should increase the number of shards. Using a value of 1 has the effect of fixing the shard
@@ -503,10 +504,12 @@ UCS accepts these compaction strategy parameters:
503504
two can be further tweaked by increasing $\lambda$ to get fewer but bigger sstables on the top level, and decreasing
504505
it to favour a higher count of smaller sstables.
505506
The default value is 0.333 meaning the sstable size grows with the square root of the growth of the shard count.
507+
The default value in vector mode is 1 which means the shard count will be fixed to the base value.
506508
* `min_sstable_size` The minimum sstable size $m$, applicable when the base shard count will result is sstables
507509
that are considered too small. If set, the strategy will split the space into fewer than the base count shards, to
508-
make the estimated sstables size at least as large as this value. A value of 0 disables this feature. A value of `auto` sets the minimum sstable size to the size
509-
of sstables resulting from flushes. The default value is 100MiB.
510+
make the estimated sstables size at least as large as this value. A value of 0 disables this feature.
511+
A value of `auto` sets the minimum sstable size to the size of sstables resulting from flushes.
512+
The default value is 100MiB. The default value in vector mode is 1GiB.
510513
* `reserved_threads` Specifies the number of threads to reserve per level. Any remaining threads will take
511514
work according to the prioritization mechanism (i.e. higher overlap first). Higher reservations mean better
512515
responsiveness of the compaction strategy to new work, or smoother performance, at the expense of reducing the
@@ -531,6 +534,17 @@ UCS accepts these compaction strategy parameters:
531534
Disabled by default and cannot be used in combination with `base_shard_count`, `target_sstable_size` or
532535
`sstable_growth`.
533536

537+
All UCS options can also be supplied as system properties, using the prefix `unified_compaction.`, e.g.
538+
`-Dunified_compaction.sstable_growth=0.5` sets the default `sstable_growth` to 0.5.
539+
540+
In addition to this, the strategy permits different defaults to be applied to tables that have a vector column when the
541+
system property `unified_compaction.override_ucs_config_for_vector_tables` is set to `true`. If this is enabled and the
542+
table has a column of type `vector`, the "vector mode" defaults in the list above apply. These vector defaults can be
543+
altered using the prefix `unified_compaction.vector_`, e.g.
544+
`-Dunified_compaction.vector_sstable_growth=1` in combination with
545+
`-Dunified_compaction.override_ucs_config_for_vector_tables=true` sets the growth to 1 only for tables with a vector
546+
column.
547+
534548
In `cassandra.yaml`:
535549

536550
* `concurrent_compactors` The number of compaction threads available. Higher values increase compaction performance

src/java/org/apache/cassandra/db/compaction/unified/AdaptiveController.java

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,7 @@ public AdaptiveController(MonotonicClock clock,
119119
Reservations.Type reservationsType,
120120
Overlaps.InclusionMethod overlapInclusionMethod,
121121
boolean parallelizeOutputShards,
122+
boolean hasVectorType,
122123
int intervalSec,
123124
int minScalingParameter,
124125
int maxScalingParameter,
@@ -146,7 +147,8 @@ public AdaptiveController(MonotonicClock clock,
146147
reservedThreadsPerLevel,
147148
reservationsType,
148149
overlapInclusionMethod,
149-
parallelizeOutputShards);
150+
parallelizeOutputShards,
151+
hasVectorType);
150152

151153
this.scalingParameters = scalingParameters;
152154
this.previousScalingParameters = previousScalingParameters;
@@ -175,8 +177,9 @@ static Controller fromOptions(Environment env,
175177
double sstableGrowthModifier,
176178
int reservedThreadsPerLevel,
177179
Reservations.Type reservationsType,
178-
boolean parallelizeOutputShards,
179180
Overlaps.InclusionMethod overlapInclusionMethod,
181+
boolean parallelizeOutputShards,
182+
boolean hasVectorType,
180183
String keyspaceName,
181184
String tableName,
182185
Map<String, String> options)
@@ -276,6 +279,7 @@ else if (staticScalingFactors != null)
276279
reservationsType,
277280
overlapInclusionMethod,
278281
parallelizeOutputShards,
282+
hasVectorType,
279283
intervalSec,
280284
minScalingParameter,
281285
maxScalingParameter,

0 commit comments

Comments
 (0)