Skip to content

Conversation

jasonstack
Copy link

What is the issue

https://github.com/riptano/cndb/issues/15253 large shard count jump caused HCD-130

What does this PR fix and why was it fixed

Replace power-of-two shard progression with factorization-based smooth growth when num_shards is explicitly configured. This prevents problematic large jumps that can cause data loss (e.g., HCD-130). For example, with num_shards 1000 which is 5^3* 2^3:

  • Before: 1 → 2 → 8 → 1000 (125x jump causes data loss due to hcd-130)
  • After: 1 → 5 → 25 → 125 → 250 → 500 → 1000 (max 5x jump)
  • new behavior can be disabled via -Dunified_compaction.use_factorization_shard_count_growth

@jasonstack jasonstack requested a review from blambov October 1, 2025 03:26
Copy link

github-actions bot commented Oct 1, 2025

Checklist before you submit for review

  • This PR adheres to the Definition of Done
  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits
  • All new files should contain the DataStax copyright header instead of the Apache License one

static final boolean USE_FACTORIZATION_SHARD_COUNT_GROWTH = Boolean.parseBoolean(getSystemProperty("use_factorization_shard_count_growth", "true"));

/** Cache for factorized shard sequences to avoid repeated calculations for the same target values */
private static final Map<Integer, List<Integer>> shardSequenceCache = new ConcurrentHashMap<>();
Copy link

@blambov blambov Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of making this a static cache, we can calculate it for the given baseShardCount during initialization (the count is final, so we can make the calculation final as well).

The cost of calculating the factors is not negligible (i.e. we could not do it on every getNumShards call), but is not as high as to prohibit it being done on every Controller construction, and it will save the map lookup on every getNumShards.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, updated to private final Optional<int[]> factorizedShardSequence

@VisibleForTesting
boolean useFactorizationShardCountGrowth()
{
return USE_FACTORIZATION_SHARD_COUNT_GROWTH && sstableGrowthModifier == 1.0 && baseShardCount > 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should always use this calculation whenever baseShardCount is not a power of two, regardless of sstableGrowthModifier.

The only reason to not use it for powers of two is that the existing formula provides the same result quicker.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to USE_FACTORIZATION_SHARD_COUNT_GROWTH && baseShardCount > 0 && Integer.bitCount(baseShardCount) != 1

}
else
{
// Keep original power-of-two logic for dynamic growth mode
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In dynamic growth mode, we also respect baseShardCount and it makes no sense to not apply the smoother logic there as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed this comment..

int result = 1;
for (int shards : shardSequence)
{
if (shards <= currentCountBasedOnDensity)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not a binary search?

Copy link
Author

@jasonstack jasonstack Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to binary search. Initially I thought iterating array would be as fast because of cpu cache


// Check odd factors up to sqrt(num)
int sqrtNum = (int) Math.sqrt(num);
for (int factor = 3; factor <= sqrtNum; factor += 2)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do
for (int factor = 3; factor*factor <= num; factor += 2)
to avoid the square root and also finish quicker by taking into account the reduction of num.

Copy link
Author

@jasonstack jasonstack Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to for (int factor = 3; factor <= num / factor; factor += 2). this is safer..

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, the multiplication will overflow when the limit is e.g. Integer.MAX_VALUE.

I'd personally still use 1L * factor * factor <= num which does not involve division, but it's a moot point since we are performing the modulus for the same division on the next line anyway, and a compiler could optimize one of them away.

List<Integer> primes = primeFactors(target);

// 2) Sort factors in descending order
primes.sort(Comparator.reverseOrder());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use primes.reversed() or just do the loop by decreasing index.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to loop by decreasing index

primes.sort(Comparator.reverseOrder());

// 3) Cumulative product to form the chain.
List<Integer> divisors = new ArrayList<>();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use int[] (we know the size at this point).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to int[]


// insertion point
int insertionPoint = -idx - 1;
int candidateIndex = Math.max(0, insertionPoint - 1);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: max is not needed here (-(-1) -1 is 0). Generally it could result in sequence.length, but that will happen only if the count is larger than the base shard count, which won't end up in this function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, -idx - 1 is always positive. It's searching for the value before insertion point via insertionPoint - 1, so it uses max when insertionPoint is 0.

// Test 3200 = 5^2 * 2^7
int[] expected3200 = new int[]{ 1, 5, 25, 50, 100, 200, 400, 800, 1600, 3200 };
assertArrayEquals(expected3200, Controller.factorizedSmoothShardSequence(3200));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a test with Integer.MAX_VALUE as well.


// Check odd factors up to sqrt(num)
int sqrtNum = (int) Math.sqrt(num);
for (int factor = 3; factor <= sqrtNum; factor += 2)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, the multiplication will overflow when the limit is e.g. Integer.MAX_VALUE.

I'd personally still use 1L * factor * factor <= num which does not involve division, but it's a moot point since we are performing the modulus for the same division on the next line anyway, and a compiler could optimize one of them away.

* use prime factorization to create smooth sequences like 1→5→25→125→250→500→1000 for num_shards=1000.
* This prevents the large jumps that can cause data loss issues (e.g., HCD-130).
* <p>
* Only applies when num_shards is explicitly configured
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is no longer correct, and maybe the first line of the doc should also be changed to base_shard_count not being a power of two.

…nt growth when num_shards and min sstable size are configured

* Replace power-of-two shard progression with factorization-based smooth growth when num_shards is explicitly configured. This prevents problematic large jumps that can cause data loss (e.g., HCD-130). For example, with num_shards 1000 which is 5^3* 2^3:

  Before: 1 → 2 → 8 → 1000 (125x jump causes data loss)
  After:  1 → 5 → 25 → 125 → 250 → 500 → 1000 (max 5x jump)

* new behavior can be disabled via `-Dunified_compaction.use_factorization_shard_count_growth`
Copy link

sonarqubecloud bot commented Oct 3, 2025

@cassci-bot
Copy link

❌ Build ds-cassandra-pr-gate/PR-2031 rejected by Butler


2 regressions found
See build details here


Found 2 new test failures

Test Explanation Runs Upstream
o.a.c.distributed.test.FullRepairCoordinatorTimeoutTest.prepareRPCTimeout[SEQUENTIAL/false] REGRESSION 🔴🔵 0 / 7
o.a.c.index.sai.cql.VectorKeyRestrictedOnPartitionTest.partitionRestrictedWidePartitionBqCompressedTest[ec] (compression) REGRESSION 🔴🔵 0 / 7

Found 1 known test failures

@jasonstack jasonstack merged commit 48eb797 into main Oct 3, 2025
487 of 495 checks passed
@jasonstack jasonstack deleted the cndb-15253-main branch October 3, 2025 08:21
michaelsembwever pushed a commit that referenced this pull request Oct 9, 2025
…shards and min sstable size are configured (#2031)

riptano/cndb#15253 large shard count jump
was involved in HCD-130

Replace power-of-two shard progression with factorization-based smooth
growth when base shard count is not power of 2. This prevents
problematic large jumps. For example, with num_shards 1000 which is 5^3* 2^3:

- Before: 1 → 2 → 8 → 1000 (125x jump causes data loss due to hcd-130)
- After:  1 → 5 → 25 → 125 → 250 → 500 → 1000 (max 5x jump)

* new behavior can be disabled via
`-Dunified_compaction.use_factorization_shard_count_growth`
michaelsembwever pushed a commit that referenced this pull request Oct 10, 2025
…shards and min sstable size are configured (#2031)

riptano/cndb#15253 large shard count jump
was involved in HCD-130

Replace power-of-two shard progression with factorization-based smooth
growth when base shard count is not power of 2. This prevents
problematic large jumps. For example, with num_shards 1000 which is 5^3* 2^3:

- Before: 1 → 2 → 8 → 1000 (125x jump causes data loss due to hcd-130)
- After:  1 → 5 → 25 → 125 → 250 → 500 → 1000 (max 5x jump)

* new behavior can be disabled via
`-Dunified_compaction.use_factorization_shard_count_growth`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants