Skip to content

Conversation

@michaeljmarshall
Copy link
Member

Fixes: https://github.com/riptano/cndb/issues/13196

What is the issue

We were incorrectly sharing a file reader across threads in BM25 queries, which can lead to invalid results, as I reproduced in the test.

What does this PR fix and why was it fixed

The PR fixes the issue by creating a DocLengthsReader object per query.

We were incorrectly sharing a file reader
across threads, which can lead to invalid
results, as the test shows.
@michaeljmarshall michaeljmarshall added the bug Something isn't working label Mar 7, 2025
@michaeljmarshall michaeljmarshall self-assigned this Mar 7, 2025
@github-actions
Copy link

github-actions bot commented Mar 7, 2025

Checklist before you submit for review

  • Make sure there is a PR in the CNDB project updating the Converged Cassandra version
  • Use NoSpamLogger for log lines that may appear frequently in the logs
  • Verify test results on Butler
  • Test coverage for new/modified code is > 80%
  • Proper code formatting
  • Proper title for each commit staring with the project-issue number, like CNDB-1234
  • Each commit has a meaningful description
  • Each commit is not very long and contains related changes
  • Renames, moves and reformatting are in distinct commits

Copy link

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link

@adelapena adelapena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I have only left a couple of suggestions that can be addressed on commit.

The primary goal here is to increase the
probability of hitting the race. It seems
it still isn't perfect, but anecdotally,
we hit it frequently on my machine. Further,
this kind of bug is not hard to see, so
I don't think it is worth further time
trying to make a perfect test.
@sonarqubecloud
Copy link

sonarqubecloud bot commented Mar 7, 2025

@cassci-bot
Copy link

✔️ Build ds-cassandra-pr-gate/PR-1622 approved by Butler


Approved by Butler
See build details here

@michaeljmarshall michaeljmarshall merged commit 840d5af into main Mar 7, 2025
473 of 478 checks passed
@michaeljmarshall michaeljmarshall deleted the cndb-13196 branch March 7, 2025 17:25
djatnieks pushed a commit that referenced this pull request Mar 11, 2025
Fixes: riptano/cndb#13196

We were incorrectly sharing a file reader across threads in BM25
queries, which can lead to invalid results, as I reproduced in the test.

The PR fixes the issue by creating a `DocLengthsReader` object per
query.
djatnieks pushed a commit that referenced this pull request May 18, 2025
Fixes: riptano/cndb#13196

We were incorrectly sharing a file reader across threads in BM25
queries, which can lead to invalid results, as I reproduced in the test.

The PR fixes the issue by creating a `DocLengthsReader` object per
query.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants