-
Notifications
You must be signed in to change notification settings - Fork 21
CNDB-13075: Reuse the analyzed tokens of the right operand of an analyzed expression #1620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Checklist before you submit for review
|
fca2d4f to
d608a33
Compare
a7fb9a8 to
92d1ecf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really extremely minor nit: I feel a bit uncomfortable when both of those overloads have the same name, yet one is meant to be called on a non-analyzed value and another on an analyzed. Like... there seems to be a subtle semantic difference between them (and you often can only call one of them but not both). Maybe they should have a different name and the one that takes the lists should say "analyzed" explicitly?
Anyway, feel free to dismiss this comment. This is more my gut feeling than an objective thing. But I simply wondered if you think the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed it to isSatisfiedByAnalyzed. I guess it makes it a bit easier to identify the calls for RowFilter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, why did we need separate queryAnalyzer / indexAnalyzer before and now we don't?
Is it because of caching you introduced in queriedTokens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new Index.Analyzer has two separate methods for index and query time analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean we're using this only at query time?
How do we analyze the values at index build time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before Index.Analyzer had a single analyzing method, and Index had two methods returning the read and write analyzers. Now Index.Analyzer has two analyzing methods for reads and writes, and Index only has one method returning the new multipurpose Index.Analyzer.
The reasons for encapsulating the two types of internal SAI analyzers into a single Index.Analyzer object are:
- Make caching of the queried term a bit easier.
- Try to make confusing both analyzers a bit harder, since there is only one analyzer in the filtering expressions.
- Get rid of some assertions around checking consistency between both analyzer parameters. The two types of analyzers were moved around together as a pack, so I guess it makes sense to encapsulate them together.
92d1ecf to
a3acff8
Compare
ed9bd02 to
ef15d96
Compare
|
ef15d96 to
d860ad6
Compare
|
❌ Build ds-cassandra-pr-gate/PR-1620 rejected by Butler2 new test failure(s) in 9 builds Found 2 new test failures
Found 665 known test failures |
…yzed expression (#1620) Memoize the analyzed tokens of the right operand of an analyzed expression. Also add some refactoring and simplifications around how analysis is dealt with in `RowFilter` and `Operator`.
…yzed expression (#1620) Memoize the analyzed tokens of the right operand of an analyzed expression. Also add some refactoring and simplifications around how analysis is dealt with in `RowFilter` and `Operator`.



CNDB-10731 added support for index analyzers to
RowFilter. Currently, both the left and right operands of an expression are analyzed on every evaluation. However, the right operand has the same fixed value over the life of the expression, so we are doing the same analysis once and again.This PR memoizes the analyzed tokens of the right operand. It also adds some refactoring and simplifications around how analysis is dealt with in
RowFilterandOperator.