-
Notifications
You must be signed in to change notification settings - Fork 311
HPCC-34657 Hybrid index improve how fpos are stored #20650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.
|
Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-34657 Jirabot Action Result: |
ghalliday
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Generally looks clean. A few comments/edge cases/optimizations.
system/jhtree/jhblockcompressed.cpp
Outdated
| //fields start at the beginning of the row. | ||
| memcpy(dst+keyCompareLen, p+keyCompareLen+sizeof(offset_t), keyLen-keyCompareLen); | ||
| memcpy(dst+keyLen, p, sizeof(offset_t)); | ||
| memcpy(dst+keyCompareLen, p+keyCompareLen, keyLen-keyCompareLen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optimization: If hasFilePosition this can now to a single memcpy, and the false case can use memset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made this change
system/jhtree/jhblockcompressed.cpp
Outdated
| if (0xffff == hdr.numKeys || 0 == compressor.writekey(pos, (const char *)indata, insize)) | ||
| unsigned writeOptions = KeyCompressor::TrailingFilePosition | (context.zeroFilePos ? KeyCompressor::NoFilePosition : 0); | ||
| int written = compressor.writekey(pos, (const char *)indata, insize, writeOptions); | ||
| if (0xffff == hdr.numKeys || written == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test on numKeys is no longer short-circuiting the call to writekey.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
system/jhtree/jhblockcompressed.cpp
Outdated
| CompressionMethod compressionMethod = *(CompressionMethod*) keys; | ||
| keys += sizeof(CompressionMethod); | ||
|
|
||
| hasFilePosition = *(bool*) keys; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A future PR should generalize this to a byte bitfield so that other information can be added. Not for this though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could / should this be added to the key header instead of the node header?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, quite possibly, that can be looked at later.
system/jhtree/jhblockcompressed.cpp
Outdated
| hdr.keyBytes += sizeof(context.compressionMethod); | ||
|
|
||
| bool hasFilepos = !context.zeroFilePos; | ||
| memcpy(keyPtr, &hasFilepos, sizeof(bool)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trivial: More normal style for a single bool would be
*(bool *)keyPtr = hasFilepos;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
system/jhtree/jhblockcompressed.cpp
Outdated
| { | ||
| memcpy(dst+keyCompareLen, p+keyCompareLen+sizeof(offset_t), reclen-keyCompareLen); | ||
| memcpy(dst+reclen, p, sizeof(offset_t)); | ||
| memcpy(dst+keyCompareLen, p+keyCompareLen, reclen-keyCompareLen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar optimization to the fixed length above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| if (!context.compressionHandler) | ||
| throw MakeStringException(0, "Unknown compression method %d", (int)compressionMethod); | ||
|
|
||
| if (helper && (helper->getFlags() & TIWzerofilepos)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this also need to take into account whether this is a TLK. TLKs implicitly have no payload, but do have a fileposition.
Add
//version multiPart=true,variant='hybrid'
to stress text and check it still works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added to the test, and it passed
ghalliday
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpmcmu please squash.
|
@ghalliday squashed |
ghalliday
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, a couple of final comments.
system/jhtree/jhblockcompressed.cpp
Outdated
| if (hasFilePosition) | ||
| return keyLen + sizeof(offset_t); | ||
| else | ||
| return keyLen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is wrong - this is the size as seen by the consumer, which should include the zeroed fileposition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that makes sense, fixed.
system/jhtree/jhblockcompressed.cpp
Outdated
| CompressionMethod compressionMethod = *(CompressionMethod*) keys; | ||
| keys += sizeof(CompressionMethod); | ||
|
|
||
| hasFilePosition = *(bool*) keys; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hasFilePosition is a confusing name - because there is already an option
keyHdr->hasSpecialFileposition()
This should be called zeroFilePosition or something similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
|
@ghalliday Implemented code review changes, will squash if it looks good |
ghalliday
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpmcmu thanks. Please squash and I will mrge.
- Moved file positions to end of record - Optional removed file positions if they will always be zero Signed-off-by: James McMullan [email protected]
|
@ghalliday Squashed |
faf91c3
into
hpcc-systems:candidate-9.14.x
|
Jirabot Action Result: |
Signed-off-by: James McMullan [email protected]
Type of change:
Checklist:
Smoketest:
Testing: