Skip to content

Conversation

@jpmcmu
Copy link
Contributor

@jpmcmu jpmcmu commented Nov 14, 2025

  • Moved file positions to end of record
  • Optional removed file positions if they will always be zero

Signed-off-by: James McMullan [email protected]

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Copilot AI review requested due to automatic review settings November 14, 2025 20:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@jpmcmu jpmcmu requested a review from Copilot November 14, 2025 21:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@github-actions
Copy link

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-34657

Jirabot Action Result:
Assigning user: [email protected]
Workflow Transition To: Merge Pending
Updated PR

Copilot finished reviewing on behalf of jpmcmu November 15, 2025 06:13
Copilot finished reviewing on behalf of jpmcmu November 15, 2025 06:15
@jpmcmu jpmcmu requested a review from ghalliday November 17, 2025 18:38
Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Generally looks clean. A few comments/edge cases/optimizations.

//fields start at the beginning of the row.
memcpy(dst+keyCompareLen, p+keyCompareLen+sizeof(offset_t), keyLen-keyCompareLen);
memcpy(dst+keyLen, p, sizeof(offset_t));
memcpy(dst+keyCompareLen, p+keyCompareLen, keyLen-keyCompareLen);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optimization: If hasFilePosition this can now to a single memcpy, and the false case can use memset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this change

if (0xffff == hdr.numKeys || 0 == compressor.writekey(pos, (const char *)indata, insize))
unsigned writeOptions = KeyCompressor::TrailingFilePosition | (context.zeroFilePos ? KeyCompressor::NoFilePosition : 0);
int written = compressor.writekey(pos, (const char *)indata, insize, writeOptions);
if (0xffff == hdr.numKeys || written == 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test on numKeys is no longer short-circuiting the call to writekey.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

CompressionMethod compressionMethod = *(CompressionMethod*) keys;
keys += sizeof(CompressionMethod);

hasFilePosition = *(bool*) keys;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A future PR should generalize this to a byte bitfield so that other information can be added. Not for this though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could / should this be added to the key header instead of the node header?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, quite possibly, that can be looked at later.

hdr.keyBytes += sizeof(context.compressionMethod);

bool hasFilepos = !context.zeroFilePos;
memcpy(keyPtr, &hasFilepos, sizeof(bool));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial: More normal style for a single bool would be
*(bool *)keyPtr = hasFilepos;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

{
memcpy(dst+keyCompareLen, p+keyCompareLen+sizeof(offset_t), reclen-keyCompareLen);
memcpy(dst+reclen, p, sizeof(offset_t));
memcpy(dst+keyCompareLen, p+keyCompareLen, reclen-keyCompareLen);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar optimization to the fixed length above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

if (!context.compressionHandler)
throw MakeStringException(0, "Unknown compression method %d", (int)compressionMethod);

if (helper && (helper->getFlags() & TIWzerofilepos))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also need to take into account whether this is a TLK. TLKs implicitly have no payload, but do have a fileposition.

Add
//version multiPart=true,variant='hybrid'
to stress text and check it still works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to the test, and it passed

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpmcmu please squash.

@jpmcmu
Copy link
Contributor Author

jpmcmu commented Nov 24, 2025

@ghalliday squashed

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, a couple of final comments.

if (hasFilePosition)
return keyLen + sizeof(offset_t);
else
return keyLen;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is wrong - this is the size as seen by the consumer, which should include the zeroed fileposition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense, fixed.

CompressionMethod compressionMethod = *(CompressionMethod*) keys;
keys += sizeof(CompressionMethod);

hasFilePosition = *(bool*) keys;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasFilePosition is a confusing name - because there is already an option
keyHdr->hasSpecialFileposition()
This should be called zeroFilePosition or something similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

@jpmcmu jpmcmu requested a review from ghalliday December 1, 2025 16:28
@jpmcmu
Copy link
Contributor Author

jpmcmu commented Dec 1, 2025

@ghalliday Implemented code review changes, will squash if it looks good

Copy link
Member

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jpmcmu thanks. Please squash and I will mrge.

- Moved file positions to end of record
- Optional removed file positions if they will always be zero

Signed-off-by: James McMullan [email protected]
@jpmcmu
Copy link
Contributor Author

jpmcmu commented Dec 2, 2025

@ghalliday Squashed

@ghalliday ghalliday merged commit faf91c3 into hpcc-systems:candidate-9.14.x Dec 3, 2025
50 of 51 checks passed
@github-actions
Copy link

github-actions bot commented Dec 3, 2025

Jirabot Action Result:
Fix versions already added.
Workflow Transition: 'Resolve issue'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants