Skip to content

fix(tee): prevent panic on UTF-8 multi-byte truncation boundary#1072

Open
pszymkowiak wants to merge 1 commit intodevelopfrom
fix/tee-utf8-panic
Open

fix(tee): prevent panic on UTF-8 multi-byte truncation boundary#1072
pszymkowiak wants to merge 1 commit intodevelopfrom
fix/tee-utf8-panic

Conversation

@pszymkowiak
Copy link
Copy Markdown
Collaborator

Summary

  • &raw[..max_file_size] in write_tee_file panics if the byte offset falls inside a multi-byte UTF-8 character (Japanese, emoji, etc.)
  • Now uses char_indices() to find the nearest safe boundary before slicing
  • 2 new tests: Japanese 3-byte chars + emoji 4-byte chars at truncation boundary

Ref: #640 (L-3 finding from security audit)

Test plan

  • 1360 tests pass, 6 ignored
  • test_write_tee_file_truncation_utf8_boundary — 3-byte Japanese chars, cut at 998/999
  • test_write_tee_file_truncation_emoji — 4-byte emoji, cut at 201/400
  • Python proof: old code would decode-error at byte 996-997

&raw[..max_file_size] panics if the byte offset falls inside a
multi-byte UTF-8 character (e.g. Japanese, emoji). Now finds the
nearest char boundary before slicing.

Ref: issue #640 (L-3 finding)

Signed-off-by: Patrick Szymkowiak <patrick@rtk-ai.app>
Signed-off-by: Patrick szymkowiak <patrick.szymkowiak@innovtech.eu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant