Skip to content

Conversation

@egmontkob
Copy link
Contributor

Proposed changes

Don't assume that the prompt is valid UTF-8

Resolves: #4801

Checklist

👉 Our coding style can be found here: https://midnight-commander.org/coding-style/ 👈

  • I have referenced the issue(s) resolved by this PR (if any)
  • I have signed-off my contribution with git commit --amend -s
  • Lint and unit tests pass locally with my changes (make indent && make check)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added the necessary documentation (if appropriate)

@github-actions github-actions bot added needs triage Needs triage by maintainers prio: medium Has the potential to affect progress labels Oct 19, 2025
@github-actions github-actions bot added this to the Future Releases milestone Oct 19, 2025
@egmontkob egmontkob force-pushed the 4801_invalid_utf8_in_prompt branch from 6ae71e3 to 1b3bb9b Compare October 19, 2025 20:47
@zyv zyv added area: core Issues not related to a specific subsystem and removed needs triage Needs triage by maintainers labels Oct 20, 2025
@zyv zyv modified the milestones: Future Releases, 4.8.34 Oct 20, 2025
Copy link
Contributor

@ossilator ossilator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't know the entire context, so i can't possibly approve this, but from what i do see it looks sane.

same applies to the followup patch, which i'd squash here.

the autotest doesn't cover combining ctrl codes with botched utf8, which seems like a relevant omission. as you noted, the exact semantics probably aren't all too important, but the boundary conditions should be exercised for robustness.

note that afaict, you're fixing a security hole here: root may be browsing a user's folders, which are prepared to exploit the heap overflow. say hello to a CVE process ...

…the prompt, part 1/2

Properly resynchronize immediately after some invalid UTF-8 segment,
instead of swallowing some subsequent letters or possibly even walking
beyond the end of the string.

Signed-off-by: Egmont Koblinger <[email protected]>
…the prompt, part 2/2

Do not filter out invalid UTF-8 components, instead let them pass through
as-is. This will result in mc converting them to a replacement symbol.

Signed-off-by: Egmont Koblinger <[email protected]>
@egmontkob egmontkob force-pushed the 4801_invalid_utf8_in_prompt branch from 1b3bb9b to cb5080d Compare November 2, 2025 12:42
@egmontkob
Copy link
Contributor Author

Please review.

I decided to go with two separate commits, without squashing them. If for some reason the second commit turns out to be undesired, we can easily revert but we'll still keep the important bugfix of the first.

@zyv zyv changed the title 4801 invalid utf8 in prompt Ticket #4801: invalid utf-8 in prompt Nov 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: core Issues not related to a specific subsystem prio: medium Has the potential to affect progress

Development

Successfully merging this pull request may close these issues.

Prompt resyncronizes too late after invalid UTF-8

3 participants