fix: add surrogate pair support for font substitution and text encoding #3133
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR adds support for handling surrogate pairs (characters outside the Basic Multilingual Plane) in font substitution and text encoding. This improves the library's ability to handle modern Unicode characters correctly.
Context
In our project, we encountered issues when rendering mathematical symbols and special characters. For example, when attempting to display mathematical italic characters like 𝑥, the text rendering would break because these characters are represented as surrogate pairs in Unicode. The existing implementation didn't properly handle these cases, leading to incorrect rendering (each character in the pair was rendered as a different character).
Changes
Font Substitution Engine
codePointAt()
instead of character-by-character iterationAFM Font Encoding
encodeText
method inAFMFont
to handle surrogate pairs correctlyglyphsForString
to usecodePointAt()
for proper Unicode character handlingTesting
Added comprehensive test coverage for surrogate pair handling:
Testing
The changes include extensive test coverage in:
packages/textkit/tests/engines/fontSubstitution.test.ts
packages/textkit/tests/utils/stringFromCodePoints.test.ts
Test cases cover various scenarios, including:
Impact
This change improves the library's ability to handle modern Unicode text, particularly:
Breaking Changes
None. This is a backward-compatible enhancement that improves existing functionality.