Skip to content

Conversation

@mcdurdin
Copy link
Member

@mcdurdin mcdurdin commented Jan 24, 2026

Ensure that when a single backspace decomposes the last NFC character in the app context, the remainder of the 'cluster' is preserved, matching the implication of the CLDR keyboard specification.

This addresses the behavior in #15487 where the cached context became out of sync with the app context after deleting an entire NFC cluster such as ê, which caused a loop ending up with the entire context being deleted, at which point the loop exited with a fail-safe.

Note that the LDML keyboard tests (ldml.cpp) do not currently exercise the normalization code; this is a gap that should be addressed to ensure that we are testing final application behavior.

Fixes: #15487

User Testing

Preparation: install the keyboard bksp_ldml.kmp from the archive below.

For these tests, always type the following key sequences:

  • E-ACUTE: abc     e/
  • D-ACUTE: abc     d/
  • A-UMLAUT: abc     '

GROUP_WINDOWS: Test on Windows 11
GROUP_MAC: Test on macOS
GROUP_LINUX: Test on Linux

  • TEST_COMPLIANT_E_ACUTE_VERIFY_CONTEXT: In a compliant app, type the E-ACUTE sequence. Select the text, copy and paste it into a character viewer and verify that the last character is U+00E9.

  • TEST_COMPLIANT_E_ACUTE: In a compliant app, type the E-ACUTE sequence. Then press bksp. The result should be abc e.

  • TEST_NONCOMPLIANT_E_ACUTE_VERIFY_CONTEXT: In a non-compliant app, type the E-ACUTE sequence. Select the text, copy and paste it into a character viewer and verify that the last character is U+00E9.

  • TEST_NONCOMPLIANT_E_ACUTE: In a non-compliant app, type the E-ACUTE sequence. Then press bksp. The result should be abc e.


  • TEST_COMPLIANT_D_ACUTE_VERIFY_CONTEXT: In a compliant app, type the D-ACUTE sequence. Select the text, copy and paste it into a character viewer and verify that the last two characters are U+0064 U+0301.

  • TEST_COMPLIANT_D_ACUTE: In a compliant app, type the D-ACUTE sequence. Then press bksp. The result should be abc d.

  • TEST_NONCOMPLIANT_D_ACUTE_VERIFY_CONTEXT: In a non-compliant app, type the D-ACUTE sequence. Select the text, copy and paste it into a character viewer and verify that the last two characters are U+0064 U+0301.

  • TEST_NONCOMPLIANT_D_ACUTE: In a non-compliant app, type the D-ACUTE sequence. Then press bksp. The result should be abc d.


  • TEST_COMPLIANT_A_UMLAUT_VERIFY_CONTEXT: In a compliant app, type the A-UMLAUT sequence. Select the text, copy and paste it into a character viewer and verify that the last character is U+00E4.

  • TEST_COMPLIANT_A_UMLAUT: In a compliant app, type the A-UMLAUT sequence. Then press bksp. The result should be abc a.

  • TEST_NONCOMPLIANT_A_UMLAUT_VERIFY_CONTEXT: In a non-compliant app, type the A-UMLAUT sequence. Select the text, copy and paste it into a character viewer and verify that the last character is U+00E4.

  • TEST_NONCOMPLIANT_A_UMLAUT: In a non-compliant app, type the A-UMLAUT sequence. Then press bksp. The result should be abc a.

Note on compliant and non-compliant apps

  • Windows 11: Some compliant apps are Notepad, Word, Firefox. Some non-compliant apps are Text Editor (artifact on this PR), Chrome.
  • macOS: One compliant app is TextEdit. One non-compliant app is Chrome.
  • Linux: One compliant app is gEdit. One non-compliant app is Chrome.

Ensure that when a single backspace decomposes the last NFC character in
the app context, the remainder of the 'cluster' is preserved, matching
the implication of the CLDR keyboard specification.

This addresses the behavior in #15487 where the cached context became
out of sync with the app context after deleting an entire NFC cluster
such as ê, which caused a loop ending up with the entire context being
deleted, at which point the loop exited with a fail-safe.

Note that the LDML keyboard tests (ldml.cpp) do not currently exercise
the normalization code; this is a gap that should be addressed to ensure
that we are testing final application behavior.

Fixes: #15487
@github-project-automation github-project-automation bot moved this to Todo in Keyman Jan 24, 2026
@keymanapp-test-bot keymanapp-test-bot bot added the user-test-missing User tests have not yet been defined for the PR label Jan 24, 2026
@keymanapp-test-bot
Copy link

keymanapp-test-bot bot commented Jan 24, 2026

User Test Results

Test specification and instructions

  • ⬜ GROUP_WINDOWS: Test on Windows 11

    • TEST_COMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_COMPLIANT_E_ACUTE (OPEN)
    • TEST_NONCOMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_NONCOMPLIANT_E_ACUTE (OPEN)
    • TEST_COMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_COMPLIANT_D_ACUTE (OPEN)
    • TEST_NONCOMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_NONCOMPLIANT_D_ACUTE (OPEN)
    • TEST_COMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN)
    • TEST_COMPLIANT_A_UMLAUT (OPEN)
    • TEST_NONCOMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN)
    • TEST_NONCOMPLIANT_A_UMLAUT (OPEN)
  • ⬜ GROUP_MAC: Test on macOS

    • TEST_COMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_COMPLIANT_E_ACUTE (OPEN)
    • TEST_NONCOMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_NONCOMPLIANT_E_ACUTE (OPEN)
    • TEST_COMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_COMPLIANT_D_ACUTE (OPEN)
    • TEST_NONCOMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_NONCOMPLIANT_D_ACUTE (OPEN)
    • TEST_COMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN)
    • TEST_COMPLIANT_A_UMLAUT (OPEN)
    • TEST_NONCOMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN)
    • TEST_NONCOMPLIANT_A_UMLAUT (OPEN)
  • ⬜ GROUP_LINUX: Test on Linux

    • TEST_COMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_COMPLIANT_E_ACUTE (OPEN)
    • TEST_NONCOMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_NONCOMPLIANT_E_ACUTE (OPEN)
    • TEST_COMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_COMPLIANT_D_ACUTE (OPEN)
    • TEST_NONCOMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN)
    • TEST_NONCOMPLIANT_D_ACUTE (OPEN)
    • TEST_COMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN)
    • TEST_COMPLIANT_A_UMLAUT (OPEN)
    • TEST_NONCOMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN)
    • TEST_NONCOMPLIANT_A_UMLAUT (OPEN)
Results Template
# Test Results

### GROUP_WINDOWS: Test on Windows 11

* **TEST_COMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_COMPLIANT_E_ACUTE (OPEN):** notes
* **TEST_NONCOMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_NONCOMPLIANT_E_ACUTE (OPEN):** notes
* **TEST_COMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_COMPLIANT_D_ACUTE (OPEN):** notes
* **TEST_NONCOMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_NONCOMPLIANT_D_ACUTE (OPEN):** notes
* **TEST_COMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN):** notes
* **TEST_COMPLIANT_A_UMLAUT (OPEN):** notes
* **TEST_NONCOMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN):** notes
* **TEST_NONCOMPLIANT_A_UMLAUT (OPEN):** notes

### GROUP_MAC: Test on macOS

* **TEST_COMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_COMPLIANT_E_ACUTE (OPEN):** notes
* **TEST_NONCOMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_NONCOMPLIANT_E_ACUTE (OPEN):** notes
* **TEST_COMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_COMPLIANT_D_ACUTE (OPEN):** notes
* **TEST_NONCOMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_NONCOMPLIANT_D_ACUTE (OPEN):** notes
* **TEST_COMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN):** notes
* **TEST_COMPLIANT_A_UMLAUT (OPEN):** notes
* **TEST_NONCOMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN):** notes
* **TEST_NONCOMPLIANT_A_UMLAUT (OPEN):** notes

### GROUP_LINUX: Test on Linux

* **TEST_COMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_COMPLIANT_E_ACUTE (OPEN):** notes
* **TEST_NONCOMPLIANT_E_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_NONCOMPLIANT_E_ACUTE (OPEN):** notes
* **TEST_COMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_COMPLIANT_D_ACUTE (OPEN):** notes
* **TEST_NONCOMPLIANT_D_ACUTE_VERIFY_CONTEXT (OPEN):** notes
* **TEST_NONCOMPLIANT_D_ACUTE (OPEN):** notes
* **TEST_COMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN):** notes
* **TEST_COMPLIANT_A_UMLAUT (OPEN):** notes
* **TEST_NONCOMPLIANT_A_UMLAUT_VERIFY_CONTEXT (OPEN):** notes
* **TEST_NONCOMPLIANT_A_UMLAUT (OPEN):** notes

Test Artifacts

@keymanapp-test-bot keymanapp-test-bot bot added this to the A19S21 milestone Jan 24, 2026
@github-actions github-actions bot added core/ Keyman Core fix labels Jan 24, 2026
@keymanapp-test-bot keymanapp-test-bot bot added has-user-test user-test-required User tests have not been completed and removed user-test-missing User tests have not yet been defined for the PR labels Jan 24, 2026
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix is entirely in this file; other changes are additional unit tests and enabling correct logging of keystrokes in the unit test log files.

@mcdurdin mcdurdin marked this pull request as ready for review January 25, 2026 20:17
@mcdurdin
Copy link
Member Author

@rc-swag @sgschantz @ermshiperete given this change is a significant fix to Keyman Core, I would like to request review from all of you. @srl295, feel free to ignore; just as you have capacity to take a look.

@@expected: abce
Comment: #15487: bksp will delete just acute (even though it will have combined to NFC, internally we do NFD bksp)

TODO: this seems to be working NFD on app context side also, do we need to set a flag? this may break other tests so care is needed
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: this seems to be working NFD on app context side also, do we need to set a flag? this may break other tests so care is needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #15491

mcdurdin added a commit that referenced this pull request Jan 26, 2026
Ensure that when a single backspace decomposes the last NFC character in
the app context, the remainder of the 'cluster' is preserved, matching
the implication of the CLDR keyboard specification.

This addresses the behavior in #15487 where the cached context became
out of sync with the app context after deleting an entire NFC cluster
such as ê, which caused a loop ending up with the entire context being
deleted, at which point the loop exited with a fail-safe.

Note that the LDML keyboard tests (ldml.cpp) do not currently exercise
the normalization code; this is a gap that should be addressed to ensure
that we are testing final application behavior.

This cherry-pick only adds the direct normalization patch and unit test
and skips the additional changes for logging.

Fixes: #15487
Cherry-pick-of: #15488
@Meng-Heng Meng-Heng self-assigned this Jan 26, 2026
Comment on lines +145 to +146
if(app_context_nfd == cached_context_string.substr(0, app_context_nfd.length())) {
context_final = cached_context_string.substr(app_context_nfd.length());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe want some more comments here.
What we're trying to say is, if the app context matches the beginning of the cache,
break but remember if there was any content that was added after the cache.

So context_final will now have any 'suffix' that is NOT reflected in the app context.

*/

auto output_nfc = output;
auto output_nfc = context_final + output;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, any of that 'final' suffix that was NOT already in the app context, needs to be prepended to the output.

so the full picture before the insertion point is:

<app_context><context_final><output>|

@srl295
Copy link
Member

srl295 commented Jan 26, 2026

looks good! i think it could be good to add some comments and even put the case into the comments (the example case) to document what context_final is doing. but that could be a later step.

@sgschantz
Copy link
Contributor

@mcdurdin For the E-ACUTE test cases, shouldn't the final unicode value be U+00E9?

@mcdurdin
Copy link
Member Author

@mcdurdin For the E-ACUTE test cases, shouldn't the final unicode value be U+00E9?

Thank you, good catch -- I had started with A-ACUTE... but then wanted to differentiate.

mcdurdin added a commit that referenced this pull request Jan 28, 2026
When normalizing, we need to stop processing on an NFC boundary, not an
NFD boundary, to support normalizations such as in Bengali, where
appending `U+09D7` to a context of `U+0995 U+09C7` should result in
`U+0995 U+09CC`.

The specification is unclear on this; see https://unicode-org.atlassian.net/browse/CLDR-19218

This also updates the ldml keyboard unit test suite to support running
in full NFC mode (used in all Engine implementations) as well retaining
the NFD mode (now only used by the debugger).

Side note: the Bengali normalization failure case was picked up by the
improvements to the unit test suite, proving once again that good tests
are so valuable.

Fixes: #15491
Fixes: #15505
Follows: #15488
Relates-to: CLDR-19218
mcdurdin added a commit that referenced this pull request Jan 28, 2026
When normalizing, we need to stop processing on an NFC boundary, not an
NFD boundary, to support normalizations such as in Bengali, where
appending `U+09D7` to a context of `U+0995 U+09C7` should result in
`U+0995 U+09CC`.

The specification is unclear on this; see https://unicode-org.atlassian.net/browse/CLDR-19218

This also updates the ldml keyboard unit test suite to support running
in full NFC mode (used in all Engine implementations) as well retaining
the NFD mode (now only used by the debugger).

Side note: the Bengali normalization failure case was picked up by the
improvements to the unit test suite, proving once again that good tests
are so valuable.

Fixes: #15491
Fixes: #15505
Follows: #15488
Relates-to: CLDR-19218
mcdurdin added a commit that referenced this pull request Jan 28, 2026
When normalizing, we need to stop processing on an NFC boundary, not an
NFD boundary, to support normalizations such as in Bengali, where
appending `U+09D7` to a context of `U+0995 U+09C7` should result in
`U+0995 U+09CC`.

The specification is unclear on this; see https://unicode-org.atlassian.net/browse/CLDR-19218

This also updates the ldml keyboard unit test suite to support running
in full NFC mode (used in all Engine implementations) as well retaining
the NFD mode (now only used by the debugger).

Side note: the Bengali normalization failure case was picked up by the
improvements to the unit test suite, proving once again that good tests
are so valuable.

Fixes: #15491
Fixes: #15505
Follows: #15488
Relates-to: CLDR-19218
test_actions_normalize(
"One backspace to delete last NFD character (#15487)",
/* app context pre transform: */ u"abcê", // NFC
/* cached context post transform: */ u"abce",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the app context pre transform is the same as l.327 (both have NFC \u00EA),
shouldn't cached context post transform match l.328? (u"abce\u0323\u0302")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core/ Keyman Core fix has-user-test user-test-required User tests have not been completed

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

bug(core): backspace with LDML keyboard deletes entire context

7 participants