Skip to content

Conversation

@alexrutar
Copy link
Contributor

This PR implements three parsing fixes.

  1. Fixes (Potentially) incorrect handling of trailing \ in MultiPattern::reparse #66: we now check for a trailing backslash \ when running 'reparse' to correctly force a complete rescore (rather than just an update).
  2. Updates the ASCII parsing code to use the ASCII is_whitespace checks (same behaviour as with the Unicode handling).
  3. Fixes a bug in Unicode parsing code, where there was a \ push that had to be deferred until the beginning of the subsequent loop. This was causing invalid handling of e.g. foö\ bar.

I've also added quite a few tests to check that the parsing code behaves as expected.

This was previously #67 but now slightly better implementation and I'm making a new PR because the other one is filled with unnecessary details.

This also fixes an unrelated bug when parsing needles which contain
non-ASCI Unicode.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

(Potentially) incorrect handling of trailing \ in MultiPattern::reparse

1 participant