Skip to content

Conversation

@Drodt
Copy link
Contributor

@Drodt Drodt commented May 6, 2025

Previously, Hayagriva did not compare UTF-8 strings correctly #193. This is due to how Rust compares UTF-8 Strings. This PR addresses the issue by using icu_ucol for comparison, using the current locale.

Fixes #193 and adds one test to citeproc-pass.

Adds rust_icu_ucol as a dependency. Problem: icu must be installed on the user's system. How do we handle this?

I might have missed some comparison in csl_cmp.

@Drodt Drodt marked this pull request as draft May 6, 2025 11:40
@Drodt
Copy link
Contributor Author

Drodt commented May 14, 2025

BTW: This is only a draft because I don't think it should be merged without discussing the added dependency.

Any opinion on this, @reknih @PgBiel?

@PgBiel
Copy link
Contributor

PgBiel commented May 23, 2025

Problem: icu must be installed on the user's system. How do we handle this?

It's a non-starter for us. It won't be installed on the Typst web app for example.

icu4x1 is an alternative. However, there are concerns about the size of the collation data it pulls, and how this could impact the size of e.g. the Typst binary, Typst on the web app etc.

One alternative that was considered is to delegate this job to a trait which implements localized sorting and let upstream users (i.e. Typst) handle it. We're not fully decided on this, but it's at least better than the status quo.

Footnotes

  1. https://github.com/unicode-org/icu4x

@Drodt
Copy link
Contributor Author

Drodt commented May 26, 2025

For now I switched to icu4x. Let me know if the typst team wants another solution

@Drodt Drodt marked this pull request as ready for review June 26, 2025 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hayagriva sorts Biblatex UTF8 entries incorrectly

3 participants