Skip to content

String functions behavior depens on target (Unicode support) #659

@ov7a

Description

@ov7a

C target treat strings as ASCII:

t[i] = kk_ascii_tolower(s[i]);

while JS treats them as unicode:
https://tc39.es/ecma262/multipage/text-processing.html#sec-string.prototype.tolowercase

Simple reproducer:

$ echo 'println("Добар Дан".to-lower)' | koka --target c
...
Добар Дан
$ echo 'println("Добар Дан".to-lower)' | koka --target js
...
добар дан

trim operation is also affected by this. In JS it's Unicode, while in C it's not:

$ echo 'println("a\xA0".trim ++ ".")' | koka --target c
...
a .
$ echo 'println("a\xA0".trim ++ ".")' | koka --target js
...
a.

As a user, I'd expect that all targets support Unicode.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions