Skip to content

Conversation

zetanumbers
Copy link
Contributor

@zetanumbers zetanumbers commented Feb 27, 2022

Previously converted such character to UTF-16 char code, so large unicode characters would have been truncated. Now it's possible to pass unicode characters.

Previously converted such character to UTF-16 char code,
so large unicode characters would have been truncated.
Now it's possible to pass unicode characters.
@aduros
Copy link
Owner

aduros commented Mar 1, 2022

We should probably match the same behavior as C's printf, which I think truncates to 8 bits for %c.

For me this program:

printf("Hello %c\n", 12345678);

Prints Hello N.

@zetanumbers
Copy link
Contributor Author

We should probably match the same behavior as C's printf, which I think truncates to 8 bits for %c.

But why? It's not like we are trying to implement libc. With this PR we would able to pass rust's char for example.

@zetanumbers
Copy link
Contributor Author

Btw if we truncate, should we truncate to 7 bits for ASCII, or truncate to 8 bits and allow some UTF-16 char codes? Aren't non-ASCII characters for printf OS dependent?

@aduros
Copy link
Owner

aduros commented Mar 2, 2022

Could we truncate to 8 bits? libc printf semantics aren't perfect, but at least they're well-defined and we don't need to document our own special handling of certain features.

For printing unicode characters, isn't it possible to use %s instead of %c? Or just format the string directly in Rust.

@zetanumbers
Copy link
Contributor Author

zetanumbers commented Mar 4, 2022

Could we truncate to 8 bits? libc printf semantics aren't perfect, but at least they're well-defined and we don't need to document our own special handling of certain features.

Until and even then we truncate to 8 bits, we probably could handle non-ascii chars as unicode code points instead of UTF-16 char codes?

@zetanumbers
Copy link
Contributor Author

For printing unicode characters, isn't it possible to use %s instead of %c? Or just format the string directly in Rust.

Current %s implementation only works on ascii null-terminated strings.

https://github.com/aduros/wasm4/blob/main/runtimes/web/src/runtime.ts#L272

To manually tracef in Rust you would:

  1. Create an empty string;
  2. Gradually write to this string other substrings, numbers, etc. Meanwhile the String would grow (reallocate) gradually increasing its capacity;
  3. Flush the whole string onto a single line via traceUtf8;
  4. Deallocate the string.

This brings some runtime (~7KiB on all code optimizations) into the binary. It could have been better (now only ~2KIB) if there was an ability flush the line by parts, requiring no allocations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants