Skip to content

Conversation

@sternenseemann
Copy link
Member

No description provided.

The representation with Strings is tricky of course since identifiers in
Nix are technically byte strings (if you allow quoting). For our
purposes, it is practical to assume that all identifiers are ASCII or
UTF-8 encoded. With quoting, we can represent any Unicode character
except '\0'.
Instead of properties, give some examples that are hopefully clearer. We
can keep the properties and actually execute them as part of a test
suite.
It doesn't make sense to prescribe some kind of “correct” behavior with
NUL bytes, so it's best not to test it.
Currently this is practically > 95% which isn't great. Presumably, the
examples for simple identifiers are very short as well.
With this, we can get below 70% generated identifiers that don't need
quoting.
@sternenseemann sternenseemann force-pushed the language-nix-identifier-wrangling branch from c99580b to 80b555b Compare September 12, 2025 17:44
@sternenseemann sternenseemann marked this pull request as ready for review September 12, 2025 17:46
@sternenseemann
Copy link
Member Author

Some comments:

  • illegalSimpleIdentifiers is exposed since toNixName will need to know about these in order to solve How to deal with packages who's name is not a valid Nix identifier? #164 (stay tuned).
  • I'll work on Identifier handling a bit more in the future, e.g. fixing escaping behavior. However, this doesn't really affect cabal2nix/hackage2nix, so I'm going to do that separately.
  • Major version bump is due to a change in behavior in quote, needsQuoting and Pretty. Maybe a bit extreme. (What is a change in behavior, what is a bug fix?!)
  • Documentation is a bit of a mess now. I can offer to rework/restructure it. Presumably we should give a high level overview in the beginning to avoid spreading out related information between needsQuoting, illegalSimpleIdentifiers, parseSimpleIdentifier etc.
  • This doesn't really do anything, but already creates a huge diff, so I would like to review/merge this separately.

Copy link
Contributor

@wolfgangwalther wolfgangwalther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nixpkgs diff looks good.

Also looked through the PR, from what I understand, it looks good to me. Never used QuickCheck, so it's more learning that anything else here.

Copy link
Member

@maralorn maralorn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it!

@sternenseemann sternenseemann force-pushed the language-nix-identifier-wrangling branch from 80b555b to 96cb931 Compare September 15, 2025 13:18
This relates to #164, as there
are packages like `assert` which clash with Nix keywords. However, this
does not actually provide us with a solution since quoting isn't
possible in some contexts (e.g. function arguments).
This increases correctness as toNixName is already used to generate the
package expression arguments. If toNixName and attr were to diverge,
packages would fail to resolve their dependencies correctly.

A side effect of this change is that `pPrint attr` only quotes
attribute names as needed, resulting in a huge diff for
hackage-packages.nix.
@sternenseemann sternenseemann force-pushed the language-nix-identifier-wrangling branch from 96cb931 to 17586e3 Compare September 15, 2025 13:26
@sternenseemann sternenseemann merged commit bfacb0b into master Sep 15, 2025
8 checks passed
@sternenseemann sternenseemann deleted the language-nix-identifier-wrangling branch September 15, 2025 13:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants