Skip to content

Conversation

@jnm2
Copy link
Contributor

@jnm2 jnm2 commented Jul 31, 2025

Proposed replacement for @Nigel-Ecma's #1287 and @gafter's #1297

Will fix #1385

I was trying not to end up creating new names like non_array_non_nullable_reference_type and non_array_nullable_reference_type which don't sound like core concepts we'd want to be referencing elsewhere. (Cf #1287)

TODO:

  • Add samples for the grammer tester. (Getting help from @Nigel-Ecma offline. I would like to create tests that show that string?[] and int[]?[] were disallowed prior to the grammar changes in this PR.)

Comment on lines 56 to +58
array_type
: non_array_type rank_specifier+
: array_type nullable_type_annotation rank_specifier+
| non_array_type rank_specifier+
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the property that when parsing string?[][,]?[,,][,,,]? we end up with two array_type nodes: one being string?[][,]?[,,][,,,], and one being the inner string?[][,].

@Nigel-Ecma It's not mutual left recursion! 😁

@jnm2 jnm2 force-pushed the jnm2/arrays_of_nrt branch from 8b0a578 to aa5e4b5 Compare July 31, 2025 01:29
Copy link
Contributor

@Nigel-Ecma Nigel-Ecma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing wrong with being concise; however unfortunately some of the conciseness is in the wrong places, and there are concerns this isn’t complete coverage of the feature.

@jnm2 jnm2 marked this pull request as ready for review October 31, 2025 00:52
@jnm2 jnm2 requested a review from Nigel-Ecma October 31, 2025 00:53
@jnm2 jnm2 added the meeting: discuss This issue should be discussed at the next TC49-TG2 meeting label Oct 31, 2025
@jnm2
Copy link
Contributor Author

jnm2 commented Oct 31, 2025

I have hit my time limit for the week but will try next week to understand the tools to set up grammar tests and run them.

Copy link
Contributor

@Nigel-Ecma Nigel-Ecma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I think there is quite a way to go yet.

Overall reading this I get a sense that too much is being placed on the number of occurrences of array_type/the change in shape of the array – these are really illusory changes (whether we think they should be or not) due to the erasure of nullability – meaning is not changing, even though adding/removing nullable annotations has the very non-illusory impact of generating compile time errors!

Trying to think of an illustration the best I came up with is language translation:

The new road avoids the swamp.

In Irish is:

Seachnaíonn an bóthar nua an portach.

These have the same meaning even if the structure has changed – translate the Irish word-by-word and you get:

Avoids the road new the swamp.

Arrays of nullable arrays doesn’t change the meaning of the code, what it changes is how the code must be written – e.g. rearrange indices (avoids the road new the swamp) – to achieve the same meaning.

Is this PR getting this across? Should it?

The nullable annotation `?` may be placed on an array type, as in `T[R]?`. Such an array type may be used as the element type of another array type, as in `T[R]?[R₂]`.
The intervening nullable annotation `?` separates the grammar into multiple *array_types*. `T[R₁][R₂]?[R₃][R₄]` is not a single *array_type* with four ranks. Rather, it is two *array_type*s, each of which has two ranks. The outer *array_type* has ranks `[R₃][R₄]`, read left to right, and its element type is `T[R₁][R₂]?`. The element type is another *array_type* with a nullable annotation, and this inner *array_type* has ranks `[R₁][R₂]`, read left to right.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saying nullable annotation “separates the grammar” is incorrect, from a grammar perspective the rule allows an array_type to contain another array_type.

The intervening nullable annotation `?` separates the grammar into multiple *array_types*. `T[R₁][R₂]?[R₃][R₄]` is not a single *array_type* with four ranks. Rather, it is two *array_type*s, each of which has two ranks. The outer *array_type* has ranks `[R₃][R₄]`, read left to right, and its element type is `T[R₁][R₂]?`. The element type is another *array_type* with a nullable annotation, and this inner *array_type* has ranks `[R₁][R₂]`, read left to right.
> *Note*: This is the sole exception to the general rule that the meaning of a program remains the same when nullable reference types annotations are removed. *end note*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a normative part of the spec so cannot be a Note (which is just informative).

You also need to change wherever this “general rule” is relied upon or mentioned (e.g. §8.9.5.3) because the “exception” invalidates the rule – it can in general never be relied upon.

> *Note*: This is the sole exception to the general rule that the meaning of a program remains the same when nullable reference types annotations are removed. *end note*
Every reference type which contains nullable annotations has a corresponding unannotated type with no semantic difference8.9.1). The corresponding unannotated type for an array of nullable arrays is a single array type which recursively collects all the ranks of all the nested *array_type*s.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With array types there is not in general a single “corresponding nullable type” (§8.9.1) – there are multiple, e.g.:

static void ArrayEquivalentTypes(int[][,]?[,,]?[,,,]?[,,,,] a1)
{
    int[,,][][,]?[,,,]?[,,,,] a2 = a1;

    int[,,,][,,][][,]?[,,,,] a3_1 = a1;
    int[,,,][,,][][,]?[,,,,] a3_2 = a2;

    int[,,,,][,,,][,,][][,] a4_1 = a1;
    int[,,,,][,,,][,,][][,] a4_2 = a2;
    int[,,,,][,,,][,,][][,] a4_3_1 = a3_1;
    int[,,,,][,,,][,,][][,] a4_3_2 = a3_2;

    int[,,][][,]?[,,,]?[,,,,] b3_1 = a3_1;
    int[,,][][,]?[,,,]?[,,,,] b3_2 = a3_2;
    int[,,][][,]?[,,,]?[,,,,] b4_1 = a4_1;
    int[,,][][,]?[,,,]?[,,,,] b4_2 = a4_2;
    int[,,][][,]?[,,,]?[,,,,] b4_3_1 = a4_3_1;
    int[,,][][,]?[,,,]?[,,,,] b4_3_2 = a4_3_2;

    int[,,,][,,][][,]?[,,,,] c4_1 = a4_1;
    int[,,,][,,][][,]?[,,,,] c4_2 = a4_2;
    int[,,,][,,][][,]?[,,,,] c4_3_1 = a4_3_1;
    int[,,,][,,][][,]?[,,,,] c4_3_2 = a4_3_2;

    // etc...
}

This bundle of joy ;-) has four array types that are semantically equivalent to each other, and that’s not the limit by a long shot.

It needs to be specified that any two types (which need not be distinct) selected from a semantically equivalent set are implicitly convertible to each other. (Depending on implementation some, but not all, of the conversions may elicit a nullable warning.)

And while we are here, consider:

static void ArrayDifferentTypesSameLiteral()
{
    int[]?[,] a1 =
        {
            {   new int[] { 1, 2 },
                new int[] { 3, 4 }
            }
        };
    int[,][] a2 =
        {
            {   new int[] { 1, 2 },
                new int[] { 3, 4 }
            }
        };

Different array types which are semantically equivalent can be init’ed using the same array literal. That will need to be specified.

Every reference type which contains nullable annotations has a corresponding unannotated type with no semantic difference8.9.1). The corresponding unannotated type for an array of nullable arrays is a single array type which recursively collects all the ranks of all the nested *array_type*s.
The unannotated array type of an array of nullable arrays cannot be found by simply removing the nullable annotations `?` from the grammar and reparsing. This is because array ranks are read left to right while nested *array_type* productions are read outside-in, with outer array type ranks to the right, inner array type ranks to the left. Thus, the type `T[R₁][R₂]?[R₃][R₄]` has an unannotated array type of `T[R₃][R₄][R₁][R₂]`. To obtain the unannotated array type of an array of nullable arrays, first take the ranks on the outermost array type in order from left to right, then move to the array type inside the nullable element type and take its ranks in order from left to right. Repeat until the element type is no longer a nullable array type. Then take this remaining element type and place on it all the collected ranks in order from first to last to obtain the unannotated array type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not yet convinced regarding the algorithm presentation, maybe bullets/numbered steps or pseudo-code would help?

Regardless at minimum I think the example should show the result of each iteration

> ```
>
> *end example*
The syntactic distinction between a *nullable reference type* and its corresponding *non-nullable reference type* enables a compiler to generate diagnostics. A compiler must allow the *nullable_type_annotation* as defined in [§8.2.1](types.md#821-general). The diagnostics must be limited to warnings. Neither the presence or absence of nullable annotations, nor the state of the nullable context can change the compile time or runtime behavior of a program except for changes in any diagnostic messages generated at compile time, with one exception: arrays of nullable arrays are not parsed as a single *array_type*, but rather as multiple nested *array_type*s. The corresponding *non-nullable reference type* of an array of nullable arrays is not the single array type that would be parsed if the nullable annotations were removed; see §arrays-of-nullable-arrays.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will surprise everybody ;-) and say “compiler” -> ”implementation”.

Not changing the compile time behavior is no longer correct, add/remove a nullable annotation on an array type without changing all associated index operations will produce compile-time errors.

I’m not sure you can say changing the number of array_type productions in the parse is an exception per se – the annotations do not change the described array shape in anyway (one might argue that the existing description of arrays is less than clear on the shape, if so adding nullable arrays is the time to fix that).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

meeting: discuss This issue should be discussed at the next TC49-TG2 meeting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Arrays of nullable references

3 participants