Skip to content

Conversation

@kaby76
Copy link
Contributor

@kaby76 kaby76 commented Dec 1, 2025

This PR fixes #4694. Both the build and the Eiffel grammar are broken.

Background

Windows does not support symbolic links. A comprehensive search was done to find symbolic links in the repo and to remove them (for f in `find . -type f | fgrep -v '.git'`; do v=`git ls-files --stage $f | awk '{print $1}'`; if [ "$v" == 120000 ]; then echo $f; fi; done, or alternatively on Linux find . -type l). Currently, there are two grammars with symbolic links: vhdl2008 and eiffel. The vhdl2008 grammar is being fixed in PR #4693. The other is the Eiffel grammar, which is being corrected with this PR.

Changes to the build

The script _scripts/test-static-checks.sh was modified to check for symbolic linked files in any future PRs. If there are any, the build fails. The check only runs on Ubuntu and checks both what Bash find and git report.

The Eiffel grammar did have a symbolic link file. But, the grammar was never integrated into the build correctly, so it didn't matter. This PR removes the file. Conveniently, the "no-symbolic-links" test added in this PR validates that the grammar has no symbolic links.

In order to support the changes to the Eiffel grammar, I needed to update the version of the Trash Toolkit to 0.23.28. This version fixes some problems with the analysis of "top-level" .g4's.

Changes to the Eiffel grammar

I removed the specialized parser driver program from the example/ test files directory. The program does not function as a regression tester. In addition, the app includes another Antlr grammar that overrides a couple of lexer rules: EiffelGrammar.g4 overrides the WhiteSpace and Comment lexer rules with channels. I moved the specialized grammars to the main directory, updated Trash trgen tool to test both grammar pairs.

The examples and the directory structure for each Eiffel application was kept, but I removed the .png file because they are not really that useful for regression testing--best to use Trash to display (trgen -t CSharp; cd Generated-CSharp; make; trparse ../examples/application.e | trtree) and test the parse trees for invariant properties. I added a few more examples and a readme.md.

A pom.xml was added for testing the Eiffel grammar using Maven, but it only tests the EiffelParser.g4/EiffelLexer.g4 pair.

The Eiffel grammar was modified as per comment.

The Eiffel grammar is now target agnostic, and the CSharp and Java ports implemented. The whole point of having a grammar in target-agnostic form is so that the grammar can be ported to other targets. With the CSharp target, it can be tested using the Trash Toolkit. See the next section for such an analysis.

Ambiguity

The Eiffel grammar is ambiguous and has large max-k's. For example, for input containing the substring "i := 0", here are two parse trees that show ambiguity.

../examples/prog_args.e.d=73.a=3: (class_declaration (class_header (CLASS "class") (class_name (Identifier "PROG_ARGS"))) (inheritance (inherit_clause (INHERIT "inherit") (parent_list (parent (class_type (class_name (Identifier "ARGUMENTS"))))))) (creators (creation_clause (CREATE "create") (creation_procedure_list (creation_procedure (feature_name (Identifier "main")))))) (features (feature_clause (FEATURE "feature") (feature_declaration_list (feature_declaration (new_feature_list (new_feature (extended_feature_name (feature_name (Identifier "main"))))) (declaration_body (feature_value (attribute_or_routine (local_declarations (LOCAL "local") (entity_declaration_list (entity_declaration_group (identifier_list (Identifier "i")) (type_mark (COLON ":") (type (class_or_tuple_type (class_type (class_name (Identifier "INTEGER"))))))))) (feature_body (effective_routine (internal (routine_mark (DO "do")) (compound (instruction (loop (initialization (FROM "from") (compound (instruction (assignment (variable (variable_attribute (feature_name (Identifier "i")))) (COLON_EQUAL ":=") (expression (special_expression (manifest_constant (manifest_value (integer_constant (Integer "0")))))))))) (exit_condition (UNTIL "until") (expression (expression (Identifier "i")) (GT ">") (expression (Identifier "argument_count")))) (loop_body (LOOP "loop") (compound (instruction (expression (expression (Identifier "io")) (DOT ".") (expression (Identifier "put_string")))) (instruction (expression (OPEN_PAREN "(") (expression (expression (expression (expression (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\"Argument \"")))))) (PLUS "+") (expression (expression (Identifier "i")) (DOT ".") (expression (Identifier "out")))) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\": \""))))))) (PLUS "+") (expression (unqualified_call (feature_name (Identifier "argument")) (actuals (OPEN_PAREN "(") (actual_list (expression (Identifier "i"))) (CLOSE_PAREN ")"))))) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\"%N\""))))))) (CLOSE_PAREN ")"))) (SEMI_COLON ";") (instruction (assignment (variable (variable_attribute (feature_name (Identifier "i")))) (COLON_EQUAL ":=") (expression (expression (Identifier "i")) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (integer_constant (Integer "1"))))))))))) (END "end"))))))) (END "end")))))))) (END "end") (EOF ""))
../examples/prog_args.e.d=73.a=4: (class_declaration (class_header (CLASS "class") (class_name (Identifier "PROG_ARGS"))) (inheritance (inherit_clause (INHERIT "inherit") (parent_list (parent (class_type (class_name (Identifier "ARGUMENTS"))))))) (creators (creation_clause (CREATE "create") (creation_procedure_list (creation_procedure (feature_name (Identifier "main")))))) (features (feature_clause (FEATURE "feature") (feature_declaration_list (feature_declaration (new_feature_list (new_feature (extended_feature_name (feature_name (Identifier "main"))))) (declaration_body (feature_value (attribute_or_routine (local_declarations (LOCAL "local") (entity_declaration_list (entity_declaration_group (identifier_list (Identifier "i")) (type_mark (COLON ":") (type (class_or_tuple_type (class_type (class_name (Identifier "INTEGER"))))))))) (feature_body (effective_routine (internal (routine_mark (DO "do")) (compound (instruction (loop (initialization (FROM "from") (compound (instruction (assigner_call (expression (Identifier "i")) (COLON_EQUAL ":=") (expression (special_expression (manifest_constant (manifest_value (integer_constant (Integer "0")))))))))) (exit_condition (UNTIL "until") (expression (expression (Identifier "i")) (GT ">") (expression (Identifier "argument_count")))) (loop_body (LOOP "loop") (compound (instruction (expression (expression (Identifier "io")) (DOT ".") (expression (Identifier "put_string")))) (instruction (expression (OPEN_PAREN "(") (expression (expression (expression (expression (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\"Argument \"")))))) (PLUS "+") (expression (expression (Identifier "i")) (DOT ".") (expression (Identifier "out")))) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\": \""))))))) (PLUS "+") (expression (unqualified_call (feature_name (Identifier "argument")) (actuals (OPEN_PAREN "(") (actual_list (expression (Identifier "i"))) (CLOSE_PAREN ")"))))) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\"%N\""))))))) (CLOSE_PAREN ")"))) (SEMI_COLON ";") (instruction (assignment (variable (variable_attribute (feature_name (Identifier "i")))) (COLON_EQUAL ":=") (expression (expression (Identifier "i")) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (integer_constant (Integer "1"))))))))))) (END "end"))))))) (END "end")))))))) (END "end") (EOF ""))

In this grammar, the parser cannot distinguish between an assignment and an assigner_call within instruction for the input string "i := 0". According to the spec, assigner_call is chosen over assignment when: "[t]he Equivalent Dot Form of target is a qualified Object_call whose feature has an assigner command." In order to make this distinction, a symbol table must be added to the grammar.

Ambiguity is a sign of a poorly designed grammar because the parse tree depends on the semantics of the language. Rule assigner_call subsumes assignment, so one could just remove assignment and the input would still parse--and more efficiently. You would need to follow up parsing to distinguish between the two different interpretations. But, the parse tree would be the same regardless.

Add check to make sure there are no symbolic links in checkin.
@kaby76 kaby76 changed the title [build, eiffel] Fix for $4694. [build, eiffel] Fix for #4694. Dec 2, 2025
kaby76 added 10 commits December 1, 2025 19:50
Unknown what 2nd grammar was for.
* Make tests simple. Do not include png's of the parse tree--completely untestable, unverifiable. Remove all other files of unknow purpose. Move Eiffel tests to examples/ directory. Update testing script to correct for extra popd.
Added many explicit lexer rules for split grammar string literals.
Parse should not be dependent on the number keyword counts.
@kaby76 kaby76 marked this pull request as ready for review December 2, 2025 04:08
@kaby76 kaby76 marked this pull request as draft December 3, 2025 13:43
@kaby76
Copy link
Contributor Author

kaby76 commented Dec 3, 2025

I'm going to add in a new version of trgen so that I can preserve the EiffelGrammar.g4 (no skip tokens) example.

@kaby76 kaby76 mentioned this pull request Dec 3, 2025
@kaby76 kaby76 marked this pull request as ready for review December 4, 2025 01:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[eiffel] Examples contains eiffel-2-eiffel/ driver app, and symbolic link, which isn't supported on Windows!

1 participant