[build, eiffel] Fix for #4694. #4695

kaby76 · 2025-12-01T22:20:35Z

This PR fixes #4694. Both the build and the Eiffel grammar are broken.

Background

Windows does not support symbolic links. A comprehensive search was done to find symbolic links in the repo and to remove them (for f in `find . -type f | fgrep -v '.git'`; do v=`git ls-files --stage $f | awk '{print $1}'`; if [ "$v" == 120000 ]; then echo $f; fi; done, or alternatively on Linux find . -type l). Currently, there are two grammars with symbolic links: vhdl2008 and eiffel. The vhdl2008 grammar is being fixed in PR #4693. The other is the Eiffel grammar, which is being corrected with this PR.

Changes to the build

The script _scripts/test-static-checks.sh was modified to check for symbolic linked files in any future PRs. If there are any, the build fails. The check only runs on Ubuntu and checks both what Bash find and git report.

The Eiffel grammar did have a symbolic link file. But, the grammar was never integrated into the build correctly, so it didn't matter. This PR removes the file. Conveniently, the "no-symbolic-links" test added in this PR validates that the grammar has no symbolic links.

In order to support the changes to the Eiffel grammar, I needed to update the version of the Trash Toolkit to 0.23.28. This version fixes some problems with the analysis of "top-level" .g4's.

Changes to the Eiffel grammar

I removed the specialized parser driver program from the example/ test files directory. The program does not function as a regression tester. In addition, the app includes another Antlr grammar that overrides a couple of lexer rules: EiffelGrammar.g4 overrides the WhiteSpace and Comment lexer rules with channels. I moved the specialized grammars to the main directory, updated Trash trgen tool to test both grammar pairs.

The examples and the directory structure for each Eiffel application was kept, but I removed the .png file because they are not really that useful for regression testing--best to use Trash to display (trgen -t CSharp; cd Generated-CSharp; make; trparse ../examples/application.e | trtree) and test the parse trees for invariant properties. I added a few more examples and a readme.md.

A pom.xml was added for testing the Eiffel grammar using Maven, but it only tests the EiffelParser.g4/EiffelLexer.g4 pair.

The Eiffel grammar was modified as per comment.

The Eiffel grammar is now target agnostic, and the CSharp and Java ports implemented. The whole point of having a grammar in target-agnostic form is so that the grammar can be ported to other targets. With the CSharp target, it can be tested using the Trash Toolkit. See the next section for such an analysis.

Ambiguity

The Eiffel grammar is ambiguous and has large max-k's. For example, for input containing the substring "i := 0", here are two parse trees that show ambiguity.

../examples/prog_args.e.d=73.a=3: (class_declaration (class_header (CLASS "class") (class_name (Identifier "PROG_ARGS"))) (inheritance (inherit_clause (INHERIT "inherit") (parent_list (parent (class_type (class_name (Identifier "ARGUMENTS"))))))) (creators (creation_clause (CREATE "create") (creation_procedure_list (creation_procedure (feature_name (Identifier "main")))))) (features (feature_clause (FEATURE "feature") (feature_declaration_list (feature_declaration (new_feature_list (new_feature (extended_feature_name (feature_name (Identifier "main"))))) (declaration_body (feature_value (attribute_or_routine (local_declarations (LOCAL "local") (entity_declaration_list (entity_declaration_group (identifier_list (Identifier "i")) (type_mark (COLON ":") (type (class_or_tuple_type (class_type (class_name (Identifier "INTEGER"))))))))) (feature_body (effective_routine (internal (routine_mark (DO "do")) (compound (instruction (loop (initialization (FROM "from") (compound (instruction (assignment (variable (variable_attribute (feature_name (Identifier "i")))) (COLON_EQUAL ":=") (expression (special_expression (manifest_constant (manifest_value (integer_constant (Integer "0")))))))))) (exit_condition (UNTIL "until") (expression (expression (Identifier "i")) (GT ">") (expression (Identifier "argument_count")))) (loop_body (LOOP "loop") (compound (instruction (expression (expression (Identifier "io")) (DOT ".") (expression (Identifier "put_string")))) (instruction (expression (OPEN_PAREN "(") (expression (expression (expression (expression (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\"Argument \"")))))) (PLUS "+") (expression (expression (Identifier "i")) (DOT ".") (expression (Identifier "out")))) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\": \""))))))) (PLUS "+") (expression (unqualified_call (feature_name (Identifier "argument")) (actuals (OPEN_PAREN "(") (actual_list (expression (Identifier "i"))) (CLOSE_PAREN ")"))))) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\"%N\""))))))) (CLOSE_PAREN ")"))) (SEMI_COLON ";") (instruction (assignment (variable (variable_attribute (feature_name (Identifier "i")))) (COLON_EQUAL ":=") (expression (expression (Identifier "i")) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (integer_constant (Integer "1"))))))))))) (END "end"))))))) (END "end")))))))) (END "end") (EOF ""))
../examples/prog_args.e.d=73.a=4: (class_declaration (class_header (CLASS "class") (class_name (Identifier "PROG_ARGS"))) (inheritance (inherit_clause (INHERIT "inherit") (parent_list (parent (class_type (class_name (Identifier "ARGUMENTS"))))))) (creators (creation_clause (CREATE "create") (creation_procedure_list (creation_procedure (feature_name (Identifier "main")))))) (features (feature_clause (FEATURE "feature") (feature_declaration_list (feature_declaration (new_feature_list (new_feature (extended_feature_name (feature_name (Identifier "main"))))) (declaration_body (feature_value (attribute_or_routine (local_declarations (LOCAL "local") (entity_declaration_list (entity_declaration_group (identifier_list (Identifier "i")) (type_mark (COLON ":") (type (class_or_tuple_type (class_type (class_name (Identifier "INTEGER"))))))))) (feature_body (effective_routine (internal (routine_mark (DO "do")) (compound (instruction (loop (initialization (FROM "from") (compound (instruction (assigner_call (expression (Identifier "i")) (COLON_EQUAL ":=") (expression (special_expression (manifest_constant (manifest_value (integer_constant (Integer "0")))))))))) (exit_condition (UNTIL "until") (expression (expression (Identifier "i")) (GT ">") (expression (Identifier "argument_count")))) (loop_body (LOOP "loop") (compound (instruction (expression (expression (Identifier "io")) (DOT ".") (expression (Identifier "put_string")))) (instruction (expression (OPEN_PAREN "(") (expression (expression (expression (expression (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\"Argument \"")))))) (PLUS "+") (expression (expression (Identifier "i")) (DOT ".") (expression (Identifier "out")))) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\": \""))))))) (PLUS "+") (expression (unqualified_call (feature_name (Identifier "argument")) (actuals (OPEN_PAREN "(") (actual_list (expression (Identifier "i"))) (CLOSE_PAREN ")"))))) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (manifest_string (Basic_manifest_string "\"%N\""))))))) (CLOSE_PAREN ")"))) (SEMI_COLON ";") (instruction (assignment (variable (variable_attribute (feature_name (Identifier "i")))) (COLON_EQUAL ":=") (expression (expression (Identifier "i")) (PLUS "+") (expression (special_expression (manifest_constant (manifest_value (integer_constant (Integer "1"))))))))))) (END "end"))))))) (END "end")))))))) (END "end") (EOF ""))

In this grammar, the parser cannot distinguish between an assignment and an assigner_call within instruction for the input string "i := 0". According to the spec, assigner_call is chosen over assignment when: "[t]he Equivalent Dot Form of target is a qualified Object_call whose feature has an assigner command." In order to make this distinction, a symbol table must be added to the grammar.

Ambiguity is a sign of a poorly designed grammar because the parse tree depends on the semantics of the language. Rule assigner_call subsumes assignment, so one could just remove assignment and the input would still parse--and more efficiently. You would need to follow up parsing to distinguish between the two different interpretations. But, the parse tree would be the same regardless.

Add check to make sure there are no symbolic links in checkin.

Unknown what 2nd grammar was for.

* Make tests simple. Do not include png's of the parse tree--completely untestable, unverifiable. Remove all other files of unknow purpose. Move Eiffel tests to examples/ directory. Update testing script to correct for extra popd.

…etter Antlr.

Added many explicit lexer rules for split grammar string literals.

Parse should not be dependent on the number keyword counts.

…onfig build files.

kaby76 · 2025-12-03T13:44:52Z

I'm going to add in a new version of trgen so that I can preserve the EiffelGrammar.g4 (no skip tokens) example.

kaby76 added 2 commits December 1, 2025 17:19

Fix for $4694.

ef99d20

Add check to make sure there are no symbolic links in checkin.

Update test-static-checks.sh

6904f76

kaby76 changed the title ~~[build, eiffel] Fix for $4694.~~ [build, eiffel] Fix for #4694. Dec 2, 2025

kaby76 added 10 commits December 1, 2025 19:50

Update test-static-checks.sh

1ee5017

Remove testing app, other "Eiffel" grammar.

8bc5ca9

Unknown what 2nd grammar was for.

Clean up Eiffel tests.

fb1cdbf

* Make tests simple. Do not include png's of the parse tree--completely untestable, unverifiable. Remove all other files of unknow purpose. Move Eiffel tests to examples/ directory. Update testing script to correct for extra popd.

Fix testing script.

843610f

Rewrite lexer rules with case insensitive; already requires 4.10 or b…

a3158d5

…etter Antlr.

Split grammar and made target agnostic.

8c0b3b8

Added many explicit lexer rules for split grammar string literals.

Remove bad implemenation of semantic constraint.

92d46cd

Parse should not be dependent on the number keyword counts.

Fix Java and CSharp ports.

7d3f484

Fix C# parser.

8dd98eb

Add eiffel to Maven tester.

c98208e

kaby76 marked this pull request as ready for review December 2, 2025 04:08

kaby76 added 2 commits December 3, 2025 06:50

Added examples and readme.md.

85850a0

Moved examples back into subdirectories, and added back .ecf Eiffel c…

6dee70c

…onfig build files.

kaby76 marked this pull request as draft December 3, 2025 13:43

Add back extension grammar and test it.

feabe22

kaby76 mentioned this pull request Dec 3, 2025

0.23.28 kaby76/Trash#573

Merged

Update to latest Trash, required to process Eiffel grammar.

b2f2edc

kaby76 marked this pull request as ready for review December 4, 2025 01:04

Add another example.

8b52cfd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[build, eiffel] Fix for #4694. #4695

[build, eiffel] Fix for #4694. #4695

Uh oh!

kaby76 commented Dec 1, 2025 •

edited

Loading

Uh oh!

kaby76 commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[build, eiffel] Fix for #4694. #4695

Are you sure you want to change the base?

[build, eiffel] Fix for #4694. #4695

Uh oh!

Conversation

kaby76 commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Changes to the build

Changes to the Eiffel grammar

Ambiguity

Uh oh!

kaby76 commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kaby76 commented Dec 1, 2025 •

edited

Loading