-
Notifications
You must be signed in to change notification settings - Fork 55
The state system and support for the actions and states in the PEG parser notation #131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Also, I'd like to add that the many things can be moved to the state class in the future. And the state class should be passed to the parser expression parse function instead of the parser itself. |
|
In a recent commit, I added support of suppress action. This feature is crucial in the PEG parser notation and the lack of it leads to an overcomplex solutions in the semantic layer. |
b6b3810 to
b8a5492
Compare
e74d29d to
d37be8e
Compare
…deep copy operation
0dc8eba to
c667daf
Compare
180f3aa to
e3ee6ab
Compare
…ld be reused in other cases
…entation parsing process
… "first longer" action is used
…ules of the PEG notation
3685c8d to
4dd394c
Compare
…es the code less readable
…es that were failed
4dd394c to
931e0b5
Compare
|
Hi @andr-dots. Thank you for your substantial work in this PR and sorry for took me so long to respond. While the explicit state handling you've implemented offers powerful capabilities for context-sensitive languages, it also introduces significant complexity and performance considerations, as you've noted. Given Arpeggio's design philosophy of remaining a simple PEG parser, I don't believe this rework aligns with the project's direction. Furthermore, I don't feel comfortable having to maintain this rework. I'd suggest forking this work into a separate library where you could maintain full control over its development. This would allow users needing more advanced parsing capabilities to benefit from your work, while Arpeggio continues serving those who need a straightforward PEG parser. |
The state system allows programmers to write complex grammar parsers for the languages with the unusual features.
The PEG action system allows to handle such features as templates/generics in such programming languages like Python and Golang or to handle the labels in such languages as C (it's an automatic linkage of labels).
The parsing state system allows to handle non-linear code parsing scenarios (like cases where the syntax constructions overlap and depend on the previously found constructions during the parsing process).
Both the state and the action systems allow programmers to solve many known and possible issues that the original parser was unable to solve. Previously, such problems were, most likely, solved in the semantic layer of the program (with a much greater complexity and much less readable code).
The con of the new system is that the parsing process becomes a little slower while working with the state system because it needs to take a state snapshot every time before a match can fail.
Updated on 2025-07-20:
To make the tests smoother, an improvement was made to the PEG grammar notation, i.e. separator
%operator is now available so that writing repetitions with a separator would be much easier. Although I thought about using,instead, but this way the code looks too odd.Code review checklist
CHANGELOG.md, no needto update for typo fixes and such).