- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.3k
Add information about node's original location in the parsed string #3557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Prior to this commit it was not possible to trace back the parsed node location in the original string. This information could be used to highlight the syntax of a math expression displayed in a rich editor (e.g. a spreadsheet or a calculator). This commit is based on the PR josdejong#2796 that has been abandoned for 2 years. I rebased the task branch and fixed the issues reported in the original PR. In the nutshell, each parsed node stores an array of sources (`SourceMapping[]`) that are set during parsing (see `tokenSource` and its usages for details). The node's constructor and clone method are adjusted to take an optional `MetaOptions` object containing the source mappings. In the future `MetaOptions` could be extended to store more information. Benchmarks showed no change in `evaluate`. `parse` became slower, from 3.68µs to 5.17µs with the changes from this commit. Closes josdejong#2795
In some cases it's important to parse data as quickly as possible. As source tracing impacts the `parse` performance I've decided to add a new config option that allows disabling the feature. I set `traceSources` to `false` and ran `expression_parser.js` benchmark and got the following result: ``` (plain js) evaluate 0.04 µs ±0.01% (mathjs) evaluate 0.25 µs ±0.30% (mathjs) parse, compile, evaluate 7.22 µs ±0.79% (mathjs) parse, compile 7.19 µs ±5.51% (mathjs) parse 6.05 µs ±12.34% ``` If `traceSources` is set to `true` I'm getting these times: ``` (plain js) evaluate 0.04 µs ±0.00% (mathjs) evaluate 0.25 µs ±0.19% (mathjs) parse, compile, evaluate 7.25 µs ±0.59% (mathjs) parse, compile 7.59 µs ±10.22% (mathjs) parse 6.42 µs ±18.91% ``` The control times (prior to the changes in this PR): ``` (plain js) evaluate 0.04 µs ±0.00% (mathjs) evaluate 0.25 µs ±0.27% (mathjs) parse, compile, evaluate 6.48 µs ±15.36% (mathjs) parse, compile 6.42 µs ±14.74% (mathjs) parse 4.94 µs ±13.23% ```
It looks like the hotspot was `tokenSource` method and the object
allocation followed by the method call: `{ sources: [tokenSource(state)] }`.
Instead of using `tokenSource` directly let's add a new method `tokenSourceMetaOptions`
and use it. This method returns a preallocated `defaultMetaOptions` if
source tracing is disabled.
`expression_parser.js` benchmark BEFORE this change:
```
(plain js) evaluate                   0.04 µs   ±0.00%
(mathjs) evaluate                     0.26 µs  ±31.65%
(mathjs) parse, compile, evaluate     6.23 µs  ±13.74%
(mathjs) parse, compile               6.01 µs  ±10.45%
(mathjs) parse                        5.31 µs  ±27.41%
```
AFTER this change:
```
(plain js) evaluate                   0.04 µs   ±0.00%
(mathjs) evaluate                     0.25 µs  ±15.79%
(mathjs) parse, compile, evaluate     6.00 µs   ±0.41%
(mathjs) parse, compile               5.89 µs   ±6.29%
(mathjs) parse                        4.66 µs   ±5.90%
```
BEFORE all the changes in the PR (control group):
```
(plain js) evaluate                   0.04 µs   ±0.00%
(mathjs) evaluate                     0.24 µs   ±0.09%
(mathjs) parse, compile, evaluate     4.89 µs   ±0.28%
(mathjs) parse, compile               4.97 µs  ±10.20%
(mathjs) parse                        3.61 µs   ±6.10%
```
    | @josdejong any chance you or someone else can look into this PR? Feedback is highly appreciated 🙏 | 
| Hi Sergey, thanks a lot for picking this up! I think I don't have enough time today to review this (large) PR, I hope to look into it coming Friday. | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again for picking this up Sergey! It's quite a job but it looks solid. it will be a very useful addition.
I made quite some inline comments (nothing very big though). Can you have a look at those? Thanks!
| All nodes have the following methods: | ||
|  | ||
| - `clone() : Node` | ||
| - `clone(options: MetaOptions) : Node` | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The options argument is optional, and it is named meta in the source code (consistent with the constructor arguments of the Node classes). Can you document this like clone(meta?: MetaOptions) : Node?
And similarly, the argument meta is optional in the constructors of each of the Node classes, can you udpate that too in this documentation page?
| Properties: | ||
|  | ||
| - `text: string` the token representing this node in the parsed string | ||
| - `index; number` the index of the token in the parsed string | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small typo, I think the semicolon ; sould be a colon :.
All is all well documented BTW 👍
|  | ||
| const name = 'Node' | ||
| const dependencies = ['mathWithTransform'] | ||
| export const defaultMetaOptions = { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense to define a defaultMetaOptions. However, would it make sense to create a new object and array every time to prevent people from being able to mess up by mutating this shared/reused object? Maybe we can use Object.freeze here to solve this. (Thinking aloud here)
| */ | ||
| constructor (object, index) { | ||
| super() | ||
| constructor (object, index, meta = defaultMetaOptions) { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking aloud here: technically it is not needed to define a default value for meta in each of the Node classes since it will be set in the root Node.js class. What do you think? Would it be OK to leave meta as it is in these constructors and only set it in the constructor of Node.js? It would save a bit of overhead.
| const name = 'Node' | ||
| const dependencies = ['mathWithTransform'] | ||
| export const defaultMetaOptions = { | ||
| sources: [] | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking about what would best communicate when sources are not available: an empty array like it is now? Or setting sources: null for example? In usage, it may be easier if you can know for sure that sources is always an array so you don't have to check whether it exists, so an empty array can make sense. But it can also be misleading and giving the impression that sources are available. Any thoughts?
| assert.deepStrictEqual(parsedNaN.sources, [{ index: 0, text: 'NaN' }]) | ||
| }) | ||
|  | ||
| it('adds sources for function calls', function () { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add tests for defining a function (uses the FunctionAssignmentNode)?
| }) | ||
|  | ||
| it('adds sources for function calls', function () { | ||
| const parsed = math.parse('foo(1, 2)') | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for a function call with more than two arguments?
| assert.deepStrictEqual(doubleQuote.sources, doubleSources) | ||
| }) | ||
|  | ||
| it('adds sources for objects', function () { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for an empty object?
| * {string} randomSeed | ||
| * Random seed for seeded pseudo random number generator. | ||
| * Set to null to randomly seed. | ||
| * {boolean} traceSources | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you document this new configuration option traceSources on the page docs/core/configuration.md?
| } | ||
|  | ||
| // Maps a parsed node back to its place in the original source string | ||
| export interface SourceMapping { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the type definitions of the constructors of all Node classes and also their .clone method describing the new meta option?
| Does this also supersede #2615 ? | 
| 
 Ah, yes. I'll close that PR in favor of this one. Thanks for pointing this out. | 
Prior to this commit it was not possible to trace back the parsed node location in the original string. This information could be used to highlight the syntax of a math expression displayed in a rich editor (e.g. a spreadsheet or a calculator).
This commit is based on the PR #2796 that has been abandoned for 2 years. I rebased the task branch and fixed the issues reported in the original PR.
In the nutshell, each parsed node stores an array of sources (
SourceMapping[]) that are set during parsing (seetokenSourceand its usages for details). The node's constructor and clone method are adjusted to take an optionalMetaOptionsobject containing the source mappings. In the futureMetaOptionscould be extended to store more information.Benchmarks showed no change in
evaluate.parsebecame slower, from 3.68µs to 5.17µs with the changes from this commit.Closes #2795