fix(testing): Optimize fill #1804

marioevz · 2025-11-21T03:31:51Z

🗒️ Description

This PR optimizes the fill process to remove virtually all python bottlenecks and leave the t8n execution as the only time consumed.

It does so via several optimizations, but first the profiling results so we can breakdown the differences.

Profiling Results

This is the profiling results for the transition tool call during the filling of a benchmark engine-x blockchain test with the following properties:

Pre-allocation groupings generated for all benchmark tests in a single group.
All accounts from all tests are contained in a ~250MB allocation passed to the t8n in each block.

Profiling before this PR:

Profiling file (use SnakeViz to visualize): profile.before.out.tar.gz

Main time consumers:

The model_validate pydantic method, 1st orange box, mainly used to load the result from the t8n tool, which includes the full modified pre-alloc (30.4s, 58.8% of the total).

A lot of the time spent in this box is spent in our to_bytes conversion method.

The write_json_files method, 2nd orange box, used to dump the pre-allocation, environment and transactions for the t8n to read them during the state transition execution, which included dumping json objects (already dumped by pydantic) into files (8.27s, 15.81% of the total).
Evmone-t8n process run, light blue box, is the actual call to spawn the t8n process (7.73 seconds, 14.77% of the total).
The model_dump method, which is used to dump memory objects into json objects to eventually pass them to t8n tool (4.07s, 7.77% of the total).

Profiling after this PR:

Profiling file: profile.after.out.tar.gz

Main time consumers:

Evmone-t8n process run, light blue box (7.61 seconds, 95.13% of the total).
The new to_files method(0.389s, 4.86% of the total).

Optimizations

Pydantic bytes optimization

The original method we used to validate hexadecimal bytes in pydantic was too flexible and even removed white spaces in order to properly parse the string (using regex even), and this method was called on every instantiation of a Bytes type which was extremely slow.

We removed this flexibility and introduced a CoerceBytes method which can be used only when the user wants to define a hexadecimal string that is more readable.

Some other minor optimizations also were included in this, for details check commit 509d17c541.

Remove Unnecessary State Root Calculations

During the filling of a pre-allocation grouping test, we used the alloc.state_root() method to pass this into the header of the genesis of the filled test.

This method calculates the state root using python and is extremely expensive, and since the pre-allocation group is very big since it contains all accounts of all tests, this was very slow.

This PR creates GroupPreAlloc subclass of Alloc which overrides the state_root parameter to look for the pre-calculated value since the state root hash never changes.

A significant refactor was necessary for this, and for more details check f59397e81d.

`LazyAlloc` class to defer model validation

This PR introduces a new class called LazyAlloc, which allows to defer validation of the pre-alloc until it's needed in decoded format.

Most of the times, and specially for tests with many blocks, this allocation is not necessary in-between calls and the only real allocation that we use for post-comparison is the one returned by the last block (with a few exceptions).

With this in mind, and since this information is validated->dumped always in the same format, when we receive the allocation in string from the t8n, LazyAlloc keeps it as-is, and then when we pass the pre-alloc of the next block to the next t8n instance, we simply pass along the string as we got it from the t8n, skipping validation and dumping entirely.

If any test requires the allocation in validated format, LazyAlloc.get_alloc() is called which validates, caches, and returns the Alloc object.

See 164de2dee8 for details.

Cache Pydantic Model Dump of the Pre-Alloc Groups

Last optimization is to keep a cache of the result of model_dump or model_dump_json for the genesis pre-allocation groups, since it's always the same, and it's always required in the same format, only the first test in the pytest worker has to do this and it's not needed for the rest of the worker's runtime.

See fa3be8e061 for details.

The profiler

Introduces a profiler which dumps profiling information when --evm-dump-dir is passed.

This feature was used to debug and fix all the issues in this PR.

🔗 Related Issues or PRs

N/A.

✅ Checklist

All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
```
uvx tox -e static
```
All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
All: Considered adding an entry to CHANGELOG.md.
All: Considered updating the online docs in the ./docs/ directory.
All: Set appropriate labels for the changes (only maintainers can apply labels).
Tests: Ran mkdocs serve locally and verified the auto-generated docs for new tests in the Test Case Reference are correctly formatted.
Tests: For PRs implementing a missed test case, update the post-mortem document to add an entry the list.
Ported Tests: All converted JSON/YML tests from ethereum/tests or tests/static have been assigned @ported_from marker.

Cute Animal Picture

packages/testing/src/execution_testing/base_types/base_types.py

fselmo

Lgtm, just left a nit 👍🏼

marioevz · 2025-11-24T17:24:06Z

@fselmo I've pushed some other fixes to this branch but I think there's still some optimizations we can do, however those other ones do not seem so easy to pin down as the ones in this PR, so I think we can try to merge this one and then leave the rest for a follow up PR.

marioevz · 2025-11-24T21:59:39Z

@fselmo I think I had a breakthrough, I'm implementing now and I'll try to push tomorrow so please hold-off on reviewing 🙏

fix(base_types): unnecessary cast

danceratopz · 2025-11-25T21:34:03Z

If you can wait until tomorrow, I'd love to review. I just ran into a state_root recalculation issue while working on enginex and want to check this addresses it. Brain's a bit frazzled to check it now.

fselmo

This is amazing, really nice finds @marioevz 🔥. I left a comment on a bug in a comparison check.

Something to think about... should the LazyAlloc cache its own state root so we don't decouple the two? I don't think there's a ton of risk separating them but if we use an underscored _state_root and define something like get_state_root that pulls from the cached _state_root and if it's not there we compute it? Something like this instead of tracking them separately?

Should we also add some basic unit tests or is it enough that these are always used to check if anything breaks or doesn't?

Looks great though, excited for this to get in 🚀

fselmo · 2025-11-26T22:13:41Z

packages/testing/src/execution_testing/specs/state.py

        if (
-            modified_tool_output.alloc.root.keys()
-            != modified_tool_output.alloc.root.keys()
+            modified_tool_output.alloc.get_alloc().root.keys()


These are the same. I think we need a base_tool_alloc.root.keys() here? It looks like they were the same before too... maybe this was already broken before?

danceratopz

The difference in these profiling charts is incredible! I'm taking a deeper look now.

Did you generate the pre-alloc group for all tests (fill stage 1) and then profile filling single test (fill stage 2)?

Here's my process - did you do follow a similar process?

Generate the full pre-alloc group for all tests:

uv run pytest -c packages/testing/src/execution_testing/cli/pytest_commands/pytest_ini_files/pytest-fill.ini \ 
    --rootdir . \
    --generate-pre-alloc-groups \
    --evm-bin=evmone-t8n \
    -m "benchmark and blockchain_test_engine_x" 
    --output=/tmp/fixtures \
    --evm-dump-dir=/tmp/evm-debug |
    tests/benchmark/ \
    --clean
    ```

Profile via a single test:

rm -rf /evm/evm-debug/;
uv run pytest -c packages/testing/src/execution_testing/cli/pytest_commands/pytest_ini_files/pytest-fill.ini \
    --rootdir . \
    --generate-all-formats \
    --use-pre-alloc-groups \
    --evm-bin=evmone-t8n \
    -m "benchmark and blockchain_test_engine_x" \
    --output=/tmp/fixtures \
    --evm-dump-dir=/tmp/evm-debug \
    tests/benchmark/compute/instruction/test_arithmetic.py::test_arithmetic[fork_Prague-blockchain_test_engine_x-opcode_ADD-]

Visualize:

uvx snakeviz "$(find /tmp/evm-debug -name 'profile.out' | head -1)"

It seems a bit over the top to enable the profiler for anyone requesting traces from fill via --evm-debug-dir. How about a --profile flag for this? This is more about noise in the output directory than performance; the profile adds <10% perf overhead.

danceratopz · 2025-11-27T13:22:43Z

The optimized CI fail should be fixed by #1813

danceratopz

The speed-up is really amazing.

This PR does resolve the performance issues I saw in the enginex PR regarding pre-alloc access and state root calculation 🚀

Small suggestion in marioevz#2

danceratopz · 2025-11-27T13:51:07Z

packages/testing/src/execution_testing/base_types/base_types.py

+        if isinstance(input_bytes, str):
+            input_bytes = sub(r"\s+", "", input_bytes)


Just for my understanding, the use of CoerceBytes should typically be discouraged, as it will be less efficient for string inputs. But it's a convenient way to keep the EOF containers defs with white spaces as they are.

danceratopz · 2025-11-27T15:03:27Z

packages/testing/src/execution_testing/fixtures/pre_alloc_groups.py

+
+    _pre_alloc_group: "PreAllocGroup" = PrivateAttr(init=False)
+    _model_dump_cache: ModelDumpCache | None = PrivateAttr(None)
+    _cache_miss_count: int = PrivateAttr(0)


Is this used anywhere?

Removed in marioevz#2

danceratopz · 2025-11-27T15:12:55Z

packages/testing/src/execution_testing/fixtures/pre_alloc_groups.py

+        Model post init method to set GroupPreAlloc reference.
+        """
+        super().model_post_init(__context)
+        self.pre._pre_alloc_group = self


I got it after a mo, but it's kinda obfuscated. Suggestion here marioevz#2

spencer-tb

Amazing optimization. Lets get this merged!
@danceratopz's changes look good to me as well.

One small comment from my side. We shoud consider adding some basic tests to our custom classes in the future (LazyAlloc/CoerceBytes/GroupPreAlloc).

spencer-tb · 2025-12-01T15:46:34Z

packages/testing/src/execution_testing/client_clis/cli_types.py

+def write_json_file(data: str, file_path: str) -> None:
+    """Write a JSON file to the given path."""
+    with open(file_path, "w") as f:
+        f.write(data)


As far as I can tell we don't use or need this.

Suggested change

def write_json_file(data: str, file_path: str) -> None:

"""Write a JSON file to the given path."""

with open(file_path, "w") as f:

f.write(data)

spencer-tb · 2025-12-01T15:59:05Z

We should add this to the changelog as well. I think a rebase is needed for CI to pass.

fselmo reviewed Nov 21, 2025

View reviewed changes

packages/testing/src/execution_testing/base_types/base_types.py Outdated Show resolved Hide resolved

fselmo approved these changes Nov 21, 2025

View reviewed changes

fselmo self-requested a review November 21, 2025 17:01

marioevz force-pushed the optimize-fill branch from 864b444 to 2091245 Compare November 24, 2025 17:23

marioevz force-pushed the optimize-fill branch from 089aa02 to 976c290 Compare November 25, 2025 19:43

marioevz added 7 commits November 25, 2025 20:15

fix(t8n): introduce profiler

4480599

profiler pauses

6a653d8

fix(testing/base_types): Speed up to_bytes

509d17c

fix(base_types): unnecessary cast

fix(testing/base_types): CoerceBytes

56c9ec4

fix(testing): Optimize pre-alloc grouping state hash

f59397e

feat(t8n): introduce LazyAlloc

164de2d

feat(fixtures): Cached genesis model dump

fa3be8e

marioevz force-pushed the optimize-fill branch from 976c290 to fa3be8e Compare November 25, 2025 21:01

danceratopz self-requested a review November 25, 2025 21:15

danceratopz mentioned this pull request Nov 26, 2025

feat(consume): initial implementation of enginex danceratopz/execution-specs#1

Open

4 tasks

fselmo reviewed Nov 26, 2025

View reviewed changes

danceratopz reviewed Nov 27, 2025

View reviewed changes

This was referenced Nov 28, 2025

fix(fill): don't split slow tests into separate pre-alloc groups #1819

Draft

enhance(ci): add fast benchmark artifact workflow #1827

Open

spencer-tb approved these changes Dec 1, 2025

View reviewed changes

spencer-tb added C-enhance Category: an improvement or new feature P-high A-test-cli-fill Area: Tests Fill CLI—runs tests and generates fixtures (eg. `p/t/s/e/cli/pytest_commands/fill.py`) A-test-client-clis Area: execution spec tests client clis labels Dec 1, 2025

		if isinstance(input_bytes, str):
		input_bytes = sub(r"\s+", "", input_bytes)

fix(testing): Optimize fill #1804

Are you sure you want to change the base?

fix(testing): Optimize fill #1804

Conversation

marioevz commented Nov 21, 2025 • edited by spencer-tb Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗒️ Description

Profiling Results

Optimizations

Pydantic bytes optimization

Remove Unnecessary State Root Calculations

LazyAlloc class to defer model validation

Cache Pydantic Model Dump of the Pre-Alloc Groups

The profiler

🔗 Related Issues or PRs

✅ Checklist

Cute Animal Picture

Uh oh!

Uh oh!

fselmo left a comment

Choose a reason for hiding this comment

Uh oh!

marioevz commented Nov 24, 2025

Uh oh!

marioevz commented Nov 24, 2025

Uh oh!

danceratopz commented Nov 25, 2025

Uh oh!

fselmo left a comment

Choose a reason for hiding this comment

Uh oh!

fselmo Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danceratopz left a comment

Choose a reason for hiding this comment

Uh oh!

danceratopz commented Nov 27, 2025

Uh oh!

danceratopz left a comment

Choose a reason for hiding this comment

Uh oh!

danceratopz Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

danceratopz Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

danceratopz Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

danceratopz Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

spencer-tb left a comment

Choose a reason for hiding this comment

Uh oh!

spencer-tb Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

spencer-tb commented Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

marioevz commented Nov 21, 2025 •

edited by spencer-tb

Loading

`LazyAlloc` class to defer model validation

fselmo Nov 26, 2025 •

edited

Loading