core/state: set-based journalling #30660

holiman · 2024-10-23T07:02:57Z

This is a second attempt at #30500 .

This PR introduces set-based journalling, where the journalling-events are basically stored in per-scoped maps.

Whenever we enter a new call scope, we create a new scoped journal. Changes within the same scoped journal overwrite eachother. For example: if a scope updates a balance to from 6 to 5 then 4, then 3, there will only be one preimage in the journal. As opposed to the old journal (linear_journal), which would have multiple entries: [ prev: 6, prev: 5, prev:4].

The linear / appending journal is wasteful on memory, and also slow on rollbacks, since each change is rolled back individually.

@karalabe reminded me of the burntpix benchmark, on which this PR excels, so I thought it was worth another chance.

master:

[user@work go-ethereum]$ ./evm_master run --prestate ./burntpix.json --receiver 0x49206861766520746F6F206D7563682074696D65  --input 0xa4de9ab4000000000000000000000000000000000000000000000000000000000F1FD58E000000000000000000000000000000000000000000000000000000000007A120 --bench | tail -c +131 | sed 's/[0]*$//' | xxd -r -p > output.svg

EVM gas used:    5642735088
execution time:  49.349209526s
allocations:     915684
allocated bytes: 175333368

This PR:

[user@work go-ethereum]$ ./evm_setjournal run --prestate ./burntpix.json --receiver 0x49206861766520746F6F206D7563682074696D65  --input 0xa4de9ab4000000000000000000000000000000000000000000000000000000000F1FD58E000000000000000000000000000000000000000000000000000000000007A120 --bench | tail -c +131 | sed 's/[0]*$//' | xxd -r -p > output.svg

EVM gas used:    5642735088
execution time:  48.740463584s
allocations:     30198
allocated bytes: 30308272

Allocations, 915K -> 30K,
Allocated bytes: 175M -> 30M

core/genesis.go

core/state/setjournal.go

core/state/journal.go

karalabe

Random nits for now

core/state/setjournal.go

core/state/journal_api.go

holiman · 2024-11-06T13:01:55Z

Lifting this to top level

revertTOsnapshot and discardSnapshot seems asymmetric (one discards something, the other reverts up to something).

Would it make sense to unify these? I.e. have discard also take the parent until which to discard (if revert knows where to revert, discard should also know? shoudln't it be called in the same spot?).

Also, discard doesn't really need an id, does it, you will only ever discard the head?

So, now I have made (actually, still working on it) the api cleaner,

journal:

	// snapshot starts a new journal scope which can be reverted or discarded.
	// The lifeycle of journalling is as follows:
	// - snapshot() starts a 'scope'.
	// - The method snapshot() may be called any number of times.
	// - For each call to snapshot, there should be a corresponding call to end
	//  the scope via either of:
	//   - revertToSnapshot, which undoes the changes in the scope, or
	//   - discardSnapshot, which discards the ability to revert the changes in the scope.
	snapshot()

	// revertSnapshot reverts all state changes made since the last call to snapshot().
	revertSnapshot(s *StateDB)

	// discardSnapshot removes the latest snapshot; after calling this
	// method, it is no longer possible to revert to that particular snapshot, the
	// changes are considered part of the parent scope.
	discardSnapshot()

StateDB

	// Snapshot starts a new journalled scope.
	Snapshot()
	// RevertSnapshot reverts all state changes made in the most recent journalled scope.
	RevertSnapshot()
	// DiscardSnapshot removes the ability to roll back the changes in the most
	// recent journalled scope. After calling this method, the changes are considered
	// part of the parent scope.
	DiscardSnapshot()

This had quite a large fallout, and also made for some large changes (simplifications) inside the existing linear journal. The result is that after these changes, I am not 100% confident that it won't blow up in some corner-case where the journal is empty / reset, and something calls revert or discard. I intentionally did not check for out of bounds, because I want to locate the source of such flaws.

One such source is when the state processor applies a tx: it either becomes finalized, in which case the journal will be reset, since cross-tx reverting is not allowed. However, if it fails, then the caller (outside) needs to revert it. So that's one point where, although the API is symmetric, the call pattern is not.

So, my plan now: let the tests run, then put it on benchmarkers for a spin.

holiman · 2024-11-07T12:29:20Z

I tried to depoy this on bench06 without wiping it. cc @rjl493456442 I guess you were running a future db format on it? Hope I didn't break any important run.. ?

Nov 07 13:25:36 bench06.ethdevops.io geth INFO [11-07|12:25:36.509] Using pebble as the backing database
Nov 07 13:25:36 bench06.ethdevops.io geth INFO [11-07|12:25:36.509] Allocated cache and file handles database=/datadir/geth/geth/chaindata cache=2.00GiB handles=524,288
Nov 07 13:25:39 bench06.ethdevops.io geth Chain metadata
Nov 07 13:25:39 bench06.ethdevops.io geth databaseVersion: 9 (0x9)
Nov 07 13:25:39 bench06.ethdevops.io geth headBlockHash: 0xe637caa29a2821cffeafd8db1906e765b0be7e4d8350123e156d757d3db62b2f
Nov 07 13:25:39 bench06.ethdevops.io geth headFastBlockHash: 0xe637caa29a2821cffeafd8db1906e765b0be7e4d8350123e156d757d3db62b2f
Nov 07 13:25:39 bench06.ethdevops.io geth headHeaderHash: 0xe637caa29a2821cffeafd8db1906e765b0be7e4d8350123e156d757d3db62b2f
Nov 07 13:25:39 bench06.ethdevops.io geth lastPivotNumber: 20977634 (0x14017e2)
Nov 07 13:25:39 bench06.ethdevops.io geth len(snapshotSyncStatus): 285 bytes
Nov 07 13:25:39 bench06.ethdevops.io geth snapshotDisabled: false
Nov 07 13:25:39 bench06.ethdevops.io geth snapshotJournal: 7938463 bytes
Nov 07 13:25:39 bench06.ethdevops.io geth snapshotRecoveryNumber: <nil>
Nov 07 13:25:39 bench06.ethdevops.io geth snapshotRoot: 0x047f51ef93470331dee55a068d094b27ccf97dc6e5ce104ae99587bfefb8ff12
Nov 07 13:25:39 bench06.ethdevops.io geth txIndexTail: 18785755 (0x11ea5db)
Nov 07 13:25:39 bench06.ethdevops.io geth
SkeletonSyncStatus: {"Subchains":[{"Head":21135734,"Tail":21135734,"Next":"0x36399eb652517a6a8618352d09c8d1cdfdf63a5bda68e3dc4a20c5eb290a27a8"}],"Finalized":21050420}
Nov 07 13:25:40 bench06.ethdevops.io geth Fatal: Failed to register the Ethereum service: rlp: input list has too many elements for rawdb.freezerTableMeta
Nov 07 13:25:40 bench06.ethdevops.io geth Fatal: Failed to register the Ethereum service: rlp: input list has too many elements for rawdb.freezerTableMeta

holiman · 2024-11-07T12:30:56Z

Deploying now, so

bench05: this PR, but using linear_journal
bench06: this PR, but using setjournal

karalabe · 2024-11-07T17:00:25Z

I was running Gary’s pr

…

On Thu, 7 Nov 2024 at 13:31, Martin HS ***@***.***> wrote: Deploying now, so - bench05: this PR, but using linear_journal - bench06: this PR, but using setjournal — Reply to this email directly, view it on GitHub <#30660 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA7UGN6TTGHA7JCOLAD2IDZ7NMRRAVCNFSM6AAAAABQOC4ON6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDINRSGEYTSMRTGI> . You are receiving this because you were mentioned.Message ID: ***@***.***>

holiman · 2024-11-13T09:57:54Z

Deploying now, so

* `bench05`: this PR, but using `linear_journal`

* `bench06`: this PR, but using `setjournal`

Has been running fine for ~5 days. Nothing noteworthy with regards to differences in charts on the two machines.

rjl493456442 · 2024-11-13T12:00:50Z

I tried to depoy this on bench06 without wiping it. cc @rjl493456442 I guess you were running a future db format on it? Hope I didn't break any important run.. ?

Nov 07 13:25:36 bench06.ethdevops.io geth INFO [11-07|12:25:36.509] Using pebble as the backing database
Nov 07 13:25:36 bench06.ethdevops.io geth INFO [11-07|12:25:36.509] Allocated cache and file handles database=/datadir/geth/geth/chaindata cache=2.00GiB handles=524,288
Nov 07 13:25:39 bench06.ethdevops.io geth Chain metadata
Nov 07 13:25:39 bench06.ethdevops.io geth databaseVersion: 9 (0x9)
Nov 07 13:25:39 bench06.ethdevops.io geth headBlockHash: 0xe637caa29a2821cffeafd8db1906e765b0be7e4d8350123e156d757d3db62b2f
Nov 07 13:25:39 bench06.ethdevops.io geth headFastBlockHash: 0xe637caa29a2821cffeafd8db1906e765b0be7e4d8350123e156d757d3db62b2f
Nov 07 13:25:39 bench06.ethdevops.io geth headHeaderHash: 0xe637caa29a2821cffeafd8db1906e765b0be7e4d8350123e156d757d3db62b2f
Nov 07 13:25:39 bench06.ethdevops.io geth lastPivotNumber: 20977634 (0x14017e2)
Nov 07 13:25:39 bench06.ethdevops.io geth len(snapshotSyncStatus): 285 bytes
Nov 07 13:25:39 bench06.ethdevops.io geth snapshotDisabled: false
Nov 07 13:25:39 bench06.ethdevops.io geth snapshotJournal: 7938463 bytes
Nov 07 13:25:39 bench06.ethdevops.io geth snapshotRecoveryNumber: <nil>
Nov 07 13:25:39 bench06.ethdevops.io geth snapshotRoot: 0x047f51ef93470331dee55a068d094b27ccf97dc6e5ce104ae99587bfefb8ff12
Nov 07 13:25:39 bench06.ethdevops.io geth txIndexTail: 18785755 (0x11ea5db)
Nov 07 13:25:39 bench06.ethdevops.io geth
SkeletonSyncStatus: {"Subchains":[{"Head":21135734,"Tail":21135734,"Next":"0x36399eb652517a6a8618352d09c8d1cdfdf63a5bda68e3dc4a20c5eb290a27a8"}],"Finalized":21050420}
Nov 07 13:25:40 bench06.ethdevops.io geth Fatal: Failed to register the Ethereum service: rlp: input list has too many elements for rawdb.freezerTableMeta
Nov 07 13:25:40 bench06.ethdevops.io geth Fatal: Failed to register the Ethereum service: rlp: input list has too many elements for rawdb.freezerTableMeta

Yes, the db version is 9, so it's confirmed!

core/state/journal.go

core/state/journal_linear.go

core/state/journal_set.go

MariusVanDerWijden · 2024-11-21T10:10:09Z

I built a little fuzzer for differential fuzzing the two journals against each other here: https://github.com/MariusVanDerWijden/go-ethereum/tree/journal_on_heapfix_fuzz, its in the first commit, I also added some things on top, so that the fuzzer doesn't panic

holiman · 2024-11-21T12:10:45Z

I built a little fuzzer for differential fuzzing

Nice! We could add it to this PR, but please make it so that it's deterministic from the fuzzer input. So don't use rand, use the input as source (and read zeroes if it's depleted, which should be detected in next loop)

holiman · 2024-11-22T13:54:17Z

We could add it to this PR, but please make it so that it's deterministic

I've done this now

holiman · 2024-12-03T10:24:55Z

Deploying this again now, so

bench05: this PR (using setjournal)
bench06: this PR, but using linear_journal

holiman · 2024-12-16T14:23:09Z

Rebased on master, which changed with eip-7702 so the journal-api now does not implicitly require the pre-code to be nil.

This has not yet been implemented in the set-based journal, but I have added it to the fuzzer, so the failure is detected. It was harder than I thought to detect, because it's a bit sneaky: if a journal sets back the codehash but not the code, then the changes pass fine: because code is only an in-memory byteslice, the codehash is the consensus object.

Hence, a broken journal would subtly corrupt the in-memory stateobject. This can be detected by checking the codehash against the code.

This is not yet fixed in set-journal, because I want to think a bit on how to best handle that.

holiman · 2024-12-16T15:40:40Z

if a journal sets back the codehash but not the code, then the changes pass fine: because code is only an in-memory byteslice, the codehash is the consensus object.

Actually, there may be nothing wrong with this.

	code []byte // contract bytecode, which gets set when code is loaded

It should be fine to set the code to nil, while resetting the previous codeHash. With the assumption that

Code will be loaded on-demand when needed
Code already exists in the code-database.

The latter assumption is interesting. Is there any case where we can undo a setCode, and revert to an older code, and the older code is not stored in the db?

…nt createaccount

holiman requested review from MariusVanDerWijden, karalabe, lightclient and rjl493456442 as code owners October 23, 2024 07:02

jwasinger reviewed Oct 24, 2024

View reviewed changes

core/genesis.go Outdated Show resolved Hide resolved

holiman commented Oct 24, 2024

View reviewed changes

core/genesis.go Outdated Show resolved Hide resolved

jwasinger reviewed Oct 24, 2024

View reviewed changes

core/state/setjournal.go Outdated Show resolved Hide resolved

holiman force-pushed the journal_on_heapfix branch from dcbf781 to 696edfa Compare November 4, 2024 08:20

MariusVanDerWijden reviewed Nov 4, 2024

View reviewed changes

core/state/journal.go Outdated Show resolved Hide resolved

karalabe suggested changes Nov 5, 2024

View reviewed changes

holiman force-pushed the journal_on_heapfix branch from 6ca0931 to 32cd8a4 Compare November 5, 2024 17:25

holiman requested a review from s1na as a code owner November 6, 2024 12:48

holiman force-pushed the journal_on_heapfix branch from 936622f to e896dac Compare November 6, 2024 12:51

holiman force-pushed the journal_on_heapfix branch from e896dac to e6f3517 Compare November 6, 2024 13:12

holiman force-pushed the journal_on_heapfix branch 2 times, most recently from f5d8372 to 79b00d1 Compare November 20, 2024 12:54

MariusVanDerWijden reviewed Nov 21, 2024

View reviewed changes

core/state/journal.go Show resolved Hide resolved

MariusVanDerWijden reviewed Nov 21, 2024

View reviewed changes

core/state/journal_linear.go Show resolved Hide resolved

MariusVanDerWijden reviewed Nov 21, 2024

View reviewed changes

core/state/journal_linear.go Show resolved Hide resolved

MariusVanDerWijden reviewed Nov 21, 2024

View reviewed changes

core/state/journal_set.go Show resolved Hide resolved

holiman force-pushed the journal_on_heapfix branch from 79b00d1 to d4072de Compare November 22, 2024 13:02

holiman force-pushed the journal_on_heapfix branch 2 times, most recently from fa13bd6 to 2dee224 Compare November 30, 2024 19:08

holiman force-pushed the journal_on_heapfix branch from 2dee224 to e303ae9 Compare December 3, 2024 08:07

rjl493456442 self-assigned this Dec 4, 2024

holiman mentioned this pull request Dec 10, 2024

core/tracing: state journal wrapper #30441

Merged

holiman force-pushed the journal_on_heapfix branch from e303ae9 to 1960545 Compare December 10, 2024 12:59

holiman requested a review from fjl as a code owner December 10, 2024 12:59

holiman force-pushed the journal_on_heapfix branch from 1960545 to e7ba44b Compare December 16, 2024 14:19

rjl493456442 assigned MariusVanDerWijden Aug 28, 2025

core/state: separate journal-implementation behind interface, impleme…

69b6195

…nt createaccount

MariusVanDerWijden force-pushed the journal_on_heapfix branch from 9d9265e to 69b6195 Compare September 16, 2025 15:40

core/state: set-based journalling #30660

Are you sure you want to change the base?

core/state: set-based journalling #30660

Uh oh!

Conversation

holiman commented Oct 23, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

karalabe left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

holiman commented Nov 6, 2024

Uh oh!

holiman commented Nov 7, 2024

Uh oh!

holiman commented Nov 7, 2024

Uh oh!

karalabe commented Nov 7, 2024 via email

Uh oh!

holiman commented Nov 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rjl493456442 commented Nov 13, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

MariusVanDerWijden commented Nov 21, 2024

Uh oh!

holiman commented Nov 21, 2024

Uh oh!

holiman commented Nov 22, 2024

Uh oh!

holiman commented Dec 3, 2024

Uh oh!

holiman commented Dec 16, 2024

Uh oh!

holiman commented Dec 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

holiman commented Nov 13, 2024 •

edited

Loading