-
Notifications
You must be signed in to change notification settings - Fork 408
change db schema, allow TXO-spender lookups, protocol 1.5 #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Supposedly it makes a difference (see e.g. [0]), and depending on how batching works it makes sense it would, but during a few full syncs of testnet I've done, it was within measurement error. Still, existing code was already doing this. [0]: https://stackoverflow.com/q/54941342
with the pending db changes, an upgrade is ~as fast as a resync from genesis
now that we have our own txindex
This will allow looking up which tx spent an outpoint.
In Bitcoin consensus, a txout index is stored as a uint32_t. However, in practice, an output in a tx uses at least 10 bytes (for an OP_TRUE output), so - to exhaust a 2 byte namespace, a tx would need to have a size of at least 2 ** 16 * 10 = 655 KB, - to exhaust a 3 byte namespace, a tx would need to have a size of at least 2 ** 24 * 10 = 167 MB.
Change schema of b'H' records from: # Key: b'H' + address_hashX + tx_num # Value: <null> # "address -> all txs that touch it" to: # Key: b'H' + address_hashX + tx_num + txout_idx # Value: <null> # "address -> funding outputs" (no spends!) Spending txs can be calculated from b's' records.
notifications not implemented yet
a6ae71a
to
9351173
Compare
History.get_txnums and History.backup depend on ordering of tx_nums, so we want the lexicographical order (used by leveldb comparator) to match the numerical order.
I've also explored an alternative scheme for the
This alternative approach would save around ~12 GiB of disk space, but based on my tests it would also almost double the initial sync time :( |
9351173
to
532cacb
Compare
pinging @romanz, @chris-belcher, @shesek, @cculianu, I think this is pretty much ready, and I am considering merging it. |
Hmm. I also implement an Electrum server called Fulcrum, which runs on BCH and BTC and is right now 100% compatible with E-X. I will have to review what you did here. What is the purpose exactly of this change? I'm not entirely sure how useful the output.subscribe stuff is but .. can be implemented. Can you elaborate on why this output-specific subscription stuff is needed? And why a client would want to see which tx spent an outpoint? It doubles the db size and will slow down full synch probably by the same factor. It will require server admins to do a full resynch. This is a huge inconvenient change -- so some exposition on why it's needed would be welcome here. Note that EX is not the only server implementing this protocol. There's explora from blockstream, jelectrum, electrs, etc. All will need to be updated if you want Electrum clients to talk to them. I consider this a pretty big deal so.. why?! EDIT: Also aside form the storage and synch costs associated with this change, what is the runtime performance impact of this during mempool synch? Have you tried seeing how this can scale? You would need to subscribe to a ton of txos and spend them in mempool and see how it copes: memory usage, delays in mempool processing, etc. Also how is the time of new blocks being processed affected by this? Already E-X can sometimes take tens of seconds on a new 2k tx block. What is the impact of this change on that time? What about DoS concerns? How is this limited on a per-client or per-IP basis? What's to stop a client from subscribing to the entire UTXO set, just for the "lulz"? What happens then? Can the server cope with that? Does it have strategies to mitigate that or limit that in some way? |
Electrum will use this for Lightning. For Lightning, in almost all cases the client just wants to watch an outpoint, get notified if it got created and if it got spent, and when spent, inspect the spender tx. This is currently implemented in the client via watching scripthashes, however that is actually vulnerable to several attacks. Most of these attacks cost more than what an attacker can gain with the ~small capacity Lightning channels of today, but if channels become larger it will be worth it to exploit them. There are multiple different attacks involved, the most trivial being exploiting the max history length for a given address by spamming it with dust outputs.
Yes, that's why I pinged the people above. :P I had already messaged some of them privately before; and discussed this on IRC.
How is that at all different from what's already possible today for a client to do? A client subscribing to thousands of addresses for DOS purposes is the same as one subscribing to thousands of outpoints. It is handled by session cost accounting; which is not perfect ofc but again, it's already how it works.
Not all the increase is due to the protocol changes. There are actually multiple changes bundled in this PR. Due to the fact that they all change the db schema, I decided to bundle them together. One change, that removes the need of having to do "db compaction" periodically, which is a nuisance for the operator and cause of endless confusion, increases the db size from ~55 GiB to ~64 GiB. (see #65 (comment)) But yes, the rest of the increase is pretty much all to do with indexing spenders of outpoints. So it's an 64->90 GiB increase: 40%.
It's a constant factor difference compared to existing code. We already need to process each tx/block, we just do a few more operations for each now. The constant factor is not too large, much less than 2x, although I don't have an exact number; but yes I have done several full syncs from genesis, and also have already been running a server with this for a month on both mainnet and testnet. |
Also note that bitcoind txindex would no longer be required - only optional (used if enabled). |
True, but this savings comes with some cost. In fact the cost may be such that you may want to recommend people still continue to use txindex, and warn of that fact on startup. Without txindex performance is not identical as with txindex. In particular you will see a huge slowdown for many clients doing tx requests at once. Passing a blockhash to Not sure if you normally read C++, but the branch where With the txindex in place, it loads just the tx itself exactly from the correct position in the disk file. Note that this entire code path is executing with the global lock I investigated doing this some time ago in Fulcrum and opted against it -- it led to extremely degraded performance. I recommend you try and hit your server with lots of requests for transactions -- both a version of it that uses txindex and one that does not. Note that passing a block hash to To really test this do this is parallel simulating a number of clients and observe the performance difference... even single-threaded the degradation is significant but if you do multiple requests at once it is even more apparent.
Thanks -- I really do appreciate the ping.
Yes, glad you finally fixed that. That was -- a strange decision on Neil's part. :) The fix is pretty logical. In Fulcrum we had no such limitation because we just store the data similarly to how you do it post-fix. Nice one.
True. I am glad to hear that already E-X has some mitigation in place for that. Was just checking.
I see. Yeah I can see why you'd need this for lightning esp. in light of the potential attack you described. Well from my point of view I see this change as potentially just reducing performance with some cost that only helps lightning users. I have no choice though -- you guys have the #1 client on BTC and my server, Fulcrum, is primarily designed for BCH. I support BTC just to challenge myself and as a way to test my server software "at scale" since BTC gets a LOT more activity than BCH does currently. You ultimately call the shots here and if I wish to continue to support future versions of Electrum on BTC with Fulcrum I will have no choice but to implement this mechanism as you describe it -- namely the ability to lookup spends by txo and to subscribe to txo's. My only reservation with this is the subscription mechanism -- since I just feel that its lifecycle could use some refinement...
Also not sure -- do you have a spec for the new RPC? Like -- what is the exact contract with the client here? Can you link me to documentation for this? Or should I just read the python? What do you recommend I read? This last point in particular is very critical for me in case you would like some review and feedback.... Thanks @SomberNight for keeping me in the loop. |
@cculianu see also spesmilo/electrum#6778 |
@ecdsa I appreciate the link but that discussion seems specific to LN. I .. in particular just want to see this Or what happens on reorg.. or... those corner cases I would like described. Or.. I can just read the python I guess if you don't have this documented yet. EDIT: @SomberNight can you clarify what happens on reorg or RBF or tx mempool drop when the txo that was previously spent now gets "unspent"? Does the event fire with empty results again? (It should!). Thanks. |
I was also considering this, yes. I think we can leave this out of the spec; a server implementation might want to clear txo subs that are deeply confirmed. This is not implemented in this PR atm.
Blockchain reorgs, mempool replacement, mempool eviction should all be considered and trigger notifications.
This PR also includes documentation for the spec changes, see |
Ok cool. That's what I was looking for. Thanks for the clarification.
Aha. Yeah I may end up triggering expiry on very deeply confirmed txo spends.
Ah, didn't see that, thanks. Do let me know your thoughts on performance with txindex vs without -- like I said I still think you should recommend server admins keep it enabled, for the reasons outlined in my expository essay. |
Thanks for the ping. For completeness here's a link to some discussion about the topic on the EPS github: chris-belcher/electrum-personal-server#174 (comment) I read through the docs of the PR and the changes seem great to me. I'm somewhat excited as it makes supporting LN in EPS much easier. |
Your analysis and conclusions make sense and sound right, unfortunately. I too had had a look at how bitcoind handles getrawtransaction without txindex, out of curiosity; and saw that it looks a lot less efficient than the with-txindex variant. I had not realised the implications of taking that lock, that makes it a lot worse IMO. :/ Indeed for public-facing servers, we should still strongly recommend enabling txindex. |
Awesome. Yeah I think that's good. :) By the way you don't need to have the user specify whether bitcoind has txindex=1 or not as a conf. var -- it's 1 extra little thing the user might get wrong and then performance will suffer -- you can detect it at runtime. Just ask bitcoind for tx_num=1 (that is, the txid of the first tx after genesis tx). If it responds with a real tx, the daemon has txindex and you can set a boolean to that effect and never send it a block_hash (so that it takes the fast path always on |
Hey @SomberNight 1 more thing occurred to me. If you wanted to save space and not have to store the txhash twice (once in the txnum -> txhash table and once in the txhash -> txnum table), there is a bit of a "shortcut" you can perform that may turn what would be a 20GB table into maybe as little as a 4-5GB table. You could take just the low-order 5 or 6 or 8 bytes of the txhash and use that as the key. The value for that key would then be a list of txnums that match that 40/48/64-bit value (on average we don't expect a collision until there are billions of tx's in Bitcoin if using a 64-bit key). E.g.: Your proposed scheme
^ Eats up about 38 bytes per key/value pair. Becomes:
^ Eats up 14 bytes per key/value pair (less if you accept more chance of collisions). Note in the more compact scheme you would have to further resolve each tx_num in the list to an actual tx_hash (by consulting the tx_num -> tx_hash table), in order to figure out which one was the real one you were looking for. This would save space at the cost of at least 1 additional db lookup for each tx_hash -> tx_num query. It's something to think about if you wanted to save space. |
Have you seen #80 (comment) ?
from first post:
Due to removing Although note that what I have tried (see linked commit) always does the extra reverse lookup: txid->txid[:11]->tx_num->txid, even if there is no collision of truncated txids. (which is a good sanity check but most likely makes it a lot slower...) |
Oh sorry I must have missed that -- there was a lot to read here initially and maybe my mind didn't absorb that the first time. Oh ok good so you did experiment with it!
:'(
Yeah I would have initially been paranoid too and done the final sanity check (extra lookup) too.. but if you think about it -- it maybe can be not done in that case. I guess technically if there really is only 1 tx_num in the bucket you can just "believe it" and doing the extra lookup is not necessary. In which case it should be just as fast and still save you a ton of space... Note that if you take 11 bytes from a 32-byte txid, that's 88 bits. The chances of a collision for 88 bits starts to get high when we have Ninja EDIT: Oh wait never mind. It really depends on your usage. If your question is "do i know about this txid??" -- you really do need the final lookup! The chance of collision there, no matter how small.. is not worth taking. Never mind. :)
All worth thinking about. I too will be experimenting with these questions in Fulcrum should I get to implementing this feature to continue to support BTC. Final EDIT: Yeah so if you want to opt for faster synch time, go for it by all means. Definitely is faster to not do additional lookups if it's a hot path. 👍 I still will be experimenting with this myself since Fulcrum is a different app with different data model already and different logic... but all worth thinking about. Thanks for the discussion |
0519ef8
to
a7ea627
Compare
Note that this is a soft fork: the server can apply it even for past protocol versions. Previously, with the order being undefined, if an address had multiple mempool transactions touching it, switching between different servers could result in a change in address status simply as a result of these servers ordering mempool txs differently. This would result in the client re-requesting the whole history of the address. ----- D/i | interface.[electrum.blockstream.info:60002] | <-- ('blockchain.scripthash.subscribe', ['660b44502503064f9d5feee48726287c0973e25bc531b4b8a072f57f143d5cd0']) {} (id: 12) D/i | interface.[electrum.blockstream.info:60002] | --> 9da27f9df91e3f860212f65b736fa20a539ba6e3d509f6370367ee7f10a4d5b0 (id: 12) D/i | interface.[electrum.blockstream.info:60002] | <-- ('blockchain.scripthash.get_history', ['660b44502503064f9d5feee48726287c0973e25bc531b4b8a072f57f143d5cd0']) {} (id: 13) D/i | interface.[electrum.blockstream.info:60002] | --> [ {'fee': 200, 'height': 0, 'tx_hash': '3ee6d6e26291ce360127fe039b816470fce6eeea19b5c9d10829a1e4efc2d0c7'}, {'fee': 239, 'height': 0, 'tx_hash': '9e050f09b676b9b0ee26aa02ccee623fae585a85d6a5e24ecedd6f8d6d2d3b1d'}, {'fee': 178, 'height': 0, 'tx_hash': 'fb80adbf8274190418cb3fb0385d82fe9d47a844d9913684fa5fb3d48094b35a'}, {'fee': 200, 'height': 0, 'tx_hash': '713933c50b7c43f606dad5749ea46e3bc6622657e9b13ace9d639697da266e8b'} ] (id: 13) D/i | interface.[testnet.hsmiths.com:53012] | <-- ('blockchain.scripthash.subscribe', ['660b44502503064f9d5feee48726287c0973e25bc531b4b8a072f57f143d5cd0']) {} (id: 12) D/i | interface.[testnet.hsmiths.com:53012] | --> f7ef7237d2d62a3280acae05616200b96ad9dd85fd0473c29152a4a41e05686c (id: 12) D/i | interface.[testnet.hsmiths.com:53012] | <-- ('blockchain.scripthash.get_history', ['660b44502503064f9d5feee48726287c0973e25bc531b4b8a072f57f143d5cd0']) {} (id: 13) D/i | interface.[testnet.hsmiths.com:53012] | --> [ {'tx_hash': '9e050f09b676b9b0ee26aa02ccee623fae585a85d6a5e24ecedd6f8d6d2d3b1d', 'height': 0, 'fee': 239}, {'tx_hash': 'fb80adbf8274190418cb3fb0385d82fe9d47a844d9913684fa5fb3d48094b35a', 'height': 0, 'fee': 178}, {'tx_hash': '3ee6d6e26291ce360127fe039b816470fce6eeea19b5c9d10829a1e4efc2d0c7', 'height': 0, 'fee': 200}, {'tx_hash': '713933c50b7c43f606dad5749ea46e3bc6622657e9b13ace9d639697da266e8b', 'height': 0, 'fee': 200} ] (id: 13)
a7ea627
to
49d589a
Compare
I have even more proposed changes for protocol version 1.5 in #90, so marking this as draft now. |
see #101 instead |
This PR changes the LevelDB/RocksDB db schema, introduces a new RPC for clients to lookup the spender of a transaction outpoint including notifications for it, and adds documentation for that RPC and other protocol changes - introducing electrum protocol version 1.5.
The db schema changed for both the history db and the utxo db.
First, some terminology:
scripthash
issha256(scriptPubKey)
- our address-encoding-independent representation of a bitcoin addresshashX
is a truncated 11-byte form of ascripthash
, used in the db, to save spacetx_num
is a 5-byte integer, the global index of a confirmed transaction in the chain. i.e. the index of a confirmed transaction, among all confirmed transactions in blockchain order.A new "table" is added to the history db that allows looking up which tx spent a particular TXO.
Note that on the file system, we already store a
tx_num
->tx_hash
mapping, but to be able to utilise this new b's' table, we would also need an inverse mapping.Such an inverse mapping is added here, as another table in the history db:
In the history db, the history for a hashX used be to stored as:
i.e. for any given
hashX
, we used to store a list oftx_nums
that touch it.These records used to be written out in batches: in a given batch, all txs that touch a
hashX
went into a single record, but different batches went into different records. To avoid key collisions, every time a batch was flushed, it had a globalflush_id
, which was also put into the key. Note that the globalflush_id
got incremented by one every time anything was flushed to the history db. In particular, for a givenhashX
, many flushes don't contain any txs that touch it, hence theflush_id
for thathashX
will have huge gaps. As theflush_id
was a 2 byte integer, and apart from initial sync there is a flush on every new block, theflush_id
used to overflow quite frequently (about once every 15 months for bitcoin). There was a compaction script that could be run any time (assuming e-x itself was not running) - and needed to be run once theflush_id
overflew - which would iterate the whole db and merge together history records concerned with the same hashX.At an intermediate commit, the above was changed instead to:
That is, for a given
hashX
we still store a list oftx_num
s, but this is split into multiple records, one pertx_num
. Instead of storing a list oftx_num
s in the value, we store exactly one as part of the key. LevelDB/RocksDB then allows iterating by key prefix and finding alltx_num
s efficiently.This allows similar performance at the cost of a slightly larger db size (as the
hashX
is now duplicated across records), but it completely does away withflush_id
and compaction.(note: this change is based on previous PR db: change history db schema #65 )
However that is not the final form. Instead above, we now store:
So instead of storing each tx that touches a
hashX
(both funding and spending), we only store funding txs - funding tx outpoints specifically. The idea is that using the b's' table, we can look up the corresponding spending txs (if any).in utxo db, b'h' keys changed from:
to:
compressed_tx_hash
was a 4-byte truncated txid. As we now have atx_hash
->tx_num
map (history db b't' records), there is no need for this anymore.txout_idx
in bitcoin consensus is a 4 byte unsigned integer and that is what we used to use in the db too. However to save space, we now use 3 byte unsigned integers instead in the db. In Bitcoin, I estimate it would need a 160 MB transaction with all OP_TRUE outputs to exhaust the 3 byte namespace.bitcoind txindex is no longer required - it is now optional (used if enabled).
There is a new ENV variable
DAEMON_HAS_TXINDEX
(defaults to True) signalling whether bitcoind txindex is enabled. txindex is no longer required as e-x now effectively reimplements it - the b't' table in the history db (tx_hash
->tx_num
) - in conjunction with existing on-disk dbs e-x has - allows looking up the hash of the block that includes a tx (and that is sufficient to use the bitcoindgetrawtransaction
RPC; see [rpc] Allow fetching tx directly from specified block in getrawtransaction bitcoin/bitcoin#10275 ).A new electrum protocol version,
1.5
, is introduced, along with documentation.server.version
message must be the first message sent by the client.That is, version negotiation must happen before any other messages.
height
argument forblockchain.transaction.get_merkle
is now optional. (related: Compressed block headers & merkle proofs over low bandwidth communications #43 )mode
argument added toblockchain.estimatefee
. (see allow passing estimate_mode to estimatefee kyuupichan/electrumx#1001 )blockchain.outpoint.subscribe
to subscribe to a transaction outpoint, and get a notification when it gets spent.blockchain.outpoint.unsubscribe
to unsubscribe from a TXO.Overall, for Bitcoin mainnet, with these changes the e-x db size went from ~55 GiB to ~90 GiB for me, using LevelDB. Though note that the old size depends quite a bit on how recent the last compaction was.