Skip to content

Conversation

@jannotti
Copy link
Contributor

@jannotti jannotti commented Jul 31, 2025

Currently, congestion is monitored locally, by each algod and is not a part of consensus. So while individual algods may require higher fees, transaction processing is unaware of that requirement. Therefore, the extra fee can end up being "double spent" to execute inner transactions. This PR introduces two header fields, Load which indicates how full a particular block is, and CongestionFee which grows when the Load exceeds 50% and returns toward MinFee with less full blocks. The long term plan is that transactions must pay MinFee + CongestionFee (which we'll call BaseFee) to send a transaction. A group indicates its willingness to be charged BaseFee per transactions by including a new transaction field, CostIncrement in the transaction header. This is an amount, in MicroAlgos that the group is willing to be charged for each transaction, including inners. It is not added to the transaction's Fee though. Instead, the group is stating that the fees it has included will be sufficient if, during evaluation, each transaction is charged at the higher rate.

For now, concessions are being made to roll out congestion pricing over multiple consensus releases. The concession is this: Transactions will be dropped when they arrive at algod if the current block's CongestionFee is higher than the submitted group's CostIncrement, but groups will not be dropped at evaluation time unless they fail to included enough Fee for their stated CostIncrement. The result of this policy is that once a group is admitted to the transaction queue, it will not be dropped simply because the congestion has increased, but the group has not gotten into a block yet. However, groups that were only admitted because they included a sufficient CostIncrement will still be held to that standard.

This means that congestion is still outside the protocol, but the groundwork was been laid to first track congestion in protocol, and eventually enforce it at evaluation time.

Why don't we enforce it with this update? Because it would lead to an ugly pathology. Suppose there is no congestion. Lots of transactions arrive with MinFee fees, and no CostIncrement. As soon as one block is created that's more than 1/2 full, the next block would demand a CostIncrement so all those sent transactions would be dropped. We'd have an empty block, congestion would go down, and then a bunch of MinFee transactions would be sent again.

To avoid that problem will require some sort of "pay-as-you-go" or "refund" model. Transactions will need to specify that they are willing to pay more, but only if there's congestion that demands it. If congestion is less than they have indicated their willingness to pay, they will keep this part of their fee. To avoid backward incompatible changes, we will leave Fee to mean, "I always pay this fee". There will need to be a new field, perhaps RefundableFee, which does not necessarily go to the fee sink. There are several issues with that change, so prudence leads us to put that off a bit. Which transactions receive the refund, if the fee is split among transactions? How can logicsigs be protected from unintentional spend? Do apps need similar protection?

So: the short term solution for this PR and the next consensus release is to only enforce the changing congestion value at ingress, then lock that choice in for checking during evaluation even if the congestion level changes. This gives the community time to understand the new congestion model, since it will appear in the block. Wallets and such can begin to offer recommended fees.

While in this code, EnableFeePooling is eliminated as a consensus parameter. Since it is strictly more permissive, we can simply assume it has always been true.

A worthwhile discussion: How quickly should the congestion level change. The current code causes the BaseFee (sum of congestion and minfee) to go up by 50% when the block is full, or down by 50% when the block is empty. As a comparison, Ethereum only allows the congestion fee to change by 12.5% per block. I lean toward adjusting this PR to use a 10% limit each way.

Summary

Test Plan

@jannotti jannotti self-assigned this Jul 31, 2025
@jannotti jannotti force-pushed the on-chain-congestion branch 2 times, most recently from dd908fe to f17cee2 Compare July 31, 2025 20:02
@jannotti jannotti changed the title Eval: Add, maintain, and validate new Load and BaseFee header fields Eval: Move congestion monitoring and fee escalation on-chain Aug 1, 2025
@jannotti jannotti force-pushed the on-chain-congestion branch 10 times, most recently from b5f1d18 to 68c0b54 Compare August 6, 2025 14:16
@codecov
Copy link

codecov bot commented Sep 29, 2025

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
106893 1 106892 2547
View the top 3 failed test(s) by shortest run time
github.com/algorand/go-algorand/ledger::TestArchivalFromNonArchival
Stack Traces | 0.12s run time
=== RUN   TestArchivalFromNonArchival
    testingLogger.go:40: time="2025-10-21T21:00:52.498043 +0000" level=info msg="trackerDBInitialize upgrading database schema from version 0 to version 11" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=72
    testingLogger.go:40: time="2025-10-21T21:00:52.498228 +0000" level=info msg="trackerDBInitialize performing upgrade from version 0" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=76
    testingLogger.go:40: time="2025-10-21T21:00:52.498318 +0000" level=info msg="upgradeDatabaseSchema0 initializing schema" file=trackerdbV2.go function="github..../store/trackerdb/sqlitedriver.(*trackerDBSchemaInitializer).upgradeDatabaseSchema0" line=189
    testingLogger.go:40: time="2025-10-21T21:00:52.500794 +0000" level=info msg="trackerDBInitialize performing upgrade from version 1" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=76
    testingLogger.go:40: time="2025-10-21T21:00:52.501039 +0000" level=info msg="trackerDBInitialize performing upgrade from version 2" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=76
    testingLogger.go:40: time="2025-10-21T21:00:52.501217 +0000" level=info msg="trackerDBInitialize performing upgrade from version 3" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=76
    testingLogger.go:40: time="2025-10-21T21:00:52.502236 +0000" level=info msg="trackerDBInitialize performing upgrade from version 4" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=76
    testingLogger.go:40: time="2025-10-21T21:00:52.502341 +0000" level=info msg="upgradeDatabaseSchema4: deleted 0 rows" file=trackerdbV2.go function="github..../store/trackerdb/sqlitedriver.(*trackerDBSchemaInitializer).upgradeDatabaseSchema4" line=338
    testingLogger.go:40: time="2025-10-21T21:00:52.502537 +0000" level=info msg="trackerDBInitialize performing upgrade from version 5" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=76
    testingLogger.go:40: time="2025-10-21T21:00:52.503302 +0000" level=info msg="upgradeDatabaseSchema5 upgraded 1 out of 52 accounts [ 1.9% ]" file=trackerdbV2.go function="github..../store/trackerdb/sqlitedriver.(*trackerDBSchemaInitializer).upgradeDatabaseSchema5.func1" line=365
    testingLogger.go:40: time="2025-10-21T21:00:52.505907 +0000" level=info msg="trackerDBInitialize performing upgrade from version 6" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=76
    testingLogger.go:40: time="2025-10-21T21:00:52.506589 +0000" level=info msg="upgradeDatabaseSchema6 upgraded 1 out of 52 accounts [ 1.9% ]" file=trackerdbV2.go function="github..../store/trackerdb/sqlitedriver.(*trackerDBSchemaInitializer).upgradeDatabaseSchema6.func1" line=432
    testingLogger.go:40: time="2025-10-21T21:00:52.507808 +0000" level=info msg="trackerDBInitialize performing upgrade from version 7" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=76
    testingLogger.go:40: time="2025-10-21T21:00:52.508163 +0000" level=info msg="trackerDBInitialize performing upgrade from version 8" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=76
    testingLogger.go:40: time="2025-10-21T21:00:52.508361 +0000" level=info msg="trackerDBInitialize performing upgrade from version 9" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=76
    testingLogger.go:40: time="2025-10-21T21:00:52.508682 +0000" level=info msg="trackerDBInitialize performing upgrade from version 10" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=76
    testingLogger.go:40: time="2025-10-21T21:00:52.509263 +0000" level=info msg="trackerDBInitialize database schema upgrade complete" file=trackerdbV2.go function=github..../store/trackerdb/sqlitedriver.RunMigrations line=149
    testingLogger.go:40: time="2025-10-21T21:00:52.535180 +0000" level=info msg="initializeHashes rebuilding merkle trie for round 0" file=catchpointtracker.go function="github..../algorand/go-algorand/ledger.(*catchpointTracker).initializeHashes" line=1584
    testingLogger.go:40: time="2025-10-21T21:00:52.539920 +0000" level=info msg="initializeHashes rebuilt the merkle trie with 52 entries in 4.550029ms" file=catchpointtracker.go function="github..../algorand/go-algorand/ledger.(*catchpointTracker).initializeHashes" line=1701
    archival_test.go:725: 
        	Error Trace:	.../go-algorand/ledger/archival_test.go:725
        	Error:      	Received unexpected error:
        	            	txgroup with 0.0A fees is less than 1mA (usage=1.000000 * base=1mA)
        	Test:       	TestArchivalFromNonArchival
--- FAIL: TestArchivalFromNonArchival (0.12s)
github.com/algorand/go-algorand/data::TestTxHandlerRememberReportErrorsWithTxPool
Stack Traces | 0.14s run time
=== RUN   TestTxHandlerRememberReportErrorsWithTxPool
    txHandler_test.go:2195: 
        	Error Trace:	.../go-algorand/data/txHandler_test.go:2195
        	Error:      	Not equal: 
        	            	expected: 1
        	            	actual  : 0
        	Test:       	TestTxHandlerRememberReportErrorsWithTxPool
--- FAIL: TestTxHandlerRememberReportErrorsWithTxPool (0.14s)
github.com/algorand/go-algorand/ledger::TestArchivalCreatables
Stack Traces | 0.46s run time
=== RUN   TestArchivalCreatables
    archival_test.go:426: 
        	Error Trace:	.../go-algorand/ledger/archival_test.go:426
        	Error:      	Received unexpected error:
        	            	txgroup with 0.0A fees is less than 1mA (usage=1.000000 * base=1mA)
        	Test:       	TestArchivalCreatables
--- FAIL: TestArchivalCreatables (0.46s)

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@jannotti jannotti force-pushed the on-chain-congestion branch 5 times, most recently from 87fa07d to 8b6d6ee Compare September 29, 2025 17:54
Comment on lines -105 to -109
// EnableFeePooling specifies that the sum of the fees in a
// group must exceed one MinTxnFee per Txn, rather than check that
// each Txn has a MinFee.
EnableFeePooling bool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this any more, so I removed it for clarity. It enabled strictly more permissive fee handling, so we can pretend it was always true.


api:
make -C daemon/algod/api
$(MAKE) -j7 -C daemon/algod/api
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doubles the speed of make api

}
}

func TestNextBaseFee(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BaseFee can move by 50% up or down each round. We should decide whether we want it to be able to move that fast.

In comparison, Ethereum can only change 12.5% per block (and their blocks are also slower).

Comment on lines -369 to -373
if len(txgroup) == 1 {
t := txgroup[0].Txn
if t.Type == protocol.StateProofTx && t.Sender == transactions.StateProofSender && t.Fee.IsZero() {
return nil
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now handled by having the stateproof transaction not contribute in SummarizeTxnFees.

Comment on lines +404 to +407
err := pool.checkSufficientFee(txgroup)
if err != nil {
return err
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JJ's note to self. checkSufficientFee probably should not have moved down. Investigate.

func (pool *TransactionPool) addToPendingBlockEvaluator(txgroup []transactions.SignedTxn, recomputing bool, stats *telemetryspec.AssembleBlockMetrics) error {
err := pool.addToPendingBlockEvaluatorOnce(txgroup, recomputing, stats)
if err == ledgercore.ErrNoSpace {
pool.numPendingWholeBlocks++
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this in the name of taking out old-style congestion code may be a mistake. It means we can't remove a transaction that has been in the pool too long from here, because we're not keeping track of a sort of "virtual round".

But, who cares? It'll get kicked out by eval before it gets into an actual block.

Comment on lines -792 to -795
case *transactions.MinFeeError:
asmStats.MinFeeErrorCount++
stats.RemovedInvalidCount++
pool.log.Infof("Pending transaction in pool no longer valid: %v", err)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error is no longer possible (and has not been for a while), because FeePooling means we don't check fees on individual transactions.

@jannotti jannotti force-pushed the on-chain-congestion branch from 8b6d6ee to 498d8b1 Compare October 2, 2025 19:26
@jannotti jannotti force-pushed the on-chain-congestion branch 2 times, most recently from 043bc35 to 04d6e98 Compare October 10, 2025 16:48
@jannotti jannotti force-pushed the on-chain-congestion branch from 04d6e98 to 98ce027 Compare October 21, 2025 18:34
@jannotti jannotti force-pushed the on-chain-congestion branch from 98ce027 to 76b307d Compare October 21, 2025 20:54
@joe-p
Copy link
Contributor

joe-p commented Oct 23, 2025

The long term plan is that transactions must pay MinFee + CongestionFee (which we'll call BaseFee) to send a transaction. A group indicates its willingness to be charged BaseFee per transactions by including a new transaction field, CostIncrement in the transaction header.

I think that we should think about swapping MinFee and BaseFee. So MinFee = BaseFee + CongestionFee. This allows us to preserve the semantics of global MinTxnFee which is used to determine how much fee an itxn needs to succeed.

I also think CostIncrement is a bit confusing. This field is what the sender expects the BaseFee/MinFee to be, so why not just call it BaseFee/MinFee?

@jannotti
Copy link
Contributor Author

The long term plan is that transactions must pay MinFee + CongestionFee (which we'll call BaseFee) to send a transaction. A group indicates its willingness to be charged BaseFee per transactions by including a new transaction field, CostIncrement in the transaction header.

I think that we should think about swapping MinFee and BaseFee. So MinFee = BaseFee + CongestionFee. This allows us to preserve the semantics of global MinTxnFee which is used to determine how much fee an itxn needs to succeed.

If a program says (today) that it only wishes to run when the tx.Fee <= global.MinFee, it is saying "I'm ok with never being able to run when there's congestion. I just don't want to ever spend more than the minimum." We need to preserve that intent, even if we might imagine we know better. There have always been potential congestion charges. So writing that code was a choice to not allow using the logicsig during congestion.

I also think CostIncrement is a bit confusing. This field is what the sender expects the BaseFee/MinFee to be, so why not just call it BaseFee/MinFee?

It only includes the amount over the minfee that the txn is willing to pay, that's why I called it an "increment". I defined it this way so that when it is 0, it can be omitted and when it is small, the encoding is small, rather than forcing every transaction to include an encoding of the minfee.

It encodes a small number x, such that the group is willing to pay (1+x) * minFee for each basic transaction. The x is expressed in millionths, though I don't know that we need that much precision.

@joe-p
Copy link
Contributor

joe-p commented Oct 23, 2025

If a program says (today) that it only wishes to run when the tx.Fee <= global.MinFee

I'm not talking about logic sigs but apps sending inner txns with global MinTxnFee; itxn_field Fee. Right now that is always guaranteed to be enough fee for an inner. If CostIncrement applies to inners, MinFee might not be sufficient. Of course we could have MinTxnFee return BaseFee but I think that'd be confusing (and we'd need to think about if/how we expose MinFee)

To be fair I think most apps just defer fee to outers but I do recall some PyTeal contract that used MinTxnFee.

It only includes the amount over the minfee that the txn is willing to pay, that's why I called it an "increment".

Ah ok that makes sense.

@jannotti
Copy link
Contributor Author

To be fair I think most apps just defer fee to outers but I do recall some PyTeal contract that used MinTxnFee.

That's what I'm hoping. Recall that there is (perhaps misguided) feature that prepopulates an inner txn's fee with the amount needed. So if an app wants to pay the fee, they should not set it at all.

As for the prepopulation feature, I am considering intentionally not letting that populate with anything more than MinFee as I don't want apps that use it to suddenly find themselves on the hook for congestion charges.

It only includes the amount over the minfee that the txn is willing to pay, that's why I called it an "increment".

Ah ok that makes sense.

I wonder if the field in the block should be defined the same way. Right now, it's a MicroAlgos amount, to be added to the MinFee. But perhaps it should also express a factor that should multiplies the MinFee. (And express that factor as just the amount > 1.0, so it stays small.)

Perhaps instead of "Increment" I should call these things "Tax" since that is something people are familiar with that is defined the same way. 6% sales tax means multiply the cost by 1.06.

@pbennett
Copy link
Contributor

I'm not talking about logic sigs but apps sending inner txns with global MinTxnFee; itxn_field Fee. Right now that is always guaranteed to be enough fee for an inner. If CostIncrement applies to inners, MinFee might not be sufficient. Of course we could have MinTxnFee return BaseFee but I think that'd be confusing (and we'd need to think about if/how we expose MinFee)

When he said applying to inners I assume he just meant in the same way fee does now - paying for unknown numbers of opups, or c2c calls - but inner transactions right now are always 1000 microAlgo. It seems like this is all about the outer 'get in the door' fee, but also how much may still be applied to inners (I'm still assuming 1000 microAlgo here). Changing what itxns cost seems like a large change and would most certainly break contracts.

@jannotti
Copy link
Contributor Author

I expect that inners will need to pay more during congestion, just like any other transaction.
I can certainly see how that could be disruptive, as any change can be. But there are a few points to consider.

  1. congestion is likely to be rare
  2. most apps don't pay the inner fee explicitly. They set it to zero, and expect their callers to supply a surplus. That will continue to work.

If apps DO want to pay the inner fee themselves, how do they do it today? Do most of them allow the prefilled default fee to be used? Or do they hardcode 1000 (or global MinFee?)

Either way, during congestion, the caller might need to supplement from the top-level. That doesn't seem or if keeping with the way most things will need to work. During congestion, pay more at the top. The mechanism to do so cleaning is non-trivial, but we already know simulate and wallets need to improve in that regard.

@joe-p
Copy link
Contributor

joe-p commented Oct 26, 2025

That's what I'm hoping. Recall that there is (perhaps misguided) feature that prepopulates an inner txn's fee with the amount needed. So if an app wants to pay the fee, they should not set it at all.

Right, I suppose if there is some app that's using it the clients could be updated to cover the extra cost via the outer txns.

As for the prepopulation feature, I am considering intentionally not letting that populate with anything more than MinFee as I don't want apps that use it to suddenly find themselves on the hook for congestion charges.

Yeah I think that makes sense. An app that fails unexpectedly under congestion seems preferable to an app that spends more than you'd expect during congestion.

If apps DO want to pay the inner fee themselves, how do they do it today?

Fwiw most apps I've seen just defer to the outer. TEALScript and Puya both actually do int 0; itxn_field Fee by default thus enforcing this behavior unless the user explicitly sets a fee.

@pbennett
Copy link
Contributor

Fwiw most apps I've seen just defer to the outer. TEALScript and Puya both actually do int 0; itxn_field Fee by default thus enforcing this behavior unless the user explicitly sets a fee.

This has been my pattern, but I think what might happen (in the same way we have to handle MBR) is that apps might convey to callers how much is needed to call them. They wouldn't be able to ever give that info here and instead simulate becomes a hard necessity and contracts can't tell callers (whether another contract or 'user/automation') how much extra they might need. Some of my code this might be a known fixed quantity but recently I've also coded with the thought of - it costs what it costs - do simulate and it figures it out. So, maybe it's not an issue?

One part of me is ok with 'a transaction' whether outer or inner scaling appropriately but they are different internally and have different costs so the 'entry fee' vs extra opcode cost always seemed logically fine to me. If we want to have the entry fee pass onto the inners then it's a different model altogether - it's not per byte, and it's also not 'per byte + x per y opcodes'. If it just gets explained as 'fee per transaction' whether outer or inner (but what does it mean when you just need to buy up opcodes? what's that unit??) that's fine but it obviously is a big change from 1000 fixed microAlgo cost per inner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants