-
Notifications
You must be signed in to change notification settings - Fork 90
feat: Add Redis lock strategy #4617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feat/nonce-ordering-with-locks
Are you sure you want to change the base?
Conversation
Signed-off-by: Simeon Nakov <[email protected]>
Codecov Report❌ Patch coverage is
❌ Your patch check has failed because the patch coverage (65.91%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. @@ Coverage Diff @@
## feat/nonce-ordering-with-locks #4617 +/- ##
=================================================================
Coverage ? 94.11%
=================================================================
Files ? 133
Lines ? 21232
Branches ? 1719
=================================================================
Hits ? 19982
Misses ? 1229
Partials ? 21
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
| | `TX_DEFAULT_GAS` | "400000" | Default gas for transactions that do not specify gas. | | ||
| | `TXPOOL_API_ENABLED` | "false" | Enables all txpool related methods. | | ||
| | `USE_ASYNC_TX_PROCESSING` | "true" | Set to `true` to enable `eth_sendRawTransaction` to return the transaction hash immediately after passing all prechecks, while processing the transaction asynchronously in the background. | | ||
| | `LOCK_MAX_HOLD_MS` | "30000" | Maximum time in milliseconds a lock can be held before automatic force-release. This TTL prevents deadlocks when transaction processing hangs or crashes. Default is 30 seconds. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, should we use the same variable as used in local lock strategy?
| * @throws Error if acquisition times out or Redis connection fails. | ||
| */ | ||
| async acquireLock(address: string): Promise<string> { | ||
| const normalizedAddress = this.normalizeAddress(address); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be a utils function and be reused in LocalLockStrategy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also add it as a static method on LockService or something and reuse it in LocalLockStrategy as well. This would keep it within the Lock module and reduce indirection.
| await this.cleanupFromQueue(queueKey, sessionKey, normalizedAddress); | ||
| } | ||
|
|
||
| const redisError = new RedisCacheError(error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should use another error, RedisCache would be confusing, or rename the error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Probably just use the original error thrown without needing to wrap it into RedisCacheError it can cause confusion
|
|
||
| try { | ||
| await redisLockStrategy.acquireLock(testAddress); | ||
| expect.fail('Should have thrown error'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't assert with try/catch, let's try and use unified approaches in tests
https://github.com/hiero-ledger/hiero-json-rpc-relay/blob/main/packages/relay/tests/lib/mirrorNodeClient.spec.ts#L216
|
|
||
| try { | ||
| await redisLockStrategy.acquireLock(testAddress); | ||
| expect.fail('Should have thrown error'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, don't use try/catch
|
|
||
| try { | ||
| await redisLockStrategy.acquireLock(testAddress); | ||
| expect.fail('Should have thrown error'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit assertion here ass well to not use try/catch
quiet-node
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some concerns and comments
| | `USE_ASYNC_TX_PROCESSING` | "true" | Set to `true` to enable `eth_sendRawTransaction` to return the transaction hash immediately after passing all prechecks, while processing the transaction asynchronously in the background. | | ||
| | `LOCK_MAX_HOLD_MS` | "30000" | Maximum time in milliseconds a lock can be held before automatic force-release. This TTL prevents deadlocks when transaction processing hangs or crashes. Default is 30 seconds. | | ||
| | `LOCK_QUEUE_POLL_INTERVAL_MS` | "50" | Polling interval in milliseconds for Redis lock queue checks. Lower values reduce latency but increase Redis load. Default is 50ms. | | ||
| | `LOCK_REDIS_PREFIX` | "lock" | Redis key prefix for lock-related keys. Useful for namespace isolation in shared Redis instances. Default is "lock". | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this as a configuration? I think a hardcoded value inside Redis Lock strategy is good enough, right?
| * @throws Error if acquisition times out or Redis connection fails. | ||
| */ | ||
| async acquireLock(address: string): Promise<string> { | ||
| const normalizedAddress = this.normalizeAddress(address); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also add it as a static method on LockService or something and reuse it in LocalLockStrategy as well. This would keep it within the Lock module and reduce indirection.
| async acquireLock(address: string): Promise<string> { | ||
| const normalizedAddress = this.normalizeAddress(address); | ||
| const sessionKey = randomUUID(); | ||
| const lockKey = this.getLockKey(normalizedAddress); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we’ve had this conversation before, right? Not sure when but anywho, requiring the key to be normalized before being passed in adds unnecessary complexity and risks producing different types of keys for different operations, since it allows unnormalized parameters to be passed. It also leads to repeated calls to this.normalizeAddress() at the call sites.
Would it make sense to handle normalization inside getLockKey? This way, whenever we call this.getLockKey() with any type of parameter, it would automatically be normalized, ensuring that getLockKey() reliably produces a lock key that is always consistent and accurate, regardless of the case sensitivity of the input.
| const normalizedAddress = this.normalizeAddress(address); | ||
| const sessionKey = randomUUID(); | ||
| const lockKey = this.getLockKey(normalizedAddress); | ||
| const queueKey = this.getQueueKey(normalizedAddress); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for getQueueKey, I think it would be more intuitive to have normalization inside getQueueKey.
| await this.cleanupFromQueue(queueKey, sessionKey, normalizedAddress); | ||
| } | ||
|
|
||
| const redisError = new RedisCacheError(error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Probably just use the original error thrown without needing to wrap it into RedisCacheError it can cause confusion
| }); | ||
|
|
||
| if (result === 1) { | ||
| this.logger.info(`Lock released: address=${normalizedAddress}, sessionKey=${sessionKey}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for this log would it make more sense to use debug?
|
|
||
| const redisError = new RedisCacheError(error); | ||
| this.logger.error(redisError, `Failed to acquire lock: address=${normalizedAddress}, sessionKey=${sessionKey}`); | ||
| throw redisError; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also on Tuesday the team decided to go with Fail Open approach so we don't need to throw here but rather return null and ignore.
| } | ||
| } | ||
| } catch (error) { | ||
| const redisError = new RedisCacheError(error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's not wrap it into RedisCacheError so we can avoid confusion
| * @param sessionKey - The session key to remove. | ||
| * @param address - The address (for logging). | ||
| */ | ||
| private async cleanupFromQueue(queueKey: string, sessionKey: string, address: string): Promise<void> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we repurpose this and make it usable for both removing it after successful acquisition or fail acquisition? rPop(queueKey) and lRem(queueKey, 1, sessionKey) technically work the same, right? If it makes sense to repurpose for both successful and fail cases we can have it triggered in the finally() block of the acquireLock() method, it'll be a little cleaner.
| this.maxLockHoldMs = ConfigService.get('LOCK_MAX_HOLD_MS' as any) as number; | ||
| this.pollIntervalMs = ConfigService.get('LOCK_QUEUE_POLL_INTERVAL_MS' as any) as number; | ||
| this.keyPrefix = ConfigService.get('LOCK_REDIS_PREFIX' as any) as string; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to add as any to these calls? ConfigService should automatically define the type based on what we put for these configs in GlobalConfig object, right?
Description
This PR introduces the class that will implement locks in Redis in order to support the new Nonce Ordering feature
Related issue(s)
Fixes #4594
Changes from original design (optional)
N/A
Additional work needed (optional)
N/A
Checklist