-
SummaryWe experienced persistent RAFT corruption in our 3-node JetStream cluster (NATS v2.9.8) that caused streams to lose quorum and JetStream to shut down with EnvironmentNATS Server:
JetStream Configuration:
Kubernetes:
Server-Side ErrorsNATS logs on also seeing this: Other pods in the cluster reported: Additional Context
Recovery AttemptsAttempt 1: Restart Pod
Attempt 2: Delete Stream & Pod
Attempt 3: Delete PVC for Pod 0
Now when I restart I don't even see the line "restored" log for the affected stream like this one: Any suggestions on how to debug, fix or mitigate such issues going forward would be very welcome. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
2.9.x is now very old, unsupported and 100s of bug fixes behind, we have invested a lot of time on improving the areas you mention. You need to upgrade to 2.12.x, we can’t help with such old versions. |
Beta Was this translation helpful? Give feedback.
2.9.x is now very old, unsupported and 100s of bug fixes behind, we have invested a lot of time on improving the areas you mention.
You need to upgrade to 2.12.x, we can’t help with such old versions.