-
-
Notifications
You must be signed in to change notification settings - Fork 68
Closed
sebadob/hiqlite
#260Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
I've setup a cluster with 3 nodes, what I did was stopping, waiting and starting nodes. There was always at least one node online. Restarting some nodes is corrupting the database/state of one node. It's now not able to start the entire database is corrupted on that node.
To Reproduce
- Setup a cluster with 3 nodes.
- Stop Node 2
- Wait a few seconds
- Stop node 3
- Wait a few seconds
- Node 1 was Running. Start Node 3
- Stop Node 3
- Wait a few seconds
- Start Node 2
- Wait a few seconds
- Start Node 3
- Then the Node 3 was corrupted
Expected behavior
A cluster should be more bullet proof in regards to clean restarts.
Additional context
After restarting the leader node, everything worked as intended.
rauthy |
rauthy |
rauthy | 88
rauthy | ,d 88
rauthy | 88 88
rauthy | 8b,dPPYba, ,adPPYYba, 88 88 MM88MMM 88,dPPYba, 8b d8
rauthy | 88P' "Y8 "" `Y8 88 88 88 88P' "8a `8b d8'
rauthy | 88 ,adPPPPP88 88 88 88 88 88 `8b d8'
rauthy | 88 88, ,88 "8a, ,a88 88, 88 88 `8b,d8'
rauthy | 88 `"8bbdP"Y8 `"YbbdP'Y8 "Y888 88 88 Y88'
rauthy | d8'
rauthy | d8'
rauthy |
rauthy | 2025-09-22T10:47:00.068382Z INFO rauthy: Starting Rauthy v0.32.3-20250915 (2025-09-15 08:10:42 UTC)
rauthy | 2025-09-22T10:47:00.068415Z INFO rauthy: Log Level set to 'INFO'
rauthy | 2025-09-22T10:47:00.068422Z INFO rauthy_data::rauthy_config: Public URL: auth.ihate.tech
rauthy | 2025-09-22T10:47:00.068427Z INFO rauthy_data::rauthy_config: Issuer: https://auth.ihate.tech/auth/v1
rauthy | 2025-09-22T10:47:00.068574Z WARN hiqlite_wal::log_store: LockFile data/logs exists already - this is not a clean start!
rauthy | 2025-09-22T10:47:00.068989Z INFO check_integrity:check_repair_data_integrity: hiqlite_wal::wal: Starting detailed WalFile integrity check for data/logs/0000000000000001.wal
rauthy | 2025-09-22T10:47:00.069171Z INFO check_integrity:check_repair_data_integrity: hiqlite_wal::wal: No unexpected additional data found in data/logs/0000000000000001.wal
rauthy | 2025-09-22T10:47:00.069340Z INFO hiqlite::store::state_machine::sqlite::state_machine: db_exists in state_machine::new(): true
rauthy | 2025-09-22T10:47:00.069501Z WARN hiqlite::store::state_machine::sqlite::state_machine: Lock file already exists: data/state_machine/lock
rauthy | Node did not shut down gracefully - auto-rebuilding State Machine
rauthy | 2025-09-22T10:47:00.150448Z INFO hiqlite::store::state_machine::sqlite::writer: Starting snapshot restore from data/state_machine/snapshots/019970fb-9b48-7952-ac10-bc05fcb07e74
rauthy | Database restore remaining: 50
rauthy | Database restore remaining: 0
rauthy | 2025-09-22T10:47:00.154905Z INFO hiqlite::store::state_machine::sqlite::writer: Snapshot restore finished after 4 ms
rauthy | 2025-09-22T10:47:00.155753Z INFO hiqlite::store::state_machine::sqlite::state_machine: applied_state: StateMachineData { last_applied_log_id: Some(LogId { leader_id: LeaderId { term: 1, node_id: 1 }, index: 11 }), last_membership: StoredMembership { log_id: Some(LogId { leader_id: LeaderId { term: 0, node_id: 0 }, index: 0 }), membership: Membership { configs: [{1}], nodes: {1: Node { id: 1, addr_raft: "192.168.1.12:8100", addr_api: "192.168.1.12:8200" }} } }, last_snapshot_id: Some("019970fb-9b48-7952-ac10-bc05fcb07e74") }
rauthy | 2025-09-22T10:47:00.155810Z INFO openraft::storage::helper: get_initial_state vote=T4-N3:committed last_purged_log_id=None last_applied=T1-N1-11 committed=None last_log_id=T4-N3-241
rauthy | 2025-09-22T10:47:00.156070Z INFO hiqlite::store::state_machine::sqlite::state_machine: applied_state: StateMachineData { last_applied_log_id: Some(LogId { leader_id: LeaderId { term: 1, node_id: 1 }, index: 11 }), last_membership: StoredMembership { log_id: Some(LogId { leader_id: LeaderId { term: 0, node_id: 0 }, index: 0 }), membership: Membership { configs: [{1}], nodes: {1: Node { id: 1, addr_raft: "192.168.1.12:8100", addr_api: "192.168.1.12:8200" }} } }, last_snapshot_id: Some("019970fb-9b48-7952-ac10-bc05fcb07e74") }
rauthy | 2025-09-22T10:47:00.156104Z INFO openraft::storage::helper: load membership from log: [12..242)
rauthy | 2025-09-22T10:47:00.156576Z INFO openraft::storage::helper: load key log ids from (None,T4-N3-241]
rauthy | 2025-09-22T10:47:00.156681Z ERROR hiqlite_wal::reader: Error reading logs: Generic("`id_from` is below threshold - id_from: 0 / self.id_from: 12")
rauthy | 2025-09-22T10:47:00.156721Z INFO openraft::core::tick: Timer shutdown signal sent: Ok(())
rauthy |
rauthy |
rauthy | thread 'main' panicked at /usr/local/cargo/registry/src/index.crates.io-1949cf8c6b5b557f/hiqlite-0.10.1/src/store/mod.rs:92:6:
rauthy | Raft create failed: StorageError(Defensive { source: DefensiveError { subject: LogIndex(0), violation: LogIndexNotFound { want: 0, got: None }, backtrace: None } })
rauthy | note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
rauthy | 2025-09-22T10:47:00.157246Z INFO openraft::core::tick: TickLoop received cancel signal, quit
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working