-
Notifications
You must be signed in to change notification settings - Fork 93
fix: Use the correct db options in db-tools #2565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Centralized RocksDB configuration in db_options module to ensure consistent settings across all column families and improve maintainability.
Added centralized column family configuration and improved database tools with proper options handling.
…opening - Reverted dbtool.rs to main branch (no changes) - Added mandatory db type validation for each command - Optimized read-only operations to only open required column families - Removed create-if-missing for all read-only operations - Added open_db_cf_readonly() to open only specific column families
Critical fix: When opening column families in read-only mode, we must use the same options including merge operators that were used when writing. Without merge operators, reads could fail or return incorrect data if there are uncompacted merge operations in the database. - Updated open_db_cf_readonly to get proper options for each CF - Added merge operators for blob_info and per_object_blob_info CFs - Properly handle all known column family types - Use DB::open_cf_descriptors_read_only to pass CF-specific options
- Create DbColumnFamily enum to encapsulate column family names and options - Ensure read-only operations use proper merge operators for uncompacted data - Optimize read-only operations to open only required column families - Add db type sanity checks for all commands in dbtool_v2 - Update CLAUDE.md with formatting rules - Add documentation for all enum variants Co-Authored-By: Claude <[email protected]>
- Remove DbType enum from db_options.rs - Use DbTypeArg directly in dbtool_v2.rs without conversion - Remove unused imports (PathBuf, DatabaseDef) - Simplify code by eliminating unnecessary type conversion Co-Authored-By: Claude <[email protected]>
- Update RepairDb to open database with all column families and proper options after repair - Update DropColumnFamilies to use open_db_for_write with proper CF options - Add get_all_possible_column_families to detect DB type and get all expected CFs - Add open_db_for_write to open database with write access and proper CF options - Ensure both commands work with any database type (Main, EventProcessor, EventBlobWriter) - Create missing column families with correct options during repair Co-Authored-By: Claude <[email protected]>
- Move DbTypeArg from dbtool_v2 to db_options and rename to DbType - Add DbType::detect_from_path() to auto-detect database type - Update get_all_possible_column_families to take DbType parameter - Update open_db_for_write to take DbType parameter - RepairDb and DropColumnFamilies now auto-detect DB type - Cleaner API with explicit database type specification Co-Authored-By: Claude <[email protected]>
…ations Remove auto-detection of database type and make it a mandatory parameter for RepairDb and DropColumnFamilies commands to ensure users explicitly specify which database they're operating on. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
sadhansood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks He, I have some comments. Also, should we move db_column_family.rs and db_options.rs into /node/storage. I also feel like we should have both db_tool.rs and dbtool_v2.rs under /node/tools or something similar.
| } else if let Some(shard_str) = name.strip_prefix("primary_sliver_") { | ||
| if let Ok(shard) = shard_str.parse::<u16>() { | ||
| Self::PrimarySliver(shard) | ||
| } else { | ||
| Self::Custom(name.to_string()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think walrus uses "shard-<id>/primary-slivers" names for primary slivers.
| } else if let Some(shard_str) = name.strip_prefix("secondary_sliver_") { | ||
| if let Ok(shard) = shard_str.parse::<u16>() { | ||
| Self::SecondarySliver(shard) | ||
| } else { | ||
| Self::Custom(name.to_string()) | ||
| } | ||
| } else if let Some(shard_str) = name.strip_prefix("shard_status_") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, "shard-/secondary-slivers"` for secondary nodes
| Self::Custom(name.to_string()) | ||
| } | ||
| } else if let Some(shard_str) = name.strip_prefix("shard_status_") { | ||
| if let Ok(shard) = shard_str.parse::<u16>() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if we need to parse the shard index we can reuse id_from_column_family_name if we needed to.
| } | ||
| Self::PrimarySliver(_) | Self::SecondarySliver(_) | Self::PendingRecoverSlivers(_) => { | ||
| // Sliver column families use the same shard options. | ||
| db_config.shard().to_options() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a different db_config.pending_recover_slivers in the file, why not use it?
| // These don't have special options defined. | ||
| let mut opts = Options::default(); | ||
| opts.set_allow_mmap_reads(true); | ||
| opts | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also different db_config.shard_status()
| opts.set_allow_mmap_reads(true); | ||
| opts | ||
| } | ||
| Self::EventStore | Self::InitState => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, and db_config.shard_sync_progress()
| // Event blob writer CFs don't have special merge operators. | ||
| let mut opts = Options::default(); | ||
| opts.set_allow_mmap_reads(true); | ||
| opts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
db_config.certified()/attested()/pending()/failed_to_attest() exists too
| // Default options for unknown column families. | ||
| let mut opts = Options::default(); | ||
| opts.set_allow_mmap_reads(true); | ||
| opts | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is existing db options for bunch of other column families, i think we should use those for consistency
| } | ||
| Self::ShardStatus(_) | Self::ShardSyncProgress(_) => { | ||
| // These don't have special options defined. | ||
| let mut opts = Options::default(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, instead of Options::default() i wonder if we need to use Options::from(&db_config.global()) everywhere
|
This PR is stale because it has been open 14 days with no activity. It will be closed in 7 days unless you remove the |
|
This PR is stale because it has been open 14 days with no activity. It will be closed in 7 days unless you remove the |
|
This PR was closed because it has not seen any activity since being marked as stale. |
Description
Extract the db cf options, so that we can use the same options in db-tool.
The current one is kept until we are sure the new one is 100% correct.
Contributes to WAL-940
Test plan
How did you test the new or updated feature?
Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that
a user might notice and any actions they must take to implement updates. (Add release notes after the colon for each item)