MDEV-38814 Skip root in opposite-intention check#4887
MDEV-38814 Skip root in opposite-intention check#4887iMineLink wants to merge 1 commit intoMariaDB:11.8from
Conversation
|
|
storage/innobase/btr/btr0cur.cc
Outdated
| /* Under SX, the root page must already be latched from the | ||
| original search_leaf() descent. Needing to freshly acquire it | ||
| here means a cascade escaped the space checks — only permissible | ||
| under X where any latch order is allowed (WL#6326 rule 3). */ | ||
| ut_ad(page_id.page_no() != index->page || | ||
| mtr->memo_contains_flagged(&index->lock, MTR_MEMO_X_LOCK)); |
There was a problem hiding this comment.
We get here if mtr->get_already_latched(page_id, rw_latch) failed. This function btr_cur_search_to_nth_level() is only being used for searching to non-leaf levels.
It would seem to me that WL#6326 rule (2.2) is more relevant here. I would suggest to write down the rules somewhere separately, because the original source could disappear at any time.
Why do we only care about holding exclusive index->lock when we are accessing the root page? I think that the answer is that this function is traversing from the root to the leaf, and it suffices to check the condition only on the root page? After all, we must be holding a latch on the parent page(s) when we goto search_loop.
Holding an exclusive root page latch along with a index->lock.u_lock() would certainly allow us to acquire any non-leaf page latch:
(2.2) When acquiring a node pointer page latch at level L, we must hold the left sibling page latch (at level L) or some ancestor latch (at level>L).
I did not think it through if the same would hold for any rw_latch mode that we are using here.
The first part of the assertion could be simplified. I care a little about debug build performance as well:
ut_ad(height != ULINT_UNDEFINED ||
mtr->memo_contains_flagged(&index->lock, MTR_MEMO_X_LOCK));There was a problem hiding this comment.
I think a partial answer is that the root is special in the sense that, being ancestor to all the other nodes in the tree, it makes rule (2.2) valid for every other non-leaf node if you have it X-latched under SX-lock of the index.
Otherwise, given that there is no sibling or ancestor to the root, you need for sure X-lock of the index to latch the root here as per rule (3), after the search_leaf() initial descent, which shall have retained at least some latches.
The only possibility under SX-lock that the root is latched here without violating (2.2) is if it's a brand new descent and no other latches are held, but I don't think this is possible after search_leaf() is executed (I'll see if assertions can help supporting this claim).
About rw_latch mode, it seems the only S-latch request comes from btr_page_get_father_node_ptr_for_validate() which requires X-lock on the index, therefore it should be currently safe to exclude rw_latch from the assertion.
btr_cur_need_opposite_intention() checks whether a structural change at a page (e.g., a page split inserting a node pointer into the parent, a merge with a sibling removing one, or a boundary change updating one) could cascade into its parent. Handling such a structural change would also require latching sibling pages at that level, which cannot be safely acquired under SX without risking deadlock with concurrent readers; hence the index lock must be upgraded from SX to X. The root has no parent and no siblings, so such cascades are impossible and no sibling latches are needed; return false early, avoiding unnecessary upgrades. A root page split (btr_root_raise_and_insert()) is a special case that changes the tree height, but it never cascades upward into a parent. Add a debug assertion in btr_cur_search_to_nth_level() that freshly acquiring the root page is only permitted under X-lock or when no page latch is held (brand-new descent). Under SX with existing page latches, the root must already be latched in the mtr, retained from the original search_leaf() descent. Add debug-only counters (Innodb_btr_need_opposite_intention_root and Innodb_btr_index_lock_upgrades) to track how often the root early return is taken and how often SX-to-X upgrades occur, exposed via SHOW GLOBAL STATUS in debug builds. Add a test that inserts and updates 1000 rows in a 4K-page table, verifying that the root early return fires and that index lock upgrades are reduced.
|
I added |
btr_cur_need_opposite_intention()checks whether a structural change at a page (e.g., a page split inserting a node pointer into the parent, a merge with a sibling removing one, or a boundary change updating one) could cascade into its parent. Handling such a structural change would also require latching sibling pages at that level, which cannot be safely acquired under SX without risking deadlock with concurrent readers; hence the index lock must be upgraded from SX to X. The root has no parent and no siblings, so such cascades are impossible and no sibling latches are needed; return false early, avoiding unnecessary upgrades.A root page split (
btr_root_raise_and_insert()) is a special case that changes the tree height, but it never cascades upward into a parent.Add a debug assertion in
btr_cur_search_to_nth_level()that freshly acquiring the root page is only permitted under X-lock or when no page latch is held (brand-new descent). Under SX with existing page latches, the root must already be latched in the mtr, retained from the originalsearch_leaf()descent.Add debug-only counters (
Innodb_btr_need_opposite_intention_rootandInnodb_btr_index_lock_upgrades) to track how often the root early return is taken and how often SX-to-X upgrades occur, exposed viaSHOW GLOBAL STATUSin debug builds.Add a test that inserts and updates 1000 rows in a 4K-page table, verifying that the root early return fires and that index lock upgrades are reduced.
Notes:
btr_cur_need_opposite_intention()calls returningtruewhen evaluated at the root level.btr_cur_search_to_nth_level()is added as an extra safety net to validate a pre-existing invariant.innodb.index_lock_upgradetest, whybtr_cur_optimistic_update()returnsDB_UNDERFLOWeven if the record "grows" in theUPDATE? To be checked