MDEV-39013: Fix flaky main.tmp_space_usage test#4765
MDEV-39013: Fix flaky main.tmp_space_usage test#4765varundeepsaini wants to merge 1 commit intoMariaDB:11.8from
Conversation
gkodinov
left a comment
There was a problem hiding this comment.
Thank you for your contribution! This is a preliminary review.
Mostly LGTM, but please rebase on 11.8
76f7e7f to
feca392
Compare
gkodinov
left a comment
There was a problem hiding this comment.
LGTM. Please stand by for the final review.
|
@varundeepsaini have you been able to reproduce this locally? It appears that 11.8 may be unaffected, buildbot is unaware of such failures in 11.8. Do you have an explanation why only MSAN builds are affacted?
It must've get deflated at the end of the statement, before switch to connection InnoDB must not be relevant here. FWICS it never updates
I doubt the same pattern can be applied here: compared to So far I was unable to reproduce it locally, can't tell what's going wrong yet. |
|
@svoj No, I haven't been able to reproduce it locally either. My original explanation was wrong. I looked at the buildbot cross-reference (https://buildbot.mariadb.org/cr/?branch=&revision=&platform=&dt=&bbnum=&typ=&info=&test_name=main.tmp_space_usage&test_variant=&info_text=&failure_text=&limit=50) and found 3 failures, all on
Build 9332 is the interesting one. MSAN says the read at if (thd->status_var.tmp_space_used + size_change >
thd->variables.max_tmp_space_usage && !no_error &&
thd->variables.max_tmp_space_usage)The MSAN origin trace points to
My fix avoids the symptom but doesn't fix this root cause. This probably needs someone with an MSAN build to look at the Can you reproduce with MSAN locally? |
|
@varundeepsaini no, I haven't been able to reproduce it locally either. I use MSAN docker image as described in https://mariadb.com/docs/server/server-management/install-and-upgrade-mariadb/compiling-mariadb-from-source/legacy-guides/compile-and-using-mariadb-with-sanitizers-asan-ubsan-tsan-msan. The mysqld.cc:3830 line of c5f6fd3 you're referring to is different: Line 3830 in c5f6fd3 |
|
@varundeepsaini I was able to reproduce it locally and posted my analysis to MDEV-39013. You were right that these 16KB come from I_S internal temporary table. However they're reset at the end of the statement. With MDEV-38019 client is released earlier, before Your approach is viable and we may use it, but there're other ways to get it fixed. We need to devise some general way to handle issues cause by MDEV-38019. This is to be discussed with MariaDB developers team. If you're willing to keep this pull request as an option, commit message has to be updated such that it gives proper explanation of the problem. Optionally retarget for 12.3 (we can do it ourselves). |
|
@svoj I wonder if wait_until_count_sessions.inc would help in this case. When the control is returned to the client it looks like the session is still counted in Threads_connected. |
|
This pull request is now waiting for feedback to #4684 (comment). Specifically for mtr framework that'd allow to fix such failures. |
gkodinov
left a comment
There was a problem hiding this comment.
While waiting for the final review, please address the licensing check.
3598859 to
99cce15
Compare
Reading information_schema.session_status with tmp_memory_table_size=0 forces the query's internal temp table to disk, which changes tmp_space_used as a side effect of reading it. By the time connection c1 reads the value from processlist, it sees the inflated value. Fix by temporarily setting tmp_memory_table_size high before the session_status read, same pattern already used earlier in the test (lines 72-77) for the same reason. Signed-off-by: Varun Deep Saini <varun.23bcs10048@ms.sst.scaler.com> Signed-off-by: Varun Deep Saini <deepsainivarun@gmail.com>
99cce15 to
7df6aca
Compare
Summary
information_schema.session_statuswithtmp_memory_table_size=0forces the query's internal temp table to disk, which changestmp_space_usedas a side effect of reading it. By the time connection c1 reads the value fromprocesslist, it sees the inflated value (observed diff: exactly 16384 bytes = one InnoDB page).tmp_memory_table_sizehigh before thesession_statusread — same pattern already used earlier in the test (lines 72-77) for the same reason.Test plan
main.tmp_space_usagepasses locally (500/500 runs)