-
Notifications
You must be signed in to change notification settings - Fork 742
Include SubPlans when checking that a query is still distributed. #8388
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (61.90%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #8388 +/- ##
==========================================
- Coverage 88.79% 88.78% -0.01%
==========================================
Files 287 287
Lines 63237 63256 +19
Branches 7928 7932 +4
==========================================
+ Hits 56149 56163 +14
- Misses 4756 4759 +3
- Partials 2332 2334 +2 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR fixes issue #8313 where queries with Citus tables that were reduced to read_intermediate_result() calls after recursive planning of CTEs and subqueries were incorrectly flagged as not requiring distributed planning. The fix enhances the distributed planning check in distributed_planner() hook to inspect both the range table and the plan's subplan list for distributed relations.
- Enhanced distributed planning check to include subplan inspection for
read_intermediate_result()function calls - Made
IsReadIntermediateResultFunction()publicly accessible for reuse - Added comprehensive regression tests covering various scenarios with distributed subplans
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/backend/distributed/planner/distributed_planner.c | Added PlanContainsDistributedSubPlanRTE() function to check for distributed subplans and updated CheckPostPlanDistribution() to use it |
| src/backend/distributed/planner/multi_logical_planner.c | Changed IsReadIntermediateResultFunction() from static to public visibility |
| src/include/distributed/multi_logical_planner.h | Added declaration for the newly exported IsReadIntermediateResultFunction() |
| src/test/regress/sql/subquery_in_where.sql | Added test cases for queries with distributed subplans including minimal repro and variants with redundant WHERE clauses |
| src/test/regress/expected/subquery_in_where.out | Added expected output for the new test cases verifying correct results |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
DESCRIPTION: tighten distributed plan check to cover subplans. The `distributed_planner()` hook checks, after calling `standard_planner()`, if the query still requires distributed planning - this was necessitated by issue #7782, #7783 where the citus tables in a query are optimized away by `standard_planner()`. However, issue #8313 exposed a case where we incorrectly flag a query as no long needing distributed planning; in this case, the citus relation(s) had been reduced to a read_intermediate_result() call after recursive planning of CTEs and subqueries. The fix involves including the plan's subplan list when checking the plan for a distributed relation - if a distributed subplan is found, the check can conclude the query requires distributed planning. Fixes issue: #8313
3d57c82 to
81c19f1
Compare
| FROM | ||
| users_table); | ||
| users_table) | ||
| ORDER BY id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewer: this change is not relevant to the fix, but this regress failed flakyness tests because of differences in the order of the results of this query. The ORDER BY stabilizes that.
|
@colm-mchugh looks like this is also an related issue: #8312 |
Yes it does, and the PR also fixes #8312, from testing locally. Thanks for pointing out! |
DESCRIPTION: tighten distributed plan check to cover subplans.
The
distributed_planner()hook checks, after callingstandard_planner(), if the query still requires distributed planning - this was necessitated by issue #7782, #7783 where the citus tables in a query are optimized away bystandard_planner().However, issue #8313 exposed a case where we incorrectly flag a query as no long needing distributed planning; in this case, the citus relation(s) had been reduced to a read_intermediate_result() call after recursive planning of CTEs and subqueries. The fix involves including the plan's subplan list when checking the plan for a distributed relation - if a distributed subplan is found, the check can conclude the query requires distributed planning.
Fixes issue: #8313