-
Notifications
You must be signed in to change notification settings - Fork 742
Harden columnar public views against cross-session temp relations #8252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## m3hm3t/pg18_beta_confs #8252 +/- ##
==========================================================
+ Coverage 86.65% 88.99% +2.34%
==========================================================
Files 287 287
Lines 63189 63200 +11
Branches 7949 7951 +2
==========================================================
+ Hits 54756 56245 +1489
+ Misses 5931 4635 -1296
+ Partials 2502 2320 -182 🚀 New features to boost your workflow:
|
c332560 to
d8e5b54
Compare
d8e5b54 to
91b0f6b
Compare
|
Instead, let's fix the columnar views -the views inside columnar schema, not the tables in columnar_internal- in a way that they skip other sessions' temp relations always. For two reasons;
I can imagine a few tests directly executing get_storage_id() and those tests should be updated in a way that they stop directly calling the UDF but they should query columnar.storage as in |
91b0f6b to
bc8ab44
Compare
@onurctirtir thank you for feedback I updated the PR according to your suggestion. If you have time, could you review the PR? |
ced7e44 to
e0ccab3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dropped one comment and one question below.
And also wondering if the test failures are related to this PR?
Also a note to myself: check the changes made in view definitions once again before approving.
| CREATE OR REPLACE FUNCTION get_storage_id_if_visible(rel regclass) | ||
| RETURNS bigint | ||
| LANGUAGE sql STABLE AS $$ | ||
| SELECT CASE | ||
| WHEN c.relpersistence = 't' | ||
| AND c.relnamespace <> pg_catalog.pg_my_temp_schema() | ||
| THEN NULL -- other session’s temp → don’t touch | ||
| ELSE columnar.get_storage_id(c.oid) | ||
| END | ||
| FROM pg_catalog.pg_class c | ||
| WHERE c.oid = $1::oid | ||
| $$; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems get_storage_id() was a function we expose to users like it's the case for columnar views. So, rather than having a new test helper function, I believe we should fix get_storage_id() itself as in what we have done for columnar views.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_storage_id() updated
test helper removed
| @@ -1,2 +1,44 @@ | |||
| -- citus_columnar--13.2-1--14.0-1 | |||
| -- bump version to 14.0-1 | |||
|
|
|||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder should create older versions of these views in downgrade path as we usually do, or, as we did in some of the similar bugfixes, should those stay as is even after a downgrade?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose if we keep this view, we’ll only “lose” rows that came from other sessions’ temp columnar tables which nobody should rely on, and which is exactly what bit on PG18.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, I can add a downgrade script to this PR and open a separate issue to decide later whether to keep or remove it.
What do you think on this @onurctirtir ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after discuss will add downgrade path
Pg18 beta conf file updated (cherry picked from commit c36410c) Update image suffix in build and test workflow Update image suffix in build configuration Update image suffix in build configuration Update image suffix in build configuration (cherry picked from commit 7dbb946) Update image suffix in build_and_test.yml to reflect latest development version Update PostgreSQL version to 18beta3 in Dockerfile and CI workflow
… WHERE … IS NOT NULL (PG15–PG18) (#8139) DESCRIPTION: Stabilize multi_insert_select expected: accept unqualified columns in WHERE … IS NOT NULL fixes #8133 **Context** * With PG18, ruleutils adds a GROUP RTE and improves column-name dedup. As a side-effect, Vars that point at the GROUP RTE print as unqualified column names even when `varprefix` is true. * In Citus’ vendored `ruleutils_18.c` we already flattened GROUP Vars in `targetList` and `havingQual`, but not in `jointree->quals`. * For queries like `INSERT … SELECT … GROUP BY …`, Citus injects an implicit null-guard on the group key in the WHERE clause. Because that Var was still referencing the GROUP RTE, the deparser emitted `WHERE (user_id IS NOT NULL)` instead of `WHERE (raw_events_first.user_id IS NOT NULL)`, causing regress diffs only in grouped SELECTs. * Related upstream change: PostgreSQL commit `52c707483ce4d0161127e4958d981d1b5655865e` (ruleutils column-name de-dup / GROUP RTE exposure). **What changed** * Added an alternative expected file `src/test/regress/expected/multi_insert_select_0.out` to keep CI green across mixed environments where the qualified form may still be produced.
The change in `merge_planner.c` fixes _unrecognized range table entry_ diffs in merge regress tests (category 2 diffs in #7992), the change in `multi_router_planner.c` fixes _column reference ... is ambiguous_ diffs in `multi_insert_select` and `multi_insert_select_window` (category 3 diffs in #7992). Edit to `common.py` enables standalone regress tests with pg18 (e..g `citus_tests/run_test.py merge`).
5f5da94 to
15ecb4b
Compare
15ecb4b to
5565920
Compare
Fixes #8235 PG18 and PG latest minors ignore temporary relations in `RelidByRelfilenumber` (`RelidByRelfilenode` in PG15) Relevant PG commit: postgres/postgres@86831952 Here we are keeping temp reloids instead of getting it with RelidByRelfilenumber, for example, in some cases, we can directly get reloid from relations, in other cases we keep it in some structures. Note: there is still an outstanding issue with columnar temp tables in concurrent sessions, that will be fixed in PR #8252
Backporting citusdata#8235 PG18 and PG latest minors ignore temporary relations in `RelidByRelfilenumber` (`RelidByRelfilenode` in PG15) Relevant PG commit: postgres/postgres@86831952 Here we are keeping temp reloids instead of getting it with RelidByRelfilenumber, for example, in some cases, we can directly get reloid from relations, in other cases we keep it in some structures. Note: there is still an outstanding issue with columnar temp tables in concurrent sessions, that will be fixed in PR citusdata#8252
Backporting citusdata#8235 PG18 and PG latest minors ignore temporary relations in `RelidByRelfilenumber` (`RelidByRelfilenode` in PG15) Relevant PG commit: postgres/postgres@86831952 Here we are keeping temp reloids instead of getting it with RelidByRelfilenumber, for example, in some cases, we can directly get reloid from relations, in other cases we keep it in some structures. Note: there is still an outstanding issue with columnar temp tables in concurrent sessions, that will be fixed in PR citusdata#8252 (cherry picked from commit daa69be)
Backporting citusdata#8235 PG18 and PG latest minors ignore temporary relations in `RelidByRelfilenumber` (`RelidByRelfilenode` in PG15) Relevant PG commit: postgres/postgres@86831952 Here we are keeping temp reloids instead of getting it with RelidByRelfilenumber, for example, in some cases, we can directly get reloid from relations, in other cases we keep it in some structures. Note: there is still an outstanding issue with columnar temp tables in concurrent sessions, that will be fixed in PR citusdata#8252 (cherry picked from commit daa69be)
Backporting citusdata#8235 PG18 and PG latest minors ignore temporary relations in `RelidByRelfilenumber` (`RelidByRelfilenode` in PG15) Relevant PG commit: postgres/postgres@86831952 Here we are keeping temp reloids instead of getting it with RelidByRelfilenumber, for example, in some cases, we can directly get reloid from relations, in other cases we keep it in some structures. Note: there is still an outstanding issue with columnar temp tables in concurrent sessions, that will be fixed in PR citusdata#8252 (cherry picked from commit daa69be)
e302318 to
42d95c8
Compare
8c37f62 to
d4fcdcb
Compare
bdd4c24 to
206fc6e
Compare
b515be7 to
141f8ff
Compare
667e83a to
8056bff
Compare
fixes #8249
On PostgreSQL 18, opening another backend’s temporary relation is rejected earlier. Our public columnar views (
columnar.storage, and those that join through it) could traverse other sessions’ temp tables and error withcannot access temporary tables of other sessions. Also, direct callers ofcolumnar.get_storage_id(regclass)could hit the same path on PG18.What changed
SQL migration:
citus_columnar--13.2-1--14.0-1.sqlRecreate
columnar.storagewith a temp-safe filter:Re-emit
columnar.stripe,columnar.chunk_group, andcolumnar.chunkwithOR REPLACEso they inherit the saferstorageand keepsecurity_barrier.C code:
columnar_metadata.ccolumnar.get_storage_id(regclass)to skip other sessions’ temp relations beforerelation_open(). For those, the function returnsNULL(SQL isSTRICT, so callers naturally drop such rows). Behavior on PG15–PG17 is unchanged.Downgrade note
We intentionally keep the hardened view definitions after downgrade as a bugfix (safer on all supported PGs).
I couldn't find related commit but found these discussions:
https://www.postgresql.org/message-id/2736425.1758475979%40sss.pgh.pa.us
https://www.postgresql.org/message-id/CAMEv5_syU0ZopE-2Wr8A8QksqrCyYT2hW06Rgw4RSPdyJO-%3Dfw%40mail.gmail.com
https://www.postgresql.org/message-id/flat/CAJDiXghdFcZ8=nh4G69te7iRr3Q0uFyXxb3ZdG09_GTNZXwH0g@mail.gmail.com