Support commit ts in TiFlash#10723
Conversation
Review Completed ✅Status: 4 issues identified and posted as PR review comments Findings Summary:
Review Links:
Issues Found:
Recommendation: Address the P0 and P1 issues before merging to ensure the code builds successfully. |
📝 WalkthroughWalkthroughAdds support for TiDB's hidden extra commit_ts column (ColumnID -5) by aliasing it to the internal version column across storage, filter, remote/schema generation, and expression analysis; includes casts after table-scan when requested types differ and adds tests covering aliasing and casting. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Coprocessor as DAG/Coprocessor
participant Storage as DeltaMerge/Storage
participant Filter as FilterParser/RSOperator
Client->>Coprocessor: Request scan referencing extra_commit_ts (col -5)
Coprocessor->>Storage: Resolve columns for table scan (map -5 -> version)
Storage-->>Coprocessor: Return schema with version column
Coprocessor->>Coprocessor: genNamesAndTypes/GenSchema (emit version name)
Coprocessor->>Coprocessor: DAGExpressionAnalyzer (if requested type != actual) add cast after table-scan
Client->>Filter: Push-down filter uses col -5
Filter->>Storage: Remap -5 -> version_col_id for evaluation
Filter-->>Client: Filter applied using version column
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@dbms/src/Flash/Coprocessor/GenSchemaAndColumn.cpp`:
- Around line 136-138: The throw in the switch case for
MutSup::extra_commit_ts_col_id in GenSchemaAndColumn.cpp omits an ErrorCodes
value; update the throw to include a DB::ErrorCodes enum (e.g.,
ErrorCodes::NOT_IMPLEMENTED) so it follows the pattern throw Exception("Not
supported in disaggregated read now", ErrorCodes::NOT_IMPLEMENTED); — locate the
case handling MutSup::extra_commit_ts_col_id and replace the current throw
Exception(...) with one that passes the appropriate ErrorCodes::... constant.
In `@dbms/src/Storages/MutableSupport.h`:
- Line 46: Rename the snake_case constant extra_commit_ts_col_id to camelCase
extraCommitTsColId and update every usage and reference accordingly; locate the
declaration inline static constexpr ColumnID extra_commit_ts_col_id in
MutableSupport.h and change the identifier, then update all call sites, static
casts, comments, and any tests or documentation that use extra_commit_ts_col_id
so they reference extraCommitTsColId to comply with the project's naming
guideline.
In `@dbms/src/Storages/tests/gtest_hidden_commit_ts_column.cpp`:
- Line 46: Update the inline comment that reads "Maybe another test has already
registed, ignore exception here." to correct the typo: change "registed" to
"registered" so it reads "Maybe another test has already registered, ignore
exception here." — edit this comment in gtest_hidden_commit_ts_column.cpp (the
comment string above the exception-handling block) to fix the spelling only.
| case MutSup::extra_commit_ts_col_id: | ||
| throw Exception("Not supported in disaggregated read now"); | ||
| default: |
There was a problem hiding this comment.
Add an ErrorCodes value to this Exception.
The current throw omits an error code, which makes error handling inconsistent.
🔧 Suggested fix
- case MutSup::extra_commit_ts_col_id:
- throw Exception("Not supported in disaggregated read now");
+ case MutSup::extra_commit_ts_col_id:
+ throw Exception("Not supported in disaggregated read now", ErrorCodes::NOT_IMPLEMENTED);🤖 Prompt for AI Agents
In `@dbms/src/Flash/Coprocessor/GenSchemaAndColumn.cpp` around lines 136 - 138,
The throw in the switch case for MutSup::extra_commit_ts_col_id in
GenSchemaAndColumn.cpp omits an ErrorCodes value; update the throw to include a
DB::ErrorCodes enum (e.g., ErrorCodes::NOT_IMPLEMENTED) so it follows the
pattern throw Exception("Not supported in disaggregated read now",
ErrorCodes::NOT_IMPLEMENTED); — locate the case handling
MutSup::extra_commit_ts_col_id and replace the current throw Exception(...) with
one that passes the appropriate ErrorCodes::... constant.
|
|
||
| inline static constexpr ColumnID extra_handle_id = -1; | ||
| inline static constexpr ColumnID extra_table_id_col_id = -3; | ||
| inline static constexpr ColumnID extra_commit_ts_col_id = -5; |
There was a problem hiding this comment.
Rename to camelCase to match naming guideline.
extra_commit_ts_col_id introduces a new snake_case identifier.
🔧 Suggested rename (plus update all usages)
- inline static constexpr ColumnID extra_commit_ts_col_id = -5;
+ inline static constexpr ColumnID extraCommitTsColId = -5;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| inline static constexpr ColumnID extra_commit_ts_col_id = -5; | |
| inline static constexpr ColumnID extraCommitTsColId = -5; |
🤖 Prompt for AI Agents
In `@dbms/src/Storages/MutableSupport.h` at line 46, Rename the snake_case
constant extra_commit_ts_col_id to camelCase extraCommitTsColId and update every
usage and reference accordingly; locate the declaration inline static constexpr
ColumnID extra_commit_ts_col_id in MutableSupport.h and change the identifier,
then update all call sites, static casts, comments, and any tests or
documentation that use extra_commit_ts_col_id so they reference
extraCommitTsColId to comply with the project's naming guideline.
| } | ||
| catch (DB::Exception &) | ||
| { | ||
| // Maybe another test has already registed, ignore exception here. |
There was a problem hiding this comment.
Typo: "registed" → "registered".
- // Maybe another test has already registed, ignore exception here.
+ // Maybe another test has already registered, ignore exception here.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Maybe another test has already registed, ignore exception here. | |
| // Maybe another test has already registered, ignore exception here. |
🤖 Prompt for AI Agents
In `@dbms/src/Storages/tests/gtest_hidden_commit_ts_column.cpp` at line 46, Update
the inline comment that reads "Maybe another test has already registed, ignore
exception here." to correct the typo: change "registed" to "registered" so it
reads "Maybe another test has already registered, ignore exception here." — edit
this comment in gtest_hidden_commit_ts_column.cpp (the comment string above the
exception-handling block) to fix the spelling only.
| has_cast = true; | ||
| } | ||
|
|
||
| if (may_need_add_cast_column[i] && table_scan_columns[i].id == ExtraCommitTSColumnID) |
There was a problem hiding this comment.
P0 - Undefined ExtraCommitTSColumnID breaks build
ExtraCommitTSColumnID is referenced but not defined anywhere in the codebase, causing build failure.
Evidence: This file is part of the default flash_service library build. A repo-wide search shows only uses (no definition) at line 1261 and in test file.
Likely fix: Use MutSup::extra_commit_ts_col_id (defined in Storages/MutableSupport.h:46).
| ColumnID col_id = col_info.id; | ||
| // TiDB may request a hidden commit_ts column in TableScan with a special ColumnID. | ||
| // In TiFlash it is stored in `_INTERNAL_VERSION` (VersionColumnID), so create an alias mapping. | ||
| if (unlikely(col_id == MutSup::extra_commit_ts_col_id)) |
There was a problem hiding this comment.
P2 - Rough set filter commit_ts alias uses wrong map key
RSOperator::build intends to alias TiDB's hidden commit_ts ColumnID (-5) to TiFlash's internal version ColumnID (-1024), but only populates the map under key -1024. Later, FilterParser resolves ColumnRef to the original ColumnID (-5) and calls id_to_attr.at(-5), which will throw std::out_of_range if a filter references commit_ts.
Evidence: Line 76 rewrites col_id from -5 to -1024, but map insertion at line 89 uses column_id_to_attr[cd.id], creating key -1024 only. Later lookup at FilterParser.cpp:168 uses the original -5.
Suggested fix: Populate the map under both keys, or adjust the lookup to use the aliased key.
| #include <IO/Buffer/WriteBufferFromString.h> | ||
| #include <Interpreters/Context.h> | ||
| #include <Storages/DeltaMerge/DeltaMergeDefines.h> | ||
| #include <Storages/DeltaMerge/Filter/PushDownFilter.h> |
There was a problem hiding this comment.
P1 - New gtest does not compile (multiple issues)
This test file has several compilation errors that block the gtests_dbms build:
- Line 23: Includes non-existent header
<Storages/DeltaMerge/Filter/PushDownFilter.h>(directory containsPushDownExecutor.*instead) - Lines 63, 92, 94, 225: References undefined constants
ExtraCommitTSColumnID,VERSION_COLUMN_ID,VERSION_COLUMN_NAME,VERSION_COLUMN_TYPE, and non-existentDM::PushDownFilter::build(...)API - Line 215: Calls
DAGQueryInfoconstructor with wrong signature (missingfts_query_infoparameter)
Evidence: Current code uses DB::MutSup::{extra_commit_ts_col_id, version_col_id, version_column_name, getVersionColumnType()} and DM::getVersionColumnDefine() instead.
|
@pantheon-ai[bot]: adding LGTM is restricted to approvers and reviewers in OWNERS files. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
| break; | ||
| } | ||
| case MutSup::extra_commit_ts_col_id: | ||
| throw Exception("Not supported in disaggregated read now"); |
There was a problem hiding this comment.
P2 - Disaggregated read rejects commit_ts
Disaggregated compute mode's TableScan path unconditionally throws when it encounters the hidden commit_ts ColumnID (MutSup::extra_commit_ts_col_id, -5), so any DAG request that includes that column will fail in disaggregated mode.
Evidence:
- Line 137:
case MutSup::extra_commit_ts_col_id: throw Exception(\"Not supported in disaggregated read now\"); - Called from
StorageDisaggregatedRemote.cpp:759and:833 - Non-disaggregated paths map -5 to
_INTERNAL_VERSIONinstead of rejecting it
Suggested fix: Implement the same aliasing behavior (-5 → version column) for disaggregated mode instead of throwing.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
dbms/src/Storages/tests/gtest_hidden_commit_ts_column.cpp (1)
62-245: Use camelCase for C++ local variables in this file.Several new locals are snake_case (for example
commit_ts_ci,table_scan_column_info,pushed_down_filters,runtime_filter_ids). Please align with repo C++ naming convention for consistency.As per coding guidelines: “Method and variable names in C++ must use camelCase (e.g.,
readBlock,totalBytes).”🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dbms/src/Storages/tests/gtest_hidden_commit_ts_column.cpp` around lines 62 - 245, The test uses snake_case local variable names; rename them to camelCase everywhere to follow C++ conventions—e.g., change commit_ts_ci -> commitTsCi, table_scan_column_info -> tableScanColumnInfo, pushed_down_filters -> pushedDownFilters, may_need_add_cast_column -> mayNeedAddCastColumn, casted_columns -> castedColumns, extra_cast -> extraCast, scan_column_infos -> scanColumnInfos, runtime_filter_ids -> runtimeFilterIds, table_column_defines -> tableColumnDefines (and any other snake_case locals) and update all references (calls to getDataTypeByColumnInfoForComputingLayer, PushDownExecutor::build, DM::RSOperator::build, DAGQueryInfo construction, analyzer.buildExtraCastsAfterTS, etc.) so identifiers and their usages remain consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@dbms/src/Storages/tests/gtest_hidden_commit_ts_column.cpp`:
- Around line 62-245: The test uses snake_case local variable names; rename them
to camelCase everywhere to follow C++ conventions—e.g., change commit_ts_ci ->
commitTsCi, table_scan_column_info -> tableScanColumnInfo, pushed_down_filters
-> pushedDownFilters, may_need_add_cast_column -> mayNeedAddCastColumn,
casted_columns -> castedColumns, extra_cast -> extraCast, scan_column_infos ->
scanColumnInfos, runtime_filter_ids -> runtimeFilterIds, table_column_defines ->
tableColumnDefines (and any other snake_case locals) and update all references
(calls to getDataTypeByColumnInfoForComputingLayer, PushDownExecutor::build,
DM::RSOperator::build, DAGQueryInfo construction,
analyzer.buildExtraCastsAfterTS, etc.) so identifiers and their usages remain
consistent.
ℹ️ Review info
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
dbms/src/Flash/Coprocessor/DAGExpressionAnalyzer.cppdbms/src/Flash/Coprocessor/GenSchemaAndColumn.cppdbms/src/Storages/DeltaMerge/Filter/RSOperator.cppdbms/src/Storages/tests/gtest_hidden_commit_ts_column.cpp
🚧 Files skipped from review as they are similar to previous changes (2)
- dbms/src/Flash/Coprocessor/DAGExpressionAnalyzer.cpp
- dbms/src/Storages/DeltaMerge/Filter/RSOperator.cpp
|
/cc @windtalker |
|
/cc @windtalker |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: solotzg, windtalker The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
[LGTM Timeline notifier]Timeline:
|
What problem does this PR solve?
Issue Number: close #10733
Problem Summary:
What is changed and how it works?
Check List
Tests
Side effects
Documentation
Release note
Summary by CodeRabbit
New Features
Tests