Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Summary of ChangesHello @lidezhu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the workload testing tool by adding a new workload type specifically designed for simulating large-row upsert scenarios. This new 'staging_forward_index' workload will be instrumental in performance testing and benchmarking systems that frequently handle updates to substantial binary data fields, ensuring robust performance under such conditions. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
📝 WalkthroughWalkthroughAdds two new workload types— Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant App as WorkloadApp
participant Builder as WorkloadBuilder
participant DB as MySQL
Client->>App: start workload run (type, params)
App->>Builder: BuildInsertSqlWithValues(tableN, batchSize)
Builder-->>App: SQL + []values
App->>DB: ExecPrepared(SQL, values)
DB-->>App: Exec result
App-->>Client: report result/metrics
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (4)
tools/workload/schema/forwardindex/staging_forward_index.go (3)
190-194: SameputRandinconsistency inBuildInsertSqlWithValues.Apply
defer w.putRand(r)here too (line 194 currently calls it eagerly), matching thedeferpattern used inBuildDeleteSql.♻️ Proposed fix
func (w *StagingForwardIndexWorkload) BuildInsertSqlWithValues(tableN int, batchSize int) (string, []interface{}) { tableName := getTableName(tableN) r := w.getRand() - debugInfo, debugInfoHistory, adDoc := w.newPayload(r) - w.putRand(r) + defer w.putRand(r) + debugInfo, debugInfoHistory, adDoc := w.newPayload(r)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/workload/schema/forwardindex/staging_forward_index.go` around lines 190 - 194, The call to w.putRand(r) in BuildInsertSqlWithValues is done eagerly; change it to defer to match the pattern used in BuildDeleteSql so the random source is returned even if the function panics or returns early—specifically, after obtaining r from w.getRand() in BuildInsertSqlWithValues, replace the immediate w.putRand(r) call with defer w.putRand(r) so the resource is always released; reference the BuildInsertSqlWithValues function and the putRand/getRand helpers when making this change.
222-249: InconsistentputRandpattern — usedeferhere as well.
BuildDeleteSql(line 254) usesdefer w.putRand(r), butBuildUpdateSqlWithValuescallsw.putRand(r)manually at line 244. If a panic occurs betweengetRandandputRand, the*rand.Randleaks from the pool. For consistency and safety, preferdefer.♻️ Proposed fix
func (w *StagingForwardIndexWorkload) BuildUpdateSqlWithValues(opt schema.UpdateOption) (string, []interface{}) { tableName := getTableName(opt.TableIndex) r := w.getRand() + defer w.putRand(r) debugInfo, debugInfoHistory, adDoc := w.newPayload(r) var sb strings.Builder @@ ... values = append(values, debugInfo, debugInfoHistory, id, adDoc) } - w.putRand(r) sb.WriteString(strings.Join(placeholders, ","))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/workload/schema/forwardindex/staging_forward_index.go` around lines 222 - 249, In BuildUpdateSqlWithValues, after acquiring r with getRand() defer return to the pool to avoid leaking the *rand.Rand on panic: replace the existing manual w.putRand(r) call with an immediate defer w.putRand(r) right after r := w.getRand(), ensuring the rest of the function (including calls to w.newPayload(r) and use of r) remains unchanged.
190-216: All rows in a batch share the same payload byte slices.
newPayloadis called once (line 193) and the samedebugInfo,debugInfoHistory, andadDocslices are appended for every row in the batch (line 210). If the SQL driver reads values lazily or ifutil.RandomBytesis later changed to mutate in-place for reuse, all rows would silently share or corrupt data. For a workload generator this is likely acceptable (reduces allocations), but worth a brief comment so future maintainers know it's intentional.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/workload/schema/forwardindex/staging_forward_index.go` around lines 190 - 216, BuildInsertSqlWithValues currently calls newPayload once and reuses the same debugInfo, debugInfoHistory, and adDoc byte slices for every row in the batch, which can cause silent corruption if drivers read values lazily or if util.RandomBytes mutates buffers; either add a clear comment next to the newPayload call in BuildInsertSqlWithValues explaining that reuse is intentional and safe for this workload (mentioning debugInfo, debugInfoHistory, adDoc and newPayload), or make the loop create per-row copies (e.g., id stays generated per row, and use append([]byte(nil), debugInfo...), append([]byte(nil), debugInfoHistory...), append([]byte(nil), adDoc...) before appending to values) to avoid shared-mutable-slice issues.tools/workload/schema/forwardindex/staging_forward_index_test.go (1)
9-89: Tests uset.Fatalfinstead oftestify/require.Per coding guidelines, test files should use
testify/requirefor assertions. This also improves readability by reducing boilerplate.♻️ Example refactor for the first test
+import ( + "encoding/binary" + "strings" + "testing" + + "github.com/stretchr/testify/require" +) + func TestStagingForwardIndexWorkloadRowSizing512K(t *testing.T) { w := NewStagingForwardIndexWorkload(512*1024, 1000).(*StagingForwardIndexWorkload) - - if w.debugInfoSize != maxBlobSize { - t.Fatalf("debugInfoSize mismatch: got %d, want %d", w.debugInfoSize, maxBlobSize) - } - if w.debugInfoHistorySize != maxBlobSize { - t.Fatalf("debugInfoHistorySize mismatch: got %d, want %d", w.debugInfoHistorySize, maxBlobSize) - } - - wantAdDocSize := 512*1024 - 2*maxBlobSize - if w.adDocSize != wantAdDocSize { - t.Fatalf("adDocSize mismatch: got %d, want %d", w.adDocSize, wantAdDocSize) - } + require.Equal(t, maxBlobSize, w.debugInfoSize, "debugInfoSize") + require.Equal(t, maxBlobSize, w.debugInfoHistorySize, "debugInfoHistorySize") + require.Equal(t, 512*1024-2*maxBlobSize, w.adDocSize, "adDocSize") }Apply the same pattern to the other two tests.
As per coding guidelines, "Use unit test files named
*_test.goin Go; favor deterministic tests and usetestify/require".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/workload/schema/forwardindex/staging_forward_index_test.go` around lines 9 - 89, Replace t.Fatalf assertions with testify/require calls in TestStagingForwardIndexWorkloadRowSizing512K, TestStagingForwardIndexWorkloadRowSizingClamped, and TestStagingForwardIndexWorkloadBuildInsertSqlWithValues: add import "github.com/stretchr/testify/require" and use require.Equal/require.Len/require.Contains/require.IsType/require.True (and require.Equal for the binary seq check) instead of t.Fatalf, converting each failure case to the appropriate require assertion for clarity and brevity while preserving the same checked values (e.g., compare w.debugInfoSize to maxBlobSize, w.adDocSize to expected sizes, len(values) to expected count, type and length checks for []byte fields, and the BigEndian sequence check).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tools/workload/app.go`:
- Around line 139-140: The code passes app.Config.TotalRowCount into
NewStagingForwardIndexWorkload as the updateKeySpace, creating a semantic
mismatch; add a dedicated config field (e.g., UpdateKeySpace uint64) to the
config struct and use that when constructing the workload (replace
app.Config.TotalRowCount with app.Config.UpdateKeySpace in the call to
NewStagingForwardIndexWorkload), or if you cannot add a field now, document this
intentional reuse clearly in the config declaration and the call site (comment
next to NewStagingForwardIndexWorkload) to avoid future accidental changes;
update any config initialization/parsing to populate UpdateKeySpace accordingly
and adjust tests/usage that assumed TotalRowCount semantics.
In `@tools/workload/schema/forwardindex/staging_forward_index_test.go`:
- Around line 1-7: Add the standard Apache 2.0 license header to the top of
staging_forward_index_test.go (above the "package forwardindex" declaration)
matching the header used in staging_forward_index.go (lines 1–12) so the CI
check-copyright step passes; keep the existing imports (encoding/binary,
strings, testing) and code unchanged.
In `@tools/workload/schema/forwardindex/staging_forward_index.go`:
- Around line 269-274: The repository defines a custom helper function named
min(a, b int) int that shadows Go's built-in min (Go 1.21+); remove this custom
min implementation and update its call sites (the two usages currently calling
min at the forward index batching logic) to use the built-in min instead, then
delete the min function declaration from staging_forward_index.go so the code
uses the standard library implementation.
---
Nitpick comments:
In `@tools/workload/schema/forwardindex/staging_forward_index_test.go`:
- Around line 9-89: Replace t.Fatalf assertions with testify/require calls in
TestStagingForwardIndexWorkloadRowSizing512K,
TestStagingForwardIndexWorkloadRowSizingClamped, and
TestStagingForwardIndexWorkloadBuildInsertSqlWithValues: add import
"github.com/stretchr/testify/require" and use
require.Equal/require.Len/require.Contains/require.IsType/require.True (and
require.Equal for the binary seq check) instead of t.Fatalf, converting each
failure case to the appropriate require assertion for clarity and brevity while
preserving the same checked values (e.g., compare w.debugInfoSize to
maxBlobSize, w.adDocSize to expected sizes, len(values) to expected count, type
and length checks for []byte fields, and the BigEndian sequence check).
In `@tools/workload/schema/forwardindex/staging_forward_index.go`:
- Around line 190-194: The call to w.putRand(r) in BuildInsertSqlWithValues is
done eagerly; change it to defer to match the pattern used in BuildDeleteSql so
the random source is returned even if the function panics or returns
early—specifically, after obtaining r from w.getRand() in
BuildInsertSqlWithValues, replace the immediate w.putRand(r) call with defer
w.putRand(r) so the resource is always released; reference the
BuildInsertSqlWithValues function and the putRand/getRand helpers when making
this change.
- Around line 222-249: In BuildUpdateSqlWithValues, after acquiring r with
getRand() defer return to the pool to avoid leaking the *rand.Rand on panic:
replace the existing manual w.putRand(r) call with an immediate defer
w.putRand(r) right after r := w.getRand(), ensuring the rest of the function
(including calls to w.newPayload(r) and use of r) remains unchanged.
- Around line 190-216: BuildInsertSqlWithValues currently calls newPayload once
and reuses the same debugInfo, debugInfoHistory, and adDoc byte slices for every
row in the batch, which can cause silent corruption if drivers read values
lazily or if util.RandomBytes mutates buffers; either add a clear comment next
to the newPayload call in BuildInsertSqlWithValues explaining that reuse is
intentional and safe for this workload (mentioning debugInfo, debugInfoHistory,
adDoc and newPayload), or make the loop create per-row copies (e.g., id stays
generated per row, and use append([]byte(nil), debugInfo...),
append([]byte(nil), debugInfoHistory...), append([]byte(nil), adDoc...) before
appending to values) to avoid shared-mutable-slice issues.
| case stagingForwardIndex: | ||
| workload = pforwardindex.NewStagingForwardIndexWorkload(app.Config.RowSize, app.Config.TotalRowCount) |
There was a problem hiding this comment.
TotalRowCount is passed as updateKeySpace — semantic mismatch.
NewStagingForwardIndexWorkload expects (rowSize int, updateKeySpace uint64), but here app.Config.TotalRowCount (the total number of rows to insert) is passed as updateKeySpace (the key range for generating update IDs). This overloads an existing config field with a different meaning specific to this workload.
Consider adding a dedicated config field (e.g., UpdateKeySpace) or at minimum documenting this reuse clearly, so future maintainers don't accidentally change TotalRowCount semantics and break the update key distribution.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/workload/app.go` around lines 139 - 140, The code passes
app.Config.TotalRowCount into NewStagingForwardIndexWorkload as the
updateKeySpace, creating a semantic mismatch; add a dedicated config field
(e.g., UpdateKeySpace uint64) to the config struct and use that when
constructing the workload (replace app.Config.TotalRowCount with
app.Config.UpdateKeySpace in the call to NewStagingForwardIndexWorkload), or if
you cannot add a field now, document this intentional reuse clearly in the
config declaration and the call site (comment next to
NewStagingForwardIndexWorkload) to avoid future accidental changes; update any
config initialization/parsing to populate UpdateKeySpace accordingly and adjust
tests/usage that assumed TotalRowCount semantics.
| package forwardindex | ||
|
|
||
| import ( | ||
| "encoding/binary" | ||
| "strings" | ||
| "testing" | ||
| ) |
There was a problem hiding this comment.
Missing copyright header — pipeline failure.
The CI check-copyright step is failing because this file lacks the standard Apache 2.0 license header. All other Go files in this PR include it (e.g., staging_forward_index.go lines 1–12).
🔧 Proposed fix — add copyright header
+// Copyright 2026 PingCAP, Inc.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
package forwardindex📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| package forwardindex | |
| import ( | |
| "encoding/binary" | |
| "strings" | |
| "testing" | |
| ) | |
| // Copyright 2026 PingCAP, Inc. | |
| // | |
| // Licensed under the Apache License, Version 2.0 (the "License"); | |
| // you may not use this file except in compliance with the License. | |
| // You may obtain a copy of the License at | |
| // | |
| // http://www.apache.org/licenses/LICENSE-2.0 | |
| // | |
| // Unless required by applicable law or agreed to in writing, software | |
| // distributed under the License is distributed on an "AS IS" BASIS, | |
| // See the License for the specific language governing permissions and | |
| // limitations under the License. | |
| package forwardindex | |
| import ( | |
| "encoding/binary" | |
| "strings" | |
| "testing" | |
| ) |
🧰 Tools
🪛 GitHub Actions: PR Build and Unit Test
[error] 1-1: check-copyright: The copyright information of following files is incorrect.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/workload/schema/forwardindex/staging_forward_index_test.go` around
lines 1 - 7, Add the standard Apache 2.0 license header to the top of
staging_forward_index_test.go (above the "package forwardindex" declaration)
matching the header used in staging_forward_index.go (lines 1–12) so the CI
check-copyright step passes; keep the existing imports (encoding/binary,
strings, testing) and code unchanged.
| func min(a, b int) int { | ||
| if a < b { | ||
| return a | ||
| } | ||
| return b | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "go.mod" -type f | head -5Repository: pingcap/ticdc
Length of output: 154
🏁 Script executed:
cat go.mod | head -20Repository: pingcap/ticdc
Length of output: 740
🏁 Script executed:
cat -n tools/workload/schema/forwardindex/staging_forward_index.go | sed -n '200,210p;269,274p'Repository: pingcap/ticdc
Length of output: 572
🏁 Script executed:
rg "func min" tools/workload/schema/forwardindex/Repository: pingcap/ticdc
Length of output: 141
🏁 Script executed:
rg "\bmin\(" tools/workload/schema/forwardindex/ -A 2 -B 2Repository: pingcap/ticdc
Length of output: 993
Remove custom min function and use the Go 1.21+ built-in.
The code uses Go 1.25.5 and includes for range batchSize (line 204), a Go 1.22+ feature. The custom min function (lines 269–274) shadows the built-in min introduced in Go 1.21. Replace the two usages at lines 87–88 with the built-in.
♻️ Proposed fix
-func min(a, b int) int {
- if a < b {
- return a
- }
- return b
-}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/workload/schema/forwardindex/staging_forward_index.go` around lines 269
- 274, The repository defines a custom helper function named min(a, b int) int
that shadows Go's built-in min (Go 1.21+); remove this custom min implementation
and update its call sites (the two usages currently calling min at the forward
index batching logic) to use the built-in min instead, then delete the min
function declaration from staging_forward_index.go so the code uses the standard
library implementation.
There was a problem hiding this comment.
Code Review
This pull request introduces a new staging_forward_index workload, which is a valuable addition for performance testing of large-row upserts. The implementation is generally well-structured and integrates cleanly with the existing workload framework. I've identified a couple of areas for improvement: a bug in the generation of DELETE statements that could lead to incorrect behavior, and a misleading implementation of an interface method that should be corrected for clarity and future maintainability. Addressing these points will enhance the robustness of the new workload.
| func (w *StagingForwardIndexWorkload) BuildDeleteSql(opts schema.DeleteOption) string { | ||
| tableName := getTableName(opts.TableIndex) | ||
| r := w.getRand() | ||
| defer w.putRand(r) | ||
|
|
||
| var sb strings.Builder | ||
| for i := 0; i < opts.Batch; i++ { | ||
| if i > 0 { | ||
| sb.WriteString(";") | ||
| } | ||
| seq := (uint64(r.Int63()) % w.updateKeySpace) + 1 | ||
| id := make([]byte, pinPromotionIDSize) | ||
| w.fillPromotionID(id, seq) | ||
| sb.WriteString(fmt.Sprintf("DELETE FROM %s WHERE g_pin_promotion_id = 0x%x", tableName, id)) | ||
| } | ||
| return sb.String() | ||
| } |
There was a problem hiding this comment.
The current implementation of BuildDeleteSql generates multiple DELETE statements separated by semicolons. This will likely not work as expected, because database drivers typically only execute the first statement in a multi-statement string unless a special mode is enabled. This could lead to only one row being deleted per batch instead of the intended number.
To fix this and improve efficiency, you should generate a single DELETE statement with an IN clause.
func (w *StagingForwardIndexWorkload) BuildDeleteSql(opts schema.DeleteOption) string {
tableName := getTableName(opts.TableIndex)
r := w.getRand()
defer w.putRand(r)
if opts.Batch == 0 {
return ""
}
var sb strings.Builder
sb.WriteString(fmt.Sprintf("DELETE FROM %s WHERE g_pin_promotion_id IN (", tableName))
for i := 0; i < opts.Batch; i++ {
if i > 0 {
sb.WriteString(",")
}
seq := (uint64(r.Int63()) % w.updateKeySpace) + 1
id := make([]byte, pinPromotionIDSize)
w.fillPromotionID(id, seq)
sb.WriteString(fmt.Sprintf("0x%x", id))
}
sb.WriteString(")")
return sb.String()
}| func (w *StagingForwardIndexWorkload) BuildUpdateSql(opt schema.UpdateOption) string { | ||
| return w.BuildInsertSql(opt.TableIndex, opt.Batch) | ||
| } |
There was a problem hiding this comment.
The BuildUpdateSql method is implemented to call BuildInsertSql, which generates new random UUIDs for the primary key. This means it will always perform inserts rather than updates, which is misleading for a method named BuildUpdateSql. While this code path may not be used for the staging_forward_index workload (which uses BuildUpdateSqlWithValues), it's part of the Workload interface and should be implemented correctly or explicitly marked as unsupported. To avoid confusion and potential bugs if this method is called in the future, I suggest making it panic.
| func (w *StagingForwardIndexWorkload) BuildUpdateSql(opt schema.UpdateOption) string { | |
| return w.BuildInsertSql(opt.TableIndex, opt.Batch) | |
| } | |
| func (w *StagingForwardIndexWorkload) BuildUpdateSql(opt schema.UpdateOption) string { | |
| panic("BuildUpdateSql is not supported for staging_forward_index. Use BuildUpdateSqlWithValues.") | |
| } |
|
[FORMAT CHECKER NOTIFICATION] Notice: To remove the 📖 For more info, you can check the "Contribute Code" section in the development guide. |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (4)
tools/workload/schema/bis/bis_metadata.go (3)
499-508:newConstBytescan be replaced withbytes.Repeat.Minor simplification —
bytes.Repeat([]byte{value}, size)achieves the same result in one line using the standard library.♻️ Proposed simplification
+import "bytes" + -func newConstBytes(size int, value byte) []byte { - if size <= 0 { - return nil - } - buf := make([]byte, size) - for i := range buf { - buf[i] = value - } - return buf -} +func newConstBytes(size int, value byte) []byte { + if size <= 0 { + return nil + } + return bytes.Repeat([]byte{value}, size) +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/workload/schema/bis/bis_metadata.go` around lines 499 - 508, The helper newConstBytes duplicates functionality in the standard library; replace its implementation and all callers to use bytes.Repeat (from the bytes package) to create a slice filled with a single byte, remove the custom newConstBytes function, and add the bytes import where needed; ensure any callers expecting nil for non-positive sizes handle that case by using a conditional or by passing zero to bytes.Repeat which returns an empty slice.
459-487:BuildDeleteSqluses string interpolation instead of parameterized queries — inconsistent with insert/update builders.All other SQL-with-values builders (
BuildInsertSqlWithValues,BuildUpdateSqlWithValues) use?placeholders and return value slices.BuildDeleteSqlinlines values viafmt.Sprintfwith'%s'and%d. While the values are internally generated (UUIDs, formatted strings, ints), this is still a fragile pattern and inconsistent with the rest of the file. Consider adding aBuildDeleteSqlWithValuesvariant (if theWorkloadinterface allows it) or at least using parameterized queries here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/workload/schema/bis/bis_metadata.go` around lines 459 - 487, BuildDeleteSql currently inlines values using fmt.Sprintf while other builders (BuildInsertSqlWithValues, BuildUpdateSqlWithValues) use parameterized SQL and return value slices; change BuildDeleteSql to produce parameterized SQL and a values slice (or add BuildDeleteSqlWithValues if interface permits). Concretely, replace fmt.Sprintf calls that build "DELETE FROM %s WHERE (`id` = '%s')" and the 3-column variant with statements using "?" placeholders (e.g. "DELETE FROM %s WHERE (`id` = ?)" and "DELETE FROM %s WHERE (`entity_id` = ?) AND (`entity_set_id` = ?) AND (`user_id` = ?)"), accumulate corresponding args (id or entityID, entitySetID, userID) in a []interface{} and return both the SQL string and the args slice (or add a new method BuildDeleteSqlWithValues that mirrors BuildInsertSqlWithValues/BuildUpdateSqlWithValues); refer to functions/vars randSeq, newID, userID, entitySetID, entityID, tableName, deleteByID to locate where to collect values.
510-535: Replace custommin/max/minUint64/maxUint64helpers with Go 1.21+ builtins.This project targets Go 1.25.5, which includes generic built-in
minandmaxfunctions. These custom helpers are redundant and shadow the builtins. Replace all usages with the built-in versions and remove these function definitions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/workload/schema/bis/bis_metadata.go` around lines 510 - 535, The custom helper functions min, max, minUint64, and maxUint64 are redundant because Go 1.21+ provides generic built-in min and max; remove these four function definitions and replace any calls to min/minUint64 and max/maxUint64 with the corresponding built-in min and max so the code uses the standard generic builtins (e.g., replace min(a,b) -> min(a,b) using the builtin) and run `go vet`/`go test` to ensure no shadowing or import issues remain.tools/workload/schema/bis/bis_metadata_test.go (1)
12-38: Prefertestify/requireover rawt.Fatalffor assertions.All test functions in this file use
t.Fatalffor assertions. Per coding guidelines, tests should usetestify/requirefor clearer, more idiomatic assertions. For example:require.Equal(t, maxEntityMediaMetadataSize, w.entityMediaSize, "entityMediaSize mismatch")This applies throughout the file. As per coding guidelines, "Use unit test files named
*_test.goin Go; favor deterministic tests and usetestify/require".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/workload/schema/bis/bis_metadata_test.go` around lines 12 - 38, Replace all t.Fatalf assertions in TestBISMetadataWorkloadRowSizingAndClamping with testify/require assertions: import "github.com/stretchr/testify/require" and use require.Equal(t, expected, actual, "message") (or require.Equalf) for comparisons of w and w2 fields (entityMediaSize, batchAuxSize, batchCjSize, batchMetaSize, perTableUpdateKeySpace) and the clamping checks against maxEntityMediaMetadataSize and maxBatchMetadataSize; keep the same messages but use require so tests are idiomatic and the expected/actual argument order matches require's signature.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tools/workload/schema/bis/bis_metadata_test.go`:
- Around line 1-10: The file bis_metadata_test.go is missing the required Apache
2.0 copyright header causing CI to fail; add the same multi-line copyright
header used in bis_metadata.go to the very top of bis_metadata_test.go (above
the "package bis" line) so the header format, year, and owner match the
project's standard.
In `@tools/workload/schema/bis/bis_metadata.go`:
- Around line 406-422: The bug comes from calling r.Float64() multiple times in
the switch, which skews the intended probabilities; change the switch to sample
p := r.Float64() once (e.g., at the top of the branch where the switch currently
is) and then use cumulative comparisons like "case p <
entityUpdateByPinIDWeight:", "case p <
entityUpdateByPinIDWeight+entityUpdateByCompositeKeyWeight:" to decide between
the pinID, composite-key, and default update branches (matching the approach
used in buildBatchUpdateWithValues); keep the existing SQL strings and returned
parameter lists (pinID, userID/entityID/entitySetID, id, etc.) unchanged.
---
Nitpick comments:
In `@tools/workload/schema/bis/bis_metadata_test.go`:
- Around line 12-38: Replace all t.Fatalf assertions in
TestBISMetadataWorkloadRowSizingAndClamping with testify/require assertions:
import "github.com/stretchr/testify/require" and use require.Equal(t, expected,
actual, "message") (or require.Equalf) for comparisons of w and w2 fields
(entityMediaSize, batchAuxSize, batchCjSize, batchMetaSize,
perTableUpdateKeySpace) and the clamping checks against
maxEntityMediaMetadataSize and maxBatchMetadataSize; keep the same messages but
use require so tests are idiomatic and the expected/actual argument order
matches require's signature.
In `@tools/workload/schema/bis/bis_metadata.go`:
- Around line 499-508: The helper newConstBytes duplicates functionality in the
standard library; replace its implementation and all callers to use bytes.Repeat
(from the bytes package) to create a slice filled with a single byte, remove the
custom newConstBytes function, and add the bytes import where needed; ensure any
callers expecting nil for non-positive sizes handle that case by using a
conditional or by passing zero to bytes.Repeat which returns an empty slice.
- Around line 459-487: BuildDeleteSql currently inlines values using fmt.Sprintf
while other builders (BuildInsertSqlWithValues, BuildUpdateSqlWithValues) use
parameterized SQL and return value slices; change BuildDeleteSql to produce
parameterized SQL and a values slice (or add BuildDeleteSqlWithValues if
interface permits). Concretely, replace fmt.Sprintf calls that build "DELETE
FROM %s WHERE (`id` = '%s')" and the 3-column variant with statements using "?"
placeholders (e.g. "DELETE FROM %s WHERE (`id` = ?)" and "DELETE FROM %s WHERE
(`entity_id` = ?) AND (`entity_set_id` = ?) AND (`user_id` = ?)"), accumulate
corresponding args (id or entityID, entitySetID, userID) in a []interface{} and
return both the SQL string and the args slice (or add a new method
BuildDeleteSqlWithValues that mirrors
BuildInsertSqlWithValues/BuildUpdateSqlWithValues); refer to functions/vars
randSeq, newID, userID, entitySetID, entityID, tableName, deleteByID to locate
where to collect values.
- Around line 510-535: The custom helper functions min, max, minUint64, and
maxUint64 are redundant because Go 1.21+ provides generic built-in min and max;
remove these four function definitions and replace any calls to min/minUint64
and max/maxUint64 with the corresponding built-in min and max so the code uses
the standard generic builtins (e.g., replace min(a,b) -> min(a,b) using the
builtin) and run `go vet`/`go test` to ensure no shadowing or import issues
remain.
| package bis | ||
|
|
||
| import ( | ||
| "math/rand" | ||
| "strings" | ||
| "sync" | ||
| "testing" | ||
|
|
||
| "workload/schema" | ||
| ) |
There was a problem hiding this comment.
Fix missing copyright header — CI is failing.
The pipeline reports: "Copyright information is incorrect in this file as part of 'check-copyright' step." Add the standard Apache 2.0 copyright header at the top, matching the format in bis_metadata.go.
🐛 Proposed fix
+// Copyright 2026 PingCAP, Inc.
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
package bis📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| package bis | |
| import ( | |
| "math/rand" | |
| "strings" | |
| "sync" | |
| "testing" | |
| "workload/schema" | |
| ) | |
| // Copyright 2026 PingCAP, Inc. | |
| // | |
| // Licensed under the Apache License, Version 2.0 (the "License"); | |
| // you may not use this file except in compliance with the License. | |
| // You may obtain a copy of the License at | |
| // | |
| // http://www.apache.org/licenses/LICENSE-2.0 | |
| // | |
| // Unless required by applicable law or agreed to in writing, software | |
| // distributed under the License is distributed on an "AS IS" BASIS, | |
| // See the License for the specific language governing permissions and | |
| // limitations under the License. | |
| package bis | |
| import ( | |
| "math/rand" | |
| "strings" | |
| "sync" | |
| "testing" | |
| "workload/schema" | |
| ) |
🧰 Tools
🪛 GitHub Actions: PR Build and Unit Test
[error] 1-1: Copyright information is incorrect in this file as part of 'check-copyright' step. Please fix copyright headers.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/workload/schema/bis/bis_metadata_test.go` around lines 1 - 10, The file
bis_metadata_test.go is missing the required Apache 2.0 copyright header causing
CI to fail; add the same multi-line copyright header used in bis_metadata.go to
the very top of bis_metadata_test.go (above the "package bis" line) so the
header format, year, and owner match the project's standard.
| switch { | ||
| case r.Float64() < entityUpdateByPinIDWeight: | ||
| pinID := w.pinID(tableIndex, seq) | ||
| sql := fmt.Sprintf("UPDATE %s SET `delete_after` = ?, `updated_at` = ?, `pin_id` = ? WHERE (`pin_id` = ?) AND `delete_after` IS NULL", tableName) | ||
| return sql, []interface{}{nil, now, pinID, pinID} | ||
| case r.Float64() < entityUpdateByPinIDWeight+entityUpdateByCompositeKeyWeight: | ||
| userID := w.userID(tableIndex, seq) | ||
| entitySetID := w.entitySetID(seq) | ||
| entityID := w.entityID(seq) | ||
| sql := fmt.Sprintf("UPDATE %s SET `updated_at` = ?, `user_id` = ?, `content_hash` = ?, `entity_id` = ?, `entity_set_id` = ? WHERE (`entity_id` = ?) AND (`entity_set_id` = ?) AND (`user_id` = ?) AND `delete_after` IS NULL", tableName) | ||
| contentHash := w.contentHash(tableIndex, uint64(now.UnixNano())) | ||
| return sql, []interface{}{now, userID, contentHash, entityID, entitySetID, entityID, entitySetID, userID} | ||
| default: | ||
| id := w.newID('e', tableIndex, seq) | ||
| sql := fmt.Sprintf("UPDATE %s SET `id` = ?, `delete_after` = ?, `updated_at` = ? WHERE (`id` = ?) AND `delete_after` IS NULL", tableName) | ||
| return sql, []interface{}{id, nil, now, id} | ||
| } |
There was a problem hiding this comment.
Bug: probability distribution for entity update types is incorrect due to multiple r.Float64() draws.
Lines 407 and 411 each call r.Float64() independently, meaning the second case evaluates a new random value against a cumulative threshold. This skews the distribution away from what the named weights suggest (0.66 / 0.25 / 0.09):
- P(pinID) = 0.66
- P(composite) ≈ (1−0.66) × 0.91 ≈ 0.31
- P(default) ≈ (1−0.66) × 0.09 ≈ 0.03
Contrast with buildBatchUpdateWithValues (line 436) which correctly draws p := r.Float64() once and uses cumulative thresholds. Apply the same pattern here.
🐛 Proposed fix
func (w *BISMetadataWorkload) buildEntityUpdateWithValues(tableIndex int, r *rand.Rand) (string, []interface{}) {
tableName := getEntityTableName(tableIndex)
now := time.Now()
slot := w.slot(tableIndex)
upper := minUint64(w.perTableUpdateKeySpace, w.entitySeq[slot].Load())
seq := randSeq(r, upper)
+ p := r.Float64()
switch {
- case r.Float64() < entityUpdateByPinIDWeight:
+ case p < entityUpdateByPinIDWeight:
pinID := w.pinID(tableIndex, seq)
sql := fmt.Sprintf("UPDATE %s SET `delete_after` = ?, `updated_at` = ?, `pin_id` = ? WHERE (`pin_id` = ?) AND `delete_after` IS NULL", tableName)
return sql, []interface{}{nil, now, pinID, pinID}
- case r.Float64() < entityUpdateByPinIDWeight+entityUpdateByCompositeKeyWeight:
+ case p < entityUpdateByPinIDWeight+entityUpdateByCompositeKeyWeight:
userID := w.userID(tableIndex, seq)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| switch { | |
| case r.Float64() < entityUpdateByPinIDWeight: | |
| pinID := w.pinID(tableIndex, seq) | |
| sql := fmt.Sprintf("UPDATE %s SET `delete_after` = ?, `updated_at` = ?, `pin_id` = ? WHERE (`pin_id` = ?) AND `delete_after` IS NULL", tableName) | |
| return sql, []interface{}{nil, now, pinID, pinID} | |
| case r.Float64() < entityUpdateByPinIDWeight+entityUpdateByCompositeKeyWeight: | |
| userID := w.userID(tableIndex, seq) | |
| entitySetID := w.entitySetID(seq) | |
| entityID := w.entityID(seq) | |
| sql := fmt.Sprintf("UPDATE %s SET `updated_at` = ?, `user_id` = ?, `content_hash` = ?, `entity_id` = ?, `entity_set_id` = ? WHERE (`entity_id` = ?) AND (`entity_set_id` = ?) AND (`user_id` = ?) AND `delete_after` IS NULL", tableName) | |
| contentHash := w.contentHash(tableIndex, uint64(now.UnixNano())) | |
| return sql, []interface{}{now, userID, contentHash, entityID, entitySetID, entityID, entitySetID, userID} | |
| default: | |
| id := w.newID('e', tableIndex, seq) | |
| sql := fmt.Sprintf("UPDATE %s SET `id` = ?, `delete_after` = ?, `updated_at` = ? WHERE (`id` = ?) AND `delete_after` IS NULL", tableName) | |
| return sql, []interface{}{id, nil, now, id} | |
| } | |
| p := r.Float64() | |
| switch { | |
| case p < entityUpdateByPinIDWeight: | |
| pinID := w.pinID(tableIndex, seq) | |
| sql := fmt.Sprintf("UPDATE %s SET `delete_after` = ?, `updated_at` = ?, `pin_id` = ? WHERE (`pin_id` = ?) AND `delete_after` IS NULL", tableName) | |
| return sql, []interface{}{nil, now, pinID, pinID} | |
| case p < entityUpdateByPinIDWeight+entityUpdateByCompositeKeyWeight: | |
| userID := w.userID(tableIndex, seq) | |
| entitySetID := w.entitySetID(seq) | |
| entityID := w.entityID(seq) | |
| sql := fmt.Sprintf("UPDATE %s SET `updated_at` = ?, `user_id` = ?, `content_hash` = ?, `entity_id` = ?, `entity_set_id` = ? WHERE (`entity_id` = ?) AND (`entity_set_id` = ?) AND (`user_id` = ?) AND `delete_after` IS NULL", tableName) | |
| contentHash := w.contentHash(tableIndex, uint64(now.UnixNano())) | |
| return sql, []interface{}{now, userID, contentHash, entityID, entitySetID, entityID, entitySetID, userID} | |
| default: | |
| id := w.newID('e', tableIndex, seq) | |
| sql := fmt.Sprintf("UPDATE %s SET `id` = ?, `delete_after` = ?, `updated_at` = ? WHERE (`id` = ?) AND `delete_after` IS NULL", tableName) | |
| return sql, []interface{}{id, nil, now, id} | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/workload/schema/bis/bis_metadata.go` around lines 406 - 422, The bug
comes from calling r.Float64() multiple times in the switch, which skews the
intended probabilities; change the switch to sample p := r.Float64() once (e.g.,
at the top of the branch where the switch currently is) and then use cumulative
comparisons like "case p < entityUpdateByPinIDWeight:", "case p <
entityUpdateByPinIDWeight+entityUpdateByCompositeKeyWeight:" to decide between
the pinID, composite-key, and default update branches (matching the approach
used in buildBatchUpdateWithValues); keep the existing SQL strings and returned
parameter lists (pinID, userID/entityID/entitySetID, id, etc.) unchanged.
What problem does this PR solve?
Issue Number: close #xxx
What is changed and how it works?
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note
Summary by CodeRabbit
New Features
Documentation
Tests