-
Notifications
You must be signed in to change notification settings - Fork 969
fix: prevent response file overwrite when -sd flag is used #2226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
WalkthroughThe code introduces explicit handling for duplicate targets by tracking their counts and processing each occurrence accordingly, especially when deduplication is disabled. It also updates file writing logic to avoid overwriting response files by generating unique filenames. Test cases are updated to reflect the new duplicate target error handling. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Runner
participant FileSystem
User->>Runner: Provide target list with duplicates, set -skip-dedupe
Runner->>Runner: Parse targets, count duplicates
loop For each duplicate count
Runner->>Runner: Process target
Runner->>FileSystem: Write response file (with unique name if needed)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Assessment against linked issues
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
runner/runner.go (1)
2152-2176
: Consider adding a safety limit to prevent potential infinite loops.While the file creation logic correctly handles duplicates, there's no upper limit on the suffix counter. In edge cases or under concurrent load, this could theoretically loop indefinitely.
Consider adding a reasonable upper limit:
finalPath := responsePath idx := 0 +const maxSuffixAttempts = 1000 for { + if idx >= maxSuffixAttempts { + gologger.Error().Msgf("Exceeded maximum attempts to create unique file for '%s'", responsePath) + break + } targetPath := finalPath if idx > 0 { basePath := strings.TrimSuffix(responsePath, ".txt") targetPath = fmt.Sprintf("%s_%d.txt", basePath, idx) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
runner/runner.go
(6 hunks)runner/runner_test.go
(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
runner/runner_test.go (1)
runner/options.go (1)
Options
(173-351)
🔇 Additional comments (5)
runner/runner.go (4)
448-449
: LGTM! Good use of a sentinel error for explicit duplicate detection.The error declaration follows Go's best practices for sentinel errors.
642-645
: Good improvement to error handling!Making duplicate detection explicit with an error return is better than silently returning 0. This allows callers to handle duplicates appropriately based on their needs.
1245-1267
: Well-structured refactoring for repeated target processing!The
runProcess
helper function cleanly encapsulates the logic for processing targets multiple times based on their count. Good handling of edge cases with the default to 1 when parsing fails.
455-465
: Consider initializing counts consistently.The current implementation always sets the initial count to "1" for new targets (line 458), but this might not accurately reflect the actual number of times a target appears in the input. Additionally, the error from
strconv.Atoi
at line 461 is ignored, which could lead to unexpected behavior if the stored value is corrupted.Consider this more robust implementation:
- r.hm.Set(target, []byte("1")) //nolint + r.hm.Set(target, []byte("1")) } else if r.options.SkipDedupe && errors.Is(err, duplicateTargetErr) { if v, ok := r.hm.Get(target); ok { - cnt, _ := strconv.Atoi(string(v)) + cnt, err := strconv.Atoi(string(v)) + if err != nil { + // Handle corrupted count, default to 1 + cnt = 1 + } r.hm.Set(target, []byte(strconv.Itoa(cnt+1))) numHosts += 1 }Likely an incorrect or invalid review comment.
runner/runner_test.go (1)
10-10
: Test updates correctly reflect the new duplicate handling behavior!Good changes:
- Explicitly setting
SkipDedupe: false
makes the test's intent clear- Using
errors.Is
is the correct way to check for sentinel errors in Go- The test properly validates that duplicates are detected and return the expected error
Also applies to: 128-130, 145-145
expandedTarget, err := r.countTargetFromRawTarget(target) | ||
if err == nil && expandedTarget > 0 { | ||
numTargets += expandedTarget | ||
r.hm.Set(target, nil) //nolint | ||
r.hm.Set(target, []byte("1")) //nolint | ||
} else if r.options.SkipDedupe && errors.Is(err, duplicateTargetErr) { | ||
if v, ok := r.hm.Get(target); ok { | ||
cnt, _ := strconv.Atoi(string(v)) | ||
r.hm.Set(target, []byte(strconv.Itoa(cnt+1))) | ||
numTargets += 1 | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Extract duplicate counting logic to reduce code duplication.
This logic is duplicated from prepareInput
. Consider extracting it into a helper method to maintain DRY principles and ensure consistent behavior.
Consider creating a helper method:
func (r *Runner) incrementTargetCount(target string) {
if v, ok := r.hm.Get(target); ok {
cnt, err := strconv.Atoi(string(v))
if err != nil {
cnt = 1
}
r.hm.Set(target, []byte(strconv.Itoa(cnt+1)))
}
}
Then use it in both locations:
} else if r.options.SkipDedupe && errors.Is(err, duplicateTargetErr) {
- if v, ok := r.hm.Get(target); ok {
- cnt, _ := strconv.Atoi(string(v))
- r.hm.Set(target, []byte(strconv.Itoa(cnt+1)))
- numTargets += 1
- }
+ r.incrementTargetCount(target)
+ numTargets += 1
}
🤖 Prompt for AI Agents
In runner/runner.go around lines 622 to 632, the logic for incrementing the
count of duplicate targets is duplicated from the prepareInput method. Extract
this duplicate counting logic into a new helper method on the Runner struct, for
example incrementTargetCount, which safely retrieves the current count, handles
conversion errors by defaulting to 1, increments the count, and updates the map.
Replace the duplicated code in both places with calls to this new helper method
to adhere to DRY principles and maintain consistent behavior.
Summary
This PR addresses issue #2152, which causes response files to be overwritten even when the
-sd
(SkipDedupe) flag is used.What was the issue?
When the
-sd
flag is enabled, input targets are processed without deduplication. However, the responses are still written to the same file path based on the SHA1 hash of the URL, leading to overwritten output files.What’s changed?
_1
,_2
, ...).HybridMap
, and the number of processing iterations is determined based on this count.countTargetFromRawTarget
to return a knownduplicateTargetErr
when deduplication is disabled.analyze
andprocess
to ensure unique file writes under concurrency.Why is this useful?
This change ensures:
-sd
flag.Result
Related issue
Closes #2152
Summary by CodeRabbit
New Features
Bug Fixes
Tests