🐛 fix(duplicate-ja-doc-bug): centralize filename sanitization logic #1319
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of Changes
This PR fixes the duplicate document name error that occurred when uploading multiple Japanese files with the same character length (#1143, #1250).
Root Cause
The previous implementation replaced all non-ASCII characters with 'X', causing different Japanese filenames with the same length to become identical:
県名リスト.xlsx→XXXXX会員リスト.xlsx→XXXXXThis resulted in AWS Bedrock API rejecting requests with:
Solution
fileNameUtils.tswithconvertToSafeFilename()function県名リスト.xlsx→_____46a890b2会員リスト.xlsx→_____5c4aa342Changes
fileNameUtils.tsbedrockAgentApi.tsandmodels.tsImpact on Existing Users
No breaking changes or compatibility issues.
Checklist
npm run cdk:testand if there are snapshot differences, executenpm run cdk:test:update-snapshotto update snapshotsRelated Issues
Closes #1143
Closes #1250