Skip to content

Conversation

@mattsu2020
Copy link
Contributor

Summary

Reworked base_common streaming so base64/basenc no longer slurp stdin into RAM; both encoder/decoder now read from any Read stream with bounded buffers, keep line-wrapping support, and reuse the SIMD wrapper.
Tightened the Base64SimdWrapper integration and added regression tests that cover streaming encode/decode behaviors plus chunked readers.
Fixed Clippy’s elidable lifetime warning and ensured Base58 handling no longer triggers gigantic buffer allocations.

Testing

cargo test -p uu_base32
cargo test -p uu_base64
cargo test -p uu_basenc
/usr/bin/time -l sh -c "cat benchmarks/bench_500mb.bin | /usr/bin/base64 > /dev/null" → max RSS 1.85 MB
/usr/bin/time -l sh -c "cat benchmarks/bench_500mb.bin | ./target/release/base64 > /dev/null" → max RSS 2.38 MB
/usr/bin/time -l sh -c "cat benchmarks/bench_500mb.b64 | /usr/bin/base64 -D > /dev/null" → max RSS 1.85 MB
/usr/bin/time -l sh -c "cat benchmarks/bench_500mb.b64 | ./target/release/base64 -d > /dev/null" → max RSS 2.33 MB

- Remove unused ReadSeek trait and rely solely on Read for inputs
- Stream stdin directly instead of buffering into memory
- Simplify fast decode/encode capability detection to be format-only
- Always use padded base64 alphabet for consistent SIMD-based handling
- Replace explicit lifetime parameter with anonymous lifetime to reduce noise
- Keeps semantics unchanged while aligning with modern Rust idioms
- Introduce READ_BUFFER_MAX to limit read buffer size during encoding
- Clamp buffer size to avoid excessively large allocations while preserving behavior
- Add missing semicolons on USimpleError returns for consistent style
- Reformat encode_to_vec_deque call to improve readability without logic changes
- Add "hellohello" to ignored words to prevent false positives
- Keeps spell-checker from flagging known test data in base_common.rs
@sylvestre
Copy link
Contributor

i wonder how we could check that in the CI to make sure we don't regress

@github-actions
Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

@sylvestre
Copy link
Contributor

sorry, small conflict

- Clarify trait alias comment for ReadSeek in English
- Refine padding detection doc to describe scanning including trailing whitespace
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 10, 2025

CodSpeed Performance Report

Merging #9215 will degrade performances by 5.77%

Comparing mattsu2020:base64_performance (3079380) with main (997d956)

Summary

❌ 2 regressions
✅ 121 untouched
⏩ 5 skipped1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
b64_decode_ignore_garbage_synthetic 149.2 µs 157.5 µs -5.31%
b64_decode_synthetic 148.9 µs 158.1 µs -5.77%

Footnotes

  1. 5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@github-actions
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

@mattsu2020
Copy link
Contributor Author

#9213

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants