-
Notifications
You must be signed in to change notification settings - Fork 7
Comprehensive batching tests and critical fix #110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Comprehensive batching tests and critical fix #110
Conversation
c191e25 to
8630e22
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive testing coverage for CloudWatch Logs byte limit scenarios and includes a critical fix to the batching logic. The changes focus on testing edge cases around the 26-byte overhead per event and ensuring proper batch splitting when approaching the 1MB size limit.
- Added extensive test suite covering small event overhead scenarios, large event batching, and mixed-size patterns
- Created a payload analysis utility to help understand CloudWatch batching efficiency
- Fixed critical batching logic to check size limits before adding events rather than after
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/utils/payload_analyzer.py | New utility module for analyzing CloudWatch payload sizes and batch efficiency |
| tests/utils/init.py | Package initialization exposing payload analysis utilities |
| tests/unit/test_cloudwatch_byte_limits.py | Comprehensive test suite covering CloudWatch byte limit edge cases and scenarios |
| tests/unit/test_log_processor.py | Updated timeout test to reflect corrected batching behavior |
| container/log_processor.py | Critical fix to batching logic and enhanced logging for size calculations |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
… testing CRITICAL BUG FIX: - Fixed batching logic that was adding events BEFORE checking size limits - This was causing oversized batches (>1MB) to be sent to CloudWatch API - Changed to check limits BEFORE adding events to prevent API failures ENHANCED LOGGING & MONITORING: - Added detailed byte calculations for each event (message + 26-byte overhead) - Enhanced batch logging with size breakdowns and overhead percentages - Added debug logging for batch triggers and size verification - Provides production visibility into batching efficiency COMPREHENSIVE TEST SUITE (33 tests): - Small events: High overhead scenarios (up to 96.3% overhead impact) - Large events: Near 1MB events and multi-batch scenarios - Mixed distributions: Realistic application log patterns - Edge cases: Unicode, JSON, boundary conditions, timeout handling - Real-world validation: Burst patterns, interleaved sizes NEW UTILITIES: - PayloadAnalyzer: Analyzes batch sizes and identifies problematic scenarios - Enhanced test utilities for CloudWatch batch analysis and debugging VALIDATION: - Worst case: 38,802 minimal events = exactly 1MB batch handled correctly - Realistic patterns: Mixed distributions show appropriate overhead (2-50%) - All CloudWatch limits respected: size, count, and timeout boundaries 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
CRITICAL TESTING ENHANCEMENT: - Added TestMessageContentPreservation class with 2 comprehensive tests - Verifies exact message content and chronological order across size-triggered batches - Goes beyond count/size verification to ensure actual message integrity TEST 1: test_exact_message_preservation_across_multiple_batches - Creates 12 large events (~300KB each) forcing 4+ batches due to 1MB size limits - Uses sequential timestamps (1 second apart) for clear chronological ordering - Embeds unique identifiers in message content for tracking across batches - Performs byte-for-byte comparison of original vs batched messages - Verifies no duplicates, no missing messages, perfect timestamp preservation TEST 2: test_message_content_with_special_characters_across_batches - Tests Unicode, JSON, multi-line, and special characters across batches - Uses hash-based verification for cryptographic content integrity - Ensures no encoding corruption during size-triggered batch splitting VALIDATION COVERAGE: ✓ Exact message content preservation (byte-for-byte) ✓ Chronological order within and across batches ✓ Unique identifier tracking through message content ✓ No duplicate or missing messages ✓ Unicode and special character handling ✓ Hash-based content verification ✓ Sequential timestamp preservation This addresses the critical need to verify that "Hello World" going in comes out exactly as "Hello World" in the correct chronological position, not just that the count and size are correct. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
The test was incorrectly checking only the last batch instead of verifying that all events were sent across multiple batches. The timeout logic works correctly - it sends events in 2 batches when timeout is reached: 1. First batch with Event 1 (due to timeout) 2. Final batch with Event 2 Updated test to properly verify both batches and all events are sent. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
8630e22 to
5e9ac2c
Compare
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #110 +/- ##
==========================================
- Coverage 68.14% 67.54% -0.61%
==========================================
Files 9 9
Lines 1510 1593 +83
==========================================
+ Hits 1029 1076 +47
- Misses 481 517 +36
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Now in #174 |
No description provided.