Skip to content

Conversation

@anthonymq
Copy link

@anthonymq anthonymq commented Jan 14, 2026

Summary

This PR fixes memory spikes that can cause OOM (Out of Memory) errors in the _interactions layer when handling file uploads and inline data.

Problem

Several functions in the _interactions layer were loading entire files into memory before processing, causing OOM errors for large files.

Note: The main aio.files.upload() API (files.py_api_client.py) is already memory-efficient - it uses 8MB chunked streaming via anyio.Path().open('rb'). The OOM issues occur in the experimental _interactions layer.

Root Causes Identified

  1. _interactions/_files.py: Used path.read_bytes() to load entire files before passing to httpx
  2. _interactions/_utils/_transform.py: Base64 encoding loaded entire files with read_bytes() before encoding
  3. _interactions/_utils/_utils.py: file_from_path() loaded entire files with read_bytes()

Solution

1. File Handle Streaming (_interactions/_files.py)

  • Return open file handles (open(path, 'rb')) instead of loading bytes with read_bytes()
  • httpx natively supports IO[bytes] file handles, so this is a drop-in fix
  • Applies to both sync and async code paths

2. Chunked Base64 Encoding (_interactions/_utils/_transform.py)

  • Implement chunked reading with 3MB chunks for base64 encoding
  • Chunk size is a multiple of 3 (required for correct base64 encoding without padding issues)
  • Reduces peak memory from O(file_size) to O(chunk_size)

3. File Handle in Utility (_interactions/_utils/_utils.py)

  • file_from_path() now returns a file handle instead of loaded bytes

Memory Flow (Before vs After)

Before:

File path → read_bytes() (🔴 entire file in memory) → process

After:

File path → open() → file handle → stream in chunks (✅ memory-efficient)

Testing

  • Verified Python syntax compiles correctly
  • Verified file handle approach works with httpx's expected types
  • Verified chunked base64 encoding produces identical output to original implementation

Backwards Compatibility

This is a fully backwards-compatible change:

  • Public API remains unchanged
  • Return types are compatible (httpx accepts both bytes and IO[bytes])
  • Base64 output is identical (chunked encoding with multiples of 3 produces same result)

@janasangeetha janasangeetha self-assigned this Jan 16, 2026
@janasangeetha janasangeetha added the size:XL Code changes > 100 lines label Jan 16, 2026
@janasangeetha
Copy link
Collaborator

Hey @anthonymq
Thanks for contributing!
The branch is out-of-date. Kindly update the same.

This change addresses memory spikes that can cause OOM errors when
uploading large files to the Gemini File API.

Changes:
1. _interactions/_files.py: Return open file handles instead of loading
   entire files into memory with read_bytes(). httpx supports IO[bytes]
   directly, so there's no need to pre-load file contents.

2. _interactions/_utils/_transform.py: Implement chunked base64 encoding
   using 3MB chunks (must be multiple of 3 for base64 correctness) to
   reduce peak memory usage when encoding files for inline data.

The existing chunked upload mechanism in _api_client.py (8MB chunks)
was already correct, but files were being loaded into memory before
reaching that code path. This fix ensures memory-efficient handling
from the start of the upload flow.
Additional fix for the _interactions layer - file_from_path was loading
entire files with read_bytes() when it can return a file handle instead.
@anthonymq anthonymq force-pushed the fix/memory-efficient-file-uploads branch from bf6914f to ccf8a52 Compare January 16, 2026 08:41
@anthonymq
Copy link
Author

Branch updated,
Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL Code changes > 100 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants