Skip to content

Conversation

@bmigette
Copy link

@bmigette bmigette commented Oct 6, 2025

Pull Request: Add Chunking and Persistent Storage to FinancialSituationMemory

Overview

This PR adds two critical improvements to the FinancialSituationMemory class:

  1. Text Chunking for Long Inputs - Handles texts exceeding embedding model limits using RecursiveCharacterTextSplitter
  2. Persistent ChromaDB Storage - Enables disk-based storage for memory persistence across sessions

Motivation

Problem 1: Long Text Handling

The OpenAI embedding model text-embedding-3-small has a maximum context length of 8,192 tokens. When financial analysis texts exceed this limit (e.g., comprehensive market analyses, long research reports), the embedding API fails with an error.

Before:

# Would fail with texts > 24,000 characters (~8000 tokens)
embedding = get_embedding(very_long_market_analysis)  # ❌ API Error

After:

# Automatically chunks and processes long texts
embeddings = get_embedding(very_long_market_analysis)  # ✅ Returns list of embeddings

Problem 2: Memory Persistence

The original implementation used ChromaDB's in-memory client, meaning all memories were lost when the process ended. For production trading systems, persisting historical knowledge is essential.

Before:

# In-memory only - lost on restart
memory = FinancialSituationMemory("trading", config)

After:

# Persistent storage - survives restarts
memory = FinancialSituationMemory("trading", config, persistent_dir="./data/chromadb")

Changes Made

1. Updated requirements.txt

Added langchain for RecursiveCharacterTextSplitter:

 typing-extensions
+langchain
 langchain-openai
 langchain-experimental

2. Enhanced FinancialSituationMemory.__init__()

Added Parameters:

  • symbol (optional): For symbol-specific collection naming
  • persistent_dir (optional): Path for persistent storage

Key Improvements:

  • Persistent ChromaDB client when persistent_dir is provided
  • Backward-compatible in-memory client when not provided
  • Symbol-based collection naming for multi-symbol support
  • Collection name sanitization for ChromaDB compatibility
  • Automatic directory creation for persistent storage
  • Fallback error handling for ChromaDB compatibility issues
def __init__(self, name, config, symbol=None, persistent_dir=None):
    # ... (see full implementation in memory.py)

3. Reimplemented get_embedding()

Returns Changed:

  • Old: Single embedding (float list)
  • New: List of embeddings (list of float lists)

Key Features:

  • Automatically detects long texts (> 24,000 characters)
  • Uses RecursiveCharacterTextSplitter for intelligent chunking
  • Chunks at natural boundaries (paragraphs, sentences, words)
  • 500-character overlap to preserve context between chunks
  • Robust error handling with per-chunk try-catch
  • Returns single-item list for short texts (backward compatible)

Algorithm:

if text_length <= 24,000 chars:
    return [single_embedding]
else:
    1. Split text into ~23,000 char chunks with 500 char overlap
    2. Get embedding for each chunk
    3. Return list of all chunk embeddings

4. Updated add_situations()

Changed to handle chunked embeddings:

  • Processes list of embeddings instead of single embedding
  • Creates separate document for each chunk
  • Associates full situation text with each chunk
  • Maintains unique IDs for all chunks

Benefit: Even if query matches only one chunk of a long situation, the full situation is returned.

5. Enhanced get_memories()

Added embedding averaging for multi-chunk queries:

if len(query_embeddings) > 1:
    query_embedding = np.mean(query_embeddings, axis=0).tolist()
else:
    query_embedding = query_embeddings[0]

Benefit: Long queries are represented by their average embedding, improving semantic search accuracy.

Backward Compatibility

Fully backward compatible - existing code continues to work without changes:

# Old usage - still works!
memory = FinancialSituationMemory("trading", config)

New features are opt-in via optional parameters:

# New usage - persistent storage
memory = FinancialSituationMemory(
    "trading", 
    config, 
    symbol="AAPL",
    persistent_dir="./chromadb_storage"
)

Testing

Created comprehensive test suite in test_memory_chunking.py:

  1. Short Text Backward Compatibility - Verifies existing functionality
  2. Long Text Chunking - Tests 24,000+ character texts
  3. Persistent Storage - Verifies data survives process restart
  4. Symbol Collection Naming - Tests multi-symbol support

To run tests:

cd TradingAgents
export OPENAI_API_KEY="your-key"
python test_memory_chunking.py

Benefits

For Users

  1. No More Embedding Errors - Handles texts of any length
  2. Persistent Memory - Trading insights survive restarts
  3. Better Organization - Symbol-specific memory collections
  4. No Breaking Changes - Existing code works as-is

For Developers

  1. Cleaner API - Consistent return type (list of embeddings)
  2. Better Error Handling - Graceful fallbacks
  3. Extensible - Easy to add more chunking strategies
  4. Well-Documented - Clear docstrings and comments

Performance Impact

Memory Usage

  • In-memory mode: Same as before
  • Persistent mode: Minimal overhead (ChromaDB uses SQLite)

Processing Time

  • Short texts (<24K chars): No change
  • Long texts: Linear increase with text length
    • ~1-2 seconds per 24K chars chunk (API latency)
    • Parallel processing possible (future optimization)

Storage

  • Disk usage: ~1KB per embedding (persistent mode)
  • Query speed: Same as before (ChromaDB vector search)

Migration Guide

For Existing Users

No changes required! Your existing code continues to work:

memory = FinancialSituationMemory("my_memory", config)

For New Features

Add persistent storage:

memory = FinancialSituationMemory(
    "my_memory", 
    config,
    persistent_dir="./chromadb_data"
)

Add symbol-specific collections:

memory = FinancialSituationMemory(
    "stock_analysis",
    config,
    symbol="AAPL",
    persistent_dir="./chromadb_data"
)

Example Usage

Before (Old API)

config = {"backend_url": "https://api.openai.com/v1"}
memory = FinancialSituationMemory("trading", config)

# Short text only
memory.add_situations([
    ("Tech stocks volatile", "Reduce exposure")
])

results = memory.get_memories("Tech volatility", n_matches=1)

After (New API with Long Texts)

config = {"backend_url": "https://api.openai.com/v1"}
memory = FinancialSituationMemory(
    "trading",
    config,
    symbol="AAPL",
    persistent_dir="./chromadb_storage"
)

# Long comprehensive analysis (would have failed before)
long_analysis = """
[5000+ words of detailed market analysis covering:
- Macroeconomic conditions
- Sector rotation trends
- Technical analysis
- Fundamental metrics
- Risk factors
... etc]
"""

memory.add_situations([
    (long_analysis, "Maintain position with trailing stop")
])

# Works with long queries too
long_query = """[Another long market situation...]"""
results = memory.get_memories(long_query, n_matches=3)

Files Changed

  1. requirements.txt - Added langchain dependency
  2. tradingagents/agents/utils/memory.py - Enhanced with chunking and persistence
  3. test_memory_chunking.py (new) - Comprehensive test suite

Code Quality

  • Type hints preserved where applicable
  • Docstrings updated with new behavior
  • Error handling added for robustness
  • Comments added for complex logic
  • Minimal code changes for easy review
  • No breaking changes to existing API

Checklist

  • Code follows project style guidelines
  • Self-review completed
  • Comments added for complex areas
  • Documentation updated (this PR description)
  • Backward compatibility maintained
  • Tests added/updated
  • All tests pass locally
  • No breaking changes
  • Dependencies documented in requirements.txt

Future Enhancements

Potential follow-ups (not in this PR):

  1. Parallel chunk embedding for faster processing
  2. Configurable chunk size and overlap
  3. Alternative chunking strategies (semantic, sentence-based)
  4. Embedding caching for repeated texts
  5. Compression for large persistent collections

Questions & Answers

Q: Why return a list instead of single embedding?
A: Consistency - both short and long texts now return lists. Makes API more predictable and easier to handle.

Q: Why average embeddings for multi-chunk queries?
A: Common approach in semantic search - represents overall meaning while avoiding bias toward any single chunk.

Q: Is persistent_dir required?
A: No - optional parameter. Defaults to in-memory storage for backward compatibility.

Q: Can I migrate existing in-memory data to persistent storage?
A: Not directly - would need to re-add situations with persistent_dir specified.

Q: Performance impact?
A: Negligible for short texts. Linear increase for long texts based on chunk count.

Acknowledgments

This implementation is based on patterns from:

  • BA2TradePlatform integration testing
  • LangChain best practices for text chunking
  • ChromaDB documentation for persistent storage

Ready for Review

Please let me know if you have any questions or suggestions!

@gemini-code-assist
Copy link

Summary of Changes

Hello @bmigette, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the FinancialSituationMemory class by addressing two key limitations: the inability to process long texts due to embedding model token limits and the lack of memory persistence. It introduces intelligent text chunking to handle arbitrarily long inputs and enables disk-based storage for ChromaDB, ensuring that valuable financial insights are retained across sessions. These improvements are fully backward compatible and include comprehensive testing.

Highlights

  • Text Chunking for Long Inputs: Implemented automatic text chunking using RecursiveCharacterTextSplitter to handle inputs exceeding the 8,192 token limit of OpenAI's embedding models, preventing API errors for long financial analysis texts.
  • Persistent ChromaDB Storage: Added an option to configure persistent local storage for ChromaDB, allowing FinancialSituationMemory to retain historical knowledge across process restarts, crucial for production trading systems.
  • Backward Compatibility: Ensured full backward compatibility, meaning existing code using FinancialSituationMemory will continue to function without changes, with new features being opt-in via optional parameters.
  • Enhanced Memory Management: Introduced symbol-specific collection naming for better organization and isolation of memories, and updated add_situations and get_memories to correctly handle and query chunked embeddings, including averaging for multi-chunk queries.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements by adding text chunking to handle large inputs and enabling persistent storage for FinancialSituationMemory. The changes are well-structured and include comprehensive tests. My review focuses on improving code clarity, robustness, and maintainability by suggesting the use of more idiomatic library features, better exception handling, and code style enhancements. I've also noted the addition of documentation files that might be redundant.

Comment on lines +58 to +61
try:
self.situation_collection = self.chroma_client.get_collection(name=collection_name)
except:
self.situation_collection = self.chroma_client.create_collection(name=collection_name)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This try-except block uses a bare except:, which is dangerous as it catches system-exiting exceptions like SystemExit and KeyboardInterrupt. This logic can be made safer and more concise by using the idiomatic get_or_create_collection method provided by ChromaDB.

        self.situation_collection = self.chroma_client.get_or_create_collection(name=collection_name)

model=self.embedding, input=chunk
)
chunk_embeddings.append(response.data[0].embedding)
except Exception as e:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Catching a broad Exception can hide specific API errors that could be handled differently (e.g., rate limit errors could be retried). It's better to catch more specific exceptions from the openai library, such as openai.APIError or its subclasses, to make the error handling more robust.

Comment on lines +41 to +43
except Exception:
# Fallback: try without settings if there are compatibility issues
self.chroma_client = chromadb.PersistentClient(path=persistent_dir)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a broad Exception can hide unexpected errors and make debugging harder. While the fallback is intended for compatibility issues, it would be more robust to at least log the exception that was caught so it's not silently ignored.

Suggested change
except Exception:
# Fallback: try without settings if there are compatibility issues
self.chroma_client = chromadb.PersistentClient(path=persistent_dir)
except Exception as e:
# Fallback: try without settings if there are compatibility issues
print(f"Warning: Failed to initialize PersistentClient with settings, falling back. Error: {e}")
self.chroma_client = chromadb.PersistentClient(path=persistent_dir)

"""
# text-embedding-3-small has a max context length of 8192 tokens
# Conservative estimate: ~3 characters per token for safety margin
max_chars = 24000 # ~8000 tokens * 3 chars/token

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The value 24000 is a "magic number". It's better to define it as a constant at the class or module level with a descriptive name (e.g., _MAX_EMBEDDING_CHARS). This improves readability and makes it easier to change if the model limits are updated in the future. This value is also used to calculate chunk_size on line 85.

return [response.data[0].embedding]

# Text is too long, use RecursiveCharacterTextSplitter
print(f"Text length {len(text)} exceeds limit, splitting into chunks for embedding")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using print for logging is not ideal for production code. It's better to use Python's logging module. This provides more control over log levels, formatting, and output streams (e.g., writing to a file, stderr, etc.). The PR description mentions this was a conscious choice to simplify, but it's a key point for maintainability. Similar print statements are on lines 92 and 103.

situation_embeddings = self.get_embedding(situation)

# Add each chunk as a separate document
for chunk_idx, embedding in enumerate(situation_embeddings):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The variable chunk_idx is defined but never used. It's a good practice to replace unused loop variables with an underscore (_) to signal that they are intentionally ignored.

            for _, embedding in enumerate(situation_embeddings):

@@ -0,0 +1,325 @@
# Pull Request: Add Chunking and Persistent Storage to FinancialSituationMemory

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This file, along with CHANGES_SUMMARY.md, appears to be a copy of the pull request description. While detailed documentation is great, checking it into the repository for each PR can lead to a lot of one-off documentation files that may become outdated. Consider whether this information would be better placed in a more central location like a project wiki, a CHANGELOG.md, or within the code's docstrings, to make it more maintainable in the long run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant