Skip to content

Conversation

@abrookins
Copy link
Collaborator

In #429, we found and fixed an issue with LangCache attribute values, which is that some characters are not allowed (commas) and some are accepted but silently fail during retrieval (slashes). Our solution was to encode these characters when we persist them and at query time. This PR decodes the values in retrieved data as well.

Copilot AI review requested due to automatic review settings November 21, 2025 16:55
Copilot finished reviewing on behalf of abrookins November 21, 2025 16:57
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes the round-trip encoding/decoding implementation for LangCache attribute values by adding decoding on retrieval. Previously (#429), encoding was added to handle LangCache service restrictions on commas and slashes in attribute values, but the encoded values were being exposed to callers. This PR ensures callers receive the original, unencoded values.

Key changes:

  • Added decoding functions to translate fullwidth commas and division slashes back to ASCII
  • Modified _convert_to_cache_hit to decode attributes before exposing them in metadata
  • Enhanced unit and integration tests to verify round-trip behavior

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
redisvl/extensions/cache/llm/langcache.py Added decode translation table and functions; integrated decoding into _convert_to_cache_hit method
tests/unit/test_langcache_semantic_cache.py Added test assertions for round-trip decode behavior and updated test data with encoded values
tests/integration/test_langcache_semantic_cache_integration.py Added assertion to verify metadata contains original decoded values
Comments suppressed due to low confidence (2)

tests/unit/test_langcache_semantic_cache.py:340

  • This method definition is incorrectly indented inside the test_check_with_attributes method (starting at line 254). It should be at the same indentation level as test_check_with_attributes to be a separate test method in the TestLangCacheSemanticCache class. Remove 4 spaces of indentation from lines 314-340.
        def test_store_with_empty_metadata_does_not_send_attributes(
            self, mock_langcache_client
        ):
            """Empty metadata {} should not be forwarded as attributes to the SDK."""
            _, mock_client = mock_langcache_client

            mock_response = MagicMock()
            mock_response.entry_id = "entry-empty"
            mock_client.set.return_value = mock_response

            cache = LangCacheSemanticCache(
                name="test",
                server_url="https://api.example.com",
                cache_id="test-cache",
                api_key="test-key",
            )

            entry_id = cache.store(
                prompt="Q?",
                response="A",
                metadata={},  # should be ignored
            )

            assert entry_id == "entry-empty"
            # Ensure attributes kwarg was NOT sent when metadata is {}
            _, call_kwargs = mock_client.set.call_args
            assert "attributes" not in call_kwargs

tests/unit/test_langcache_semantic_cache.py:374

  • This method definition is also incorrectly indented inside the test_check_with_attributes method (starting at line 254). It should be at the same indentation level as test_check_with_attributes to be a separate test method in the TestLangCacheSemanticCache class. Remove 4 spaces of indentation from lines 342-374.
        def test_check_with_empty_attributes_does_not_send_attributes(
            self, mock_langcache_client
        ):
            """Empty attributes {} should not be forwarded to the SDK search call."""
            _, mock_client = mock_langcache_client

            mock_entry = MagicMock()
            mock_entry.model_dump.return_value = {
                "id": "e1",
                "prompt": "Q?",
                "response": "A",
                "similarity": 1.0,
                "created_at": 0.0,
                "updated_at": 0.0,
                "attributes": {},
            }
            mock_response = MagicMock()
            mock_response.data = [mock_entry]
            mock_client.search.return_value = mock_response

            cache = LangCacheSemanticCache(
                name="test",
                server_url="https://api.example.com",
                cache_id="test-cache",
                api_key="test-key",
            )

            results = cache.check(prompt="Q?", attributes={})  # should be ignored
            assert results and results[0]["entry_id"] == "e1"

            # Ensure attributes kwarg was NOT sent when attributes is {}
            _, call_kwargs = mock_client.search.call_args
            assert "attributes" not in call_kwargs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@bsbodden bsbodden self-assigned this Nov 21, 2025
Copy link
Collaborator

@bsbodden bsbodden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Send it!

@abrookins abrookins merged commit a1b6a04 into main Nov 21, 2025
44 checks passed
abrookins added a commit that referenced this pull request Nov 21, 2025
Following on #437, we find and encode/decode more problematic characters
used within attribute values in LangCache. This PR also adds a
comprehensive punctuation test for other characters.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants