Add cross-chunk context awareness to prevent information loss during document chunking

## **Description:**

> **⚠️ AI-Generated Issue Disclaimer**: This issue was identified and generated using generative AI tools. The problem analysis and proposed solutions have not been manually tested or verified. Please validate the issue description and proposed solutions before implementation.

### **Problem Statement**
The current `_annotate_documents_single_pass` implementation processes document chunks independently without considering context from previous chunks. This leads to significant information loss and extraction quality degradation, particularly for:

- **Coreference resolution** (pronouns like "she", "he", "it")
- **Entity disambiguation** (partial names in later chunks)
- **Cross-chunk relationships** (entities and relationships spanning multiple chunks)
- **Context-dependent extractions** (entities that only make sense with full context)

### **Current Behavior**
```python
# Each chunk is processed in isolation
for text_chunk in batch:
    batch_prompts.append(
        self._prompt_generator.render(
            question=text_chunk.chunk_text,  # Only current chunk
            additional_context=text_chunk.additional_context,  # Only doc-level context
        )
    )
```

### **Example Problem**
**Document**: `"Dr. Sarah Johnson is a cardiologist at Mayo Clinic. She specializes in heart surgery. Dr. Johnson has 15 years of experience."`

**Chunk 1**: `"Dr. Sarah Johnson is a cardiologist at Mayo Clinic."`
- Extracts: `{"name": "Dr. Sarah Johnson", "profession": "cardiologist", "hospital": "Mayo Clinic"}`

**Chunk 2**: `"She specializes in heart surgery. Dr. Johnson has 15 years of experience."`
- Extracts: `{"specialization": "heart surgery", "experience": "15 years"}`
- **Lost**: Connection between "She"/"Dr. Johnson" and "Dr. Sarah Johnson"

### **Proposed Solutions**

#### **Option 1: Sliding Window Context**
#### **Option 2: Entity Tracking**
#### **Option 3: Overlapping Chunks**
#### **Option 4: Post-Processing Coreference Resolution**



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add cross-chunk context awareness to prevent information loss during document chunking #230

Description:

Problem Statement

Current Behavior

Example Problem

Proposed Solutions

Option 1: Sliding Window Context

Option 2: Entity Tracking

Option 3: Overlapping Chunks

Option 4: Post-Processing Coreference Resolution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add cross-chunk context awareness to prevent information loss during document chunking #230

Description

Description:

Problem Statement

Current Behavior

Example Problem

Proposed Solutions

Option 1: Sliding Window Context

Option 2: Entity Tracking

Option 3: Overlapping Chunks

Option 4: Post-Processing Coreference Resolution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions