Querying Multiple Documents Returns Error

## Issue Summary
When attempting to ask questions across multiple uploaded documents using the PageIndex chat completions API, I'm receiving an error response instead of answers based on my uploaded documents.

## Setup Information

### API Key
```
PageIndex-API-KEY
```

### Uploaded Documents
I have successfully uploaded 5 documents to PageIndex:
1. `pi-cmlqm8gpx00690io97n36u5ji`
2. `pi-cmlqm8a0900670io9iol2h3x5`
3. `pi-cmlqm84aa010h0fo98sy9a8ls`
4. `pi-cmlqm7c9500fd0lo90o3ot5tm`
5. `pi-cmlqm71i800xd08o9ma5a57o0`

## Code Used to Upload Documents 
```python
import requests

api_key = "PageIndex-API-KEY"
file_path = "./documents/Western Power Distribution RIIO-ED2 Business Plan 2023-2028: Strategic Vision, Commitments, and Investment for a Net Zero Energy Future.pdf"

with open(file_path, "rb") as file:
    response = requests.post(
        "https://api.pageindex.ai/doc/",
        headers={"api_key": api_key},
        files={"file": file}
    )
print(f"Status Code: {response.status_code}")
print(f"Response: {response.text}")
```

## Code Used to Verify Uploaded Documents 
```python
import requests

api_key = "PageIndex-API-KEY"

response = requests.get(
    "https://api.pageindex.ai/docs",
    headers={"api_key": api_key},
    params={"limit": 10, "offset": 0}
)
print("Status Code:", response.status_code)

data = response.json()
print("\nAll Documents:")
for doc in data.get("documents", []):
    print(f"{doc['id']} - {doc['name']}")

# Find duplicates by base filename (without version suffix like _5.pdf)
import re

base_names = {}
for doc in data.get("documents", []):
    name = doc['name']
    # Remove version suffix (e.g., "_5.pdf" -> ".pdf")
    base_name = re.sub(r'_\d+\.pdf$', '.pdf', name)
    if base_name not in base_names:
        base_names[base_name] = []
    base_names[base_name].append((doc['id'], name))

# Print duplicates
print("\nDuplicates (by base filename):")
duplicates_found = False
duplicate_ids = []

for base_name, docs in base_names.items():
    if len(docs) > 1:
        duplicates_found = True
        print(f"\n{base_name}")
        for doc_id, full_name in docs:
            print(f"  - {doc_id} ({full_name})")
        # Keep the first ID (most recent based on version number)
        duplicate_ids.append(docs[0][0])

if not duplicates_found:
    print("No duplicates found")
else:
    print("\n\nDuplicate IDs to keep (one from each group):")
    print(duplicate_ids)
```

## Code Used to Query Documents (Tokens.py)
```python
import requests

# Test with documents parameter (based on RAG API patterns)
response = requests.post(
    "https://api.pageindex.ai/chat/completions",
    headers={
        "api_key": "API-KEY",
        "Content-Type": "application/json"
    },
    json={
        "messages": [
            {"role": "user", "content": "According to Electricity North West's 2023-2028 Business Plan, How much investment will be made to help customers connect low-carbon technologies?"}
        ],
        "doc_id": ['pi-cmlqm8gpx00690io97n36u5ji', 'pi-cmlqm8a0900670io9iol2h3x5', 'pi-cmlqm84aa010h0fo98sy9a8ls', 'pi-cmlqm7c9500fd0lo90o3ot5tm', 'pi-cmlqm71i800xd08o9ma5a57o0']
    }
)

result = response.json()
print("Full response:", result)
print("Status code:", response.status_code)

# Check if the response contains 'choices'
if "choices" in result:
    print(result["choices"][0]["message"]["content"])
else:
    print("Error: Response does not contain 'choices' field")
    if "error" in result:
        print("Error details:", result["error"])
```

## Actual Response Received

I'll help you find the investment amount for connecting low-carbon technologies in Electricity North West's business plan. {"doc_name": "Electricity North West's 2023-2028 Business Plan to Lead the North West to Net Zero_6.pdf"} It seems the documents are not currently accessible in the system. The Electricity North West document you're asking about appears to have been removed or is no longer available.

To answer your question, I would need you to re-upload the document using a PDF URL. Do you have access to the Electricity North West 2023-2028 Business Plan PDF that I can process?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Querying Multiple Documents Returns Error #113

Issue Summary

Setup Information

API Key

Uploaded Documents

Code Used to Upload Documents

Code Used to Verify Uploaded Documents

Code Used to Query Documents (Tokens.py)

Actual Response Received

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Querying Multiple Documents Returns Error #113

Description

Issue Summary

Setup Information

API Key

Uploaded Documents

Code Used to Upload Documents

Code Used to Verify Uploaded Documents

Code Used to Query Documents (Tokens.py)

Actual Response Received

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions