Self-hosted latency benchmark request: tree generation stage is very slow on scanned PDFs

I'm trying to benchmark the latency of **self-hosted PageIndex**, focusing specifically on the **tree generation stage**.

I'm using a local fork that incorporates concurrent request handling, but I would like to understand the baseline performance of the upstream self-hosted version for comparison, particularly regarding **tree generation**.

## Environment
- Self-hosted deployment with concurrent LLM calls
- OCR input: scanned/image-based PDF
- Document length: 128 pages
- Model backend: Snowflake Cortex (Claude Sonnet 4.5)

## Observed Latency
- **Tree generation with concurrent requests**: approximately 125–140 seconds (with concurrent requests set to 128)
- **Tree generation without concurrent requests (original implementation)**: approximately 400 seconds

I have also tested the same document through the official [chat page](https://chat.pageindex.ai/chat), which completed in around 60 seconds end-to-end. I understand that the official framework may have implemented many more optimizations, but I'm still curious whether the stats above are expected.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Self-hosted latency benchmark request: tree generation stage is very slow on scanned PDFs #106

Environment

Observed Latency

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Self-hosted latency benchmark request: tree generation stage is very slow on scanned PDFs #106

Description

Environment

Observed Latency

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions