Skip to content

Self-hosted latency benchmark request: tree generation stage is very slow on scanned PDFs #106

@Taooo-habitus

Description

@Taooo-habitus

I'm trying to benchmark the latency of self-hosted PageIndex, focusing specifically on the tree generation stage.

I'm using a local fork that incorporates concurrent request handling, but I would like to understand the baseline performance of the upstream self-hosted version for comparison, particularly regarding tree generation.

Environment

  • Self-hosted deployment with concurrent LLM calls
  • OCR input: scanned/image-based PDF
  • Document length: 128 pages
  • Model backend: Snowflake Cortex (Claude Sonnet 4.5)

Observed Latency

  • Tree generation with concurrent requests: approximately 125–140 seconds (with concurrent requests set to 128)
  • Tree generation without concurrent requests (original implementation): approximately 400 seconds

I have also tested the same document through the official chat page, which completed in around 60 seconds end-to-end. I understand that the official framework may have implemented many more optimizations, but I'm still curious whether the stats above are expected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions