What is the performance limit ? #56
Replies: 2 comments
-
|
Yea good question actually. Traditional RAG struggles at scale mainly because of arbitrary chunking and semantic drift in large embedding spaces. PageIndex should handle this better due to its hierarchical structure - it maintains page-level context and has more explainable retrieval paths. Instead of searching through thousands of random chunks, it navigates the document structure. That said, I haven't seen specific benchmarks at 10K+ docs yet. Would be interesting to test:
The architecture suggests it should scale better than traditional RAG, but real-world testing would confirm. Has anyone tried it with large document sets? |
Beta Was this translation helpful? Give feedback.
-
|
UNSUBSCRIBE
…On Thu, Feb 12, 2026 at 08:46 Valdemar Stamm ***@***.***> wrote:
Yea good question actually. Traditional RAG struggles at scale mainly
because of arbitrary chunking and semantic drift in large embedding spaces.
PageIndex should handle this better due to its hierarchical structure - it
maintains page-level context and has more explainable retrieval paths.
Instead of searching through thousands of random chunks, it navigates the
document structure.
That said, I haven't seen specific benchmarks at 10K+ docs yet. Would be
interesting to test:
- Retrieval accuracy vs document count
- Query latency at scale
- Memory footprint
The architecture suggests it should scale better than traditional RAG, but
real-world testing would confirm. Has anyone tried it with large document
sets?
—
Reply to this email directly, view it on GitHub
<#56 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/BSGAVOYCGDZUKYFK4PSFWMT4LQ4X3AVCNFSM6AAAAACQWO6QN6VHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTKNZXHE3TEMY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Traditional RAGs see a drop in accuracy over 10,000 documents. Do we have any such benchmark limits for this PageIndex
Beta Was this translation helpful? Give feedback.
All reactions