You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CNDB-15640: Determine if vectors are unit length at insert (#2059)
Fixes: riptano/cndb#15640
In order to lay the ground work for Fused ADC, I want to refactor some
of the PQ/BQ logic. The unit length computation needs to move, so I
decided to move it out to its own PR.
The core idea is that:
* some models are documented to provide unit length vectors, and in
those cases, we should skip the computational check
* otherwise, we should check at runtime until we hit a non-unit length
vector, and then we can skip the check and configure the `writePQ`
method as needed
(I asked chat gpt to provide proof for the config changes proposed in
this PR. Here is it's generated description.)
Quick rundown of which models spit out normalized vectors (so cosine ==
dot product, etc.):
* **OpenAI (ada-002, v3-small, v3-large)** → already normalized. [OpenAI
FAQ](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings)
literally says embeddings are unit-length.
* **BERT** → depends. The SBERT “-cos-” models add a [`Normalize`
layer](https://www.sbert.net/docs/package_reference/layers.html#normalize)
so they’re fine; vanilla BERT doesn’t.
* **Google Gecko** → normalized out of the box per [Vertex AI
docs](https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings).
* **NVIDIA QA-4** → nothing in the [NVIDIA NIM model
card](https://docs.api.nvidia.com/nim/reference/nvidia-embed-qa-4) about
normalization, so assume *not* normalized and handle it yourself.
* **Cohere v3** → not explicitly in their [API
docs](https://docs.cohere.com/docs/cohere-embed)
TL;DR: OpenAI + Gecko are definitely safe, Cohere/BERT/NV need manual
normalization due to lack of documentation.
0 commit comments