The ScaNN index uses tree-quantization-based indexing, in which indexes learn a search tree together with a quantization (or hashing) function. When you run a query, the search tree is used to prune the search space, while quantization is used to compress the index size. This pruning speeds up the scoring of the similarity—in other words, the distance—between the query vector and the database vectors.
To achieve both a high query-per-second rate (QPS) and a high recall with your nearest-neighbor queries, you must partition the tree of your ScaNN index in a way that is most appropriate to your data and your queries.
High-dimensional embedding models can retain much of the information at much
lower dimensionality. For example, you can retain 90% of the information with
only 20% of the embedding's dimensions. To help speed up such datasets,
the AlloyDB AI ScaNN index automatically performs dimension reduction
using Principal Component Analysis
(PCA) on the indexed vectors, which further reduces CPU and memory usage for
the vector search. For more information, see scann.enable_pca
.
Because dimension reduction causes minor recall loss in the index, the
AlloyDB AI ScaNN index compensates for recall loss
by first performing a ranking
step with a larger number of PCAed vector candidates from the index. Then,
ScaNN re-ranks the PCAed vector candidates by the original vectors.
For more information, see scann.pre_reordering_num_neighbors
.
What's next
- Learn best practices for tuning ScaNN indexes .
- Get started with vector embeddings using AlloyDB AI .
- Learn more about the AlloyDB AI ScaNN index .