AlloyDB ScaNN Index reference

This page provides a detailed reference for the tuning parameters available for Scalable Nearest Neighbors (ScaNN) indexes in AlloyDB for PostgreSQL.

For a step-by-step tutorial on how to implement vector search from start to finish, see the guide on how to Perform a vector search .

Tuning parameters

The following index and query parameters are used to find the right balance of recall and queries per second (QPS).

Tuning parameter

Description

Option type

mode

Defines the ScaNN index as either an automatically-tuned or manually-tuned index. The available options are as follows:

AUTO : automatically-tuned index
MANUAL : manually-tuned index

For more information, see Create a ScaNN index .

Index creation

(optional)

max_num_levels

Maximum number of centroid levels of the K-means clustering tree. For guidance on setting this value, see Tune a ScaNN index .

The available values are as follows:

1 : Two-level ScaNN index
2 : Three-level ScaNN index
3 : Four-level ScaNN index

Index creation

(optional)

num_leaves

Number of partitions to apply to this index. The maximum value is 30000000 . For more information on choosing this value, see Tune a ScaNN index and Best practices for tuning ScaNN indexes in AlloyDB .

Index creation

(required for manually-tuned indexes)

quantizer

The type of quantizer you want to use for the K-means tree. The default value is set to SQ8 which provides better query performance with minimal recall loss (typically less than 1-2%).

Set it to FLAT if a recall of 99% or higher is required.

Index creation
(optional)

scann.enable_pca

Enables Principal Component Analysis (PCA), which is a dimension reduction technique used to automatically reduce the size of the embedding when possible. This option is enabled by default.

Set to false if you observe deterioration in recall.

Index creation
(optional)

auto_maintenance

Automatically maintain a ScaNN index so that as your dataset grows, AlloyDB analyzes and updates centroids, and splits large outlier partitions. This automatically improves QPS and search results. For more information, see Maintain indexes automatically .

Index creation

(optional)

scann.pct_leaves_to_search

Automatically manage the number of partitions that a vector index searches. For more information, see Search percentage of partitions .

scann.num_leaves_to_search

Absolute number of partitions to search. This lets you trade off between query recall and QPS. The default value is 1% of num_leaves .

Higher values result in better query recall, but lower QPS. Similarly, lower values result in worse query recall, but better QPS.

Query runtime

(optional)

scann.pre_reordering_num_neighbors

Specifies the number of candidate neighbors to consider during the reordering stages once the initial search identifies a set of candidates. Set this parameter to a value higher than the number of neighbors you want the query to return. Higher values result in better recall, but lower QPS.

The default is 0 , which disables reordering. If PCA is enabled during index creation, the default is 50 x K , where K is the LIMIT specified in the query.

Query runtime

(optional)

scann.num_search_threads

The number of searcher threads for multi-thread search. This can help reduce single query latency by using more than one thread for ScaNN ANN search in latency-sensitive applications. This setting doesn't improve single query latency if the database is already cpu-bound. The default value is 2 .

Query runtime
(optional)

scann.satisfy_limit ( Preview )

When set to relaxed_order , the database flag addresses insufficient recall. Insufficient recall can occur when a query's observed recall falls below the target recall, which is more likely when using filters. This setting helps achieve the target recall by allowing the vector scan to continue searching beyond the num_leaves_to_search limit until a sufficient number of results are found.

Query runtime
(optional)

scann.max_pct_leaves_to_search ( Preview )

This database flag sets an upper bound on the percentage of total leaves that can be visited for queries that you run with scann.satisfy_limit enabled. This is the only upper bound GUC available and applies to both auto and manual search modes. It prevents the search from overshooting, which might significantly degrade performance. This is applicable when scann.satisfy_limit is turned on.
You can set this parameter to any value between 0 to 100 . The default value is 15% . This default value is based on the rationale that if a search needs to examine more than 15% of the leaves, the filter is likely selective enough that approximate nearest neighbor (ANN) search doesn't provide a benefit, making pre-filtering k-nearest neighbor (KNN) a more suitable choice.

Query runtime
(optional)

What's next

Get started with vector embeddings using AlloyDB AI .

AlloyDB ScaNN Index reference Stay organized with collections Save and categorize content based on your preferences.

Tuning parameters

What's next

AlloyDB ScaNN Index reference