Tuning parameters
The following index parameters and database flags are used together to find the right balance of recall and QPS.
 
Tuning parameter
 
 Description
 
 Option type
 
max_num_levels 
The maximum number of centroid levels of the K-means clustering tree.
 
 
 -  Two-level tree index: Set to 1by default for a two-level tree (1 centroid level + bottom leaf level).
-  Three-level tree index: Set to 2by default for a three-level tree (2 centroid levels + bottom leaf level)
- Set the value to 2if the number of vector rows exceeds 100 million rows.
- Set the value to 1if the number of vector rows are less than 10 million rows.
- Set to either 1or2if the number of vector rows lie between 10 million and 100 million rows to optimize for index build time (set to 2) or optimize for search recall (set to 1).
Index creation
(optional)
 
(optional)
num_leaves 
The number of partitions to apply to this index. The number of partitions you apply to when creating an index affects the index performance. By increasing partitions for a set number of vectors, you create a more fine-grained index, which improves recall and query performance. However, this comes at the cost of longer index creation times.
Since three-level trees build faster than two-level trees, you can increase the
 
 
 Since three-level trees build faster than two-level trees, you can increase the
num_leaves_value 
when creating a three-level tree index to achieve better performance.-  Two-level index: Set this value to any value between 1and1048576.
 For an index that balances fast index build and good search performance, usesqrt(ROWS)as a starting point, whereROWSis the number of vector rows. The number of vectors that each partition holds is calculated by
 ROWS/sqrt(ROWS) = sqrt(ROWS).
 Since a two-level tree index can be created on a dataset with less than 10 million vector rows, each partition will hold less than (sqrt(10M)) vectors, which is3200vectors. For optimal vector search quality, it's recommended to minimize the number of vectors in each partition. The recommended partition size is about 100 vectors per partition, so setnum_leavestoROWS/100. If you have 10 million vectors you would setnum_leavesto 100,000.
-  Three-level index: Set this value to any value between 1and1048576.
 If you are unsure about selecting the exact value, usepower(ROWS, 2/3)as a starting point, whereROWSis the number of vector rows. The number of vectors that each partition holds is calculated by
 ROWS/power(ROWS, 2/3) = power(ROWS, 1/3).
 Since a three-level tree index can be created on a dataset with vector rows more than 100 million, each partition will hold more than
 (power(100M, 1/3)) vectors, which is465vectors. For optimal vector search quality, it's recommended to minimize the number of vectors in each partition. The recommended partition size is about 100 vectors per partition, so setnum_leavestoROWS/100. If you have 100 million vectors you would setnum_leavesto 1 million.
Index creation
(required)
 
(required)
quantizer 
The type of quantizer you want to use for the K-means tree. The default value is set to 
Set it to
 
 SQ8 
which provides better query performance with minimal recall loss (typically less than 1-2%).Set it to
FLAT 
if a recall of 99% or higher is required.Index creation
(optional)
 
(optional)
scann.enable_inline_filtering 
Enables inline filtering support that queries your data and applies filters directly within a vector similarity search operation. These vector similarity queries use filters on the same database tables and complete filter evaluation while computing the distance for nearest neighbor identification. This option is disabled by default.
To enable inline filtering, set this parameter to
This option is available in Preview .
 
 To enable inline filtering, set this parameter to
true 
. If you observe deterioration in performance, then set to false 
.This option is available in Preview .
Query runtime
(optional)
 
(optional)
scann.enable_pca 
Enables Principal Component Analysis (PCA), which is a dimension reduction technique used to automatically
  reduce the size of the embedding when possible. This option is enabled by default.
Set to
 
 Set to
false 
if you observe deterioration in recall.Index creation
(optional)
 
(optional)
scann.num_leaves_to_search 
This database flag controls the absolute number of leaves or partitions to search which lets you trade off between recall and QPS. The default value is 1% of the value set in 
A higher value will result in better recall but lower QPS. Similarly, a lower value will result in lower recall but higher QPS.
 
 num_leaves 
.A higher value will result in better recall but lower QPS. Similarly, a lower value will result in lower recall but higher QPS.
Query runtime
(optional)
 
(optional)
scann.pre_reordering_num_neighbors 
The database flag, when set, specifies the number of candidate neighbors to consider during the reordering stages after the initial search identifies a set of candidates. Set this parameter to a value higher than the number of neighbors you want the query to return.
A higher value results in better recall, but a lower QPS. Set this value to
 
 A higher value results in better recall, but a lower QPS. Set this value to
0 
to disable reordering. The default is 0 
if PCA is not enabled during index creation. Otherwise, the default is 50 x K 
, where K 
is the LIMIT specified in the query.Query runtime
(optional)
 
(optional)
scann.num_search_threads 
The number of searcher threads for multi-thread search. This can help reduce single query latency by using more than one thread for ScaNN ANN search in latency-sensitive applications. This setting doesn't improve single query latency if the database is already cpu-bound. The default value is 
 
 2 
.Query runtime
(optional)
 
(optional)

