Optimizing IVF-PQ Performance With RAPIDS CuVS: Key Tuning Techniques

Tony Kim Jul 18, 2024 19:39

Learn how to optimize the IVF-PQ algorithm for vector search performance using RAPIDS cuVS, with practical tips on tuning hyper-parameters and improving recall.

Optimizing IVF-PQ Performance with RAPIDS cuVS: Key Tuning Techniques

In the first part of the series, an overview of the IVF-PQ algorithm was presented, explaining its foundation on the IVF-Flat algorithm and the use of Product Quantization (PQ) to compress the index and support larger datasets. In part two, the focus shifts to the practical aspects of tuning IVF-PQ performance, which is crucial for achieving optimal results, especially when dealing with billion-scale datasets.

Tuning Parameters for Index Building

IVF-PQ shares some parameters with IVF-Flat, such as coarse-level indexing and search hyper-parameters. However, IVF-PQ introduces additional parameters that control compression. One of the critical parameters is n_lists, which determines the number of partitions (inverted lists) into which the input dataset is clustered. The performance is influenced by the number of lists probed and their sizes. Experiments suggest that n_lists in the range of 10K to 50K yield good performance across recall levels, though this can vary depending on the dataset.

Another crucial parameter is pq_dim, which controls compression. Starting with one fourth the number of features in the dataset and increasing in steps is a good technique for tuning this parameter. Figure 2 in the original blog post illustrates significant drops in QPS, which can be attributed to factors such as increased compute work and shared memory requirements per CUDA block.

The pq_bits parameter, ranging from 4 to 8, controls the number of bits used in each individual PQ code, affecting the codebook size and recall. Reducing pq_bits can improve search speed by fitting the look-up table (LUT) in shared memory, although this comes at the cost of recall.

Additional Parameters

The codebook_kind parameter determines how the codebooks for the second-level quantizer are constructed, either for each subspace or for each cluster. The choice between these options can impact training time, GPU shared memory utilization, and recall. Parameters such as kmeans_n_iters and kmeans_trainset_fraction are also important, though they rarely need adjustment.

Tuning Parameters for Search

The n_probes parameter, discussed in the previous blog post on IVF-Flat, is essential for search accuracy and throughput. IVF-PQ provides additional parameters like internal_distance_dtype and lut_dtype, which control the representation of distance or similarity during the search and the datatype used to store the LUT, respectively. Adjusting these parameters can significantly impact performance, especially for datasets with large dimensionality.

Improving Recall with Refinement

When tuning parameters is not enough to achieve the desired recall, refinement offers a promising alternative. This separate operation, performed after the ANN search, recomputes exact distances for selected candidates and reranks them. The refinement operation can significantly improve recall, as demonstrated in Figure 4 of the original blog post, though it requires access to the source dataset.

Summary

The series on accelerating vector search with inverted-file indexes covers two cuVS algorithms: IVF-Flat and IVF-PQ. IVF-PQ extends IVF-Flat with PQ compression, enabling faster searches and the ability to handle billion-scale datasets with limited GPU memory. By fine-tuning parameters for index building and search, data practitioners can achieve the best results efficiently. The RAPIDS cuVS library offers a range of vector search algorithms to cater to various use cases, from exact searches to low-accuracy-high-QPS ANN methods.

For practical tuning of IVF-PQ parameters, refer to the IVF-PQ notebook on GitHub. For more details on the provided APIs, see the cuVS documentation.

Image source: Shutterstock
RECENT NEWS

Ether Surges 16% Amid Speculation Of US ETF Approval

New York, USA – Ether, the second-largest cryptocurrency by market capitalization, experienced a significant surge of ... Read more

BlackRock And The Institutional Embrace Of Bitcoin

BlackRock’s strategic shift towards becoming the world’s largest Bitcoin fund marks a pivotal moment in the financia... Read more

Robinhood Faces Regulatory Scrutiny: SEC Threatens Lawsuit Over Crypto Business

Robinhood, the prominent retail brokerage platform, finds itself in the regulatory spotlight as the Securities and Excha... Read more

Ethereum Lags Behind Bitcoin But Is Expected To Reach $14K, Boosting RCOF To New High

Ethereum struggles to keep up with Bitcoin, but experts predict a rise to $14K, driving RCOF to new highs with AI tools.... Read more

Ripple Mints Another $10.5M RLUSD, Launch This Month?

Ripple has made notable progress in the rollout of its stablecoin, RLUSD, with a recent minting of 10.5… Read more

Bitcoin Miner MARA Acquires Another $551M BTC, Whats Next?

Bitcoin mining firm Marathon Digital Holdings (MARA) has announced a significant milestone in its BTC acquisition strate... Read more