FAISS (Facebook AI Similarity Search)

FAISS (Facebook AI Similarity Search) is a library developed by Facebook AI Research for efficient similarity search and clustering of dense vectors.

It is useful for large-scale similarity search problems, which are common in various machine learning and information retrieval tasks.

FAISS is designed to work on either the GPU or CPU and provides significant performance improvements compared to other nearest neighbour search algorithms.

The research foundations for FAISS are extensive:

FAISS Research Foundations

FAISS (Facebook AI Similarity Search) is a library developed by Facebook AI that enables efficient similarity search and clustering of dense vectors.

Inverted File: Originating from Sivic and Zisserman's work in 2003, the inverted file is crucial for non-exhaustive search in large datasets, enabling searches without scanning all elements in the index.
Product Quantization (PQ): Introduced by Jégou et al. in 2011, PQ is a method for lossy compression of high-dimensional vectors that supports relatively accurate reconstructions and distance computations in the compressed domain.
Three-Level Quantization (IVFADC-R): From Tavenard et al.'s 2011 research, this method further refines the quantization process for more efficient searches.
Inverted Multi-Index: Babenko and Lempitsky's 2012 work improves the speed of inverted indexing, enhancing search efficiency.
Optimized PQ: He et al.'s 2013 research optimizes product quantization, adapting the vector space for more effective indexing.
Pre-Filtering of PQ Distances: Introduced by Douze et al. in 2016, this technique adds a binary filtering stage before computing PQ distances, improving search speed.
GPU Implementation and Fast k-Selection: Johnson et al.'s 2017 paper details the adaptation of FAISS for GPU, enabling faster search processes.
HNSW Indexing Method: Malkov et al.'s 2016 work on Hierarchical Navigable Small World graphs contributes to an efficient and robust approximate nearest neighbor search method.
In-Register Vector Comparisons: André et al. (2019) and Guo et al. (2020) explore SIMD optimizations to enhance product quantization's efficiency.
Binary Multi-Index Hashing: Norouzi et al.'s 2012 research introduces a method to expedite searches in Hamming space.
Graph-Based Indexing (NSG): Fu et al.'s 2019 research on the Navigating Spreading-out Graph method aids in fast approximate nearest neighbor searches.
Local Search Quantization: Research by Julieta Martinez et al. in 2016 and 2018 introduces methods to improve quantization for higher recall in searches.
Residual Quantizer: Liu et al.'s 2015 paper on residual vector quantization enhances the accuracy of approximate nearest neighbor searches.
A Survey of Product Quantization: A general paper by Matsui et al. in 2018, providing an overview of product quantization and related methods.

These research foundations collectively contribute to FAISS's capabilities in efficient similarity search and clustering, making it a powerful tool for handling large-scale, high-dimensional data.

Vector databases have become the standard for managing vast collections of embedding vectors.

These embeddings, which are dense vector representations of data items like words, images, or user preferences, are typically generated by neural networks.

Embeddings offer a compact yet expressive data representation, enabling efficient similarity searches and a plethora of other operations.

They transform data into a vector space where the proximity of vectors corresponds to the similarity of the data they represent, thereby facilitating operations like similarity search.

Like any database, their are scalability issues - speed, storage costs and accuracy are key considerations.

FAISS offers a suite of indexing methods and tools that

1) enable efficient searching and clustering of vectors

2) provide capabilities for compressing and transforming vectors

Central to the operation of FAISS is the "embedding contract," a conceptual framework where the embedding extractor—typically a neural network—is trained to generate vectors such that the distances between them mirror the similarity of the corresponding data items.

Concurrently, the vector index within FAISS is optimised to perform neighbour searches with accuracy, adhering to the established distance metrics.

This dual commitment ensures that FAISS remains a reliable and effective tool for vector database management, addressing the critical needs of the AI domain with precision and efficiency.

FAISS is not involved in the embedding process

FAISS doesn't have the capability to take raw data and convert it into embeddings. FAISS is designed to take embeddings and perform operations like indexing, searching, clustering, and transforming these vectors efficiently.

Its primary role is to deal with the embeddings after they've been created, focusing on how to store, search, and manage them effectively, especially when dealing with large-scale datasets.

Approximate Nearest Neighbor Search (ANN)

ANN operates on the principle that slightly imperfect results can be acceptable if they come with significant gains in efficiency or resource usage.

This trade-off allows for the exploration of new design spaces in solution architecture.

Instead of storing data as a plain matrix, ANN introduces an indexing structure to preprocess the database. This indexing facilitates more efficient querying by organising the data in a manner that speeds up the search for nearest neighbours.

Non-Exhaustive Search

Non-exhaustive search in fast search implementations for medium-sized datasets focuses on quickly identifying a subset of database vectors most likely to contain the search results.

Faiss implements two non-exhaustive search methods:

Inverted files - clusters database vectors and store them in a way that only a subset of clusters is examined during a search.

Graph-based indexing - builds a directed graph of vectors, exploring edges closest to the query vector during the search.

These methods aim to reduce the complexity of searches, making them faster than exhaustive searches, especially as the dataset size grows.

However, their effectiveness can vary with the dataset's dimensionality and size.

The choice between inverted file and graph-based indexing depends on the trade-off between memory usage and search speed, with graph-based methods generally being more memory-intensive but potentially faster for smaller datasets.

FAISS is a library of tools

FAISS offers a comprehensive set of tools for vector similarity search.

It integrates various indexing methods, which often require a sequence of components like preprocessing, compression, and non-exhaustive search.

The diversity in tools accommodates different efficiency needs based on specific usage constraints, meaning the most effective indexing method can vary depending on the scenario.

Technical details

FAISS employs several techniques to achieve efficient similarity search:

Quantization

FAISS uses quantization techniques to compress the embeddings, which significantly reduces memory usage and accelerates distance computations. One of the most widely used quantization techniques in FAISS is Product Quantization (PQ).

PQ approximates each vector by the concatenation of multiple codebook entries, enabling efficient storage and fast distance computation.

Indexing

FAISS provides multiple index types to cater to different use cases and trade-offs between search speed and search quality. Some common index types are:

1) Flat index

A brute-force index that computes exact distances between query vectors and indexed vectors.

2) IVF (Inverted File)

A partitioned index that divides the vector space into Voronoi cells. It stores the centroids of these cells and assigns each vector to the nearest centroid. This reduces the number of distance computations required for each query.

3) HNSW (Hierarchical Navigable Small World)

A graph-based index that builds a hierarchical graph structure, enabling efficient nearest neighbor search with logarithmic complexity.

GPU acceleration

FAISS leverages the parallelism of GPUs to accelerate similarity search operations, making it suitable for large-scale and real-time applications.

Application of FAISS

Here are some highlighted use cases demonstrating FAISS's broad applicability:

Trillion-scale Index

FAISS can handle massive datasets, as illustrated by an example where it indexed 1.5 trillion vectors of 144 dimensions.

The process involved a hierarchical, distributed approach using PCA for dimensionality reduction, scalar quantization, and HNSW for coarse quantization, followed by sharding techniques for efficient storage and search, demonstrating FAISS's capability to manage and search through enormous data volumes efficiently.

Text Retrieval

FAISS can facilitate information retrieval tasks like fact-checking, entity linking, and question answering.

Embedding models optimised for text retrieval are used to search across large text corpora, aiding in extracting relevant information or documents quickly.

Data Mining

FAISS aids in the mining and organisation of large datasets, such as finding bilingual texts in vast web-crawled datasets or organising a language model's training corpus.

For instance, it can group similar documents or identify duplicate images in a dataset, optimising the data curation process for better training model performance or more efficient storage.

Conclusion

FAISS is a well developed and widely used library. It underpins many of the modern vector database offerings such as Pinecone and Zilliz.

PreviousVector Similarity Search - HNSW NextUnsupervised Dense Retrievers

Last updated 3 months ago