FAISS (Facebook AI Similarity Search)
Last updated
Copyright Continuum Labs - 2023
Last updated
FAISS (Facebook AI Similarity Search) is a library developed by Facebook AI Research for efficient similarity search and clustering of dense vectors.
It is useful for large-scale similarity search problems, which are common in various machine learning and information retrieval tasks.
FAISS is designed to work on either the GPU or CPU and provides significant performance improvements compared to other nearest neighbour search algorithms.
The research foundations for FAISS are extensive:
Vector databases have become the standard for managing vast collections of embedding vectors.
These embeddings, which are dense vector representations of data items like words, images, or user preferences, are typically generated by neural networks.
Embeddings offer a compact yet expressive data representation, enabling efficient similarity searches and a plethora of other operations.
They transform data into a vector space where the proximity of vectors corresponds to the similarity of the data they represent, thereby facilitating operations like similarity search.
Like any database, their are scalability issues - speed, storage costs and accuracy are key considerations.
FAISS offers a suite of indexing methods and tools that
1) enable efficient searching and clustering of vectors
2) provide capabilities for compressing and transforming vectors
Central to the operation of FAISS is the "embedding contract," a conceptual framework where the embedding extractor—typically a neural network—is trained to generate vectors such that the distances between them mirror the similarity of the corresponding data items.
Concurrently, the vector index within FAISS is optimised to perform neighbour searches with accuracy, adhering to the established distance metrics.
This dual commitment ensures that FAISS remains a reliable and effective tool for vector database management, addressing the critical needs of the AI domain with precision and efficiency.
FAISS doesn't have the capability to take raw data and convert it into embeddings. FAISS is designed to take embeddings and perform operations like indexing, searching, clustering, and transforming these vectors efficiently.
Its primary role is to deal with the embeddings after they've been created, focusing on how to store, search, and manage them effectively, especially when dealing with large-scale datasets.
ANN operates on the principle that slightly imperfect results can be acceptable if they come with significant gains in efficiency or resource usage.
This trade-off allows for the exploration of new design spaces in solution architecture.
Instead of storing data as a plain matrix, ANN introduces an indexing structure to preprocess the database. This indexing facilitates more efficient querying by organising the data in a manner that speeds up the search for nearest neighbours.
Non-exhaustive search in fast search implementations for medium-sized datasets focuses on quickly identifying a subset of database vectors most likely to contain the search results.
Faiss implements two non-exhaustive search methods:
Inverted files - clusters database vectors and store them in a way that only a subset of clusters is examined during a search.
Graph-based indexing - builds a directed graph of vectors, exploring edges closest to the query vector during the search.
These methods aim to reduce the complexity of searches, making them faster than exhaustive searches, especially as the dataset size grows.
However, their effectiveness can vary with the dataset's dimensionality and size.
The choice between inverted file and graph-based indexing depends on the trade-off between memory usage and search speed, with graph-based methods generally being more memory-intensive but potentially faster for smaller datasets.
FAISS offers a comprehensive set of tools for vector similarity search.
It integrates various indexing methods, which often require a sequence of components like preprocessing, compression, and non-exhaustive search.
The diversity in tools accommodates different efficiency needs based on specific usage constraints, meaning the most effective indexing method can vary depending on the scenario.
FAISS employs several techniques to achieve efficient similarity search:
FAISS uses quantization techniques to compress the embeddings, which significantly reduces memory usage and accelerates distance computations. One of the most widely used quantization techniques in FAISS is Product Quantization (PQ).
PQ approximates each vector by the concatenation of multiple codebook entries, enabling efficient storage and fast distance computation.
FAISS provides multiple index types to cater to different use cases and trade-offs between search speed and search quality. Some common index types are:
1) Flat index
A brute-force index that computes exact distances between query vectors and indexed vectors.
2) IVF (Inverted File)
A partitioned index that divides the vector space into Voronoi cells. It stores the centroids of these cells and assigns each vector to the nearest centroid. This reduces the number of distance computations required for each query.
3) HNSW (Hierarchical Navigable Small World)
A graph-based index that builds a hierarchical graph structure, enabling efficient nearest neighbor search with logarithmic complexity.
GPU acceleration
FAISS leverages the parallelism of GPUs to accelerate similarity search operations, making it suitable for large-scale and real-time applications.
Here are some highlighted use cases demonstrating FAISS's broad applicability:
Trillion-scale Index
FAISS can handle massive datasets, as illustrated by an example where it indexed 1.5 trillion vectors of 144 dimensions.
The process involved a hierarchical, distributed approach using PCA for dimensionality reduction, scalar quantization, and HNSW for coarse quantization, followed by sharding techniques for efficient storage and search, demonstrating FAISS's capability to manage and search through enormous data volumes efficiently.
Text Retrieval
FAISS can facilitate information retrieval tasks like fact-checking, entity linking, and question answering.
Embedding models optimised for text retrieval are used to search across large text corpora, aiding in extracting relevant information or documents quickly.
Data Mining
FAISS aids in the mining and organisation of large datasets, such as finding bilingual texts in vast web-crawled datasets or organising a language model's training corpus.
For instance, it can group similar documents or identify duplicate images in a dataset, optimising the data curation process for better training model performance or more efficient storage.
FAISS is a well developed and widely used library. It underpins many of the modern vector database offerings such as Pinecone and Zilliz.