# FAISS (Facebook AI Similarity Search)

<mark style="color:blue;">**FAISS (Facebook AI Similarity Search)**</mark> is a library developed by Facebook AI Research for efficient similarity search and clustering of dense vectors. &#x20;

It is useful for large-scale similarity search problems, which are common in various machine learning and information retrieval tasks.&#x20;

<mark style="color:yellow;">**FAISS is designed to work on either the GPU or CPU**</mark> and provides significant performance improvements compared to other nearest neighbour search algorithms.

{% embed url="<https://arxiv.org/abs/2401.08281>" %}
The FAISS Library paper
{% endembed %}

The research foundations for FAISS are extensive:

<details>

<summary><mark style="color:green;">FAISS Research Foundations</mark></summary>

FAISS (Facebook AI Similarity Search) is a library developed by Facebook AI that enables efficient similarity search and clustering of dense vectors.&#x20;

1. <mark style="color:green;">**Inverted File**</mark><mark style="color:green;">:</mark> Originating from Sivic and Zisserman's work in 2003, the inverted file is crucial for non-exhaustive search in large datasets, enabling searches without scanning all elements in the index.
2. <mark style="color:green;">**Product Quantization (PQ)**</mark><mark style="color:green;">:</mark> Introduced by Jégou et al. in 2011, PQ is a method for lossy compression of high-dimensional vectors that supports relatively accurate reconstructions and distance computations in the compressed domain.
3. <mark style="color:green;">**Three-Level Quantization (IVFADC-R)**</mark><mark style="color:green;">:</mark> From Tavenard et al.'s 2011 research, this method further refines the quantization process for more efficient searches.
4. <mark style="color:green;">**Inverted Multi-Index**</mark><mark style="color:green;">:</mark> Babenko and Lempitsky's 2012 work improves the speed of inverted indexing, enhancing search efficiency.
5. <mark style="color:green;">**Optimized PQ**</mark><mark style="color:green;">:</mark> He et al.'s 2013 research optimizes product quantization, adapting the vector space for more effective indexing.
6. <mark style="color:green;">**Pre-Filtering of PQ Distances**</mark><mark style="color:green;">:</mark> Introduced by Douze et al. in 2016, this technique adds a binary filtering stage before computing PQ distances, improving search speed.
7. <mark style="color:green;">**GPU Implementation and Fast k-Selection**</mark><mark style="color:green;">:</mark> Johnson et al.'s 2017 paper details the adaptation of FAISS for GPU, enabling faster search processes.
8. <mark style="color:green;">**HNSW Indexing Method**</mark><mark style="color:green;">:</mark> Malkov et al.'s 2016 work on Hierarchical Navigable Small World graphs contributes to an efficient and robust approximate nearest neighbor search method.
9. <mark style="color:green;">**In-Register Vector Comparisons**</mark><mark style="color:green;">:</mark> André et al. (2019) and Guo et al. (2020) explore SIMD optimizations to enhance product quantization's efficiency.
10. <mark style="color:green;">**Binary Multi-Index Hashing**</mark><mark style="color:green;">:</mark> Norouzi et al.'s 2012 research introduces a method to expedite searches in Hamming space.
11. <mark style="color:green;">**Graph-Based Indexing (NSG)**</mark><mark style="color:green;">:</mark> Fu et al.'s 2019 research on the Navigating Spreading-out Graph method aids in fast approximate nearest neighbor searches.
12. <mark style="color:green;">**Local Search Quantization**</mark><mark style="color:green;">:</mark> Research by Julieta Martinez et al. in 2016 and 2018 introduces methods to improve quantization for higher recall in searches.
13. <mark style="color:green;">**Residual Quantizer**</mark><mark style="color:green;">:</mark> Liu et al.'s 2015 paper on residual vector quantization enhances the accuracy of approximate nearest neighbor searches.
14. <mark style="color:green;">**A Survey of Product Quantization**</mark><mark style="color:green;">:</mark> A general paper by Matsui et al. in 2018, providing an overview of product quantization and related methods.

These research foundations collectively contribute to FAISS's capabilities in efficient similarity search and clustering, making it a powerful tool for handling large-scale, high-dimensional data.

</details>

Vector databases have become the standard for managing vast collections of <mark style="color:blue;">**embedding vectors**</mark>.&#x20;

These embeddings, which are dense vector representations of data items like words, images, or user preferences, are typically generated by neural networks.   &#x20;

Embeddings offer a compact yet expressive data representation, enabling efficient similarity searches and a plethora of other operations.

They <mark style="color:yellow;">transform data into a vector space</mark> where the proximity of vectors corresponds to the similarity of the data they represent, thereby <mark style="color:yellow;">facilitating operations like similarity search.</mark>

Like any database, their are scalability issues - speed, storage costs and accuracy are key considerations.

FAISS offers a suite of indexing methods and tools that&#x20;

1\) enable <mark style="color:yellow;">efficient searching</mark> and clustering of vectors&#x20;

2\) provide capabilities for <mark style="color:yellow;">compressing and transforming vectors</mark>

Central to the operation of FAISS is the "embedding contract," a conceptual framework where the embedding extractor—typically a neural network—is trained to generate vectors such that the distances between them mirror the similarity of the corresponding data items.&#x20;

Concurrently, the vector index within FAISS is optimised to perform neighbour searches with accuracy, adhering to the established distance metrics.&#x20;

This dual commitment ensures that FAISS remains a reliable and effective tool for vector database management, addressing the critical needs of the AI domain with precision and efficiency.

### <mark style="color:purple;">FAISS is not involved in the embedding process</mark>

FAISS doesn't have the capability to take raw data and convert it into embeddings.  FAISS is designed to take embeddings and perform operations like indexing, searching, clustering, and transforming these vectors efficiently.&#x20;

Its primary role is to <mark style="color:yellow;">deal with the embeddings after they've been created</mark>, focusing on how to store, search, and manage them effectively, especially when dealing with large-scale datasets.

### <mark style="color:purple;">Approximate Nearest Neighbor Search (ANN)</mark>&#x20;

ANN operates on the principle that slightly imperfect results can be acceptable if they come with significant gains in efficiency or resource usage.&#x20;

This trade-off allows for the exploration of new design spaces in solution architecture.

Instead of storing data as a plain matrix, <mark style="color:yellow;">ANN introduces an indexing structure to preprocess the database</mark>. This indexing facilitates more efficient querying by organising the data in a manner that speeds up the search for nearest neighbours.

### <mark style="color:purple;">Non-Exhaustive Search</mark>

Non-exhaustive search in fast search implementations for medium-sized datasets focuses on <mark style="color:yellow;">quickly identifying a subset of database vectors most likely to contain the search results</mark>.&#x20;

Faiss implements two non-exhaustive search methods:&#x20;

<mark style="color:green;">**Inverted files**</mark> - clusters database vectors and store them in a way that only a subset of clusters is examined during a search.&#x20;

<mark style="color:green;">**Graph-based indexing**</mark> - builds a directed graph of vectors, exploring edges closest to the query vector during the search.

These methods aim to reduce the complexity of searches, making them faster than exhaustive searches, especially as the dataset size grows.&#x20;

However, their effectiveness can vary with the dataset's dimensionality and size.&#x20;

The choice between inverted file and graph-based indexing depends on the trade-off between memory usage and search speed, with <mark style="color:yellow;">**graph-based methods generally being more memory-intensive but potentially faster for smaller datasets**</mark>.

### <mark style="color:purple;">FAISS is a library of tools</mark>

FAISS offers a comprehensive set of tools for vector similarity search.&#x20;

It integrates various indexing methods, which often require a sequence of components like preprocessing, compression, and non-exhaustive search.&#x20;

The diversity in tools accommodates different efficiency needs based on specific usage constraints, meaning the most effective indexing method can vary depending on the scenario.

### <mark style="color:purple;">Technical details</mark>

FAISS employs several techniques to achieve efficient similarity search:

#### <mark style="color:green;">**Quantization**</mark>

FAISS uses quantization techniques to <mark style="color:yellow;">**compress the embeddings**</mark>, which <mark style="color:yellow;">**significantly reduces memory usage**</mark> <mark style="color:yellow;"></mark><mark style="color:yellow;">and</mark> <mark style="color:yellow;"></mark><mark style="color:yellow;">**accelerates distance computations**</mark>. One of the most widely used quantization techniques in FAISS is Product Quantization (PQ).&#x20;

PQ approximates each vector by the concatenation of multiple codebook entries, enabling efficient storage and fast distance computation.

#### <mark style="color:green;">**Indexing**</mark>

FAISS provides multiple index types to cater to different use cases and trade-offs between search speed and search quality. Some common index types are:

<mark style="color:blue;">**1) Flat index**</mark>

A brute-force index that computes exact distances between query vectors and indexed vectors.

<mark style="color:blue;">**2) IVF (Inverted File)**</mark>

A partitioned index that divides the vector space into Voronoi cells. It stores the centroids of these cells and assigns each vector to the nearest centroid. This reduces the number of distance computations required for each query.

<mark style="color:blue;">**3) HNSW (Hierarchical Navigable Small World)**</mark>

A graph-based index that builds a hierarchical graph structure, enabling efficient nearest neighbor search with logarithmic complexity.

<mark style="color:blue;">**GPU acceleration**</mark>

FAISS leverages the parallelism of GPUs to accelerate similarity search operations, making it suitable for large-scale and real-time applications.

### <mark style="color:purple;">Application of FAISS</mark>

Here are some highlighted use cases demonstrating FAISS's broad applicability:

<mark style="color:green;">**Trillion-scale Index**</mark>

FAISS <mark style="color:yellow;">can handle massive datasets</mark>, as illustrated by an example where it indexed 1.5 trillion vectors of 144 dimensions.&#x20;

The process involved a hierarchical, distributed approach using PCA for dimensionality reduction, scalar quantization, and HNSW for coarse quantization, followed by sharding techniques for efficient storage and search, demonstrating FAISS's capability to manage and search through enormous data volumes efficiently.

<mark style="color:green;">**Text Retrieval**</mark>

FAISS can facilitate information retrieval tasks like fact-checking, entity linking, and question answering. &#x20;

Embedding models optimised for text retrieval are used to search across large text corpora, aiding in extracting relevant information or documents quickly.

<mark style="color:green;">**Data Mining**</mark>

FAISS aids in the mining and organisation of large datasets, such as finding bilingual texts in vast web-crawled datasets or organising a language model's training corpus.&#x20;

For instance, it can group similar documents or identify duplicate images in a dataset, optimising the data curation process for better training model performance or more efficient storage.

### <mark style="color:purple;">Conclusion</mark>

FAISS is a well developed and widely used library.  It underpins many of the modern vector database offerings such as Pinecone and Zilliz.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://training.continuumlabs.ai/knowledge/vector-databases/faiss-facebook-ai-similarity-search.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
