LogoLogo
Continuum WebsiteContinuum ApplicationsContinuum KnowledgeAxolotl Platform
Continuum Knowledge
Continuum Knowledge
  • Continuum
  • Data
    • Datasets
      • Pre Training Data
      • Types of Fine Tuning
      • Self Instruct Paper
      • Self-Alignment with Instruction Backtranslation
      • Systematic Evaluation of Instruction-Tuned Large Language Models on Open Datasets
      • Instruction Tuning
      • Instruction Fine Tuning - Alpagasus
      • Less is More For Alignment
      • Enhanced Supervised Fine Tuning
      • Visualising Data using t-SNE
      • UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
      • Training and Evaluation Datasets
      • What is perplexity?
  • MODELS
    • Foundation Models
      • The leaderboard
      • Foundation Models
      • LLama 2 - Analysis
      • Analysis of Llama 3
      • Llama 3.1 series
      • Google Gemini 1.5
      • Platypus: Quick, Cheap, and Powerful Refinement of LLMs
      • Mixtral of Experts
      • Mixture-of-Agents (MoA)
      • Phi 1.5
        • Refining the Art of AI Training: A Deep Dive into Phi 1.5's Innovative Approach
      • Phi 2.0
      • Phi-3 Technical Report
  • Training
    • The Fine Tuning Process
      • Why fine tune?
        • Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
        • Explanations in Fine Tuning
      • Tokenization
        • Tokenization Is More Than Compression
        • Tokenization - SentencePiece
        • Tokenization explore
        • Tokenizer Choice For LLM Training: Negligible or Crucial?
        • Getting the most out of your tokenizer for pre-training and domain adaptation
        • TokenMonster
      • Parameter Efficient Fine Tuning
        • P-Tuning
          • The Power of Scale for Parameter-Efficient Prompt Tuning
        • Prefix-Tuning: Optimizing Continuous Prompts for Generation
        • Harnessing the Power of PEFT: A Smarter Approach to Fine-tuning Pre-trained Models
        • What is Low-Rank Adaptation (LoRA) - explained by the inventor
        • Low Rank Adaptation (Lora)
        • Practical Tips for Fine-tuning LMs Using LoRA (Low-Rank Adaptation)
        • QLORA: Efficient Finetuning of Quantized LLMs
        • Bits and Bytes
        • The Magic behind Qlora
        • Practical Guide to LoRA: Tips and Tricks for Effective Model Adaptation
        • The quantization constant
        • QLORA: Efficient Finetuning of Quantized Language Models
        • QLORA and Fine-Tuning of Quantized Language Models (LMs)
        • ReLoRA: High-Rank Training Through Low-Rank Updates
        • SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models
        • GaLora: Memory-Efficient LLM Training by Gradient Low-Rank Projection
      • Hyperparameters
        • Batch Size
        • Padding Tokens
        • Mixed precision training
        • FP8 Formats for Deep Learning
        • Floating Point Numbers
        • Batch Size and Model loss
        • Batch Normalisation
        • Rethinking Learning Rate Tuning in the Era of Language Models
        • Sample Packing
        • Gradient accumulation
        • A process for choosing the learning rate
        • Learning Rate Scheduler
        • Checkpoints
        • A Survey on Efficient Training of Transformers
        • Sequence Length Warmup
        • Understanding Training vs. Evaluation Data Splits
        • Cross-entropy loss
        • Weight Decay
        • Optimiser
        • Caching
      • Training Processes
        • Extending the context window
        • PyTorch Fully Sharded Data Parallel (FSDP)
        • Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
        • YaRN: Efficient Context Window Extension of Large Language Models
        • Sliding Window Attention
        • LongRoPE
        • Reinforcement Learning
        • An introduction to reinforcement learning
        • Reinforcement Learning from Human Feedback (RLHF)
        • Direct Preference Optimization: Your Language Model is Secretly a Reward Model
  • INFERENCE
    • Why is inference important?
      • Grouped Query Attention
      • Key Value Cache
      • Flash Attention
      • Flash Attention 2
      • StreamingLLM
      • Paged Attention and vLLM
      • TensorRT-LLM
      • Torchscript
      • NVIDIA L40S GPU
      • Triton Inference Server - Introduction
      • Triton Inference Server
      • FiDO: Fusion-in-Decoder optimised for stronger performance and faster inference
      • Is PUE a useful measure of data centre performance?
      • SLORA
  • KNOWLEDGE
    • Vector Databases
      • A Comprehensive Survey on Vector Databases
      • Vector database management systems: Fundamental concepts, use-cases, and current challenges
      • Using the Output Embedding to Improve Language Models
      • Decoding Sentence-BERT
      • ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
      • SimCSE: Simple Contrastive Learning of Sentence Embeddings
      • Questions Are All You Need to Train a Dense Passage Retriever
      • Improving Text Embeddings with Large Language Models
      • Massive Text Embedding Benchmark
      • RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking
      • LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders
      • Embedding and Fine-Tuning in Neural Language Models
      • Embedding Model Construction
      • Demystifying Embedding Spaces using Large Language Models
      • Fine-Tuning Llama for Multi-Stage Text Retrieval
      • Large Language Model Based Text Augmentation Enhanced Personality Detection Model
      • One Embedder, Any Task: Instruction-Finetuned Text Embeddings
      • Vector Databases are not the only solution
      • Knowledge Graphs
        • Harnessing Knowledge Graphs to Elevate AI: A Technical Exploration
        • Unifying Large Language Models and Knowledge Graphs: A Roadmap
      • Approximate Nearest Neighbor (ANN)
      • High Dimensional Data
      • Principal Component Analysis (PCA)
      • Vector Similarity Search - HNSW
      • FAISS (Facebook AI Similarity Search)
      • Unsupervised Dense Retrievers
    • Retrieval Augmented Generation
      • Retrieval-Augmented Generation for Large Language Models: A Survey
      • Fine-Tuning or Retrieval?
      • Revolutionising Information Retrieval: The Power of RAG in Language Models
      • A Survey on Retrieval-Augmented Text Generation
      • REALM: Retrieval-Augmented Language Model Pre-Training
      • Retrieve Anything To Augment Large Language Models
      • Generate Rather Than Retrieve: Large Language Models Are Strong Context Generators
      • Active Retrieval Augmented Generation
      • DSPy: LM Assertions: Enhancing Language Model Pipelines with Computational Constraints
      • DSPy: Compiling Declarative Language Model Calls
      • DSPy: In-Context Learning for Extreme Multi-Label Classification
      • Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs
      • HYDE: Revolutionising Search with Hypothetical Document Embeddings
      • Enhancing Recommender Systems with Large Language Model Reasoning Graphs
      • Retrieval Augmented Generation (RAG) versus fine tuning
      • RAFT: Adapting Language Model to Domain Specific RAG
      • Summarisation Methods and RAG
      • Lessons Learned on LLM RAG Solutions
      • Stanford: Retrieval Augmented Language Models
      • Overview of RAG Approaches with Vector Databases
      • Mastering Chunking in Retrieval-Augmented Generation (RAG) Systems
    • Semantic Routing
    • Resource Description Framework (RDF)
  • AGENTS
    • What is agency?
      • Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves
      • Types of Agents
      • The risk of AI agency
      • Understanding Personality in Large Language Models: A New Frontier in AI Psychology
      • AI Agents - Reasoning, Planning, and Tool Calling
      • Personality and Brand
      • Agent Interaction via APIs
      • Bridging Minds and Machines: The Legacy of Newell, Shaw, and Simon
      • A Survey on Language Model based Autonomous Agents
      • Large Language Models as Agents
      • AI Reasoning: A Deep Dive into Chain-of-Thought Prompting
      • Enhancing AI Reasoning with Self-Taught Reasoner (STaR)
      • Exploring the Frontier of AI: The "Tree of Thoughts" Framework
      • Toolformer: Revolutionising Language Models with API Integration - An Analysis
      • TaskMatrix.AI: Bridging Foundational AI Models with Specialised Systems for Enhanced Task Completion
      • Unleashing the Power of LLMs in API Integration: The Rise of Gorilla
      • Andrew Ng's presentation on AI agents
      • Making AI accessible with Andrej Karpathy and Stephanie Zhan
  • Regulation and Ethics
    • Regulation and Ethics
      • Privacy
      • Detecting AI Generated content
      • Navigating the IP Maze in AI: The Convergence of Blockchain, Web 3.0, and LLMs
      • Adverse Reactions to generative AI
      • Navigating the Ethical Minefield: The Challenge of Security in Large Language Models
      • Navigating the Uncharted Waters: The Risks of Autonomous AI in Military Decision-Making
  • DISRUPTION
    • Data Architecture
      • What is a data pipeline?
      • What is Reverse ETL?
      • Unstructured Data and Generatve AI
      • Resource Description Framework (RDF)
      • Integrating generative AI with the Semantic Web
    • Search
      • BM25 - Search Engine Ranking Function
      • BERT as a reranking engine
      • BERT and Google
      • Generative Engine Optimisation (GEO)
      • Billion-scale similarity search with GPUs
      • FOLLOWIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions
      • Neural Collaborative Filtering
      • Federated Neural Collaborative Filtering
      • Latent Space versus Embedding Space
      • Improving Text Embeddings with Large Language Models
    • Recommendation Engines
      • On Interpretation and Measurement of Soft Attributes for Recommendation
      • A Survey on Large Language Models for Recommendation
      • Model driven recommendation systems
      • Recommender AI Agent: Integrating Large Language Models for Interactive Recommendations
      • Foundation Models for Recommender Systems
      • Exploring the Impact of Large Language Models on Recommender Systems: An Extensive Review
      • AI driven recommendations - harming autonomy?
    • Logging
      • A Taxonomy of Anomalies in Log Data
      • Deeplog
      • LogBERT: Log Anomaly Detection via BERT
      • Experience Report: Deep Learning-based System Log Analysis for Anomaly Detection
      • Log-based Anomaly Detection with Deep Learning: How Far Are We?
      • Deep Learning for Anomaly Detection in Log Data: A Survey
      • LogGPT
      • Adaptive Semantic Gate Networks (ASGNet) for log-based anomaly diagnosis
  • Infrastructure
    • The modern data centre
      • Enhancing Data Centre Efficiency: Strategies to Improve PUE
      • TCO of NVIDIA GPUs and falling barriers to entry
      • Maximising GPU Utilisation with Kubernetes and NVIDIA GPU Operator
      • Data Centres
      • Liquid Cooling
    • Servers and Chips
      • The NVIDIA H100 GPU
      • NVIDIA H100 NVL
      • Lambda Hyperplane 8-H100
      • NVIDIA DGX Servers
      • NVIDIA DGX-2
      • NVIDIA DGX H-100 System
      • NVLink Switch
      • Tensor Cores
      • NVIDIA Grace Hopper Superchip
      • NVIDIA Grace CPU Superchip
      • NVIDIA GB200 NVL72
      • Hopper versus Blackwell
      • HGX: High-Performance GPU Platforms
      • ARM Chips
      • ARM versus x86
      • RISC versus CISC
      • Introduction to RISC-V
    • Networking and Connectivity
      • Infiniband versus Ethernet
      • NVIDIA Quantum InfiniBand
      • PCIe (Peripheral Component Interconnect Express)
      • NVIDIA ConnectX InfiniBand adapters
      • NVMe (Non-Volatile Memory Express)
      • NVMe over Fabrics (NVMe-oF)
      • NVIDIA Spectrum-X
      • NVIDIA GPUDirect
      • Evaluating Modern GPU Interconnect
      • Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)
      • Next-generation networking in AI environments
      • NVIDIA Collective Communications Library (NCCL)
    • Data and Memory
      • NVIDIA BlueField Data Processing Units (DPUs)
      • Remote Direct Memory Access (RDMA)
      • High Bandwidth Memory (HBM3)
      • Flash Memory
      • Model Requirements
      • Calculating GPU memory for serving LLMs
      • Transformer training costs
      • GPU Performance Optimisation
    • Libraries and Complements
      • NVIDIA Base Command
      • NVIDIA AI Enterprise
      • CUDA - NVIDIA GTC 2024 presentation
      • RAPIDs
      • RAFT
    • Vast Data Platform
      • Vast Datastore
      • Vast Database
      • Vast Data Engine
      • DASE (Disaggregated and Shared Everything)
      • Dremio and VAST Data
    • Storage
      • WEKA: A High-Performance Storage Solution for AI Workloads
      • Introduction to NVIDIA GPUDirect Storage (GDS)
        • GDS cuFile API
      • NVIDIA Magnum IO GPUDirect Storage (GDS)
      • Vectors in Memory
Powered by GitBook
LogoLogo

Continuum - Accelerated Artificial Intelligence

  • Continuum Website
  • Axolotl Platform

Copyright Continuum Labs - 2023

On this page
  • Storage in Vector Databases
  • Nearest Neighbour Search
  • Approximate Nearest Neighbour Search (ANNS)
  • Graph Based Approach to ANNS
  • The Quantization-Based Approach
  • Challenges
  • Synergy with Large Language Models

Was this helpful?

  1. KNOWLEDGE
  2. Vector Databases

A Comprehensive Survey on Vector Databases

PreviousVector DatabasesNextVector database management systems: Fundamental concepts, use-cases, and current challenges

Last updated 11 months ago

Was this helpful?

This October 2023 paper presents a comprehensive review of vector databases, specialised databases designed to store and manage high-dimensional data.

As opposed to traditional relational databases, vector databases are optimised for handling unstructured data, transforming them into high-dimensional vectors via embedding functions derived from various machine learning models or algorithms.

Storage in Vector Databases

In vector databases, data storage and retrieval are optimised for handling high-dimensional vector representations of unstructured complex data - such as images, text, or audio.

Sharding

This process distributes the database across multiple servers or clusters, known as shards. There are two main sharding strategies:

  • Hash-based sharding: Here, data is allocated to different shards based on the hash value of a key column or a set of columns. This method ensures even data distribution and helps avoid load imbalances.

  • Range-based sharding: This method assigns data to shards based on value ranges of a key column or a set of columns. It allows efficient querying by enabling data retrieval based on specific shard names or value ranges.

Partitioning

Similar to sharding, partitioning divides the database into smaller, manageable segments, but it typically occurs within a single database system rather than across multiple systems. Two common partitioning methods are:

  • Range partitioning: Data is divided into partitions based on value ranges of a key column. This method is useful for time-series data or any scenario where data can be segmented into well-defined ranges.

  • List partitioning: Data is grouped into partitions based on the value lists of a key column. This approach is suitable when the partitioning criteria are categorical or when grouping by specific, discrete values.

Caching

To enhance data retrieval speed, frequently accessed or recently used data is stored in fast, accessible memory. Caching strategies in vector databases include:

  • Least Recently Used (LRU) caching: This policy removes the least recently accessed data when the cache is full, prioritising data that is more likely to be accessed again.

  • Partitioned caching: The cache is divided based on certain criteria, allowing for tailored cache management for different data segments, optimizing resource use and retrieval efficiency.

Replication

Creating multiple copies of data ensures higher availability, durability, and read performance. Two primary replication methods are:

  • Leaderless replication: This method allows any node to handle write and read requests, enhancing scalability and avoiding single points of failure. However, it can lead to consistency challenges that need resolution strategies.

  • Leader-follower replication: One node acts as the leader (handling write operations) and propagates changes to follower nodes. This method ensures strong consistency but requires effective failover strategies to maintain availability in case the leader node fails.

Nearest Neighbour Search

In the context of vector databases, nearest neighbour search (NNS) is crucial for finding the closest or most similar data points to a given query point, leveraging vector distances or similarities.

Brute Force Approach

Tree-Based Approaches

These methods enhance efficiency by structuring data in tree formats, enabling quicker nearest neighbour searches by pruning irrelevant sections of the data space.

KD-tree: Organises points in k-dimensional space using a binary tree. Each node represents a point that splits the space, reducing the search area by comparing distances within relevant partitions.

Ball-tree: Groups data points into hyperspheres, making it effective in high-dimensional spaces. It iteratively refines the search to the most promising hypersphere, enhancing search efficiency.

R-tree: Uses rectangles to encapsulate points, supporting efficient spatial queries. It's particularly beneficial for geographical or multidimensional data, enabling rapid identification of nearest neighbours within defined bounds.

M-tree: Similar to a ball-tree but supports dynamic updates (insertions and deletions), maintaining an efficient structure for ongoing nearest neighbour queries even as data changes.

These tree-based methods reduce the search space significantly compared to the brute force approach, enhancing efficiency, especially in large and high-dimensional datasets.

Approximate Nearest Neighbour Search (ANNS)

Approximate Nearest Neighbour Search (ANNS) in vector databases allows for quick and efficient searches for data points that are close or similar to a query point, albeit with some margin of error.

This is particularly useful in handling vast or high-dimensional datasets where exact nearest neighbour search (NNS) might be computationally intensive.

Hash-Based Approach

  • Locality-Sensitive Hashing (LSH): Transforms high-dimensional vectors into compact binary codes using hash functions, preserving the locality so that similar vectors have similar codes. This method boosts search speed and reduces memory usage by operating on these codes rather than the original vectors.

  • Spectral Hashing: Uses spectral graph theory to generate hash functions that minimise quantization error and maximise the variance of binary codes, effective when data points reside on a low-dimensional manifold within a high-dimensional space.

  • Deep Hashing: Employs deep neural networks to learn hash functions, converting high-dimensional vectors into binary codes. This approach retains semantic information, making it suitable for complex data like images or texts.

Tree-Based Approach

  • Approximate Nearest Neighbours Oh Yeah (Annoy): Creates a forest of binary trees to partition the vector space. It assigns vectors to leaf nodes based on random hyperplanes, collecting candidates from the same nodes for distance computations.

  • Best Bin First: Uses a kd-tree to partition data into bins, then prioritises searching in bins closer to the query point, improving search time and accuracy.

  • K-means Tree: Clusters data points into a hierarchical structure, where each node represents a cluster. It facilitates efficient search by navigating through the branches likely to contain the nearest neighbours.

These ANNS methods provide a balance between search accuracy and computational efficiency, making them invaluable in large-scale data environments typical of vector databases.

They enable applications like image retrieval, recommendation systems, and more, where rapid and efficient data retrieval is crucial.

Graph Based Approach to ANNS

The graph-based approach to approximate nearest neighbour search (ANNS) in vector databases employs graph structures to efficiently store and retrieve high-dimensional vectors based on their similarity or distance.

This approach includes the Navigable Small World (NSW) and Hierarchical Navigable Small World (HNSW) methods, both optimising search processes through graph structures.

Navigable Small World (NSW)

NSW creates a graph where each vector is connected to its nearest neighbours and some randomly chosen distant vectors. These connections form shortcuts, enabling rapid traversal of the graph.

The addition of each new point involves a random walk to find the nearest neighbour and establish connections based on proximity. This method ensures that similar vectors are interconnected, facilitating a quick nearest neighbour search.

Hierarchical Navigable Small World (HNSW)

An advancement over NSW, HNSW constructs a multi-layered graph where each layer operates at a different scale.

Points are assigned to layers probabilistically, creating a pyramid-like structure with fewer points at higher levels. Searches begin at the top layer, utilizing the hierarchy to quickly narrow down the search space as they descend to the more densely populated lower layers.

The Quantization-Based Approach

The quantization-based approach in vector databases offers a method to compress high-dimensional vectors into compact codes, significantly enhancing the efficiency of approximate nearest neighbour searches.

This approach encompasses three main methods:

Product Quantization (PQ)

PQ divides each high-dimensional vector into several sub vectors and assigns each to the nearest centroid in a predefined codebook, effectively transforming the vector into a sequence of centroid indices.

This process significantly reduces the storage requirement and accelerates the search process by comparing these compact codes instead of the full-dimensional vectors. While PQ simplifies the implementation and enhances search efficiency, its effectiveness depends on factors like data dimensionality, the granularity of sub-vector segmentation, and the size of the codebook.

Optimized Product Quantization (OPQ)

OPQ refines the PQ process by optimising the space decomposition and the codebook to reduce quantization errors, thereby preserving more information from the original vectors.

This optimisation usually involves a rotation of the data space to align it more effectively with the quantization grid, enhancing the discriminative power of the generated codes.

The method involves balancing the benefits of a more nuanced space decomposition with the increased computational complexity of the optimisation process.

Online Product Quantization (O-PQ)

O-PQ adapts the PQ framework to dynamic datasets by continuously updating the quantization codebook and the codes as new data arrives.

This method is particularly relevant for systems dealing with data streams or incremental datasets, maintaining the relevance of the quantization process over time without the need for complete retraining.

The adaptability of O-PQ provides a robust framework for evolving datasets but requires careful management of learning and forgetting rates to ensure the timely incorporation of new information while discarding outdated data.

These quantization methods transform the challenge of searching in a high-dimensional space into a more manageable problem of searching through a set of discrete codes, balancing between search accuracy and computational efficiency.

They offer a scalable and flexible approach to handling vast datasets, making them great tools in the domain of vector databases and their applications in various fields such as image retrieval, recommendation systems, and more.

Challenges

Vector databases, along with the integration of these databases with large language models (LLMs), present multifaceted technical hurdles:

Index Construction and Searching of High-Dimensional Vectors

  • Dimensionality Catastrophe: Traditional indexing methods struggle with high-dimensional data due to the "curse of dimensionality," where the distance between points becomes less meaningful, complicating nearest neighbour searches.

  • Specialised Techniques: Advanced techniques like approximate nearest neighbour (ANN) search, hashing, quantization, and graph-based methods are essential for handling the complexity and improving the search accuracy for vector data.

Support for Heterogeneous Vector Data Types

  • Diverse Vector Characteristics: Vector databases must accommodate various vector types—dense, sparse, binary, etc.—each with unique traits such as dimensionality and sparsity.

  • Adaptive Indexing System: A flexible indexing system is required to efficiently manage different vector data types, optimising performance based on the specific characteristics of each data type.

Distributed Parallel Processing Support

  • Scalability: To manage large-scale vector data, vector databases should leverage distributed computing, distributing the data and processing across multiple nodes or clusters.

  • Distributed Processing Challenges: This involves overcoming issues related to data partitioning, load balancing, fault tolerance, and maintaining consistency across distributed systems.

Integration with Mainstream Machine Learning Frameworks

  • Framework Compatibility: Seamless integration with prevalent machine learning frameworks is vital for the generation and utilization of vector embeddings within vector databases.

  • APIs and Connectors: Providing user-friendly APIs and connectors is essential for ensuring that vector databases can effectively interact with different machine learning frameworks and support various data formats and models.

Synergy with Large Language Models

The paper outlines potential applications of integrating vector databases with large language models (LLMs) and vice versa, highlighting the enhanced capabilities and interactive systems that can emerge from this synergy.

Applications of Vector Databases on LLMs

  • Long-term Memory: Vector databases can augment LLMs with a form of long-term memory. They store information in vector form, allowing quick retrieval of relevant or similar vectors when a user interacts with an LLM, thereby enabling the LLM to provide more personalised and informed responses.

  • Semantic Search: These databases empower LLMs to perform semantic searches, where users can query using natural language, and the system retrieves text based on meaning rather than mere keywords, allowing the LLM to provide concise summaries or paraphrases.

  • Recommendation Systems: By analysing vector representations, vector databases can help LLMs in recommending items (like movies) that align with user preferences, providing reasons for recommendations and additional context or reviews.

Applications of LLMs on Vector Databases

  • Text Generation: LLMs can generate texts corresponding to specific vectors from the database, aiding in content creation across various formats and styles.

  • Text Augmentation: LLMs can enhance texts by infusing additional relevant details from the vector database, improving text quality and diversity.

  • Text Transformation: Using vector databases, LLMs can transform texts across languages, domains, or formats, supporting tasks like translation, paraphrasing, and summarization.

Retrieval-Based LLM

  • Defined as a language model that integrates external datastores, particularly beneficial during inference.

  • Advantages: Such LLMs offer enhanced memory for long-tail knowledge, easy updates without extensive retraining, capabilities for source citation and fact-checking, improved privacy through encryption and anonymization, and cost-effectiveness by using external datastores and efficient retrieval mechanisms.

  • Inference Process: Involves a complex data flow where user queries trigger retrieval from a diverse datastore, employing algorithms to identify relevant data subsets, facilitating informed and context-rich responses.

This straightforward method involves scanning every point in the dataset, calculating the distance to the query point, and maintaining the closest one. While guaranteed to find the exact nearest neighbour, it's computationally expensive, with time complexity O(n)O(n)O(n) for nnn data points.

LogoA Comprehensive Survey on Vector Database: Storage and Retrieval...arXiv.org
A Comprehensive Survey on Vector Database: Storage and Retrieval Technique, Challenge Yikun Han, Chunjiang Liu and Pengfei Wang
Page cover image