Interactive Infographic

Vector Databases Explained

The infrastructure behind modern AI search

🗄️

Traditional DB

Rows & Columns

→

🔮

Vector DB

Semantic Space

🎯

Better Retrieval

Search by meaning, not keywords

⚡

Production Scale

Billions of vectors, milliseconds

🤖

Grounded AI

LLM answers backed by your data

Panel 1

Why Keywords Fall Short

Traditional search matches words. AI needs to match meaning.

🔍 User searches:

"How to fix a leaky pipe?"

❌ Keyword Search

Misses "Resolving Residential Plumbing Failures" — the tokens don't overlap.

Words must match exactly

✅ Vector Search

Finds it — embeddings encode that "leaky pipe" and "plumbing failure" mean the same thing.

Meaning is what matters

This is the keyword gap — and it's fundamental to why vector databases exist.

Panel 2

Same Word. Five Meanings.
Vector Search Knows.

The word "bank" appears in every sentence — but only one answers the question.

🔍 Query:

"Where can I store my money?"

❌ Geography

"The bank of the river is prone to erosion during the spring thaw."

✅ Match — Finance

"I need to visit the local bank to deposit my paycheck before it closes."

❌ Aviation

"The pilot had to bank the aircraft sharply to avoid the storm clouds."

❌ Idiom

"You can always bank on her to be the first person in the office."

❌ Technology

"The technician replaced a faulty module in the server bank."

Keyword search can't distinguish these. Vector search understands context.

Panel 3

What Is a Vector?

Think GPS, but for meaning instead of location.

🌍 GPS Analogy

GPS: [48.86, 2.35] → Paris
Embedding: [0.82, 0.15, 0.91, ...]
→ "The cat sat on the mat"

A GPS coordinate is 2 numbers describing physical location. An embedding is 1,536 numbers describing location in meaning-space.

📐 Semantic Space

cat

kitten

dog

car

truck

Simplified 2D projection

Words with similar meaning cluster together — cat, kitten, and dog are close because they're all animals. Car and truck form their own cluster — similar to each other, but far from the animals.

The distance between dots = how different their meanings are. This is how vector search finds relevant results without matching keywords.

Panel 4 — Engineering

How Vector DBs Make It Fast

Approximate Nearest Neighbor (ANN) trades tiny accuracy loss for massive speed gains.

📚 The Library Analogy

Imagine a library with 1 million books. You want the 5 most similar to yours.

Brute Force

Read every book and compare. Correct, but takes years.

ANN (What Vector DBs Do)

Organize into neighborhoods. Walk to the right section, compare nearby books only. Done in seconds.

🕸️

HNSW — Graph Navigation

Vectors connected in a multi-layer graph. Search hops between neighbors, getting closer each step. Fast, accurate, most widely used.

🎯

IVF — Cluster Search

Vectors grouped into clusters. Query compares to cluster centers first, then searches only the closest clusters. Tunable speed/accuracy.

⚡ Latency at Scale

5ms

6ms

7ms

8ms

HNSW — Stays flat

10ms

100ms

5s

min+

Brute Force — Explodes

1K vectors100K10M1B

Panel 5 — The Core Diagram

RAG Pipeline Architecture

Retrieval-Augmented Generation: how AI answers questions from your documents.

📄 Ingestion Phase

📄 Documents

→

✂️ Chunking

→

🧮 Embed

→

🗄️ Vector DB

🔍 Query Phase

❓ User Question

→

🧮 Query Embed

→

🔎 Top‑k Retrieval

→

🤖 LLM

→

💬 Answer

⚠️ Bad Chunking

Too small = missing context. Too large = diluted relevance. Test multiple strategies.

⚠️ Weak Embeddings

A general model on domain-specific text retrieves poorly. Use domain-tuned models.

⚠️ No Reranking

Top-k by similarity ≠ best for answering. Add a reranker for production quality.

Panel 6

Choosing a Vector Database

A decision tree, not a comparison table — pick the path that fits your situation.

Already use PostgreSQL?→ pgvector

Need zero ops at massive scale?→ Pinecone

Open source + hybrid search?→ Weaviate

Max performance + billion-scale, open source?→ Milvus / Qdrant

Just prototyping locally?→ Chroma

Multimodal (images + text + audio)?→ LanceDB / Weaviate

💡 Pro tip: For retrieval quality, your embedding model matters more than your database choice. For retrieval speed and cost, the database decision is critical.

Panel 7

What Actually Matters

Three things people overlook — and they're more important than which database you pick.

1. Embedding Model > Database Choice

A strong embedding model with a simple index will outperform a weak model on the fanciest database. Invest in embedding quality first.

2. Chunking Strategy Is a Design Decision

There's no universal chunk size. It depends on your data, queries, and context window. Test 256, 512, 1024 tokens with overlap.

3. Hybrid Retrieval Is Table Stakes

Production systems combine vector similarity + keyword/BM25 + metadata filters → final ranking. Don't rely on vector search alone.

🔀 Hybrid Search in Practice

Vector Search

+

Keyword/BM25

+

Metadata Filters

→

Combined Ranking

📋 Getting Started Checklist

Select and test an embedding model on your data
Determine chunk size with overlap (512? 1024?)
Design your metadata schema (time, type, permissions)
Configure hybrid search (BM25 + vector weights)
Evaluate a reranker (does it improve your top-k?)
Test recall and latency at expected scale

🏗️ OUR STACK — "Everything above, we run it."

Vector DBMilvus (HNSW, open source)

Embeddingsbge-small-en-v1.5 (384-dim)

RerankerGTE-multilingual

LLM ProxyLiteLLM → Bedrock / Azure OpenAI

Search TypeHybrid (vector + metadata)

Embedding — converting text/image to a list of numbers

ANN — Approximate Nearest Neighbor (fast, ~99% accurate)

HNSW — Hierarchical Navigable Small World (graph-based ANN)

IVF — Inverted File Index (cluster-based ANN)

Cosine Similarity — angle between vectors (direction matters)

RAG — Retrieval-Augmented Generation

Top-k — the k best search results (e.g., top-5)

BM25 — traditional keyword relevance scoring

Climacs IT Consulting · Based on "Vector Databases Explained" by Darren Broemmer · March 2026