Vector Databases Made Easy with Real Examples

26th June, 2025

1. So... What is a Vector Database?

Alright, imagine trying to find resumes that *"feel similar"* to a job description — not just based on keywords like “Python” or “ETL” but based on meaning.

That’s where a vector database comes in. It stores content as high-dimensional vectors (numbers!) that represent meaning using embeddings.

2. Let's Get Nerdy: Code Time!

We’ll use sentence-transformers to convert text into embeddings:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')

text = "Built scalable data pipelines using Apache Airflow"
embedding = model.encode(text)
print(embedding[:5])  # Just showing the first 5 numbers
          

This returns a long vector (usually 384 or 768 dimensions) that represents the meaning of the text.

3. Storing in a Vector DB (Using ChromaDB)

Now let’s store our resume or job description vector in a vector store. We’ll use ChromaDB — easy to use and runs locally.

import chromadb
client = chromadb.Client()

collection = client.create_collection("resumes")

collection.add(
    documents=["Built scalable data pipelines using Apache Airflow"],
    embeddings=[embedding],
    ids=["resume_1"]
)
          

4. Searching for Similar Candidates

Say a recruiter types this:

job_desc = "Looking for someone experienced in data engineering and Airflow"
job_embedding = model.encode(job_desc)

results = collection.query(
    query_embeddings=[job_embedding],
    n_results=1
)

print(results["documents"])
          

✨ Boom! You just performed a semantic search. The most similar resumes will appear — even if they don’t use the exact words.

5. Real-World Use Cases

  • ⚡ AI Chatbots with memory (context retrieval)
  • 🔍 Image or audio similarity search
  • 📄 Document Q&A with RAG (Retrieval Augmented Generation)
  • 🧠 Smart recommendation systems

6. TL;DR

✅ Vector DBs store **meaning**, not keywords.

✅ Great for building **AI-driven search, Q&A, and recommendations**.

✅ Try it out with open tools like ChromaDB, FAISS, or Weaviate.

Want a full tutorial on this? I’ve got a resume matcher project using this exact setup. Let me know and I’ll write a step-by-step build guide!