PostgreSQL for the AI Era: Using pgvector and Semantic Search
You don't need a separate vector database. PostgreSQL can do semantic search alongside your existing relational data — and the hybrid query capability is where it beats dedicated vector databases for most applications.
What pgvector actually is
pgvector is a PostgreSQL extension that adds a vector data type and distance operators. Install it, create a column of type vector(n) where n is the dimension count, and query by similarity using standard SQL.
Requirements: PostgreSQL 12 or later. pgvector 0.5.0+ recommended (HNSW indexes were added in 0.5.0).
Setting it up
CREATE EXTENSION IF NOT EXISTS vector;
-- 1536 dims for OpenAI text-embedding-3-small / ada-002
-- 768 dims for most open-source models (e5-base, bge-base, etc.)
CREATE TABLE documents (
id BIGSERIAL PRIMARY KEY,
content TEXT NOT NULL,
category TEXT NOT NULL,
source_url TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
embedding vector(1536)
);
In practice, generate the embedding in your application and pass it as a parameter:
import openai, psycopg2
def store_document(conn, content, category, source_url):
response = openai.embeddings.create(
model="text-embedding-3-small",
input=content
)
embedding = response.data[0].embedding # list of 1536 floats
with conn.cursor() as cur:
cur.execute(
"INSERT INTO documents (content, category, source_url, embedding)"
" VALUES (%s, %s, %s, %s)",
(content, category, source_url, embedding)
)
conn.commit()
The three distance operators
<-> — Euclidean distance (L2)
Measures straight-line distance in vector space. Use for spatial data or image similarity where magnitude matters.
<=> — Cosine distance
Measures the angle between vectors, ignoring magnitude. Most text embedding models are designed for cosine similarity — use this for semantic search.
SELECT content, embedding <=> '[0.1, -0.2, ...]'::vector AS distance
FROM documents
ORDER BY distance
LIMIT 5;
<#> — Negative inner product
For normalised vectors (unit vectors), inner product equals cosine similarity. Check your model docs to know whether embeddings are normalised.
For most text search use cases, start with <=> (cosine).
Making it fast — IVFFlat and HNSW indexes
Without an index, a nearest-neighbour query scans every row in the table. This is exact but slow at scale. pgvector provides two approximate nearest-neighbour index types:
IVFFlat
Partitions the vector space into clusters and searches only the nearest clusters at query time.
-- Build after inserting data (IVFFlat needs data to learn the clusters)
-- Rule of thumb: lists = sqrt(row_count) for < 1M rows
CREATE INDEX ON documents USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
SET ivfflat.probes = 10; -- higher = more accurate, slower
HNSW
Graph-based index with better recall than IVFFlat. Slower to build, uses more memory.
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);
SET hnsw.ef_search = 64; -- higher = more accurate, slower
Trade-off: IVFFlat builds faster, less memory, good for most use cases. HNSW has better recall for precision-critical applications.
The real power — hybrid queries
This is what PostgreSQL does better than dedicated vector databases like Pinecone. In Pinecone, filtering by metadata happens after the ANN search — you search for 100 nearest neighbours and then filter down, which means you may get fewer results than you asked for if many neighbours don't match your filter.
In PostgreSQL, you write a WHERE clause and the query planner handles it:
-- Semantic search within a specific category, last 30 days
SELECT
d.id,
d.content,
d.source_url,
d.embedding <=> $1 AS distance
FROM documents d
WHERE d.category = 'postgresql'
AND d.created_at > NOW() - INTERVAL '30 days'
ORDER BY d.embedding <=> $1
LIMIT 10;
The WHERE clause runs on indexed relational columns, then vector similarity is computed on the filtered result set. You get genuinely filtered results — not post-hoc filtering on a pre-computed list. You can also combine vector search with JOINs, CTEs, and aggregates.
A practical example: find similar products
CREATE TABLE products (
id BIGSERIAL PRIMARY KEY,
name TEXT NOT NULL,
category TEXT NOT NULL,
price NUMERIC(10, 2) NOT NULL,
description TEXT NOT NULL,
embedding vector(1536)
);
CREATE INDEX ON products USING hnsw (embedding vector_cosine_ops);
CREATE INDEX ON products (category);
CREATE INDEX ON products (price);
Query: find products similar to product #42, in the same category, under €50:
WITH target AS (
SELECT embedding, category FROM products WHERE id = 42
)
SELECT
p.id, p.name, p.price,
p.embedding <=> t.embedding AS similarity_distance
FROM products p
CROSS JOIN target t
WHERE p.category = t.category
AND p.price < 50.00
AND p.id != 42
ORDER BY similarity_distance
LIMIT 10;
This query would require multiple API calls and client-side merging in a dedicated vector database. In PostgreSQL, it's one query.
Limitations worth knowing
For datasets over a few million vectors with very high query throughput requirements, dedicated vector databases (Qdrant, Weaviate, Milvus) offer better performance — they're built exclusively for this use case. pgvector is also single-machine; if you need horizontal scaling of the vector index specifically, you're in custom territory.
The sweet spot for pgvector is what most applications actually need: tens of thousands to a few million vectors, hybrid queries combining vector similarity with relational data, and a single database to maintain. I've been running pgvector in production on a few projects. For document search over ~200,000 documents with category and date filters, query time with HNSW is 8–15ms on a mid-range server.
pgvector turns PostgreSQL into a database that handles vector search without giving up everything else PostgreSQL is good at. For most applications, that's enough — one database to run, one to maintain, no sync process between your vectors and your relational data.