There is no specific code to create an extension inside the Drizzle schema. We assume that if you are using vector types, indexes, and queries, you have a PostgreSQL database with the pg_vector extension installed.
PostgreSQL extensions
pg_vector
pg_vector is open-source vector similarity search for Postgres
Store your vectors with the rest of your data. Supports:
- exact and approximate nearest neighbor search
- single-precision, half-precision, binary, and sparse vectors
- L2 distance, inner product, cosine distance, L1 distance, Hamming distance, and Jaccard distance
Column Types
vector
Store your vectors with the rest of your data
For more info please refer to the official pg_vector docs docs.
const table = pgTable('table', {
embedding: vector({ dimensions: 3 }),
embedding2: vector({ dimensions: 3 }).default([0, -2, 3])
})CREATE TABLE "table" (
"embedding" vector(3),
"embedding2" vector(3) DEFAULT '[0,-2,3]'
);halfvec
Half-precision vector type, stores dense vectors of half-precision float values
For more info please refer to the official pg_vector docs docs.
export const table = pgTable('table', {
halfvec: halfvec({ dimensions: 3 }),
halfvec2: halfvec({ dimensions: 3 }).default([0, -2, 3])
})CREATE TABLE "table" (
"halfvec" halfvec(3),
"halfvec2" halfvec(3) DEFAULT '[0,-2,3]'
);sparsevec
Sparse vector type, stores vectors with mostly zero values efficiently
For more info please refer to the official pg_vector docs docs.
export const table = pgTable('table', {
sparsevec: sparsevec({ dimensions: 3 }),
sparsevec2: sparsevec({ dimensions: 5 }).default(`{1:-1,3:2,5:3}/5`)
})CREATE TABLE "table" (
"sparsevec" sparsevec(3),
"sparsevec2" sparsevec(5) DEFAULT '{1:-1,3:2,5:3}/5'
);bit
Bit string type, used for binary vectors in pgvector
For more info please refer to the official pg_vector docs.
export const table = pgTable('table', {
bit: bit({ dimensions: 5 }),
bit2: bit({ dimensions: 3 }).default(`101`)
})CREATE TABLE "table" (
"bit" bit(5),
"bit2" bit(3) DEFAULT '101'
);Indexes
You can now specify indexes for pg_vector and utilize pg_vector functions for querying, ordering, etc.
Let’s take a few examples of pg_vector indexes from the pg_vector docs and translate them to Drizzle
L2 distance, Inner product and Cosine distance for vector
// CREATE INDEX ON items USING hnsw (embedding vector_l2_ops);
// CREATE INDEX ON items USING hnsw (embedding vector_ip_ops);
// CREATE INDEX ON items USING hnsw (embedding vector_cosine_ops);
export const table = pgTable('items', {
embedding: vector({ dimensions: 3 })
}, (table) => [
index('l2_index').using('hnsw', table.embedding.op('vector_l2_ops')),
index('ip_index').using('hnsw', table.embedding.op('vector_ip_ops')),
index('cosine_index').using('hnsw', table.embedding.op('vector_cosine_ops'))
])L2 distance, Inner product and Cosine distance for halfvec
// CREATE INDEX ON items USING hnsw (embedding halfvec_l2_ops);
// CREATE INDEX ON items USING hnsw (embedding halfvec_ip_ops);
// CREATE INDEX ON items USING hnsw (embedding halfvec_cosine_ops);
export const table = pgTable('items', {
embedding: halfvec({ dimensions: 3 })
}, (table) => [
index('l2_index').using('hnsw', table.embedding.op('halfvec_l2_ops')),
index('ip_index').using('hnsw', table.embedding.op('halfvec_ip_ops')),
index('cosine_index').using('hnsw', table.embedding.op('halfvec_cosine_ops')),
])L2 distance, Inner product and Cosine distance for sparsevec
// CREATE INDEX ON items USING hnsw (embedding sparsevec_l2_ops);
// CREATE INDEX ON items USING hnsw (embedding sparsevec_ip_ops);
// CREATE INDEX ON items USING hnsw (embedding sparsevec_cosine_ops);
export const table = pgTable('items', {
embedding: sparsevec({ dimensions: 3 })
}, (table) => [
index('l2_index').using('hnsw', table.embedding.op('sparsevec_l2_ops')),
index('ip_index').using('hnsw', table.embedding.op('sparsevec_ip_ops')),
index('cosine_index').using('hnsw', table.embedding.op('sparsevec_cosine_ops')),
])Hamming distance and Jaccard distance for bit
// CREATE INDEX "hamming_index" ON "items" USING hnsw ("embedding" bit_hamming_ops);
// CREATE INDEX "bit_jaccard_index" ON "items" USING hnsw ("embedding" bit_jaccard_ops);
export const table = pgTable("items", {
embedding: bit({ dimensions: 3 }),
}, (table) => [
index("hamming_index").using("hnsw", table.embedding.op("bit_hamming_ops")),
index("bit_jaccard_index").using("hnsw", table.embedding.op("bit_jaccard_ops"))
]);Helper Functions
For queries, you can use predefined functions for vectors or create custom ones using the SQL template operator.
You can also use the following helpers:
import { l2Distance, l1Distance, innerProduct,
cosineDistance, hammingDistance, jaccardDistance } from 'drizzle-orm'
l2Distance(table.column, [3, 1, 2]) // table.column <-> '[3, 1, 2]'
l1Distance(table.column, [3, 1, 2]) // table.column <+> '[3, 1, 2]'
innerProduct(table.column, [3, 1, 2]) // table.column <#> '[3, 1, 2]'
cosineDistance(table.column, [3, 1, 2]) // table.column <=> '[3, 1, 2]'
hammingDistance(table.column, '101') // table.column <~> '101'
jaccardDistance(table.column, '101') // table.column <%> '101'If pg_vector has some other functions to use, you can replicate implementation from existing one we have. Here is how it can be done
export function l2Distance(
column: SQLWrapper | AnyColumn,
value: number[] | string[] | TypedQueryBuilder<any> | string,
): SQL {
if (is(value, TypedQueryBuilder<any>) || typeof value === 'string') {
return sql`${column} <-> ${value}`;
}
return sql`${column} <-> ${JSON.stringify(value)}`;
}Name it as you wish and change the operator. This example allows for a numbers array, strings array, string, or even a select query. Feel free to create any other type you want or even contribute and submit a PR
Examples
Let’s take a few examples of pg_vector queries from the pg_vector docs and translate them to Drizzle
import { l2Distance } from 'drizzle-orm';
// SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;
db.select().from(items).orderBy(l2Distance(items.embedding, [3,1,2]))
// SELECT embedding <-> '[3,1,2]' AS distance FROM items;
db.select({ distance: l2Distance(items.embedding, [3,1,2]) })
// SELECT * FROM items ORDER BY embedding <-> (SELECT embedding FROM items WHERE id = 1) LIMIT 5;
const subquery = db.select({ embedding: items.embedding }).from(items).where(eq(items.id, 1));
db.select().from(items).orderBy(l2Distance(items.embedding, subquery)).limit(5)
// SELECT (embedding <#> '[3,1,2]') * -1 AS inner_product FROM items;
db.select({ innerProduct: sql`(${innerProduct(items.embedding, [3,1,2])}) * -1` }).from(items)
// and more!postgis
There is no specific code to create an extension inside the Drizzle schema. We assume that if you are using postgis types, indexes, and queries, you have a PostgreSQL database with the postgis extension installed.
As PostGIS website mentions:
PostGIS extends the capabilities of the PostgreSQL relational database by adding support for storing, indexing, and querying geospatial data.
If you are using the introspect or push commands with the PostGIS extension and don’t want PostGIS tables to be included, you can use extensionsFilters to ignore all the PostGIS tables
Column Types
geometry
Store your geometry data with the rest of your data
For more info please refer to the official PostGIS docs docs.
const items = pgTable('items', {
geo: geometry('geo', { type: 'point' }),
geoObj: geometry('geo_obj', { type: 'point', mode: 'xy' }),
geoSrid: geometry('geo_options', { type: 'point', mode: 'xy', srid: 4000 }),
});mode
Type geometry has 2 modes for mappings from the database: tuple and xy.
tuplewill be accepted for insert and mapped on select to a tuple. So, the database geometry will be typed as [1,2] with drizzle.xywill be accepted for insert and mapped on select to an object with x, y coordinates. So, the database geometry will be typed as{ x: 1, y: 2 }with drizzle
type
The current release has a predefined type: point, which is the geometry(Point) type in the PostgreSQL PostGIS extension. You can specify any string there if you want to use some other type
Indexes
With the available Drizzle indexes API, you should be able to write any indexes for PostGIS
Examples
// CREATE INDEX custom_idx ON table USING GIST (geom);
const table = pgTable('table', {
geo: geometry({ type: 'point' }),
}, (table) => [
index('custom_idx').using('gist', table.geo)
])