Upstash Vector: Serverless Vector Database for AI Search

Upstash Vector is a serverless vector database with per-request pricing and a REST API — making it the natural companion to Upstash Redis for AI applications. You get similarity search without running a persistent vector database server. For applications on Vercel Edge or Cloudflare Workers where persistent TCP connections aren't possible, Upstash Vector is often the only viable vector search option.

When to Use Upstash Vector

Scenario	Recommendation
RAG on Vercel Edge Functions	Upstash Vector (REST API works on Edge)
High-volume production RAG	Supabase pgvector or Pinecone (more features)
Small knowledge base (< 100k docs)	Upstash Vector (simple, cheap)
Hybrid search (vector + keyword)	Supabase pgvector (SQL + pgvector)
Metadata filtering on vectors	Upstash Vector (supported natively)

Setup

npm install @upstash/vector openai

Create a Vector Index at console.upstash.com → Vector. Choose a dimensions size matching your embedding model (1536 for `text-embedding-3-small`, 3072 for `text-embedding-3-large`).

# .env.local UPSTASH_VECTOR_REST_URL=https://your-index.upstash.io UPSTASH_VECTOR_REST_TOKEN=your-token-here

Indexing Documents

// lib/vector-store.ts import { Index } from "@upstash/vector"; import OpenAI from "openai"; const index = new Index({ url: process.env.UPSTASH_VECTOR_REST_URL!, token: process.env.UPSTASH_VECTOR_REST_TOKEN!, }); const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }); export async function indexDocument(doc: { id: string; content: string; metadata: Record<string, string | number | boolean>; }) { // Generate embedding const response = await openai.embeddings.create({ model: "text-embedding-3-small", input: doc.content, }); const embedding = response.data[0].embedding; // Upsert into Upstash Vector await index.upsert({ id: doc.id, vector: embedding, metadata: { ...doc.metadata, content: doc.content, // store content in metadata for retrieval }, }); return { id: doc.id, dimensions: embedding.length }; } export async function indexDocumentBatch(docs: Array<{ id: string; content: string; metadata: Record<string, string | number | boolean>; }>) { // Batch embed for efficiency const response = await openai.embeddings.create({ model: "text-embedding-3-small", input: docs.map((d) => d.content), }); const vectors = docs.map((doc, i) => ({ id: doc.id, vector: response.data[i].embedding, metadata: { ...doc.metadata, content: doc.content }, })); // Upstash accepts batches of up to 1000 vectors await index.upsert(vectors); }

Querying with Metadata Filters

export async function searchDocuments( query: string, options: { topK?: number; filter?: string; // Upstash metadata filter expression includeMetadata?: boolean; } = {} ) { const { topK = 5, filter, includeMetadata = true } = options; // Embed the query const response = await openai.embeddings.create({ model: "text-embedding-3-small", input: query, }); const queryVector = response.data[0].embedding; // Search with optional metadata filter const results = await index.query({ vector: queryVector, topK, includeMetadata, filter, // e.g., 'category = "docs" AND published = true' }); return results.map((r) => ({ id: r.id, score: r.score, content: r.metadata?.content as string, metadata: r.metadata, })); }

Building a RAG Endpoint on Vercel Edge

// app/api/ask/route.ts import { Index } from "@upstash/vector"; import OpenAI from "openai"; import { NextResponse } from "next/server"; export const runtime = "edge"; const vectorIndex = new Index({ url: process.env.UPSTASH_VECTOR_REST_URL!, token: process.env.UPSTASH_VECTOR_REST_TOKEN!, }); const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY }); export async function POST(req: Request) { const { question, namespace } = await req.json(); // 1. Embed the question const embedResponse = await openai.embeddings.create({ model: "text-embedding-3-small", input: question, }); const queryVector = embedResponse.data[0].embedding; // 2. Retrieve relevant context const results = await vectorIndex.query({ vector: queryVector, topK: 4, includeMetadata: true, namespace: namespace ?? "default", // Upstash Vector supports namespaces }); const context = results .filter((r) => r.score > 0.75) // threshold for relevance .map((r) => r.metadata?.content as string) .join(" --- "); // 3. Generate answer with context const stream = await openai.chat.completions.create({ model: "gpt-4o-mini", stream: true, messages: [ { role: "system", content: `Answer based on the following context. If the context doesn't contain the answer, say so. Context: ${context}`, }, { role: "user", content: question }, ], }); // 4. Stream the response const encoder = new TextEncoder(); const readable = new ReadableStream({ async start(controller) { for await (const chunk of stream) { const text = chunk.choices[0]?.delta?.content ?? ""; if (text) controller.enqueue(encoder.encode(text)); } controller.close(); }, }); return new Response(readable, { headers: { "Content-Type": "text/plain; charset=utf-8" }, }); }

Namespaces for Multi-Tenant Apps

// Isolate vectors per tenant using namespaces // Each org's documents stay logically separated // Index to a specific namespace await index.upsert( [{ id: "doc1", vector: embedding, metadata: { content: "..." } }], { namespace: `org-${orgId}` } ); // Query only within a namespace const results = await index.query({ vector: queryVector, topK: 5, includeMetadata: true, namespace: `org-${orgId}`, });

Upstash Vector namespaces are free — they don't increase cost or require separate indexes. Use them liberally to isolate data by user, organization, or document type.

Upstash Vector Pricing

Tier	Vectors	Queries/day	Price
Free	10,000	10,000	$0
Pay-as-you-go	Unlimited	Unlimited	$0.4 per 100k queries
Fixed 100M	100M vectors	Unlimited	$70/month