Blog | Developer | | 10 min read

Add Vector Search to Your FastAPI App With VectorAI DB

Add Vector Search to Your FastAPI App With VectorAI DB

Summary

  • Build semantic search in FastAPI with a simple local setup instead of a heavier cloud stack.
  • Create an ingest endpoint that embeds product descriptions and stores them with metadata.
  • Create a search endpoint that returns products by meaning, not just exact keywords.
  • Add structured filters like category directly inside the vector search.
  • Next steps are async support, hybrid search, and richer filtering for production use.

Adding semantic search to a FastAPI application often starts with friction. Most guides push developers toward cloud services that require account setup and API keys, or toward local stacks that depend on multiple services before you write any useful code.

This overhead appears before the feature is even built, shifting effort away from API development and toward infrastructure setup.

This tutorial takes a simpler approach. Actian VectorAI DB runs as a single Docker container and connects through a Python or a JavaScript client. For early development and prototyping, you do not need a cloud account, API keys, or multiple services. Install the database and client to begin building your API. Compared to setups that rely on managed services or multi-container stacks, this reduces setup time, costs, and moving parts.

In this tutorial, you will build a small product search API to make this concrete. The goal is to let users search by meaning instead of exact keywords. For example, a query like “something warm to wear in winter” should return jackets or sweaters, even if those exact words do not appear in the product description.

By the end of this tutorial, you will have a working FastAPI application with semantic search running locally, using a setup you can run, test, and extend without extra infrastructure.

Configuración

In this section, you will start VectorAI DB and set up your FastAPI project using uv. At the end, you will have a running vector database and a Python environment ready to connect to it.

Requisitos previos

Para seguir los pasos, instala las siguientes herramientas en tu red local:

  • Docker y Docker Compose
  • Python 3.10 o superior
  • PIP or uv (This guide uses uv)

Start Actian VectorAI DB

Run VectorAI DB as a single Docker container. Create a docker-compose.yml file with the following content:

services:
  vectorai:
    image: williamimoh/actian-vectorai-db:latest
    platform: linux/amd64
    container_name: vectorai_db
    ports:
      - "50051:50051"
    volumes:
      # vector data persists across restarts
      - ./data:/app/data
    environment:
      - VECTORAI_LOG_LEVEL=info
    restart: unless-stopped

Start the service:

docker compose up -d

You now have VectorAI DB running locally on port 50051.

Create Your FastAPI Project With Uv

Now set up your Python project. If you do not have uv installed, install it first.

Initialize a new project in your current directory by running this command:

uv init .
uv venv

Install FastAPI, the VectorAI DB Python client, and the embedding model dependency:

uv add fastapi uvicorn sentence-transformers

Sign up for the Actian VectorAI DB community edition. Once registered, you will receive instructions for setting up the client, either by downloading the binary or by running it with a Docker container, depending on your preference.

Install VectorAI DB Python SDK by:

uv add actian-vectorai-client

Verify the Connection

Create a simple script test_connection.py to check that your app connects to VectorAI DB:

from actian_vectorai import VectorAIClient

VECTORAI_HOST = "localhost:50051"

with VectorAIClient(VECTORAI_HOST) as client:

    # Health check

    info = client.health_check()

    print(f"Connected to {info['title']} v{info['version']}")

Test this connection by:

uv run test_connection.py

If everything works, you should see a successful connection message.

uv run test

Test the connection

You are now ready to define your data model and start building your API.

Build the Ingest Endpoint

You will build a POST /ingest endpoint that accepts a list of products, embeds each description, and stores the results in VectorAI DB. This enables you to send a JSON list of products to your API and have them ready to search.

Start with the data model and collection setup. Create a file main.py with the following contents:

from fastapi import FastAPI

from pydantic import BaseModel

from sentence_transformers import SentenceTransformer

from actian_vectorai import VectorAIClient, VectorParams, Distance, PointStruct, CollectionExistsError

from typing import List

from contextlib import asynccontextmanager

COLLECTION = "products_collection"

DIMENSION = 384  # all-MiniLM-L6-v2 produces 384-dimensional vectors

model = SentenceTransformer("all-MiniLM-L6-v2")

class Product(BaseModel):

    id: int

    name: str

    description: str

    category: str

    price: float

@asynccontextmanager
async def lifespan(app: FastAPI):
    # Startup
    with VectorAIClient("localhost:50051") as client:
        try:
            client.collections.create(
                COLLECTION,
                vectors_config=VectorParams(size=DIMENSION, distance=Distance.Cosine)
            )
        except CollectionExistsError:
            pass
    yield

app = FastAPI(lifespan=lifespan)

Use all-MiniLM-L6-v2 . It’s a 22.7M-parameter model that runs on CPU and produces 384-dimensional vectors. It delivers sufficient speed for local development and strong accuracy for product search. The collection uses cosine distance, which measures the angle between vectors rather than their magnitude. This works well for text embeddings because it focuses on meaning rather than the length of the description.

Now build the ingest endpoint.

@app.post("/ingest")

def ingest(products: List[Product]):

    descriptions = [p.description for p in products]

    embeddings = model.encode(descriptions, convert_to_numpy=True)

    points = [

        PointStruct(

            id=p.id,

            vector=embeddings[i].tolist(),

            payload={

                "name": p.name,

                "category": p.category,

                "price": p.price,

            }

        )

        for i, p in enumerate(products)

    ]

    with VectorAIClient("localhost:50051") as client:

        client.points.upsert(COLLECTION, points)

    return {"inserted": len(points)}

This does the following:

  • Extracts all product descriptions into a list so you can embed them in one batch.
  • Generates embeddings using the all-MiniLM-L6-v2 model.
  • Converts each product into a PointStruct with an id, vector, and payload.
  • Stores useful metadata like name, category, and price alongside the vector.
  • Inserts all points into VectorAI DB in a single request.

Build the Search Endpoint

To get the top five most similar products, the implementation includes a GET /search endpoint that accepts a text query and an optional category filter. The endpoint embeds the query, performs a vector search in VectorAI DB, and returns the most relevant results.

This step completes the search flow, moving from raw text input to semantically ranked results in a single request.

Why this filter matters

The category filter is where VectorAI DB improves the developer experience. With FAISS, you typically run a full vector search first, then filter results in Python. That means you waste compute on results you will discard.

VectorAI DB filters inside the search operation. It considers only matching vectors during retrieval. This keeps search efficient and simpler to reason about.

Implement the endpoint

Add the following imports and endpoint to your main.py file:

from actian_vectorai import FilterBuilder, Field

@app.get("/search")

def search(query: str, category: str = None, top_k: int = 5):

    query_vector = model.encode([query], convert_to_numpy=True)[0].tolist()

    search_filter = None

    if category:

        search_filter = FilterBuilder().must(Field("category").eq(category)).build()

    with VectorAIClient("localhost:50051") as client:

        results = client.points.search(

            COLLECTION,

            vector=query_vector,

            limit=top_k,

            filter=search_filter

        )

    return [

        {

            "id": r.id,

            "score": round(r.score, 4),

            "name": r.payload["name"],

            "category": r.payload["category"],

            "price": r.payload["price"],

        }

        for r in results

    ]

Cómo funciona

  • Convert the user query into an embedding.
  • Build a filter only if a category is provided.
  • Pass both the vector and filter to VectorAI DB in a single request.
  • Return the top results with similarity scores and metadata.

The filter objects are typed Python objects. This helps FastAPI catch mistakes early and keeps your query structure consistent.

When category is not provided, search_filter stays None. The search then runs across all products.

Run It

Start VectorAI DB, then start FastAPI. Two commands.

docker-compose up -d

uv run uvicorn main:app –reload

In another terminal window, ingest three sample products:

curl -X POST http://localhost:8000/ingest \

  -H "Content-Type: application/json" \

  -d '[

    {"id": 1, "name": "Cashmere Scarf", "description": "Soft cashmere scarf, ideal for cold weather", "category": "clothing", "price": 49.00},

    {"id": 2, "name": "Bluetooth Speaker", "description": "Portable waterproof speaker with 12-hour battery", "category": "electronics", "price": 59.99},

    {"id": 3, "name": "Trail Mix", "description": "Mixed nuts and dried fruit, high-energy snack", "category": "food", "price": 8.50}

  ]'

You get a response:

{“inserted”: 3}

Run a semantic search with a category filter:

curl "http://localhost:8000/search?query=something+warm+for+winter&category=clothing"

You should have the following response:

curl response

Search query with category

Run the same query without the filter to see all categories ranked by relevance:

curl -s "http://localhost:8000/search?query=something+warm+for+winter" | jq

The scarf still scores highest. The speaker and trail mix return with lower scores because their descriptions carry less semantic overlap with the query.

search query without category

Search query without category

So far, you have built the architecture as shown in the image.

high-level vector search

High-level architectural diagram

Let’s look at what to build next.

What to Build Next

This setup works well for local development and small workloads. The next step is to prepare it for higher throughput and more advanced search patterns.

Use the async client for production workloads

The current implementation uses a synchronous client, which works well for local development and small tests but limits throughput as concurrency increases. Each database call blocks the request thread until completion, reducing overall performance under load.

VectorAI DB provides an AsyncVectorAIClient that removes this bottleneck. It integrates with FastAPI’s async model, allowing endpoints to handle other requests while awaiting database operations. This approach is especially important for concurrent searches, batch ingestion, and background data updates.

In a production setup, this change also pairs well with running multiple uvicorn workers. Together, they allow your service to scale horizontally without changing your core logic.

Add hybrid search

Vector similarity alone works well when the query is vague or descriptive, because it focuses on meaning rather than exact words. However, it can struggle in cases where precision matters, especially with product names, model numbers, or branded terms.

VectorAI DB combines vector similarity with keyword matching in a single query. The vector component captures semantic meaning, while the keyword component ensures exact terms are not lost in translation. Both signals contribute to the final ranking.

Hybrid search helps in real-world scenarios where users mix intent and precision in the same query. For example, a search like “Nike running shoes size 42” contains both semantic intent (running shoes for sports use) and exact constraints (Nike and size 42). Hybrid search ensures you do not lose either signal, which improves result quality without adding extra logic on your side.

Extend filtering and ranking logic

Right now, your search only filters by category, which works for a simple product demo. In real applications, filtering becomes more dynamic because users expect to narrow results based on multiple conditions.

You can extend your filters to support things like price ranges, stock availability, brand names, or any other structured attribute in your dataset. VectorAI DB’s typed filter system makes this easier because you build filters as structured Python objects rather than ad-hoc query dictionaries. This reduces errors and keeps your search logic consistent with the rest of your FastAPI application.

As the dataset grows, these filters become essential for controlling result space before ranking. Vector search continues to handle relevance, while filtering narrows the dataset to records that match user constraints.

Conclusión

In this tutorial, you built a complete semantic search system using FastAPI and VectorAI DB. You moved from a blank project to a working API that can understand natural language queries and return relevant results based on meaning, not just keywords.

You created an ingestion pipeline that converts product descriptions into embeddings using all-MiniLM-L6-v2 and stores them in a vector collection with metadata. You then built a search endpoint that embeds user queries and retrieves the most relevant products using vector similarity, with optional filtering for structured constraints like category.

Adding vector search to a FastAPI application does not require a complex stack or external cloud service. VectorAI DB runs as a single container and integrates directly through a Python client, making it easy to move from setup to implementation quickly.

From here, you can extend the system with async clients, hybrid search, and richer filtering logic as your application grows into more production-ready workloads.

Refer to the Vector AI DB documentation and GitHub repository for updates and implementation details.

Sign up for the Actian VectorAI DB Community Edition and begin building today.