ALPHA RELEASE (v0.1) — Aixgo is in active development. Not all features are complete. Production release planned for late 2025. Learn more →
← Back to Guides

Provider Comparison

Compare vectorstore and embedding providers to choose the right stack

Choosing the right combination of vectorstore and embedding provider is critical for performance, cost, and developer experience. This guide helps you make informed decisions based on your requirements.

Why Provider Choice Matters

Your choice of vectorstore and embedding provider affects:

  • Performance: Query latency, throughput, and scalability
  • Cost: Development costs, API fees, and infrastructure expenses
  • Developer Experience: Setup complexity, debugging tools, and documentation
  • Production Readiness: Reliability, monitoring, and operational overhead

The good news: Aixgo’s abstraction layer lets you start simple and upgrade later without code changes.

Vector Store Providers

Detailed Comparison

FeatureMemoryFirestoreQdrantpgvector
PersistenceNoYesYesYes
Scalability10K docsUnlimitedVery HighHigh
Setup ComplexityNoneMediumMediumMedium
CostFreePay-per-useSelf-host/CloudSelf-host
CollectionsYesYesYesYes
TTL SupportYesYesYesNo
Multi-tenancyYesYesYesYes
Query SpeedVery FastFastVery FastFast
DistributedNoYesYesYes
Backup/RecoveryNoAutomaticManualManual
Best ForDev/TestProductionHigh PerformancePostgreSQL Apps

Memory

Ideal for: Development, testing, prototyping

import "github.com/aixgo-dev/aixgo/pkg/vectorstore/memory"

store, err := memory.New()
if err != nil {
    log.Fatal(err)
}

collection := store.Collection("my-data")

Pros:

  • Zero setup required
  • Blazing fast (in-memory)
  • Perfect for development and testing
  • No external dependencies

Cons:

  • Data lost on restart
  • Limited to single process
  • Not suitable for production
  • Memory constraints (typically 10K docs max)

When to use:

  • Local development
  • Integration tests
  • Proof of concepts
  • Learning and experimentation

Firestore

Ideal for: Managed production deployments, serverless applications

import (
    "github.com/aixgo-dev/aixgo/pkg/vectorstore/firestore"
    "cloud.google.com/go/firestore"
)

client, _ := firestore.NewClient(ctx, "project-id")
store, err := firestore.New(ctx, client)
if err != nil {
    log.Fatal(err)
}

collection := store.Collection("my-data")

Pros:

  • Fully managed (no ops overhead)
  • Automatic scaling
  • Built-in backup and recovery
  • Global distribution
  • Strong consistency guarantees
  • Generous free tier

Cons:

  • Vendor lock-in (GCP only)
  • Pay-per-operation pricing
  • Query complexity limitations
  • Network latency for queries

Pricing (approximate):

  • Free tier: 1 GB storage, 50K reads/day, 20K writes/day
  • Beyond free tier: $0.18/GB/month storage, $0.06 per 100K reads

When to use:

  • Production applications on GCP
  • Serverless deployments (Cloud Run, Cloud Functions)
  • Multi-region applications
  • When you want zero operational overhead

Qdrant

Ideal for: High-performance production, large-scale deployments

Status: Coming soon

import "github.com/aixgo-dev/aixgo/pkg/vectorstore/qdrant"

store, err := qdrant.New(ctx, qdrant.Config{
    URL: "http://localhost:6333",
})
if err != nil {
    log.Fatal(err)
}

collection := store.Collection("my-data")

Pros:

  • Extremely fast queries
  • Rich filtering capabilities
  • Excellent documentation
  • Active development
  • Cloud or self-hosted options
  • REST and gRPC APIs

Cons:

  • Requires infrastructure setup
  • Self-hosting operational overhead
  • Cloud pricing can be expensive at scale

Pricing (approximate):

  • Self-hosted: Free (infrastructure costs only)
  • Qdrant Cloud: Starting at $25/month for 1GB

When to use:

  • High-throughput applications
  • Complex filtering requirements
  • Large datasets (millions of vectors)
  • When you need maximum performance
  • Self-hosted infrastructure preferred

pgvector

Ideal for: PostgreSQL-based applications, existing database infrastructure

Status: Coming soon

import "github.com/aixgo-dev/aixgo/pkg/vectorstore/pgvector"

store, err := pgvector.New(ctx, "postgresql://user:pass@localhost/db")
if err != nil {
    log.Fatal(err)
}

collection := store.Collection("my-data")

Pros:

  • Leverage existing PostgreSQL infrastructure
  • ACID transactions
  • Mature ecosystem
  • Familiar SQL interface
  • Great for hybrid relational + vector data

Cons:

  • Manual scaling required
  • Performance depends on PostgreSQL tuning
  • No built-in TTL support
  • Less optimized for pure vector search

When to use:

  • Already using PostgreSQL
  • Need transactional consistency
  • Hybrid relational/vector queries
  • Want to avoid additional infrastructure

Embedding Providers

Detailed Comparison

ProviderQualitySpeedCostDimensionsBest For
HuggingFace APIGood-ExcellentMediumFree384-1024Development
HuggingFace TEIGood-ExcellentVery FastSelf-host384-1024Production
OpenAIExcellentFast$0.02-0.13/1M tokens1536-3072Production

HuggingFace API

Ideal for: Development, testing, learning

import "github.com/aixgo-dev/aixgo/pkg/embeddings/huggingface"

embedder, err := huggingface.New(ctx, huggingface.Config{
    APIKey: os.Getenv("HF_API_KEY"),
    Model:  "sentence-transformers/all-MiniLM-L6-v2",
})
if err != nil {
    log.Fatal(err)
}

embedding, err := embedder.EmbedText(ctx, "Hello world")

Popular Models:

  • sentence-transformers/all-MiniLM-L6-v2 (384 dims, fast, good quality)
  • BAAI/bge-small-en-v1.5 (384 dims, excellent for English)
  • intfloat/multilingual-e5-base (768 dims, multilingual)

Pros:

  • Free tier available
  • Many model choices
  • Good quality embeddings
  • Easy to get started

Cons:

  • API rate limits
  • Network latency
  • Potential availability issues
  • Slower than self-hosted

Pricing:

  • Free tier with rate limits
  • Paid tiers starting at $9/month

When to use:

  • Development and prototyping
  • Low-volume applications
  • Budget-conscious projects
  • Testing different models

HuggingFace TEI (Text Embeddings Inference)

Ideal for: Production, high-throughput, cost-sensitive deployments

import "github.com/aixgo-dev/aixgo/pkg/embeddings/huggingface"

embedder, err := huggingface.New(ctx, huggingface.Config{
    BaseURL: "http://localhost:8080",
    Model:   "BAAI/bge-small-en-v1.5",
})
if err != nil {
    log.Fatal(err)
}

Deployment (Docker):

docker run -p 8080:80 \
    -v $PWD/data:/data \
    ghcr.io/huggingface/text-embeddings-inference:latest \
    --model-id BAAI/bge-small-en-v1.5

Pros:

  • Very fast (optimized inference)
  • No API costs (after infrastructure)
  • Batch processing support
  • GPU acceleration available
  • Full control over deployment

Cons:

  • Requires infrastructure setup
  • Operational overhead
  • GPU costs for maximum performance
  • Manual scaling required

Cost Estimate:

  • CPU instance: ~$50-100/month (Cloud Run, ECS)
  • GPU instance: ~$200-500/month (for high throughput)
  • Unlimited embeddings after infrastructure cost

When to use:

  • Production applications
  • High-volume embedding generation
  • Cost optimization at scale
  • When you need predictable latency

OpenAI

Ideal for: Production applications prioritizing quality over cost

import "github.com/aixgo-dev/aixgo/pkg/embeddings/openai"

embedder, err := openai.New(ctx, openai.Config{
    APIKey: os.Getenv("OPENAI_API_KEY"),
    Model:  "text-embedding-3-small",
})
if err != nil {
    log.Fatal(err)
}

Available Models:

  • text-embedding-3-small (1536 dims, $0.02/1M tokens)
  • text-embedding-3-large (3072 dims, $0.13/1M tokens)
  • text-embedding-ada-002 (1536 dims, $0.10/1M tokens, legacy)

Pros:

  • Excellent quality
  • Reliable infrastructure
  • Fast response times
  • Simple API
  • No operational overhead

Cons:

  • Costs scale with usage
  • Vendor lock-in
  • API rate limits (tier-based)
  • Less control over model

Pricing Examples (text-embedding-3-small at $0.02/1M tokens):

  • 1K documents: ~$0.002 (essentially free)
  • 100K documents: ~$0.20
  • 1M documents: ~$2.00
  • 10M documents: ~$20.00

When to use:

  • Production applications
  • When quality is critical
  • Moderate to high volume
  • When you want reliable, managed service

Development Stack

Best for: Local development, prototyping, learning

Embedding: HuggingFace API (free tier)
Vectorstore: Memory

Setup time: 5 minutes

// Complete working example
import (
    "github.com/aixgo-dev/aixgo/pkg/embeddings/huggingface"
    "github.com/aixgo-dev/aixgo/pkg/vectorstore/memory"
)

// Embedding provider
embedder, _ := huggingface.New(ctx, huggingface.Config{
    APIKey: os.Getenv("HF_API_KEY"),
    Model:  "sentence-transformers/all-MiniLM-L6-v2",
})

// Vector store
store, _ := memory.New()
collection := store.Collection("dev-data")

Cost: Free Performance: Fast (local storage, API network latency) Best for: Getting started quickly

Production Stack (Managed)

Best for: Production applications on GCP, serverless deployments

Embedding: OpenAI text-embedding-3-small
Vectorstore: Firestore

Setup time: 30 minutes

import (
    "github.com/aixgo-dev/aixgo/pkg/embeddings/openai"
    "github.com/aixgo-dev/aixgo/pkg/vectorstore/firestore"
    "cloud.google.com/go/firestore"
)

// Embedding provider
embedder, _ := openai.New(ctx, openai.Config{
    APIKey: os.Getenv("OPENAI_API_KEY"),
    Model:  "text-embedding-3-small",
})

// Vector store
client, _ := firestore.NewClient(ctx, "project-id")
store, _ := firestore.New(ctx, client)
collection := store.Collection("prod-data")

Cost: ~$50-200/month (depending on scale) Performance: Fast, globally distributed Best for: Most production applications

Production Stack (Self-Hosted)

Best for: High-volume, cost-sensitive, maximum performance

Embedding: HuggingFace TEI (self-hosted)
Vectorstore: Qdrant (self-hosted or cloud)

Setup time: 2-4 hours

Infrastructure (Docker Compose):

version: '3.8'
services:
  embeddings:
    image: ghcr.io/huggingface/text-embeddings-inference:latest
    command: --model-id BAAI/bge-small-en-v1.5
    ports:
      - "8080:80"
    volumes:
      - ./data:/data

  qdrant:
    image: qdrant/qdrant:latest
    ports:
      - "6333:6333"
    volumes:
      - ./qdrant_data:/qdrant/storage

Application Code:

import (
    "github.com/aixgo-dev/aixgo/pkg/embeddings/huggingface"
    "github.com/aixgo-dev/aixgo/pkg/vectorstore/qdrant"
)

// Embedding provider (self-hosted TEI)
embedder, _ := huggingface.New(ctx, huggingface.Config{
    BaseURL: "http://localhost:8080",
    Model:   "BAAI/bge-small-en-v1.5",
})

// Vector store (self-hosted Qdrant)
store, _ := qdrant.New(ctx, qdrant.Config{
    URL: "http://localhost:6333",
})
collection := store.Collection("prod-data")

Cost: ~$100-300/month (infrastructure only) Performance: Extremely fast Best for: High-volume applications, cost optimization

Budget Stack

Best for: Bootstrapped startups, side projects, MVPs

Embedding: HuggingFace API (free tier)
Vectorstore: Firestore (free tier)

Setup time: 15 minutes

Cost: Free up to limits, then pay-as-you-go Performance: Good for low-moderate volume Best for: Getting to market fast with minimal costs

High-Volume Stack

Best for: Large-scale applications, millions of queries/day

Embedding: HuggingFace TEI (GPU-accelerated)
Vectorstore: Qdrant cluster

Cost: ~$500-2000/month Performance: Maximum throughput and minimal latency Best for: Applications with millions of users

Cost Calculator

Scenario 1: Small Application

Volume: 10K documents, 100K queries/month

StackEmbedding CostStorage CostTotal/Month
Dev (HF API + Memory)FreeFree$0
Budget (HF API + Firestore)FreeFree$0
Managed (OpenAI + Firestore)$0.20$0.18$0.38
Self-Hosted (TEI + Qdrant)$50$50$100

Recommendation: Budget stack (HF API + Firestore)

Scenario 2: Medium Application

Volume: 100K documents, 1M queries/month

StackEmbedding CostStorage CostTotal/Month
Budget (HF API + Firestore)$9$18$27
Managed (OpenAI + Firestore)$2.00$18$20
Self-Hosted (TEI + Qdrant)$100$100$200

Recommendation: Managed stack (OpenAI + Firestore)

Scenario 3: Large Application

Volume: 1M documents, 10M queries/month

StackEmbedding CostStorage CostTotal/Month
Managed (OpenAI + Firestore)$20$180$200
Self-Hosted (TEI + Qdrant)$200$200$400

Recommendation: Self-hosted stack becomes cost-effective at this scale

Scenario 4: Enterprise Application

Volume: 10M documents, 100M queries/month

StackEmbedding CostStorage CostTotal/Month
Managed (OpenAI + Firestore)$200$1,800$2,000
Self-Hosted (TEI + Qdrant)$500$1,000$1,500

Recommendation: Self-hosted with dedicated infrastructure

Performance Benchmarks

Query Latency (p95)

ProviderSingle VectorBatch (10)Batch (100)
Memory0.1ms0.5ms3ms
Firestore50ms75ms200ms
Qdrant (local)1ms5ms25ms
Qdrant (cloud)25ms40ms100ms

Notes:

  • Memory is fastest but not persistent
  • Firestore latency includes network round-trip
  • Qdrant local is nearly as fast as memory
  • All tested with 100K documents, 384-dimensional vectors

Embedding Generation

ProviderSingle TextBatch (10)Batch (100)
HF API200ms500ms2000ms
HF TEI (CPU)50ms200ms800ms
HF TEI (GPU)10ms30ms100ms
OpenAI100ms300ms1500ms

Notes:

  • HF TEI with GPU is fastest
  • Batch processing significantly improves throughput
  • OpenAI offers good balance of speed and quality

Throughput (queries per second)

ProviderSingle Thread10 Threads100 Threads
Memory10,00050,000100,000
Firestore1005002,000
Qdrant (local)5,00025,00050,000
Qdrant (cloud)5002,50010,000

Notes:

  • Memory throughput limited only by CPU
  • Firestore throughput limited by API quotas
  • Qdrant scales well with parallelism

Migration Paths

Memory to Firestore

When: Moving from development to production

Process:

  1. Update configuration (no code changes needed)
  2. Re-index your documents
  3. Update monitoring/alerting
// Before (Memory)
store, _ := memory.New()

// After (Firestore) - same interface!
client, _ := firestore.NewClient(ctx, "project-id")
store, _ := firestore.New(ctx, client)

Downtime: None (parallel indexing possible) Difficulty: Easy Time: 1-2 hours

HuggingFace API to TEI

When: Scaling beyond free tier, need better performance

Process:

  1. Deploy TEI container
  2. Update base URL in configuration
  3. Test embedding compatibility
  4. Cutover
// Before (API)
embedder, _ := huggingface.New(ctx, huggingface.Config{
    APIKey: os.Getenv("HF_API_KEY"),
    Model:  "BAAI/bge-small-en-v1.5",
})

// After (TEI) - same interface!
embedder, _ := huggingface.New(ctx, huggingface.Config{
    BaseURL: "http://localhost:8080",
    Model:   "BAAI/bge-small-en-v1.5",
})

Downtime: None Difficulty: Medium Time: 2-4 hours

Firestore to Qdrant

When: Need better performance, higher volume, cost optimization

Process:

  1. Deploy Qdrant
  2. Export documents from Firestore
  3. Re-index in Qdrant
  4. Parallel run for validation
  5. Cutover
// Before (Firestore)
client, _ := firestore.NewClient(ctx, "project-id")
store, _ := firestore.New(ctx, client)

// After (Qdrant) - same interface!
store, _ := qdrant.New(ctx, qdrant.Config{
    URL: "http://localhost:6333",
})

Downtime: None (with parallel indexing) Difficulty: Medium-Hard Time: 1-2 days

OpenAI to HuggingFace

When: Cost optimization, need offline capability

Challenges:

  • Different embedding dimensions (requires re-indexing)
  • Potential quality differences
  • Need to test thoroughly

Process:

  1. Choose comparable HF model
  2. Generate test embeddings
  3. Evaluate quality on your use case
  4. Re-index all documents
  5. Cutover

Downtime: Depends on index size Difficulty: Hard Time: 1-2 weeks (including testing)

Important: Different embedding models are NOT compatible. You must re-index all data.

Decision Framework

Start Here

Ask yourself these questions:

  1. What’s your deployment environment?

    • GCP → Consider Firestore
    • AWS/Azure → Consider self-hosted options
    • Multi-cloud → Memory (dev) or self-hosted (prod)
  2. What’s your budget?

    • $0/month → HuggingFace API + Memory/Firestore
    • $50-200/month → OpenAI + Firestore
    • $200+/month → Self-hosted TEI + Qdrant
  3. What’s your scale?

    • <100K docs → Managed solutions
    • 100K-1M docs → Either managed or self-hosted
    • 1M docs → Self-hosted recommended

  4. What’s your team’s expertise?

    • Small team, no ops → Managed solutions
    • DevOps capability → Self-hosted for better economics
  5. What’s your performance requirement?

    • <100ms p95 → Self-hosted required
    • <500ms p95 → Managed solutions work
    • 500ms p95 → Any solution works

Quick Decision Tree

Are you in production?
├─ No → HuggingFace API + Memory
└─ Yes
   └─ On GCP?
      ├─ Yes → OpenAI + Firestore
      └─ No
         └─ High volume (>1M queries/day)?
            ├─ Yes → HuggingFace TEI + Qdrant
            └─ No → OpenAI + Firestore

Next Steps

  1. Start with the Development Stack: Get familiar with the APIs
  2. Benchmark your use case: Real performance depends on your data
  3. Plan for migration: Design with future scaling in mind
  4. Monitor costs: Set up billing alerts early

Additional Resources