Brixo
Skip to main content
Back to Agent Infrastructure
Vespa logo

Vespa

Vespa.ai operates Vespa Cloud - used by companies to run Big Data serving with AI, online. We maintain the Vespa open-source project, continuously released and used by organizations with high performance, availability, and functional requirements. We are hiring! See the Jobs page, or visit our website.

Visit Website

Founded

2023

Location

Trondheim

Employees

57

Funding

OSS

Vespa — Open-Source Search and Vector Engine for Large-Scale AI

Overview

**Vespa** is an open-source search and vector engine purpose-built for production AI workloads. It stores and serves vectors, text, and structured data in one system, then ranks results using on-node machine learning for ultra‑low latency at scale. Use it for **hybrid retrieval**, **RAG**, **recommendation**, and **personalization**—self-managed or on the managed **Vespa Cloud**.

  • Open source core: [Vespa](https://vespa.ai/)
  • Managed service: [Vespa Cloud](https://cloud.vespa.ai/)
  • Originating from Yahoo’s search stack and open-sourced in 2017, Vespa is maintained by Vespa.ai (HQ Trondheim, Norway) and operated by a senior, compact team.

    Why Vespa

    Vespa’s pitch: a search platform with native vector support beats a pure vector database for production AI. You get ANN search with filters, text retrieval, aggregations, custom ranking features, and fresh updates in one system—plus **on-node inference** to cut glue code and latency. Notably, **Perplexity** brought its search in-house with Vespa to scale more effectively .

  • Benchmark vs Elasticsearch: up to 12.9x faster vector search and lower cost
  • Designed for billions of items, real-time updates, and strict latency budgets
  • Key Capabilities

  • **Hybrid retrieval (sparse + dense)** with ANN, BM25/text, metadata filtering, grouping, and aggregations
  • **HNSW ANN with multi‑vector indexing** for improved recall/precision and passage-level retrieval
  • **On-node ML inference** in ranking: serve ONNX and TensorFlow models; supports LTR with XGBoost/LightGBM
  • **Flexible querying with YQL**: filters, re‑ranking, grouping, aggregations, custom rank profiles
  • **Real-time freshness**: stream or batch feed, immediate availability, schema with tensors and structured fields
  • **Scale and reliability**: distributed, stateful serving and storage with horizontal scaling
  • Primary Use Cases

  • **Enterprise and site search** with hybrid retrieval and re‑ranking
  • **RAG over large knowledge bases** with filters and aggregations
  • **Recommendations and personalization** with on‑node models
  • **E‑commerce search** with query understanding and freshness
  • **Document/PDF retrieval** with multi‑vector or ColBERT‑style patterns
  • Who It’s For

  • Teams shipping RAG, search, or recommendation with strict latency and freshness requirements
  • Companies operating at billion‑document scale with heavy metadata filtering
  • Engineering groups that want **ranking and inference on the data nodes** (not in app code)
  • Orgs that prefer open source with a managed path via Vespa Cloud
  • Integrations and Ecosystem

  • SDKs and APIs: REST/HTTP, Java, and **PyVespa**
  • RAG frameworks: **LangChain** retriever/vector store and **LlamaIndex** vector store
  • Model formats: **ONNX**, **TensorFlow**, plus XGBoost/LightGBM for LTR
  • Sample apps and notebooks: [Vespa sample apps](https://github.com/vespa-engine/sample-apps)
  • Deployment, Pricing, and Free Trial

  • Self‑managed open source: [Vespa](https://vespa.ai/)
  • Managed service with usage‑based pricing: [Vespa Cloud](https://cloud.vespa.ai/)
  • Price calculator: [Cloud pricing](https://cloud.vespa.ai/price-calculator)
  • Free trial (14 days, promotions like $300 credits appear periodically): [Free trial](https://vespa.ai/free-trial/), [Product terms](https://vespa.ai/product-terms/), [Cloud FAQ](https://cloud.vespa.ai/en/faq), [AWS Marketplace listing](https://aws.amazon.com/marketplace/pp/prodview-5pkxkencasnoo)
  • Proof Points and References

  • Product overview: [Homepage](https://vespa.ai/), [Features](https://vespa.ai/features/), [Use cases](https://vespa.ai/use-cases/)
  • Technical deep dives: [Hybrid search deep dive](https://blog.vespa.ai/redefining-hybrid-search-possibilities-with-vespa/), [Multi‑vector HNSW](https://blog.vespa.ai/semantic-search-with-multi-vector-indexing/), [ANN guide](https://docs.vespa.ai/en/nearest-neighbor-search-guide.html)
  • Getting started: [Quickstart](https://docs.vespa.ai/en/getting-started.html), [Hybrid search tutorial](https://docs.vespa.ai/en/tutorials/hybrid-search.html)
  • Notable adoption: [Perplexity partnership](https://vespa.ai/perplexity-partners-with-vespa-ai-to-bring-its-search-function-in-house/)
  • Comparative benchmark: [Elasticsearch alternative](https://vespa.ai/elasticsearch-alternative/)
  • User Sentiment Snapshot

    Pros

  • Production‑grade hybrid search with filters, text, and vectors in one system
  • Strong performance and multi‑vector support
  • Effective for billion‑scale with robust metadata filtering
  • Complete vector search engine for production use
  • Hybrid search and re‑ranking work well in practice
  • Helpful free credits to get started
  • Cons

  • Smaller community and fewer third‑party tutorials than some vector DBs
  • Heavier setup/ops than simple vector stores at small scale
  • Deep documentation but a meaningful learning curve
  • Desire for more out‑of‑the‑box integrations/examples
  • Technical Highlights

  • ANN: **HNSW** with per‑document and multi‑vector support
  • Data model: tensors, sparse text, and structured fields in one schema
  • Query: **YQL**, filters, grouping, aggregations, and custom rank profiles
  • Inference: **ONNX**/**TensorFlow** served on data nodes for low‑latency ranking
  • Hybrid: combine dense/sparse signals; re‑rank with learned models
  • Freshness: streaming/batch feeds with near‑immediate availability
  • Getting Started

  • Start building: [Getting started](https://docs.vespa.ai/en/getting-started.html)
  • Explore hybrid and RAG patterns: [Hybrid search tutorial](https://docs.vespa.ai/en/tutorials/hybrid-search.html), [RAG blueprint](https://docs.vespa.ai/en/tutorials/rag-blueprint.html)
  • Try managed: [Vespa Cloud free trial](https://vespa.ai/free-trial/)
  • Company Snapshot

  • Size: ~57 employees
  • Location: Trondheim, Norway
  • Profile: [LinkedIn](https://www.linkedin.com/company/vespa-ai)
  • ---

    In short: Vespa is a production‑ready, open‑source search and vector engine that unifies vectors, text, and structured data with on‑node ML for fast, scalable hybrid search, RAG, and recommendations—self‑managed or fully managed on Vespa Cloud.

    Related Companies

    Arcade logo

    Arcade

    Baseten logo

    Baseten

    Inference is everything. Baseten is an AI infrastructure platform giving you the tooling, expertise, and hardware needed to bring great AI products to market - fast. Our proprietary Inference Stack utilizes the cutting-edge of performance research combined with highly performant and reliable infrastructure to give you out-of-the-box global availability with 99.99% of uptime.

    Cast AI logo

    Cast AI

    Increase your profit margin without additional work. CAST AI cuts your cloud bill in half, automates DevOps tasks, and prevents downtime in one Autonomous Kubernetes platform.

    Ciroos logo

    Ciroos

    Ciroos (pronounced "Sai-rose") offers an AI SRE teammate that empowers site reliability engineers (SREs), DevOps and operations teams to be superheroes. Built from the ground up with the power of multi-agentic AI, Ciroos enables operations teams to reduce toil, investigate incidents, explain anomalies, and drive autonomous operations, across complex multi-domain environments, all while leaving humans in control. Reach out to us at www.ciroos.ai to learn more about what an AI SRE Teammate can do for you.

    Context.ai logo

    Context.ai

    Context is the first AI Office Suite that automates your workflow by creating documents, presentations, spreadsheets, and more using your data, tools, and style.

    Databricks Mosaic AI logo

    Databricks Mosaic AI

    Databricks is the Data and AI company. More than 15,000 organizations worldwide — including Block, Comcast, Condé Nast, Rivian, Shell and over 60% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to take control of their data and put it to work with AI. Databricks is headquartered in San Francisco, with offices around the globe, and was founded by the original creators of Lakehouse, Apache Spark, Delta Lake and MLflow. --- Databricks applicants Please apply through our official Careers page at databricks.com/company/careers. All official communication from Databricks will come from email addresses ending with @databricks.com or @goodtime.io (our meeting tool).