Vespa

Vespa.ai operates Vespa Cloud - used by companies to run Big Data serving with AI, online. We maintain the Vespa open-source project, continuously released and used by organizations with high performance, availability, and functional requirements. We are hiring! See the Jobs page, or visit our website.

Agent Infrastructure

Visit Website

Founded

2023

Location

Trondheim

Employees

Funding

OSS

Vespa — Open-Source Search and Vector Engine for Large-Scale AI

Overview

**Vespa** is an open-source search and vector engine purpose-built for production AI workloads. It stores and serves vectors, text, and structured data in one system, then ranks results using on-node machine learning for ultra‑low latency at scale. Use it for **hybrid retrieval**, **RAG**, **recommendation**, and **personalization**—self-managed or on the managed **Vespa Cloud**.

Open source core: [Vespa](https://vespa.ai/)

Managed service: [Vespa Cloud](https://cloud.vespa.ai/)

Originating from Yahoo’s search stack and open-sourced in 2017, Vespa is maintained by Vespa.ai (HQ Trondheim, Norway) and operated by a senior, compact team.

Why Vespa

Vespa’s pitch: a search platform with native vector support beats a pure vector database for production AI. You get ANN search with filters, text retrieval, aggregations, custom ranking features, and fresh updates in one system—plus **on-node inference** to cut glue code and latency. Notably, **Perplexity** brought its search in-house with Vespa to scale more effectively .

Benchmark vs Elasticsearch: up to 12.9x faster vector search and lower cost

Designed for billions of items, real-time updates, and strict latency budgets

Key Capabilities

**Hybrid retrieval (sparse + dense)** with ANN, BM25/text, metadata filtering, grouping, and aggregations

**HNSW ANN with multi‑vector indexing** for improved recall/precision and passage-level retrieval

**On-node ML inference** in ranking: serve ONNX and TensorFlow models; supports LTR with XGBoost/LightGBM

**Flexible querying with YQL**: filters, re‑ranking, grouping, aggregations, custom rank profiles

**Real-time freshness**: stream or batch feed, immediate availability, schema with tensors and structured fields

**Scale and reliability**: distributed, stateful serving and storage with horizontal scaling

Primary Use Cases

**Enterprise and site search** with hybrid retrieval and re‑ranking

**RAG over large knowledge bases** with filters and aggregations

**Recommendations and personalization** with on‑node models

**E‑commerce search** with query understanding and freshness

**Document/PDF retrieval** with multi‑vector or ColBERT‑style patterns

Who It’s For

Teams shipping RAG, search, or recommendation with strict latency and freshness requirements

Companies operating at billion‑document scale with heavy metadata filtering

Engineering groups that want **ranking and inference on the data nodes** (not in app code)

Orgs that prefer open source with a managed path via Vespa Cloud

Integrations and Ecosystem

SDKs and APIs: REST/HTTP, Java, and **PyVespa**

RAG frameworks: **LangChain** retriever/vector store and **LlamaIndex** vector store

Model formats: **ONNX**, **TensorFlow**, plus XGBoost/LightGBM for LTR

Sample apps and notebooks: [Vespa sample apps](https://github.com/vespa-engine/sample-apps)

Deployment, Pricing, and Free Trial

Self‑managed open source: [Vespa](https://vespa.ai/)

Managed service with usage‑based pricing: [Vespa Cloud](https://cloud.vespa.ai/)

Price calculator: [Cloud pricing](https://cloud.vespa.ai/price-calculator)

Free trial (14 days, promotions like $300 credits appear periodically): [Free trial](https://vespa.ai/free-trial/), [Product terms](https://vespa.ai/product-terms/), [Cloud FAQ](https://cloud.vespa.ai/en/faq), [AWS Marketplace listing](https://aws.amazon.com/marketplace/pp/prodview-5pkxkencasnoo)

Proof Points and References

Product overview: [Homepage](https://vespa.ai/), [Features](https://vespa.ai/features/), [Use cases](https://vespa.ai/use-cases/)

Technical deep dives: [Hybrid search deep dive](https://blog.vespa.ai/redefining-hybrid-search-possibilities-with-vespa/), [Multi‑vector HNSW](https://blog.vespa.ai/semantic-search-with-multi-vector-indexing/), [ANN guide](https://docs.vespa.ai/en/nearest-neighbor-search-guide.html)

Getting started: [Quickstart](https://docs.vespa.ai/en/getting-started.html), [Hybrid search tutorial](https://docs.vespa.ai/en/tutorials/hybrid-search.html)

Notable adoption: [Perplexity partnership](https://vespa.ai/perplexity-partners-with-vespa-ai-to-bring-its-search-function-in-house/)

Comparative benchmark: [Elasticsearch alternative](https://vespa.ai/elasticsearch-alternative/)

User Sentiment Snapshot

Pros

Production‑grade hybrid search with filters, text, and vectors in one system

Strong performance and multi‑vector support

Effective for billion‑scale with robust metadata filtering

Complete vector search engine for production use

Hybrid search and re‑ranking work well in practice

Helpful free credits to get started

Cons

Smaller community and fewer third‑party tutorials than some vector DBs

Heavier setup/ops than simple vector stores at small scale

Deep documentation but a meaningful learning curve

Desire for more out‑of‑the‑box integrations/examples

Technical Highlights

ANN: **HNSW** with per‑document and multi‑vector support

Data model: tensors, sparse text, and structured fields in one schema

Query: **YQL**, filters, grouping, aggregations, and custom rank profiles

Inference: **ONNX**/**TensorFlow** served on data nodes for low‑latency ranking

Hybrid: combine dense/sparse signals; re‑rank with learned models

Freshness: streaming/batch feeds with near‑immediate availability

Getting Started

Start building: [Getting started](https://docs.vespa.ai/en/getting-started.html)

Explore hybrid and RAG patterns: [Hybrid search tutorial](https://docs.vespa.ai/en/tutorials/hybrid-search.html), [RAG blueprint](https://docs.vespa.ai/en/tutorials/rag-blueprint.html)

Try managed: [Vespa Cloud free trial](https://vespa.ai/free-trial/)

Company Snapshot

Size: ~57 employees

Location: Trondheim, Norway

Profile: [LinkedIn](https://www.linkedin.com/company/vespa-ai)

---

In short: Vespa is a production‑ready, open‑source search and vector engine that unifies vectors, text, and structured data with on‑node ML for fast, scalable hybrid search, RAG, and recommendations—self‑managed or fully managed on Vespa Cloud.

Related Companies

Arcade

Baseten

Inference is everything. Baseten is an AI infrastructure platform giving you the tooling, expertise, and hardware needed to bring great AI products to market - fast. Our proprietary Inference Stack utilizes the cutting-edge of performance research combined with highly performant and reliable infrastructure to give you out-of-the-box global availability with 99.99% of uptime.

Cast AI

Increase your profit margin without additional work. CAST AI cuts your cloud bill in half, automates DevOps tasks, and prevents downtime in one Autonomous Kubernetes platform.

Ciroos

Ciroos (pronounced "Sai-rose") offers an AI SRE teammate that empowers site reliability engineers (SREs), DevOps and operations teams to be superheroes. Built from the ground up with the power of multi-agentic AI, Ciroos enables operations teams to reduce toil, investigate incidents, explain anomalies, and drive autonomous operations, across complex multi-domain environments, all while leaving humans in control. Reach out to us at www.ciroos.ai to learn more about what an AI SRE Teammate can do for you.

Context.ai

Context is the first AI Office Suite that automates your workflow by creating documents, presentations, spreadsheets, and more using your data, tools, and style.

Databricks Mosaic AI

Databricks is the Data and AI company. More than 15,000 organizations worldwide — including Block, Comcast, Condé Nast, Rivian, Shell and over 60% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to take control of their data and put it to work with AI. Databricks is headquartered in San Francisco, with offices around the globe, and was founded by the original creators of Lakehouse, Apache Spark, Delta Lake and MLflow. --- Databricks applicants Please apply through our official Careers page at databricks.com/company/careers. All official communication from Databricks will come from email addresses ending with @databricks.com or @goodtime.io (our meeting tool).