Your AI strategy is only as good as your vector index

Andrew Fletcher published: 29 September 2025 (updated) 7 October 2025 3 minutes read

Organisations are rushing to build internal AI tools that let teams query their own data. Yet many projects stall when the system returns empty or unreliable results. One of the most common and least understood, culprits is the vector index.

If your engineering logs show warnings such as “No Atlas vector index named ‘vec_idx’. Skipping. Vector search returned no candidates”, your platform is likely missing a core building block. The issue isn’t your data or your AI model, it’s how you’ve set up your database to support semantic search.

Why vector indexes matter

Traditional database indexes often called B-tree or regular indexes are optimised for exact matches (e.g., finding all records where “ProjectNumber = 2025-001”). AI systems, however, rely on vector search: a way of comparing dense numerical representations (“embeddings”) of text to find semantically similar content.

MongoDB Atlas, one of the most widely used cloud databases, supports this through Atlas Search. But it’s not automatic. A vector search requires a Search/Vector index a specialised configuration that knows how to handle high-dimensional data (such as 768-dimensional OpenAI or Nomic embeddings) and calculate cosine similarity.

Without the right index, the system can’t search meaningfully, so it falls back to returning nothing.

A pragmatic fix for teams

If your logs show vector search returning no candidates, the fastest way to recover is to:

1. Create an Atlas Search index

In the Atlas UI, open your cluster and go to the Atlas Search tab
Create a new Search Index for your database and collection (for example, `projects`)
Give it a clear name most teams use something like `vec_idx`
Set the mapping to match your embeddings:

    {
      "mappings": {
        "dynamic": false,
        "fields": {
          "nomic_embed_unit": {
            "type": "knnVector",
            "dimensions": 768,
            "similarity": "cosine"
          }
        }
      }
    }

Ensure the `dimensions` field matches the size of your model’s embeddings (768 is common).

2. Adjust your configuration if an index already exists
Many teams discover an index was created but given a different name (often “default”). If so, simply update your configuration file so the code points to the correct index:

mongo_vector_index_name = "default"

This change avoids downtime and rebuilds trust in your search results.

Create an Atlas Search index named text_idx per collection with at least:

{
 "mappings": {
   "dynamic": false,
   "fields": {
     "Title":     {"type": "string"},
     "Tags":      {"type": "string"},
     "Category":  {"type": "string"},
     "chunk_text":{"type": "string"}
   }
 }
}

3. Test and validate early
Use the Atlas Aggregations UI or `mongosh` to run a quick `$vectorSearch` query and confirm results are returning with a `score`. Add simple logging to your code to list available search indexes when something is misconfigured. This turns silent failures into actionable alerts.

The leadership takeaway

For executives sponsoring AI initiatives, this is more than a technical detail, it’s a trust issue. When internal AI tools return empty or irrelevant answers, users disengage and question the investment. The fix is straightforward but easily missed if teams assume standard indexes are enough.

Leaders should ensure their data platform teams:

Understand the distinction between traditional indexes and vector indexes
Monitor AI pipelines for silent failures (e.g., empty `doc_lines` or “no candidates” logs)
Build resilience: automatic fallbacks, clearer diagnostics, and basic search when the vector path fails.

From empty results to meaningful insights

Many AI projects stall not because the model is weak or the data is poor, but because the infrastructure isn’t configured for semantic retrieval. By prioritising the right index and a clear fallback strategy, organisations can quickly restore user trust and unlock the insights their AI investments promise.

Andrew Fletcher • 07 Dec 2024

Navigating technical infrastructure hiccups when running Python packages in virtual environments

AI
Python

Seemingly minor technical misconfigurations can escalate into major organisational inefficiencies. Consider a scenario where a Python-based web application experiences repeated errors due to missing dependencies, incorrect permissions, and environment mismanagement. Although these challenges appear...

Andrew Fletcher • 19 Nov 2024

How to resolve issues with Python and virtual environments in pyenv

Python
AI

For developers working with Python, setting up and managing environments can sometimes lead to frustrating terminal errors. If you’ve encountered issues like the `python: command not found` error or struggled to create a virtual environment, this guide walks through resolving these common problems...

Andrew Fletcher • 15 Nov 2024

Understanding Python transformers logging levels

Python
AI

Logging levels determine the severity or importance of the messages that are logged. They help in filtering the logs based on the desired granularity. The transformers library defines several logging levels, each serving a specific purpose. 1. DEBUGDescription: Detailed information, typically...

Why vector indexes matter

A pragmatic fix for teams

The leadership takeaway

From empty results to meaningful insights

Related articles