Vector Search

Vector search enables semantic similarity searches across diverse data types such as documents, images, audio, and video. By leveraging your MySQL expertise, you can build scalable AI applications with advanced search functionality.

Get started

To get started with TiDB vector search, refer to the following tutorials:

Auto Embedding

The Auto Embedding feature lets you perform vector searches directly with plain text, without providing your own vectors. With this feature, you can insert text data directly and perform semantic searches using text queries, while TiDB automatically converts the text into vectors behind the scenes.

Currently, TiDB supports various embedding models, such as Amazon Titan, Cohere, Jina AI, OpenAI, Gemini, Hugging Face, and NVIDIA NIM. You can choose the one that best fits your needs. For more information, see Auto Embedding Overview.

Integrations

To accelerate your development, you can integrate TiDB vector search with popular AI frameworks (such as LlamaIndex and LangChain), embedding services (such as Jina AI), and ORM libraries (such as SQLAlchemy, Peewee, and Django ORM). You can choose the one that best fits your needs.

For more information, see Vector Search Integration Overview.

Text search

Unlike vector search, which focuses on semantic similarity, full-text search lets you retrieve documents for exact keywords.

To improve the retrieval quality in RAG scenarios, you can combine vector search with full-text search.

Scenario	Documentation
Perform keyword-based search using SQL.	Full-Text Search with SQL
Implement full-text search in Python applications.	Full-Text Search with Python
Combine vector and full-text search for better results.	Hybrid Search

Improve performance

To optimize the performance of your vector search queries, you can follow a series of best practices, such as adding vector indexes, monitoring index build progress, reducing dimensions, excluding vector columns, and warming up indexes.

For more information about these best practices, see Improve Vector Search Performance.

Limitations

Before implementing vector search, be aware of the following limitations:

Maximum 16383 dimensions per vector
Vector columns cannot be primary keys, unique indexes, or partition keys
No direct casting between vector and other data types (use string as intermediate)

For a complete list, see Vector Search Limitations.