So you want to create AI powered applications with Vector Databases.

Yes, Apps with Gen AI. Even better, with local open source LLMs and custom data (local databases).

You already had a look to projects like PrivateGPT which use embedding and conversational model and know you wonder how to manage those VectorDBs with that local knowledge.

Keep reading if You want to be one of the firsts to use LLMs with your Private Knowledge Base.

VectorDBs and LLMs

Vector databases store and manage data in the form of vectors. Each vector represents a data point in a multidimensional space.


Basically data like text, images, or audio is converted into a numerical vector form using models (like neural networks). These embeddings capture the essence or features of the data.

VectorDBs excel in searching for similar items.

For example, given an image embedding, a vector database can quickly find the most similar images in its storage.

Same applies to text, where we can get semanticaly similar text results.

Why VectorDBs?

  • Handling Complex Data: Ideal for applications dealing with non-traditional data types like images, audio, and natural language. ๐Ÿ–ผ๏ธ๐ŸŽต๐Ÿ“
  • Scalability: They can efficiently handle large-scale datasets, crucial for machine learning and big data applications. ๐Ÿš€๐Ÿ“Š
  • Speed and Accuracy: Provide fast and accurate results for similarity searches, crucial for recommendation systems, image retrieval, etc. โšกโœ”๏ธ
  • AI and Machine Learning Projects: Useful for students working on AI projects, as they often involve dealing with embeddings. ๐Ÿค–๐Ÿ“š

How to use VectorDBs?

We can SelfHost many F/OSS Vector Databases with Docker, but the here point is - How to properly manage the content of such DBs?

We are lucky enough to have VectorAdmin (also F/OSS project) which allow us to manage VectorDBs with UI.

Consider VectorAdmin our frontend fro VectorDBs - Embedd your knowledge once and manage it with UI.

Good news is that can get started pretty quick with VectorAdmin: The frontend of vector databases.

The Vector Admin Project

SelfHosting VectorAdmin with Docker

To make sure that it works for any of you. I prepared this SelfHosting Setup of VectorAdmin with Docker.

Pre-Requisites - Get Docker ๐Ÿ‹

Important step and quite recommended for any SelfHosting Project - Get Docker Installed

It will be one command, this one, if you are in Linux:

apt-get update && sudo apt-get upgrade && curl -fsSL -o
sh && docker version

The Steps that we need are:

git clone ./vector-admin
cd vector-admin
cd docker
cp .env.example .env. #and adjust it

Once you have adjusted the .env, lets build our VectorDB Docker image:

sudo docker-compose up -d --build vector-admin

Now its time to relax, and enjoy your GUI for vector DB’s like: Qdrant, ChromaDB or Pinecone


F/OSS Vector DBs for AI Projects?


ChromaDB is a vector database tailored for efficient storage and retrieval of high-dimensional data.

The AI-native open-source Embedding Database. You will see it everywhere from now. And yes, you can SelfHost ChromaDB

Why ChromaDB as VectorDB?
  • Key Features:
    • Optimized for Similarity Search: Specializes in nearest neighbor search, crucial for tasks like image or voice recognition.
    • High Scalability: Can handle large datasets, which is essential for machine learning and AI-based applications.
  • Use Cases: Suited for applications that need efficient similarity search in large vector datasets, such as facial recognition systems, audio fingerprinting, etc.


Weaviate is an open-source smart vector search engine that allows for storage and retrieval of high-dimensional vector data.

Why Weaviate as VectorDB?
  • Key Features:
    • Semantic Search: Integrates machine learning models to enable semantic search capabilities.
    • GraphQL API: Offers a GraphQL interface for querying, making it accessible and easy to integrate into various applications.
    • Scalable Architecture: Designed to scale horizontally, facilitating the management of large datasets.
  • Use Cases: Particularly useful for developers building applications that require semantic understanding and context-aware searching, like advanced search engines, recommendation systems, etc.
Check more F/OSS VectorDBs ๐Ÿ‘‡
  • Elastic Search

    • While primarily a search engine, it can be used as a vector database with its dense_vector datatype and KNN search capabilities.
  • Milvus

    • An open-source vector database designed for scalable similarity search and AI applications.
  • Qdrant

    • A vector search engine that is optimized for storing and searching large volumes of vector data.
  • Faiss

    • By Facebook AI: Primarily a library for efficient similarity search, but can be used in conjunction with databases to handle vector data.
  • Pinecone

    • A scalable vector database service, though not entirely open source, it offers a free tier that can be useful for students.
  • LanceDB

    • LanceDB is a vector database that focuses on providing high performance for both ingestion and querying of vector data.
      • Key Features:
        • Efficient Indexing: It uses advanced indexing techniques to handle large-scale vector data efficiently.
        • Real-time Processing: Designed for real-time data processing, making it suitable for applications that require immediate insights from vector data.
      • Use Cases: Ideal for scenarios where both high-speed data ingestion and querying are critical, such as real-time recommendation systems, image retrieval systems, etc.