Embedchain allow us to use all these Data sources combined with these many models (some of them local and private like Ollama) or GPT4All.

It’s appreciated the support for HF as well.

The vector DBs

And embedding models

The EmbedChain Project

Installing EmbedChain

Conda

https://docs.conda.io/projects/miniconda/en/latest/

conda --version
# conda create --name embedchain python=3.11
# conda activate embedchain
conda install numpy

Venv

# !python -m venv embedchain_venv
#Unix
#!source embedchain_venv/bin/activate
#Windows
#.\embedchain_venv\Scripts\activate

#deactivate

#Get-ExecutionPolicy
#Set-ExecutionPolicy RemoteSigned
#Set-ExecutionPolicy Restricted

!pip install embedchain

Using EmbedChain

https://www.youtube.com/watch?v=jE24Y_GasE8

Default with OpenAI

export OPENAI_API_KEY=sk-blablabla # on Linux/Mac
#set OPENAI_API_KEY=sk-blablabla #bash
$env:OPENAI_API_KEY = "sk-blablabla" #PS
# Create a bot instance
os.environ["OPENAI_API_KEY"] = "your_API_key"

With Ollama Locally

https://docs.embedchain.ai/components/llms#ollama

Code Scheleton

import os
from embedchain import Pipeline as App


elon_bot = App()

# Embed online resources
elon_bot.add("https://en.wikipedia.org/wiki/Elon_Musk")
elon_bot.add("https://www.forbes.com/profile/elon-musk")

# Query the bot
elon_bot.query("How many companies does Elon Musk run and name those?")
# Answer: Elon Musk currently runs several companies. As of my knowledge, he is the CEO and lead designer of SpaceX, the CEO and product architect of Tesla, Inc., the CEO and founder of Neuralink, and the CEO and founder of The Boring Company. However, please note that this information may change over time, so it's always good to verify the latest updates.

FAQ

F/OSS Vector DataBases

SelfHosting ChromaDB

Other F/OSS RAG’s

  • LLamaIndex - It brings Data to LLMs
  • LangChain - another alternative for RAG?

Ollama embeddings

https://www.youtube.com/watch?v=Ml179HQoy9o

GPUs for ollama https://www.youtube.com/watch?v=QRot1WtivqI