Embedchain allow us to use all these Data sources combined with these many models (some of them local and private like Ollama) or GPT4All.
It’s appreciated the support for HF as well.
The vector DBs
And embedding models
The EmbedChain Project
Installing EmbedChain
- We just need: The embedchainpackage at PyPi
Conda
https://docs.conda.io/projects/miniconda/en/latest/
conda --version
# conda create --name embedchain python=3.11
# conda activate embedchain
conda install numpy
Venv
# !python -m venv embedchain_venv
#Unix
#!source embedchain_venv/bin/activate
#Windows
#.\embedchain_venv\Scripts\activate
#deactivate
#Get-ExecutionPolicy
#Set-ExecutionPolicy RemoteSigned
#Set-ExecutionPolicy Restricted
!pip install embedchain
Using EmbedChain
https://www.youtube.com/watch?v=jE24Y_GasE8
Default with OpenAI
export OPENAI_API_KEY=sk-blablabla # on Linux/Mac
#set OPENAI_API_KEY=sk-blablabla #bash
$env:OPENAI_API_KEY = "sk-blablabla" #PS
# Create a bot instance
os.environ["OPENAI_API_KEY"] = "your_API_key"
With Ollama Locally
https://docs.embedchain.ai/components/llms#ollama
Code Scheleton
import os
from embedchain import Pipeline as App
elon_bot = App()
# Embed online resources
elon_bot.add("https://en.wikipedia.org/wiki/Elon_Musk")
elon_bot.add("https://www.forbes.com/profile/elon-musk")
# Query the bot
elon_bot.query("How many companies does Elon Musk run and name those?")
# Answer: Elon Musk currently runs several companies. As of my knowledge, he is the CEO and lead designer of SpaceX, the CEO and product architect of Tesla, Inc., the CEO and founder of Neuralink, and the CEO and founder of The Boring Company. However, please note that this information may change over time, so it's always good to verify the latest updates.
FAQ
F/OSS Vector DataBases
SelfHosting ChromaDB
Other F/OSS RAG’s
- LLamaIndex - It brings Data to LLMs
- LangChain - another alternative for RAG?
Ollama embeddings
https://www.youtube.com/watch?v=Ml179HQoy9o
GPUs for ollama https://www.youtube.com/watch?v=QRot1WtivqI