Embedchain: Open Source RAG Framework

The Embedchain Project allows us to utilize a combination of resources from various sources.

EmbedChain Focuses on Data Management and LLM Integration - It excels at personalizing LLM responses by creating data embeddings that are tailored to the specific application.

It prepares data for LLMs, generates embeddings, and simplifies the process of interacting with LLMs.

🎯 Today’s Objective:
- Explore how to install **Embedchain** in an isolated environment to facilitate the proper self-hosting of AI applications.
- How to use EmbedChain with Ollama
- Along the way, you will understand what are RAGs and why they matter for AI Apps.
These are the embedchain key components:
- Data Sources: Harness a wide array of data sources as outlined in the Embedchain documentation. Data Sources Overview combined with various models.
- Models: Access multiple models, including local and private ones like Ollama and GPT4All, for diverse applications. Explore the Models
- Hugging Face Support: The project also includes support for models from Hugging Face, expanding its versatility. Hugging Face Integration
- Vector Databases: Leverage vector databases integral to handling large datasets effectively. Vector Databases
- Embedding Models: Utilize sophisticated embedding models for advanced data processing and analysis. Embedding Models

This structure offers a clear, organized view of the resources and features available through the Embedchain Project, enhancing navigation and understanding for readers.

The EmbedChain Project

Embedchain Key Features: 🔗
- Data Streamlining: Simplifies managing unstructured data for training and personalizing LLMs. Segments data, generates embeddings, and stores them in a vector database for efficient retrieval.
- Personalization Capabilities: Enables personalized responses by feeding the LLM with user-specific data embeddings.
- Ease of Use: Emphasizes a “conventional but configurable” approach, catering to both software engineers and machine learning specialists. Offers a user-friendly interface and configuration options for specific needs.

And EmbedChain is completely Open:
- EmbedChain Docs
- The EmbedChain Code at Github
  - License: Apache 2.0 ✅

Installing EmbedChain

We just need: The embedchainpackage at PyPi
And install the Python (and the dependencies) properly

Conda

https://docs.conda.io/projects/miniconda/en/latest/

conda --version
# conda create --name embedchain python=3.11
# conda activate embedchain
conda install numpy

Venv

# !python -m venv embedchain_venv
#Unix
#!source embedchain_venv/bin/activate
#Windows
#.\embedchain_venv\Scripts\activate

#deactivate

#Get-ExecutionPolicy
#Set-ExecutionPolicy RemoteSigned
#Set-ExecutionPolicy Restricted

!pip install embedchain

How to Use EmbedChain

Default with OpenAI

export OPENAI_API_KEY=sk-blablabla # on Linux/Mac

#set OPENAI_API_KEY=sk-blablabla #bash
$env:OPENAI_API_KEY = "sk-blablabla" #PS

# Create a bot instance
os.environ["OPENAI_API_KEY"] = "your_API_key"

With Ollama Locally

Get Ollama Ready
Follow: https://docs.embedchain.ai/components/llms#ollama

Code Scheleton for EmbedChain

import os
from embedchain import Pipeline as App


elon_bot = App()

# Embed online resources
elon_bot.add("https://en.wikipedia.org/wiki/Elon_Musk")
elon_bot.add("https://www.forbes.com/profile/elon-musk")

# Query the bot
elon_bot.query("How many companies does Elon Musk run and name those?")
# Answer: Elon Musk currently runs several companies. As of my knowledge, he is the CEO and lead designer of SpaceX, the CEO and product architect of Tesla, Inc., the CEO and founder of Neuralink, and the CEO and founder of The Boring Company. However, please note that this information may change over time, so it's always good to verify the latest updates.

FAQ

Understanding Retrieval-Augmented Generation

RAG, or Retrieval-Augmented Generation, is a technique employed in large language models (LLMs) to enhance the quality and relevance of generated text. Here’s a breakdown of how RAG works and its benefits:

How RAG Works

Retrieval 🔍: Uses an initial query or prompt to retrieve relevant documents or passages from a large dataset.
Augmentation 🔄: Processes and incorporates the retrieved information into the LLM’s internal representation, utilizing techniques such as summarization and key concept encoding.
Generation 📝: Leverages the augmented data to generate responses or complete tasks, ensuring the text is on-topic, factual, and coherent.

Benefits of RAG

Improved Factual Accuracy: RAG enhances the accuracy and truthfulness of LLM-generated responses by utilizing retrieved information.
Enhanced Relevance: Ensures text remains relevant to the initial prompt or query.
Better Coherence: Aids in generating more cohesive and structured text by providing contextual support.

Popular RAG Implementations: Both utilize RAG to serve distinct purposes, showcasing the potential of RAG in enhancing LLM capabilities
- LangChain: A framework for building complex LLM applications, integrating RAG within larger workflows for tasks like information retrieval and text generation.
  - A versatile framework that can incorporate RAG for diverse LLM applications.
- LLAMIndex: Focuses on search and retrieval applications, using RAG to enhance LLM’s ability to search through large text corpora efficiently.
  - Specializes in creating search and retrieval systems using RAG-enhanced LLMs.

FREE Vector DataBases

Ollama embeddings

https://www.youtube.com/watch?v=Ml179HQoy9o

GPUs for ollama https://www.youtube.com/watch?v=QRot1WtivqI

The EmbedChain Project#

Installing EmbedChain#

Conda#

Venv#

How to Use EmbedChain#

Default with OpenAI#

With Ollama Locally#

Code Scheleton for EmbedChain#

FAQ#

Understanding Retrieval-Augmented Generation#

FREE Vector DataBases#

Ollama embeddings#