Deploy local LLMs like Containers - OLLama Docker

Ollama is a great open source project that can help us to use large language models locally, even without internet connection and CPU only.

Why Ollama?

This year we are living an explosion on the number of new LLMs model.

Some of them are great, like ChatGPT or bard, yet private source. But there are of course open source LLMs that we can use, even in our own hardware (even without GPU).

I love this kind of projects, like TextGenWebUI or Ollama because they are really making Generative AI accesible to everyone.

What makes Ollama better?

The good thing about Ollama is that there is a Docker container to deploy it out of the box.

Even better, they have made the project so that we can spin up LLMs like if they were containers inside Ollama.

You can get to know more about the Ollama project on their Official Page.

And as F/OSS, they made the Ollama source code available on Github.

Ollama with Docker

As always with Self-Hosting, make sure you have Docker installed (it will really make your life easier to try new software with all dependencies checked).

We are going to use the Ollama Docker Container Image. that the project provides.

This command should tell you the Docker version you just installed and that you are ready to go ahead.

docker --version

Ollama Docker CLI

If you have used Docker before to Self-Host some service, feel free the CLI command below to deploy Ollama with Docker

docker volume create ollama_data

docker run -d --name ollama -p 11434:11434 -v ollama_data:/root/.ollama ollama/ollama

Wait for the first time use download and you are ready to use LLMs locally with Ollama.

You can check that everything is ready with this command that will tell you the Ollama version you deployed:

docker exec -it ollama ollama --version

Ollama Docker Stack

If you are not yet familiar with Docker, I would recommend you to install Portainer to manage the containers with a GUI.

Then, use the Stack below (it replicates the CLI command above) to deploy Ollama.

#version: '3'
services:
  ollama:
    image: ollama/ollama
    container_name: ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama_data:/root/.ollama

volumes:
  ollama_data:

After it downloads the OLLama Docker image the first time, you will see that everything is running in Portainer.

Successfully deploying OLLama with Docker

You can now go inside the Ollama container with the terminal and try if verything is fine and which OLLama version you have:

docker exec -it ollama /bin/bash #to use the container interactive terminal
ollama --version #0.1.8

Now the fun part begins with Ollama

Selecting Models with Ollama

In Ollama’s official site, you can find a library of open source models already configured to work with it.

Let try one LLM locally with our Ollama Docker container.

Using Ollama - DeepSeek Coder Locally

The DeepSeek Coder model has gain popularity lately and can help us with code related tasks.

Let’s try DeepSeek Coder with Ollama.

Just execute the command below and it will download the LLM model and will wait for your prompts:

ollama run deepseek-coder:1.3b

As I mentioned, the Ollama project is giving us a way to control LLMs in a way very familiar to Docker users: ollama run some_llm

You will see the following in Ollama’s container terminal:

Using DeepSeek Coder LLM with OLLama for free

Once it finishes, you can try the model with any question you have:

Using DeepSeek Coder LLM with OLLama for free

Other great LLMs for Ollama

For general purpose
- Llama2
For coding
- CodeLlama.
- SQLCoder.

FAQ

Does OLLama provides API?

Yes, for example with DeepSeek Coder, we can do:

curl -X POST http://localhost:11434/api/generate -d '{"model": "deepseek-coder:1.3b", "prompt":"What it is Python?"}'

#Windows
# curl.exe -X POST http://localhost:11434/api/generate -d "{\"model\": \"deepseek-coder:1.3b\", \"prompt\":\"What it is Python?\"}"

And if you want just the full response:

curl http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

#Windows
# curl.exe -X POST http://localhost:11434/api/generate -d "{\"model\":\"mistral\",\"prompt\":\"Why is the sky blue?\",\"stream\":false}"

Can I use Ollama from Python?

Yes, with the help of the LLamaIndex project, which it is an open source data framework for our LLMs.

First, install the LLamaIndex Python Package:

pip install llama-index

from llama_index.llms import Ollama
llm = Ollama(model="llama2")
llm.complete("Why is the sky blue?")

You can follow the full example here.

The LLamaIndex projects also provide great information and concepts to know if you want to get Started and build more serious Projects with LLMs

How to use Ollama with https?

You can use NGINX as reverse proxy together with Ollama Docker container. NGINX will create the SSL certificates and you can expose OLLama safely if you need to.

If you are interested in deploying a separated NGINX instance with Docker, I already created a guide for that here.

Why Ollama?#

What makes Ollama better?#

Ollama with Docker#

Ollama Docker CLI#

Ollama Docker Stack#

Selecting Models with Ollama#

Using Ollama - DeepSeek Coder Locally#

Other great LLMs for Ollama#

FAQ#

Does OLLama provides API?#

Can I use Ollama from Python?#

How to use Ollama with https?#

Is there a Video explaining How to Install Ollama with Docker?#