KoboldCpp is an easy-to-use AI text-generation software for GGML and GGUF models.
It’s a single self contained distributable from Concedo, that builds off llama.cpp, and adds a versatile Kobold API endpoint,
The KoboldCPP Project
- The koboldcpp Code at Github
- License: AGPL-3 ✅
Installing koboldcpp
Check latest releases: https://github.com/LostRuins/koboldcpp/releases/
wget https://github.com/LostRuins/koboldcpp/releases/download/v1.58/koboldcpp-linux-x64
#curl -fLo koboldcpp https://github.com/LostRuins/koboldcpp/releases/latest/download/koboldcpp-linux-x64 && chmod +x koboldcpp-linux-x64
./koboldcpp-linux-x64
Select the model, for example you can download: https://huggingface.co/eachadea/ggml-vicuna-7b-1.1/tree/main or https://huggingface.co/TheBloke/dolphin-2.5-mixtral-8x7b-GGUF/tree/main
http://localhost:5001
Trying a MoE LLM
mixtral-8x7b - The original one is dolphin.
you can read more at: https://mistral.ai/news/mixtral-of-experts/
https://huggingface.co/TheBloke/dolphin-2.5-mixtral-8x7b-GGUF/tree/main
FAQ
Why Cpp and not Python?
-
Performance: C++ typically offers better performance than Python due to its lower-level nature and more direct control over hardware resources. For computationally intensive AI tasks, especially those involving large datasets or complex algorithms, C++ can provide significant speed advantages.
-
Development Time: Python is often favored for its simplicity and ease of development. It offers concise syntax, dynamic typing, and extensive libraries (such as TensorFlow, PyTorch, and scikit-learn) that make it convenient for prototyping and experimenting with AI models. In contrast, C++ development may require more time and effort due to its stricter syntax and manual memory management.
-
Portability: Python’s high-level nature and platform independence make it more portable than C++, which is typically compiled to machine code specific to the target platform. Python code can run on various platforms without modification, whereas C++ code may need to be recompiled for different platforms.