🤖
llama.cpp
by ggerganov
About
Run quantized language models locally with llama.cpp. Highly optimized CPU inference for Llama, Mistral, Phi, and GGUF-format models.
Categories
Works With
Frequently Asked Questions
What is the llama.cpp MCP server?
Run quantized language models locally with llama.cpp. Highly optimized CPU inference for Llama, Mistral, Phi, and GGUF-format models.
How do I install llama.cpp?
Visit the GitHub repository for installation instructions.
What AI clients work with llama.cpp?
Quick Info
- Install Type
- binary
- Author
- ggerganov
- Categories
- 1
- Integrations
- 3
Related Servers
🧠✓
Memory
Knowledge graph-based persistent memory system. Store and retrieve contextual information.
🤖✓
Sequential Thinking
Dynamic and reflective problem-solving through thought sequences.
🔍
Exa
Search Engine made for AIs. Neural search with understanding of content meaning.
🗄️
Milvus
Search, Query and interact with data in your Milvus Vector Database.
🗄️
Chroma
Embeddings, vector search, document storage, and full-text search with the open-source AI application database.
Ad Placeholder