🤖

llama.cpp

by ggerganov

About

Run quantized language models locally with llama.cpp. Highly optimized CPU inference for Llama, Mistral, Phi, and GGUF-format models.

Categories

Frequently Asked Questions

What is the llama.cpp MCP server?
Run quantized language models locally with llama.cpp. Highly optimized CPU inference for Llama, Mistral, Phi, and GGUF-format models.
How do I install llama.cpp?
Visit the GitHub repository for installation instructions.
What AI clients work with llama.cpp?
llama.cpp works with Claude Desktop, Cursor, VS Code.