AlexsJones

llmfit

// summary

llmfit is a terminal-based utility that analyzes your system's hardware to identify which large language models will run effectively on your specific configuration. It provides an interactive TUI and CLI to score models based on quality, speed, and memory fit while supporting various backends like Ollama, llama.cpp, and MLX. Users can also perform hardware simulations to test how different model configurations would perform on target system specifications.

// technical analysis

llmfit is a terminal-based utility designed to bridge the gap between hardware capabilities and Large Language Model (LLM) requirements. By automatically detecting system specifications—including CPU, RAM, and various GPU architectures—it scores and ranks models based on their fit, speed, and quality, effectively solving the problem of manual trial-and-error in local model deployment. The project employs a sophisticated scoring engine that accounts for dynamic quantization, Mixture-of-Experts (MoE) architectures, and context-length constraints, providing users with actionable insights into which models will perform optimally on their specific hardware.

// key highlights

Provides an interactive TUI and CLI to rank hundreds of models based on hardware-specific fit, speed, and quality scores.

Features a hardware simulation mode that allows users to override system specs to test model compatibility for different hardware configurations.

Supports advanced model architectures like MoE by calculating effective VRAM requirements based on active expert parameters.

Implements dynamic quantization selection, automatically choosing the highest quality quantization that fits within available memory.

Includes a built-in web dashboard and REST API for remote monitoring and integration with cluster schedulers or external scripts.

Offers comprehensive hardware detection across diverse backends, including NVIDIA, AMD, Intel Arc, Apple Silicon, and Ascend NPUs.

// use cases

Automated hardware detection and model compatibility scoring for local LLM execution.

Hardware simulation mode to predict model performance on different RAM, VRAM, and CPU configurations.

REST API and JSON output support for integrating model recommendations into automated workflows and cluster schedulers.

// getting started

To begin, install llmfit using your preferred package manager such as Scoop on Windows, Homebrew or MacPorts on macOS/Linux, or via the provided shell script. Once installed, simply run the 'llmfit' command in your terminal to launch the interactive TUI, or use 'llmfit recommend' to receive immediate, machine-readable model suggestions for your current hardware.