model-fit — Find the Local LLMs That Actually Run on Your Hardware
- • 1 Developer

Hardware-Aware LLM Compatibility Engine
model-fit is an open-source command-line tool, published on npm, that tells you which local LLMs will actually run on your machine before you waste time downloading them. It reads your CPU, GPU and RAM, then for every model computes the real memory budget (weights + KV cache sized to your context length + overhead) to classify each model as full-GPU, hybrid, CPU-only, or won't-run — with an estimated tokens/sec and the reasoning behind every number. No opaque scores, just transparent VRAM math.


The Challenge
Anyone running local LLMs hits the same wall: models that crash with out-of-memory errors or crawl at 2 tokens/sec. Existing tools hand out vague compatibility scores that ignore the single biggest variable — the KV cache, which grows with context length and can dwarf the model weights themselves. Worse, naive estimators wrongly flag modern GQA architectures like Llama 3 and Qwen2.5 as "won't run" at long context. The challenge was to model real memory behaviour accurately across heterogeneous hardware (Windows, macOS, Linux) and a constantly changing model catalog, while keeping the tool fast and fully usable offline.
The Solution
I built model-fit as a zero-config Node.js CLI that runs instantly via npx model-fit. It detects hardware through the systeminformation library, then derives every figure from first principles: weights = params × bytes-per-weight for the chosen quantization, the KV cache sized to the actual context window and the model's real attention architecture (GQA-aware), and generation speed approximated from memory bandwidth ÷ active weights — accounting for MoE models that only read their active experts. The catalog merges three sources: a curated offline seed list, the user's local Ollama models with real on-disk sizes, and the Hugging Face Hub GGUF catalog ranked by popularity — all de-duped and cached for 24 hours so it stays fast and works offline. Commands include detect, recommend (with category filters like coding, reasoning and vision) and check, which prints a full memory breakdown for a single model. Shipped as an MIT-licensed package.
Other remarkable projects

Visual Blueprint AI — Prompt Generation
Reverse-engineer any creative image into a structured prompt-ready blueprint with layout logic and style analysis.

Kontakly — AI-Powered B2B Lead Generation
AI-powered B2B sales automation platform that finds, qualifies, and contacts leads automatically with personalized email sequences and automated follow-ups.

Veritas — Judiciary AI Assistant
AI-powered judiciary assistant for Cameroon with document library, identification services, and LLM chatbot trained on Cameroon Constitution.