Олесь Пічак

FLAP - Fine-tune any LLM (100B+) on your GPU zero cloud costs

Fine-tune any LLM on your local GPU no cloud required. FLAP uses memory-mapped sharding to train models from 1B to 670B+ on as little as 6 GB VRAM. Your data never leaves your machine. No per-hour GPU bills. No vendor lock-in. ✓ 21.5× faster than traditional fine-tuning ✓ ~95% cost reduction vs cloud APIs ✓ Supports Llama, Mistral, Qwen, and more ✓ Free tier - No credit card.

Add a comment

Replies

Best
Олесь Пічак
Hey PH! 👋 I built FLAP after paying $400/month to fine-tune models on cloud GPUs for a side project. There had to be a better way. FLAP runs entirely on your local GPU using memory-mapped parameter sharding. A 70B model that would need 140GB VRAM traditionally? FLAP handles it on 6GB. Would love your feedback especially from anyone who's tried fine-tuning before and hit the cost wall. AMA in the comments!