FLAP - Fine-tune any LLM (100B+) on your GPU zero cloud costs
by•
Fine-tune any LLM on your local GPU no cloud required.
FLAP uses memory-mapped sharding to train models from 1B
to 670B+ on as little as 6 GB VRAM. Your data never leaves
your machine. No per-hour GPU bills. No vendor lock-in.
✓ 21.5× faster than traditional fine-tuning
✓ ~95% cost reduction vs cloud APIs
✓ Supports Llama, Mistral, Qwen, and more
✓ Free tier - No credit card.
Replies
Best
Maker
📌
Hey PH! 👋
I built FLAP after paying $400/month to fine-tune models
on cloud GPUs for a side project. There had to be a
better way.
FLAP runs entirely on your local GPU using memory-mapped
parameter sharding. A 70B model that would need 140GB VRAM
traditionally? FLAP handles it on 6GB.
Would love your feedback especially from anyone who's
tried fine-tuning before and hit the cost wall.
AMA in the comments!
Replies