onLM - Run LLMs on your iPhone — no cloud, fully private

onLM brings LLMs and speech recognition to your iPhone — fully on-device, no cloud, no API keys, no subscriptions. Choose between Apple Intelligence and open-source models from Hugging Face, all running locally via MLX on Apple Silicon. Voice transcription powered by Qwen3-ASR lets you dictate messages without a single byte leaving your device. Record, transcribe, and summarize — all offline. Download models once, use them anywhere. Your conversations and voice data never touch a server.

Hey Product Hunt! I'm Alex, the maker of onLM. I built this app because every AI chat app I used required sending my conversations to someone else's server. I wanted something that works entirely on my device — private by default, not by policy. onLM runs large language models and speech recognition directly on your iPhone or iPad using Apple Silicon. No accounts, no API keys, no subscriptions. You download a model once and it's yours. How it works: - Pick from open-source models on Hugging Face (4-bit quantized, optimized for mobile) or use Apple Intelligence built into iOS - Chat works offline — even in airplane model - Voice input uses Qwen3-ASR running locally — tap the mic, speak, and the transcription happens on-device. Nothing is uploaded anywhere - One model in memory at a time to stay within iPhone's RAM limits — the app handles swapping automatically Why on-device matters: It's not just about privacy. There's zero latency from network round-trips, no rate limits, no outages, and no recurring costs. The trade-off is that on-device models are smaller than cloud ones — but they're getting better fast. I'd love your feedback — what models would you like to see supported? What features would make this more useful for you?

onLM - Run LLMs on your iPhone — no cloud, fully private

Replies