I kept bouncing between ChatGPT, Gemini, and Grok for stuff that didn't need a massive model and all of them wanted my data or my money. n0x runs in your browser LLM inference on your GPU, document Q&A, web search, Python execution, image gen, memory. Pick a model, start using it. Nothing leaves your machine. Works offline after first visit. Supports Ollama and any OpenAI-compatible API for bigger models.
This is the 2nd launch from N0X. View more
N0X
Launching today
Since launch 1: added a ReAct agent loop that runs tools
(web search, Python, docs, image gen) autonomously. Hybrid
RAG now uses BM25 + vector search fused with RRF + MMR
reranking way better recall than vector-only. Auto-routing
picks local vs cloud per message complexity. Chrome AI
(Gemini Nano, zero download) works alongside WebGPU, Ollama,
and any OpenAI-compatible endpoint. Persistent semantic
memory across sessions. GPU-tier detection now blocks models
that'd OOM your device.

Free
Launch Team / Built With




