Launched this week

ReteAi

Launched this week

Your own AI. On your Mac. No subscription, ever

8 followers

Your own AI. On your Mac. No subscription, ever

8 followers

Visit website

Mesh your Apple devices into a local AI supercomputer. Rete runs models like Llama and Mistral on your Mac — and pools compute from your iPhone, iPad, or friends' Macs to handle larger models. Fully private: conversations stay on your network. Works offline. No accounts, no subscriptions, no data harvesting. One payment of $20, yours forever. Stop renting AI and start owning it.

Free Options

Launch tags:Artificial Intelligence•Tech•Apple

Launch Team

Pioneer — Fine-tune any LLM in minutes, with one prompt

Fine-tune any LLM in minutes, with one prompt

Promoted

Maker

📌

I built Rete because I got tired of paying $20/month to rent AI and watching my conversations get harvested for training data. Rete is a native Mac app for running local LLMs (Llama, Mistral, and more) with a clean interface. The part I'm most proud of: the mesh. Your iPhone, iPad, and even your friends' Macs can pool compute together to run larger models than any single device could handle alone. Nobody else is doing this. $20 one-time. No subscription, no accounts, no cloud. Nothing leaves your network. This is v1 — I'd love feedback on what's working, what's missing, and what you'd want next. Happy to answer questions about the mesh architecture or where the product is headed.

Report

3d ago

jared.so

The "mesh your Apple devices" angle is what differentiates this from every other local-LLM wrapper — pooling compute is the missing infrastructure piece. Curious what the latency looks like once you bring in a non-LAN device (a friend's Mac across the internet): is throughput still usable, or does it degrade fast enough that the mesh ends up being effectively same-network only in practice?

Report

7h ago

Maker

@mcarmonas Yes, it degrades your math instinct is right. LAN is near-free (~2s per response with local Metal), but the WAN path through our relay hits the per-token RTT wall hard. We see ~5s/token for short responses cross-internet, exactly the compounding you'd predict.

So WAN mesh isn't really for "make my Mac faster." The use case it's actually for is memory pooling: a 70B+ model that won't fit on one device becomes runnable when a friend holds half the layers, and at that point 5s/token beats "can't run at all." For models that fit locally, LAN is where the throughput story lands.

So in practice: same-network for chat throughput, cross-internet for big-model access. Two different value props on one substrate.

Report

2h ago

So in practice: same-network for chat throughput, cross-internet for big-model access. Two different value props on one substrate.