We run the AI on-device so your weather never leaves your phone
A lot of apps have bolted on "AI weather insights" in the last year. Almost all of them work the same way: the app sends your location, your recent queries, and the weather context to an OpenAI or Anthropic endpoint, then prints what comes back.
That's fine for a novelty. It's not fine for a tool you check every morning, in every room of your house, with your location pinned to your exact coordinates.
DewLogic takes the opposite approach. The primary LLM path is local inference running inside the app. Cloud providers are a fallback, not the default.
What that actually means
When you ask DewLogic "should I cover my tomatoes tonight?" the question, your location, and your forecast never leave the device. A quantized language model (GGUF format, running on llama.cpp via Rust) reads the weather context and writes the answer on your own CPU or GPU.
No API bill. No account sign-up. No telemetry on what you asked. Works on a plane.
The stack
Dart (Flutter UI)
|
| flutter_rust_bridge FFI
v
Rust (dewlogic_core)
|
| momusdev_bridge
v
TaskEngine -> InferenceEngine -> llama_cpp (GGUF)
The model loads at startup through an asset manager we built specifically to handle multi-gigabyte downloads, integrity checks, and swapping between models without restarting the app. You can pick which model runs. Smaller ones (1 to 3B params) are fast enough for live conversation on mid-range phones. Larger ones (7 to 13B) run on desktops and newer flagships.
Why go through the trouble
Three reasons, in order:
1. Privacy. Weather data is location data, and location data over time is behavior data. We don't want that pipe, and we don't want to be the company that eventually sells that pipe to someone else.
2. Availability. The places where specialty weather apps matter most (farms, trails, boats, backcountry, construction sites, remote properties) are the places with the worst cell coverage. A weather assistant that stops working when LTE drops is worse than no assistant.
3. Cost to the user. Cloud LLM calls add up. A free app that silently costs a cloud provider $0.02 per query is either going to serve you ads, limit your usage, or raise a subscription wall. Local inference removes that entire category of decision.
Trade-offs, honestly
App size is larger because models have to live somewhere. We keep the base install reasonable and let you download your preferred model on first run.
First-token latency is worse than a server with an H100. It's fine in practice, but if you're used to ChatGPT's speed on a data-center GPU, set expectations.
Quality ceiling is lower than a frontier cloud model. A 7B quantized model is not GPT-5. For weather reasoning tasks (summarizing a forecast, scoring conditions for an activity, explaining what a pressure trend means) it's more than enough, and we can feed it structured context from the Decision Engine so it doesn't need to invent facts.
Cloud as fallback
If you want to use Claude, GPT, or Gemini, you can plug in your own API key. Some tasks (long narrative summaries, complex multi-factor analysis) genuinely benefit from a larger model. That path exists. It just isn't the default, and we aren't in the middle of it.
Running local inference in a consumer app has more footguns than it looks. Happy to go deeper on model selection, quantization trade-offs, or how we handle model switching without restarting.
Good morning, everyone! It's launch day!
I'm here all day to answer your questions so feel free to drop me a line!