NexaSDK for Mobile

NexaSDK for Mobile

Easiest solution to deploy multimodal AI to mobile

654 followers

NexaSDK for Mobile lets developers use the latest multimodal AI models fully on-device on iOS & Android apps with Apple Neural Engine and Snapdragon NPU acceleration. In just 3 lines of code, build chat, multimodal, search, and audio features with no cloud cost, complete privacy, 2x faster speed and 9× better energy efficiency.
NexaSDK for Mobile gallery image
NexaSDK for Mobile gallery image
NexaSDK for Mobile gallery image
NexaSDK for Mobile gallery image
NexaSDK for Mobile gallery image
NexaSDK for Mobile gallery image
NexaSDK for Mobile gallery image
NexaSDK for Mobile gallery image
NexaSDK for Mobile gallery image
Free
Launch Team / Built With
Famulor AI
Famulor AI
One agent, all channels: phone, web & WhatsApp AI
Promoted

What do you think? …

Zack Li

Hey Product Hunt — I’m Zack Li, CTO and co-founder of Nexa AI 👋

We built NexaSDK for Mobile after watching too many mobile app development teams hit the same wall: the best AI experiences want to use your users’ real context (notes, photos, docs, in-app data)… but pushing that to the cloud is slow, expensive, and uncomfortable from a privacy standpoint. Going fully on-device is the obvious answer — until you try to ship it across iOS + Android with modern multimodal models.

NexaSDK for Mobile is our “make on-device AI shippable” kit. It lets you run state-of-the-art models locally across text + vision + audio with a single SDK, and it’s designed to use the phone’s NPU (the dedicated AI engine) so you get ~2× faster inference and ~9× better energy efficiency — which matters because battery life is important.

What you can build quickly:

  • On-device LLM copilots over user data (messages/notes/files) — private by default

  • Multimodal understanding (what’s on screen / in camera frames) fully offline

  • Speech recognition for low-latency transcription & voice commands

  • Plus: no cloud API cost, day-0 model support, and one SDK across iOS/Android

Try today at: https://sdk.nexa.ai/mobile, I’d love your real feedback:

  1. What’s the first on-device feature you’d ship if it was easy?

  2. What’s your biggest blocker today — model support, UX patterns, or performance/battery?

Alan Zhu

@zack_learner We look forward to hearing everyone's feedback! Feel free to ask us any questions.

Masum Parvej

@zack_learner How does NexaSDK handle different NPUs across devices? Is performance consistent on older phones too?

Alan Zhu

@zack_learner  @masump Great question. NexaSDK uses our NexaML runtime as an abstraction layer: at runtime we detect the device’s available accelerators (Apple Neural Engine / Snapdragon NPU / GPU / CPU) and route each model/operator to the best backend for that device. Same app code — the SDK handles the hardware differences.

On older phones, performance won’t be identical (hardware is the bottleneck), but it’s predictable: we automatically fall back to GPU/CPU when an NPU isn’t available

Zack Li

@masump We support Android with Snapdragon Gen4 NPU (SAMSUNG S25) IOS & MacOS (iPhone 12+). The performance is consistent from all compatible devices. For those devices without NPU support, you can also use the GPU & CPU version of SDK.

Anton Gubarenko

@zack_learner Congrats! Very anticipated. Been waiting for such SDK with dynamic LLM loading. Please add a text/image generation examples in Docs for Qwen in example. Will be very popular.

Zack Li

@anton_gubarenko Yes, we have supported Qwen model, see: https://docs.nexa.ai/nexa-sdk-android/overview#supported-models
Qwen3-4B is supoprted for PC/mobile, and Qwen3VL is supported for PC

Nikita Savchenko

Local AI modals are definitely the future! Wondering how do you price this product. Is it free? Because pricing stuff running locally is quite tricky.

Zack Li

@nikitaeverywhere Yes, nexaSDK is free, we only charge for large enterprise adoption for NPU inference.

Alan Zhu

@nikitaeverywhere Yes this product is free for you to use! We believe local AI will be in every device in the future. Please feel free to let me know your feedback.

Anton Loss

Very impressive! So, you re-package models to make them compatible with different devices? What is `NEXA_TOKEN` needed for? Maybe you could quickly explain how does it work and which models are available?

Alan Zhu
@avloss Thank you Anton for the support. We optimize the models so that it is compatible and accelerated to run on different Android and iOS devices. We are the only SDK that can run SOTA models on NPU. NEXA_TOKEN is needed to activate the SDK. You can get the access token for free from our website. We support all types of latest AI models: - LLM: Granite, Liquid - Multimodal (audio+vision): OmniNeural - ASR: Parakeet - Embedding: EmbedNerual (Multimodal) - OCR: PaddleOCR
Zack Li

@avloss We have our internal convert pipeline and quantization algorithm make model compatible for difference devices. For NPU inference usages on PC, `NEXA_TOKEN` is needed for 1st time to validate the device, since NPU inference is only free for individual developers.

Lynn Li

On-device is clearly the right answer for anything touching real user context — notes, messages, photos, screen content. Curious to see how teams use this for:

  • screen-aware copilots

  • offline multimodal assistants

  • privacy-sensitive workflows

  • ...

Alan Zhu

@llnx Exactly! On-device AI will be powering every app by 2030!

Zack Li

@llnx I cannot agree more, thanks

Kate Ramakaieva

Nice work team👏How deeply you integrate with Apple’s NPU - is this Core ML–based or a custom runtime?

Alan Zhu
@kate_ramakaieva We are using CoreML but we have built our inference engine from scratch. We are the only SDK that can support latest model on Apple NPU
Zack Li

@kate_ramakaieva We use customed pipeline and build our inference engine ourselves. We leveraged some low-level APIs in coreML only.

Helena
congrats on the launch!
Alan Zhu
@hehe6z Thank you Helena!
Zack Li

@hehe6z Thank you and look forward to more feedbacks!

Polman Trudo

What types of models are supported today (e.g., language, vision, speech), and how easy is it to bring your own model?

Zack Li

@polman_trudo We support language (LLM models), vision (VLM, CV models), speech (ASR models). Our SDK has converter to support bringing your own model for enterprise customers.

Alan Zhu

@polman_trudo We support almost all model types and tasks: vision, language, ASR, embedding models. NexaSDK is the only SDK that supports latest, state-of-the-art models on NPU, GPU, CPU. It is easy to bring your own model as we will release an easy-to-use converter tool soon.

123
•••
Next
Last