Alpie Core

Alpie Core

A 4-bit reasoning model with frontier-level performance

228 followers

Alpie Core is a 32B reasoning model trained, fine-tuned, and served entirely at 4-bit precision. Built with a reasoning-first design, it delivers strong performance in multi-step reasoning and coding while using a fraction of the compute of full-precision models. Alpie Core is open source, OpenAI-compatible, supports long context, and is available via Hugging Face, Ollama, and a hosted API for real-world use.
Alpie Core gallery image
Alpie Core gallery image
Alpie Core gallery image
Alpie Core gallery image
Alpie Core gallery image
Free
Launch Team
NMI Payments
NMI Payments
Don’t Integrate Payments Until You Read This Guide
Promoted

What do you think? …

Chirag Arya

Hey builders

Modern AI keeps getting better, but only if you can afford massive GPUs and memory. We didn’t think that was sustainable or accessible for most builders, so we took a different path.

Alpie Core is a 32B reasoning model trained, fine-tuned, and served entirely at 4-bit precision. It delivers strong multi-step reasoning, coding, and analytical performance while dramatically reducing memory footprint and inference cost, without relying on brute-force scaling.

It supports 65K context, is open source (Apache 2.0), OpenAI-compatible, and runs efficiently on practical, lower-end GPUs. You can use it today via Hugging Face, Ollama, our hosted API, or the 169Pi Playground.

To keep you building over Christmas and the New Year, we’re offering 5 million free tokens on your first API usage, so you can test, benchmark, and ship without friction.

This launch brings the model, benchmarks, api access, and infrastructure together in one place, and we’d love feedback from builders, researchers, and infra teams. Questions, critiques, and comparisons are all welcome as we shape v2.

Sujal Meghwal

@chirag_a2207 This is a solid direction 4-bit end-to-end with 65K context is not easy to get right.

I run a security & adversarial testing practice focused on LLM / API / inference-time risks (prompt injection, jailbreaks, context poisoning, OpenAI-compat compatibility gaps, abuse vectors).

If you’re open to it, I'd be happy to do a free adversarial assessment of Alpie Core and share a short report with findings + mitigations.

No pitch just stress-testing before v2.

Chirag Arya

@sujal_meghwal Really appreciate this, and thanks for the kind words. You’re absolutely right, getting 4-bit end-to-end with long context stability is non-trivial.

We’d be open to an adversarial assessment, especially ahead of v2. Stress-testing around prompt injection, jailbreaks, and inference-time risks is something we take seriously. Happy to connect and see how we can collaborate and learn from the findings.

Thanks for offering, will reach out to coordinate soon.

Sujal Meghwal

@chirag_a2207 Great. When you’re ready, I can share a short scope outlining what we’d test (prompt injection, jailbreak surfaces, long-context abuse, OpenAI-compat edge cases, inference-time abuse) and the format of the report so expectations are clear upfront. Happy to adapt it to whatever stage or constraints you’re working with. Looking forward to working with you and your team

Malek Moumtaz

@chirag_a2207 A 32B model at 4-bit with strong reasoning is impressive. How do you think about the trade-off between aggressive quantization and reasoning reliability, especially on long, multi-step chains or edge cases where small precision errors can compound?

Chirag Arya

@malekmoumtaz That’s a great question, and it’s exactly the trade-off we spent the most time on.

We don’t treat 4-bit as a post-training compression step. Alpie Core is trained, fine-tuned, and served entirely at 4-bit, so the model learns to reason within low-precision constraints instead of being forced into them later. That significantly reduces error accumulation compared to aggressive quantization applied after the fact.

That said, long multi-step chains are still where issues surface first. We’ve found that stability depends less on individual arithmetic precision and more on how the model represents intermediate reasoning states. In practice, this means we actively test for instruction drift, compounding errors, and state collapse across long contexts, and we design training and evaluation around those failure modes.

We’re very cautious about claiming “no trade-offs,” and the goal is to make the trade-offs explicit, measurable, and improve with each iteration, especially for long-horizon and edge-case reasoning. Will be happy to hear your feedback on it after you try it out.

Malek Moumtaz

@chirag_a2207 Chirag! Thanks for the clarification! Will let you know my feedback soon

Koder Kashif
At first it seemed like it can run on any laptop, hope you guys will keep on optimizing for running on most laptops.
Chirag Arya

@koderkashif Good question! Right now, it does need GPU VRAM or a fairly high-end CPU to run locally at this scale. That said, we’re actively optimising it further so it can run on more everyday laptops and eventually even phones over time.

Thank you for your interest. We will address this issue soon. Please stay connected for updates.

Koder Kashif
@chirag_a2207 Appreciate it.
Peter Shu

What is the strenght of this compared to using api from openrouter? Do you think this is better to be used for putting into a product, or development?

Chirag Arya

@peterz_shu Good question, Peter. OpenRouter is great for quick experimentation and model comparison. Alpie Core is designed to be a consistent, production-ready model that you can rely on for development and products, with predictable behaviour, lower latency, and improved cost control.

We’ll be available on OpenRouter soon for easy evaluation. For now, we’re offering a free first API key on our website so teams can test it properly and share feedback.

Mykyta Semenov 🇺🇦🇳🇱

Congratulations on the launch! We are actually developing several AI startups based on ChatGPT. We’ll check out your product with our team; it might be applicable to our tasks.

Chirag Arya

@mykyta_semenov_ Thank you, appreciate that. Do check it out with your team, as we’d be happy to jump on a call and share more context if helpful. It’s a state-of-the-art reasoning model at this scale, trained and served entirely at 4-bit, so it can be a good fit for real product workflows.


Looking forward to hearing your thoughts.

Samuel
Congrats on the launch 👏 Alpie Core is interesting, especially as an alternative to larger reasoning models like Llama or Qwen that rely on heavier hardware and higher precision. Running a 32B model at 4 bit with strong reasoning sounds promising, but I’m curious how it holds up on tougher multi step reasoning and coding benchmarks where higher precision models usually shine. Are early users seeing better value in cost efficiency, or in being able to run it on lower end GPUs? What’s been the most surprising comparison result so far when benchmarking against other open source reasoning models?