Empromptu AI - Train Fine Tuned Models With AI Apps You're Already Building
by•
Most AI apps launch on someone else’s model and stay there forever. Empromptu AI turns live AI features into custom models you own. As your app runs, Empromptu AI captures real-world usage, human corrections, and edge cases from live AI workflows, then uses that signal to train a custom model you own. Improve accuracy, lower inference costs, and stop depending forever on rented intelligence from the same providers moving into your category.


Replies
Empromptu AI
@lakshminath_dondeti you can only fine tune oss models. The frontier models are deprecating the ability for you to fine tune them. 😬 Which is one of the reasons we're making this available. But Empromptu is totally model agnostic
Empromptu AI
@lakshminath_dondeti To add some architecture context, the fine tuned model you build on Alchemy is your asset regardless of what happens upstream with frontier models. When a new frontier model drops, you're not starting over. Your domain knowledge, your corrections, your edge cases are portable. We can use a newer base and retrain on your existing data. The expertise your team captured doesn't deprecate with the model version.
@lakshminath_dondeti Your built up dataset that you fine tune on is the real value that builds up over the use of your app. Alchemy let's you turn that into a fine tuned model, but when a new model releases you can quickly fine tune it on your existing data.
Empromptu AI
@lakshminath_dondeti We want to remove complexity and enhance outcomes from building with AI, sort of universally. What we would suggest is that because what you have is a dataset created by the 'shape' of the AI function and its trained outcomes, we can take that dataset and re-train it using the new model at an accelerated rate because the judgment exhibited in training is what creates the signal that matters, not the foundation model provider.
Optimizing our platform in this way helps us to refocus the conversation on "which model" to "what's the best/most optimal outcome" of the AI function, and to engineer the solution/function set accordingly!
Empromptu AI
I’m Shanea, co-founder and CEO of @Empromptu AI
We built Empromptu AI's Alchemy because we believe the next phase of AI is not just building apps faster.
It is building AI that learns how your business works.
Right now, everyone is rushing to learn or add AI whether you are someone trying to figure out how to survive as an employee or take your expertise and monetize it. Everyone is plugging into the same frontier models, shipping the same generic workflows, and calling it a moat. But if everyone is using the same intelligence, no one is differentiated for long.
We're changing that.
With Empromptu AI's Alchemy you can fine tune a model with no ml expertise simply by building an AI applications. Our platform automatically captures customer usage, your corrections as a subject matter expert, edge cases, and application feedback. Alchemy turns those signals into a fine tune model that can keep itself up to date. Yes a self learning, self improving AI that doesn't cost trillions.
The simple version:
You've spent the last 5-10 years at your job learning really valuable skills whether its engineering, content, or more insane a highly regulated or specialized field.
Now your AI can learn from you so you can own your expertise.
This matters because the best knowledge usually lives inside people’s heads. The accountant knows the exception. The support lead knows when an escalation is real. The operator knows the edge case. The product team knows what “good” actually looks like.
Alchemy gives everyone a way to turn that expertise into AI that gets self improves with up to 98% accuracy.
Thank you for checking us out today. I’d love your feedback, questions, and brutal honesty.
@shanealeven Congrats on the launch team. How do you stop bad user feedback, noisy corrections, or outdated domain assumptions from being absorbed into the tuning loop
Empromptu AI
@zolani_matebese first the user can set the eval and we can automatically optimize towards that goal and as a back you can actually go in and correct the training set manually with natural language
Instrumenting production app usage as a fine-tuning data source is genuinely clever. You avoid the cold start problem of manually curating datasets that don't reflect real user behavior. We hit that exact wall building our AI features and ended up with synthetic data that didn't generalize well. What does your quality filtering pipeline look like between raw app interactions and the training checkpoint?
Empromptu AI
@retain_dev Actually thats kinda the best part if I do say so myself Dr Sean Robinson my cofounder has a PhD is computational astrophysics and he invented a way to get up to 98% accurate outputs out of any model! Its built in to happen completely automatically based on the eval that you write and that last mile is what you label. Thanks for the complement. We know the problem is unless youre a founder, the smes and ai/ml eng are usually separate so you never really get that perfect dataset.
Empromptu AI
@retain_dev To add the engineering layer to what Shanea described, the quality filtering is what makes the self-improving loop actually work in practice rather than in theory.
The eval you write upfront becomes the ground truth signal. Every production interaction gets scored against it automatically. What surfaces for labeling isn't a random sample of outputs, it's specifically the cases that fell outside your accuracy threshold. You're not reviewing everything, you're reviewing the exact delta between what the model did and what your domain requires.
That's why the labeled dataset stays small and high signal over time. The model improves, the eval catches the new edge cases, and you're always training on the real distribution of your own users rather than synthetic approximations.
The part that resonates with what you described about SMEs and ML eng being separate is exactly the failure mode we designed around. The eval layer is built to be owned by the domain expert, not the ML team. The person who knows what correct looks like is the one defining the signal, not translating it through an eng team.
Empromptu AI
@retain_dev @sean_robinson1 Yes. Second what Sean said. We think that this is truly overlooked in the tools out there today.
Empromptu AI
@retain_dev Appreciate that you're sharing this insight!
We train on raw app interactions, actually. We find that the real-world edge case training often provides the most unique and distinguishing insights that help us to effectively compress training timelines. We often will say "your model is unique because of these edge cases," whereas most people generally find them intrusive or something they should try to mitigate.
The reality is quite the opposite.
Wion - Audio Dating
Empromptu AI
@tanjum thank you so much yes. We always try to make everything we do accessible.
Empromptu AI
@tanjum Two years of production deployments across healthcare, retail, and financial workflows is what shaped the architecture. The edge cases you only hit in production are exactly what we built around.
CodeSee
@tanjum I’ve seen this in my work too. It’s been hard with the tools to help non technical folks get excited about labeling. But it really is required to get that high level of accuracy
Empromptu AI
@tanjum @joshua_leven thanks so much for the support it's been really incredible bringing this to life
Empromptu AI
@tanjum Exactly! A really common reaction for us is: I feel like this is the perfect conclusion to my Claude Code / Codex projects, a real deployment environment!
The 'bring your own expertise' angle is the right way to think about the next wave of AI. We struggle constantly with customer support AI missing the nuance of our specific software updates. If this plugs directly into customer usage signals to self-improve, it solves a massive operational headache. Amazing job @shanea_leven
Congrats on the launch 🎉
Curious, when the dynamic prompt optimization kicks in after 30 runs, does the user get visibility into what changed, or does it just happen silently in the background?
Empromptu AI
@boyuan_deng1 The user absolutely gets visibility into what runs. You can also do this manually as well for all the tinkerers out there.
Empromptu AI
@boyuan_deng1 And the visibility is intentional from an architecture standpoint. A model you can't inspect is one you can't trust in production. You should always know what signal drove a change before it goes into training.
Empromptu AI
@boyuan_deng1 Our users always know what's happening and why -- we have evaluations and audits built in from the beginning, not added in as an afterthought.
Earth.fm
Empromptu AI
@1mirul thanks so much. If you own your asset you should be able to decide what you do with it whether you compete or whether you decide to sell that asset but you and everyone else should be able to capitalize on the data you own. Your data is getting scrapped and captured anyway. You should at least be compensated
Empromptu AI
@1mirul The compounding part is what makes it structurally different. Most AI deployments get smarter for the vendor. This one gets smarter for you. That asymmetry is the whole point.
Empromptu AI
@1mirul Thanks! We simply believe there's a better way to build great, value-additive functions that are governed entirely by AI, and that the discussion about 'AI costs' is actually a discussion about implementation discipline and tightly controlling deployments around known workflows instead of chaotic experimentation everywhere.
I like the part abt capturing corrections and edge cases from real usage. That feels more useful than trying to guess everything upfront. One thing I wonder, how do you keep the model from leaaning the wrong patterns when user feedback is inconsistent or when diff experts correct the same situation in diff ways?
Empromptu AI
@busra_seker1 that's a great question. There is one ground truth so SMEs can override customers but SMEs have to agree what is ground truth
Empromptu AI
@busra_seker1 Exactly right on the ground truth architecture. On the inconsistency problem specifically, that's where the eval becomes the arbitration layer. Conflicting corrections don't both make it through, the eval scores against a defined expected outcome so noise and contradictions get filtered before they touch training. The model learns from signal that passed a quality bar, not raw feedback volume.
Empromptu AI
@busra_seker1 @sean_robinson1 Evals are the most important thing and yet some tools make evals really inefficient for people learning this technology to access. and take advantage of it's true power.
Empromptu AI
@busra_seker1 I think that what you're talking about is like the '7 out of 10 dentists' sort of conjecturing, and I think it's important to note that if policy is to enable flexibility from like a healthcare provider in making recommendations based on particular signals, it would seem only appropriate that the way the governance works would be similar, sort in a tightly-banded range of outcome likelihoods.
We're focused on optimizing toward outcomes, and find that helping people focus on results instead of the process itself helps to create alignment in the directionality of training outcomes. I hope that makes sense! If you'd like to discuss more, we're available for meetings to talk about your specific use case or scenario.
Build Check
This is awesome Shanea! Wish you all the best on this impressive launch
Empromptu AI
@german_merlo1 Thank you so much. Excited to get this out to the world.
Empromptu AI
@german_merlo1 Thank you Germán!
@german_merlo1 Thank you!
Empromptu AI
@german_merlo1 Thanks for the support!
@shanea_leven Congratulations on the launch.
One thing I’m trying to understand from your positioning. If the underlying model providers keep improving rapidly every few months, how do you measure whether the gains your customers see are actually coming from Empromptu’s learning layer versus improvements in the foundation model itself?
It seems like that’s a pretty important distinction because both could lead to better outputs over time, but only one creates a real competitive advantage for the customer. Are you able to quantify that difference in a meaningful way?
Empromptu AI
@shanea_leven @moh_codokiai yes absolutely you will actually see the performance and accuracy improvements directly in the product. And frontier models have deprecated the ability for you to fine tune their models any more. Also a model trained on your data for your product and your users is always going to be eventually more accurate than a general model.
@shanea_leven @shanealeven That makes sense. I agree a model trained on a company’s own users should eventually outperform a general-purpose model for that specific use case.
What I’m curious about is how you prove that improvement to customers. Do you have any benchmarking or evaluation framework that shows accuracy before and after the learning process, or is the validation mainly based on production outcomes and user feedback?
Empromptu AI
@shanea_leven @moh_codokiai yes! It's literally built into the product and you can see in our optimizer the performance. It's actually one of the things that my co-founder invented. Our optimizer is what our entire platform is based on. We have benchmarks that models trained on our platform are 30% more accurate than frontier models
Empromptu AI
@shanea_leven @moh_codokiai First, we're model agnostic -- the user can specify whatever they'd like to use, and we'll adapt the 'baseline' accordingly; the difference is often, for our users, the other efficiencies, such as the training, vertical integrations and other optimization components that, combined, make an enormous difference in both the quality of life they experience while building (setting up actual databases, auth sequencing, et cetera, with real-world best practices), and doing all of the AI-focused functionality from a template-based system that let's me say something like: "I want to build a growth function, and that's going to be me following up with leads we haven't talked to in more than 3 weeks, and I want that sort of outreach to look like this."
I haven't met a business leader yet who wants to replace someone on their team with a system that does that at 98% accuracy. And that's the difference they walk away remembering.