How are you building AI that takes actions — not just answers?

Amarsia

•3mo ago

We've been getting the same request over and over from our users: "My AI gives great answers, but it can't actually do anything."

It got us thinking — most AI integrations today are still essentially fancy search boxes. The AI talks, the human acts. But the real unlock is when the AI can close the loop itself — query the database, send the email, update the record — without a human in the middle.

The hard part isn't the action itself. It's the non-determinism. How do you build a system where the AI decides when to act, which action to take, and what parameters to pass — based purely on context — without it going off the rails?

A few things we've learned building this:

Intent detection is the core problem. The AI needs to understand not just what the user said, but what they actually need done. "Check if John is on a paid plan" should trigger a database lookup, not a paragraph explaining what a paid plan is.

Isolation matters more than you think. Each action needs to be stateless and sandboxed. When the AI is calling functions autonomously, you need guaranteed blast radius — one bad call shouldn't affect anything else.

Logging is non-negotiable. Non-deterministic systems are hard to debug. Every action call needs full params + response logged so you can understand exactly what the AI did and why.

Curious how others are thinking about this:

How are you deciding which actions to expose to your AI vs. which ones stay human-controlled?
Are you prompting heavily to guide action selection, or letting the model figure it out?
What's broken for you that current tool-use implementations don't solve?

We just shipped AI Actions on Amarsia with a upcoming PH launch — would love to hear how you're approaching this.

https://www.producthunt.com/products/amarsia

75 views

Replies

Best

run into this exact gap. My AI explains things beautifully but hesitates to act. I’ve started limiting actions to high-confidence intents only.

Report

3mo ago

@rahul_manjhi1 For me, the biggest shift was realizing actions need tighter boundaries than conversations. I’m exposing only reversible or low-risk actions first. Anything destructive stays human-controlled.

Report

3mo ago

@rahul_manjhi1 @simran_kumar I’m approaching this by tiering actions. Simple reads (like fetching data) are fully automated, but writes or updates require confirmation layers. I’ve found that letting the model “suggest” an action before executing helps a lot.

Report

3mo ago

The logging point resonates most for me. I'm building an AI-powered product and the hardest debugging sessions are always "what did the agent actually decide and why." Full params + context logged per action call has become non-negotiable — not just for debugging but for building trust with users who want to understand what the AI did on their behalf.

Report

3mo ago

I’d expose actions in layers, not all at once.

Low-risk actions like reading data, searching records, or drafting emails can be autonomous. Anything that changes money, permissions, customer records, or external communication should probably start with human approval until there’s enough trust.

Heavy prompting helps, but I think permissions, scoped actions, logs, and clear approval rules matter more than the prompt itself. The model can choose the action, but the system should define the guardrails.

Report

3mo ago

Closing the loop is definitely the next frontier. You mentioned intent detection as the core problem—are you finding that smaller, fine-tuned models handle action-selection better than the larger 'generalist' models, or is heavy system prompting still the most reliable way to keep them on the rails?

Report

3mo ago

Amarsia

@tehreem_fatima5 system prompting and tool prompting still work best in practice. We’ve also found smaller models specialized for specific tasks often make better judgments while taking actions.

There’s definitely still risk with fully autonomous agents, but proper orchestration and guardrails are what make them reliable in production.

Report

3mo ago

We’re thinking about this as a permissioned action system, not just “let the model call tools.” The model can suggest the intent and parameters, but high-risk actions should still go through policies, approvals, scoped permissions, and full traces/logs. For me, the real challenge is not tool calling itself — it’s deciding which actions are safe to automate fully, which need confirmation, and how to make every step explainable when something goes wrong.

Report

3mo ago

The framing I keep coming back to is this, the question is not which actions to give the AI, but which actions to permanently take away.

For Sharpread, I made one hard architectural decision early. The AI is never allowed to touch the financial math.

Revenue, operating loss, EPS, and debt all come directly from SEC XBRL structured data feeds. The model only reads the text. It cannot query, compute, or estimate a number. That action is permanently off the table.

The result is that the non-determinism problem you describe is fully contained. The AI does exactly one thing: extract and cite verbatim sentences from the filing. If it cannot find the source sentence, it drops the claim entirely. It does not approximate, summarise, or fill in the gaps.

Your point on logging is exactly right. Every Sharpread analysis links each claim back to the highlighted paragraph in the original EDGAR government filing. That is not just a UX feature. It is the audit trail that lets me debug what the model actually did versus what it was supposed to do.

The trust ceiling for AI actions rises dramatically when you define the blast radius before you define the capability. Isolation first, then scope creep.

Report

3mo ago