Aadil Ghani

How do you stay aware of what your AI coding agents are doing?

I've been running Claude Code, Cursor, and Codex pretty heavily for the last few months and I keep hitting the same loop:

1. Start a task in one agent

2. Switch to something else (Slack, Twitter, another terminal)

3. Come back 30-40 minutes later

4. Agent finished 35 minutes ago. Or worse, it's been waiting for my approval the entire time.

The more agents I run, the worse it gets. There's no unified way to know what's happening across them.

Curious what other people's setups look like:

- Do you just keep terminals visible and check manually?

- Built any custom notification scripts?

- Use something like ntfy or Pushover?

- Just... accept the wasted time?

I've been building something in this space (push notifications + approval flows for AI agents) and I'm trying to understand if everyone's workflow is as janky as mine, or if some of you have figured out something clever.

Would love to hear what's working and what's not.

974 views

Add a comment

Replies

Best
Riya Pariyar

mostly just accept the wasted time:)

the approval-wait problem is the one that actually stings though.
a finished agent i can recover from.
one that's been sitting on a y/n for 40 minutes while i'm in another tab. that one hurts differently.

curious how you're handling that in pushary: does the agent push the approval request directly, or is there a layer in between?

Aadil Ghani

@riya_pariyar 

"A finished agent I can recover from, one sitting on a y/n for 40 minutes hurts differently" is painfully accurate. The finished one wasted your time. The blocked one wasted its time and yours, simultaneously, while smugly doing nothing.

To your question: there's a layer in between, on purpose. The agent doesn't push to you directly, it hits Pushary, and we relay it as the approval request with context attached, then route your y/n back to the right terminal. Going direct would mean every agent needs your device, your auth, and your phone number baked in, which is a security nightmare and breaks the second you add a seventh terminal. The middle layer is what lets one inbox sit across Claude Code, Cursor, and Codex at once.

The fun part is the layer can also be smart: batch the noise, flag the risky approvals louder than the trivial ones, so you're not getting paged for "can I create a file."

Out of curiosity, would you actually want to approve from your phone, or just be told it's blocked so you can walk back to the machine? Trying to figure out how many people want full remote control vs. just the heads-up.

Rob Stout

I use Warp IDE. It has passive notifications (I think it also has active ones) when the agents finish. It also has tabbed views that you can set up with names/colors/groups to keep your session straight. One of the best features, you can save the views to open later. For example, if you're working on an app, you can open iOS/Android/Watch/Wear/Web all at the same time to help keep key features in sync and get them directly onto your FAQ/Support site. I think it works pretty well.

It does have an agent that is supposed to orchestrate, but I don't know what model it's on, and I'd rather do that myself — less drift.

If you're curious about the results:

https://hopscotch.city

https://trytokenmomics.com

Side note: Claude Code does have push notifications when agents finish, also.

Aadil Ghani

@robstout 

Warp's saved views are legitimately great, the iOS/Android/Watch/Web group you can reopen in one click is exactly the kind of thing more tools should steal. And you're right that Warp and Claude Code both ping on "done." I'm not pretending that signal doesn't exist.

Here's the gap though: "done" is the easy notification. Every tool is converging on it because it's the obvious one. The two that nobody handles well are "blocked waiting on your approval" and "here's what actually changed," and they're handled per-tool, locked inside each app. Pushary's bet is the cross-tool layer: one inbox for approvals and handoffs across Warp, Claude Code, Cursor, and Codex, so you're not juggling three different notification systems with three different definitions of "done."

Also, big agree on the orchestrator. "I'd rather do that myself, less drift" is the correct instinct. A black-box orchestrator on a mystery model is just trading visible work for invisible work. Pushary deliberately doesn't try to drive your agents, it just tells you the truth about what they're doing so you stay the orchestrator.

hopscotch.city and trytokenmomics shipping out of that multi-view setup is a solid flex, by the way.

Quick one: do Warp's notifications fire on approval-blocked, or only on finished? That's the line I keep trying to map across tools.

rongze

I'd separate progress logs from needs attention.

For coding agents, the useful notification is usually not every step they take. It's just a few state changes: running, blocked / waiting for approval, failed tests, and finished with a short summary of changed files.

Even a tiny script that watches the terminal or log file and sends those states is probably enough. The key is making the agent say when it needs a human, instead of silently waiting for 35 minutes.

Aadil Ghani

@kevinzrzgg 

"Separate progress logs from needs-attention" should be the first line of the spec, and it's the line most people get wrong. They pipe every step to a notification and train themselves to ignore the channel, which is worse than silence. The signal is the four state changes you listed, nothing else. Progress is for the log you read when you choose to. Attention is for the ping you can't ignore.

You're also right that a tiny script gets you most of the way on one tool, one machine. The catch is that's where everyone starts and then quietly drowns. The script that watches Claude Code's log doesn't know about Cursor's, the one for your laptop doesn't follow you to your phone, and "finished with a summary of changed files" needs the agent to actually emit that summary reliably, which is its own small project. It scales to one of you and one tool beautifully, and falls apart at six agents across three tools.

That's the entire reason Pushary exists: the exact state machine you described (running, blocked, failed, done-with-diff) but as one layer across tools and devices, so you're not maintaining a script graveyard. You nailed the spec. We just productized the part where it has to keep working at scale.

The real unlock is your last sentence: making the agent declare when it needs a human instead of silently waiting. Curious, in your setup do you prompt the agent to announce that, or do you infer "blocked" from the terminal going quiet? The detection method is the part I keep going back and forth on.

Mert Tunay

I still check manually most of the time and the worst part is finding out the agent was waiting for one small approval. A simple done/stuck/needs input notification would be useful. Are you building this mainly for terminals or also for Slack/GitHub workflows?

Aadil Ghani

@mert__33 

Direct answer: the capture starts at the terminal layer, because that's where the agents actually live and where the "waiting on one tiny approval for 35 minutes" pain is sharpest. Claude Code, Cursor, Codex. But the delivery side is deliberately channel-agnostic. The whole point is the alert finds you wherever you are, so phone and Slack are first-class, not the terminal you've already walked away from.

GitHub workflows are the natural next layer and very much on the radar, because a blocked agent and a PR waiting on review are the same problem wearing different clothes: work that's done moving and now stuck on a human who doesn't know yet. Unifying "your agent needs you" and "your PR needs you" into one inbox is exactly the direction.

And yes, done / stuck / needs-input is the entire signal. Three states, not a firehose. The one that earns its keep is "needs input," because that's the silent 35-minute killer you described. Done you recover from, stuck-on-a-y/n just bleeds.

Quick one back: if it pinged you in Slack, would you want to actually answer the approval right there in the thread, or just get the nudge and walk back to the terminal? Trying to figure out how much people want to resolve in-channel versus just be told.

rongze

The useful notification is probably not just “agent finished”. I’d want three boring bits with it: what changed, what it needs from me, and whether it touched anything risky.

For coding agents, a good status update is closer to a tiny handoff report than a push alert: changed files, tests run, failures, and “waiting on approval for X”.

Aadil Ghani

@kevinzrzgg 

"Closer to a tiny handoff report than a push alert" is the cleanest way I've heard anyone draw that line. A bare "agent finished" just relocates the work, you still have to climb back into the terminal to find out what finished means. The notification should carry the decision, not just announce that a decision is waiting.

Your three boring bits are the exact payload: what changed, what it needs from me, and did it touch anything risky. That last one is the underrated star. Most setups surface changed files and failures but never flag "by the way, I edited the auth flow," which is precisely the change you'd want screaming at you and the one that arrives silently. Changed files and tests are the body of the report. "Touched something risky" is the headline.

This is more or less Pushary's whole spec, so either you've been reading my notes or we independently arrived at the same correct answer. The push is the handoff: changed files, tests run, failures, waiting-on-approval-for-X, with risk flags promoted to the top so the loud stuff stays loud.

The hard part is the risk flag, because the agent has to reliably know it touched something sensitive. Curious how you'd draw that line, declare risky paths per project up front (auth, payments, migrations), or have the agent infer risk on its own? I trust the first more than the second right now.

Goddey Uwamari

Interesting parallel — this visibility problem shows up at the infrastructure layer too.

Most multi-tenant SaaS teams have almost no real-time awareness of what their workloads are actually doing or costing until the AWS bill arrives (or something breaks). Same observability gap, different layer of the stack.

Curious if you've found any solid patterns or tools for keeping track of multiple agents yet?

Aadil Ghani

@goddey_uwamari 

The AWS-bill parallel is painfully good. "You find out what it cost when the invoice lands" is the exact same failure mode as "you find out the agent was blocked when you happen to glance at the terminal." In both cases the system was telling you the whole time, you just had no surface listening. Observability gaps are all the same shape: work happening faster than the human can passively perceive it.

To your actual question, the patterns that keep showing up in this thread and hold up: isolate first (git worktrees, one per agent, so contexts never collide), then collapse the signal to three states (done, blocked, failed) instead of streaming every step, then make the notification a tiny handoff report (changed files, tests, and crucially a flag if it touched anything risky) rather than a bare "done." And the cultural one: make the agent declare when it needs a human instead of silently waiting. Silence should never be the blocked state.

That stack is basically what Pushary productizes, which is the honest answer to "any solid tools yet." Most people are hand-rolling it with curl pings and log-watchers, and it works until you hit six agents across three tools and your script graveyard becomes its own maintenance job. The pattern is well understood now. The unsolved part is making it work at scale without you babysitting the babysitter.

Funny enough your world might be the next layer down: an agent that's burning tokens or spinning a runaway loop is a cost-observability problem wearing a coding-agent hat. Have you seen anyone tie agent activity to spend in real time, or is that still bill-arrives-and-you-cry?

Aadil Ghani

Launch is live guys - would love some support https://www.producthunt.com/posts/pushary-3

Veer Singh

The "waiting for approval" loop is the absolute silent killer of AI productivity! 🛑 Right now, it's mostly manual terminal-watching or messy custom scripts. A unified notification layer for agent execution is a brilliant move—definitely a tool the space needs.

Aadil Ghani

@veer_singh14 

"Silent killer" is the perfect phrase for it 🎯 — the agent finishes its actual work in 5 minutes and then just... sits there waiting, and you don't find out for half an hour. The work was fast; the waiting on you to notice is what's slow.

Appreciate the read on it being a unified layer rather than another one-off script. That's the bet exactly — everyone's already solved this badly with their own custom ntfy/Pushover hacks, but nobody wants to maintain that glue across Claude Code, Cursor, and Codex. Are you currently running any custom notification scripts yourself, or still in the manual terminal-watching phase? Trying to gauge how many people have already duct-taped a solution vs. just living with it.

Veer Singh

@aadilghani Couldn't agree more—maintaining that custom glue code across different AI tools is a massive hidden time-sink for engineering teams.

We've built some of our own notification workarounds for our dev workflows to dodge the terminal-watching trap, so I completely validate the pain point you're solving here. Centralizing this into a single, reliable layer is a game-changer for team efficiency. Really looking forward to seeing how Pushary handles the multi-tool ecosystem!

Toch

This is a real problem, especially once you’re running more than one coding agent at the same time.

What has helped me is treating AI coding agents less like autocomplete and more like junior engineers that need task boundaries, check-ins, and review points.

My workflow is usually:

  1. I break tasks into very small tickets before sending anything to Claude Code/Cursor/Codex.

  2. I define the expected output upfront: files to touch, files not to touch, acceptance criteria, and what “done” means.

  3. For any critical change, I ask the agent to update the project .md file with what changed, why it changed, files touched, assumptions made, and any follow-up risks.

  4. I ask for a summary before implementation when the task is risky.

  5. I review diffs before accepting changes.

  6. For longer tasks, I use checkpoints: “stop after planning,” “stop after backend changes,” “stop before modifying auth/payment/database logic.”

That .md file has been surprisingly important because it becomes a running memory and audit trail for the project. When something breaks later, I’m not trying to reverse-engineer what the agent did from vibes.

For me, the bigger issue is not just notification visibility. It’s workflow observability.

I don’t only want to know “the agent is done.” I want to know what changed, why it changed, what assumptions it made, what files were touched, and what needs human approval before moving forward.

That’s where I think the real opportunity is: status, approvals, logs, summaries, risk flags, and handoff points across agentic coding tools.

Because once you’re building production software, the real question becomes: how do I stay in control while AI moves faster than I can manually monitor?

Aadil Ghani

@toch_aria 

You basically wrote my product spec, so I'm either flattered or out of a job.

You're right: "done" is the shallow version. What changed, why, what it touched, what needs sign-off, that's the actual signal. The notification is just the courier. The .md-as-audit-trail move is the smart part, because future-you debugging at 2am does not accept "vibes" as a commit message.

That control-while-it-moves-faster-than-you-can-watch problem is the whole reason pushary.com exists. Think of it as a control panel for your agents: status, approvals, and risk flags across Claude Code, Cursor, and Codex, so you stay in the loop without babysitting six terminals.

Quick one back at you: do you prompt for that .md update every time, or have you wired it into a skill so it's automatic? That reliability is load-bearing for the whole thing.

Toch

@aadilghani Glad to hear that, I will be reviewing the product link. Initially I do prompt for that everytime. Which can be a token killer. I switched to automating it on skill

Florent Duthoit

There's hundreds way to handle this I believe.
In my case, the most efficient way was letting my agent access to my Slack and format the AI to let know any of the team member related to the question so they can answer in slack and the agent can read the response.
You can also let the agent to wait X minutes and check back if got any answer.
That has been quite smooth with my co founder. They went to sleep, the agent was running/building stuff, when it got stuck, the agent just ping me on Slack, I gave him the solution or fix the problem it was facing, then it get back to work confirmed me and back and fourth this way.

Of course, there's still the first step of planning the work which need attention and you can't really get off the computer. But when it's building, anyone from a company should be able to help it make the works done.

Aadil Ghani

@florent_duthoit 

This is genuinely one of the better setups in the thread, and the insight buried in it is the real gold: "anyone from a company should be able to help it get the work done." That's the part most people miss. Once the agent can route a question to whoever can actually answer it, you've turned a solo bottleneck into a team that happens to be asleep half the time.

The Slack relay is smart, and it's basically Pushary's thesis built by hand. The two places it gets fragile: you had to wire it up per agent, give it Slack access, and format the prompts just so, and the "wait X minutes and check back" loop is polling sneaking back in through the side door. Works great with a cofounder who knows the dance. Gets messy across six agents and a bigger team where you also want to know what it changed, not just answer its question.

Pushary is that pattern as a product instead of a custom integration: structured handoffs with risk flags routed to the right person, no per-agent plumbing, and the relay handles the wait so the agent isn't burning cycles checking back.

You nailed the one honest limitation too. Planning still chains you to the desk. Agreed, and I think that's correct for now. You should be in the room for the plan. It's the babysitting after the plan that shouldn't require a human at all. Curious, does your agent ever ping the wrong person, or have you got the routing dialed in?

Florent Duthoit

@aadilghani You're right about the two fragile places but we've easy solution to counter that.
Let me give you more context about our team workflow.

We train each of our team member to Opencode from sales, customer support, delivery, developper, finance, what ever. We've 30+ MCP connected (built-in MCP server that is used as a gateway) that our team can use in Opencode accross all the company services and connect their own company account to it.
It has been the best productivity tool we found and by far the most performing one.
Easy script setup that can run on any device and configure everything properly without any technical knowledge required. Just a one line command to run (could become an executable).

So to answer the two fragile places:
- No need to format the prompt, global instructions are setup in a way that it know what to do.
- The time it wait, we don't really care and can be incremental. At the end, we just want the work to be done what ever time it takes. If it waited too long, it will ping the concerned person again to follow up. I wish there's some way to wake up an agent on specific event (slack messages received and so on, that could be the perfect improvement/solution long term).

For the wrong person, we of course use agent memory. It is not yet shared accross the team, but we may have ASAP. Right now, everyone have his own personal memory and the Agent can remember who to talk with according to the task we're dealing with at the moment.

I think the problem most people face is big at the beginning and they all need to find a way it match with the team mindset.
We are evolving our ecosystem with our discovery and rely more and more with agents, with this setup we have already automated most of it:
- Development
- New workflows
- Create/Update documentation
- & much more.

The most underrated winning MCP has been Playwright MCP. There's no more limits with this one and way more efficient than any built-in Claude/GPT/Perplexity browsing agent system.

I feel your platform is interesting, but if I have to keep being on my phone confirming the notification, this will drive me crazy.
We need to provide the agent the capability of having a dedicated environment for it to fail, so the basic question/authorization is nbot required anymore.

Aadil Ghani

@florent_duthoit 

You just argued me into a better version of my own product, so thank you for that.

You're right, and I'll go further than you expect: if Pushary is pinging your phone for "can I create a file," it has already failed. That class of approval shouldn't exist. A sandboxed environment where the agent is free to fail and self-correct kills 90% of interruptions, and I'm fully on board with that being the goal. Anyone building approval flows for reversible actions is just adding friction with extra steps.

But there's a second class that no sandbox removes, because it's not "did it fail," it's "which way did you want this." Touch the payment logic or not. Ship to prod now or wait. The spec was ambiguous and there are two reasonable interpretations. In those, the human isn't a safety net, the human is the missing input. The agent can't fail its way to your intent. That irreducible slice is the only thing Pushary should ever surface, and if it surfaces more than that, I'm doing it wrong.

And here's the fun part: you already designed the rest of it. "I wish there was some way to wake up an agent on a specific event, Slack message received" is exactly the return path Pushary is built around. The agent sleeps, the event fires, it wakes and keeps going. You're not arguing against the platform, you're spec'ing its core loop. We just disagree on whether you want to be the device that confirms, and I think the answer is "only for the handful that actually need your brain."

Separately, your Opencode plus 30+ MCP gateway setup with non-technical teammates running one-line installs is one of the more impressive org-wide agent rollouts I've heard described, and the shared-memory routing is the obvious next unlock. Big co-sign on Playwright MCP too, the built-in browsing agents aren't close. Quick one: when the agent hits a genuine judgment call, not a failure, who in that 30-person setup ends up being the input, and how does it know?

Florent Duthoit

@aadilghani we believe more in agent than god itself 😂

Mostly when the agent designed everything, human mostly guide it the wrong way.
If the agent came with a blocker, then we just need to let him investigate and came with choices that match what we want. The human is just here to give insights or information he doesn't have, but never to make a decision.

I remember Jensen saying: "I'm not asking it to think for me. I'm asking it to teach me things that I don't know."

This happen during the plan mode, that's why the focus and waiting (reading actually) during this phase is important, you learn every minute.
We know what we want but we don't know how to achieve it in a proper & quickest way.