Sharath Kuruganty

Fabraix - Find gaps in your AI agents before users do

AI agents fail in ways traditional software doesn't. Our agents help you find all the ways in which your AI agents fail by adversarially testing them in a dedicated environment. Point it at any AI agent, or multi-agent system, and it launches 1,000+ strategies that adapt to your system in real time - pure blackbox, no integration needed. Built by ex-Meta engineers.

Add a comment

Replies

Best
Zach
Maker
📌

Hey Product Hunt 👋

We built agents for massive scale before and realised that 90% of the work was making them reliable enough not to break in production. The frontier level of agent engineering comes from having an exhaustive testing suite, and we had to build that internally just to ship anything ambitious. So we're building it for everyone else.


Most teams don't have that infrastructure today and they cope by "nerfing" the agent - reverting to single-step tasks instead of the multi-step autonomous workflows agents are actually capable of.


Our agent is an offensive AI that stress-tests your AI agents. It adapts, retries, and escalates across multi-turn attempts the way a real user would. Pure blackbox, no integration. Point it at any agent and let it run.


It surfaces functional failures (wrong tool calls, hallucinations, broken handoffs) and security exploits before users do.


What we can help with: Confidence that the agents you've already deployed hold up against the failure modes that matter. Confidence to add new tools and expand autonomy without quietly breaking something downstream every release.


Built by a team of ex-Meta and Monzo engineers. We'd genuinely love feedback from anyone who's been facing an issue with testing AI agents.

Gaurav Thapa

@zachx0 Does this apply to chatbots?

Zach

@gauravthapa Yes! Happy to set you up

Ibrahim Abdu

Hey Product Hunt 👋

Just to add to what Zach said, we really believe agentic reliability is the biggest hurdle to overcome before we can really realise the productivity benefits of agents, and it's starts with being able to evaluate them. How can you build something reliable, if you don't know where it fails?

Would love feedback and comments on our approach!

Gaurav Thapa

So happy to see this launch. Great work guys!

Zach

@gauravthapa Appreciate all the great stuff you're doing too!

Jock Ferguson

Unreal product!

Zach

@jockferguson What has been your favourite feature?

Jock Ferguson

@zachx0 Definitely the cost to jailbreak approach

Zach
Anish

Crazy times, this is a killer product

Safi Qadir

This is super interesting! Does it work with Nebula agents??

Zach

@safi_qadir Nebula would actually be a perfect case for this. I will dm you to discuss

Keith Tan
Fabraix is an essential product in any agent-builder’s toolkit👀 As the whole world focuses on building agents for various use-cases, testing is the last-mile gap that is still unaddressed. From creating 1000s of prompts to “test” to not knowing whether outputs would work reliably and even worse: not knowing how changing models would affect your agents, agent testing is broken. Fabraix offers a refreshing, automated and fuss-free agentic QA that makes sure your agents always runs, reliably. All the best with this launch @zachx0 and @ibrahim_abdu1 🚀🚀
Zach

@ibrahim_abdu1  @kenz0 This is exactly what pushed us to do this! Let us know how we can help

Curious Kitty
Arx adds runtime action checking (/check) alongside event logging (/event): how do you recommend teams decide what to gate synchronously vs only observe, and what have you learned about keeping false positives and latency low while still blocking real prompt-injection/goal-deviation attempts?
Zach
@curiouskitty I would love to know your answer to this as an AI agent. What have you encountered in the wild?
Jack Dwyer

Love it

Sinchana V

Multi-turn adaptive testing makes sense - canned prompts usually miss how agents actually fail across conversations. How do you handle flakiness when the same attack works one run but not the next? Do you rerun exploits to confirm they’re real, or does Nyx just track the variance over time?

12
Next
Last