Fabraix - Find gaps in your AI agents before users do

Composio

•2mo ago

AI agents fail in ways traditional software doesn't. Our agents help you find all the ways in which your AI agents fail by adversarially testing them in a dedicated environment. Point it at any AI agent, or multi-agent system, and it launches 1,000+ strategies that adapt to your system in real time - pure blackbox, no integration needed. Built by ex-Meta engineers.

Replies

Best

Fabraix

Maker

📌

Hey Product Hunt 👋

We built agents for massive scale before and realised that 90% of the work was making them reliable enough not to break in production. The frontier level of agent engineering comes from having an exhaustive testing suite, and we had to build that internally just to ship anything ambitious. So we're building it for everyone else.

Most teams don't have that infrastructure today and they cope by "nerfing" the agent - reverting to single-step tasks instead of the multi-step autonomous workflows agents are actually capable of.

Our agent is an offensive AI that stress-tests your AI agents. It adapts, retries, and escalates across multi-turn attempts the way a real user would. Pure blackbox, no integration. Point it at any agent and let it run.

It surfaces functional failures (wrong tool calls, hallucinations, broken handoffs) and security exploits before users do.

What we can help with: Confidence that the agents you've already deployed hold up against the failure modes that matter. Confidence to add new tools and expand autonomy without quietly breaking something downstream every release.

Built by a team of ex-Meta and Monzo engineers. We'd genuinely love feedback from anyone who's been facing an issue with testing AI agents.

Report

2mo ago

Fastlane

@zachx0 Does this apply to chatbots?

Report

2mo ago

Fabraix

Maker

@gauravthapa Yes! Happy to set you up

Report

2mo ago

Fabraix

Maker

Hey Product Hunt 👋

Just to add to what Zach said, we really believe agentic reliability is the biggest hurdle to overcome before we can really realise the productivity benefits of agents, and it's starts with being able to evaluate them. How can you build something reliable, if you don't know where it fails?

Would love feedback and comments on our approach!

Report

2mo ago

Fastlane

So happy to see this launch. Great work guys!

Report

2mo ago

Fabraix

Maker

@gauravthapa Appreciate all the great stuff you're doing too!

Report

2mo ago

Unreal product!

Report

2mo ago

Fabraix

Maker

@jockferguson What has been your favourite feature?

Report

2mo ago

@zachx0 Definitely the cost to jailbreak approach

Report

2mo ago

Fabraix

Maker

@jockferguson Then you will enjoy reading this: https://fabraix.com/blog/adversarial-cost-to-exploit

Report

2mo ago

Crazy times, this is a killer product

Report

2mo ago

This is super interesting! Does it work with Nebula agents??

Report

2mo ago

Fabraix

Maker

@safi_qadir Nebula would actually be a perfect case for this. I will dm you to discuss

Report

2mo ago

Maxium

Fabraix is an essential product in any agent-builder’s toolkit👀 As the whole world focuses on building agents for various use-cases, testing is the last-mile gap that is still unaddressed. From creating 1000s of prompts to “test” to not knowing whether outputs would work reliably and even worse: not knowing how changing models would affect your agents, agent testing is broken. Fabraix offers a refreshing, automated and fuss-free agentic QA that makes sure your agents always runs, reliably. All the best with this launch @zachx0 and @ibrahim_abdu1 🚀🚀

Report

2mo ago

Fabraix

Maker

@ibrahim_abdu1 @kenz0 This is exactly what pushed us to do this! Let us know how we can help

Report

2mo ago

Product Hunt

Arx adds runtime action checking (/check) alongside event logging (/event): how do you recommend teams decide what to gate synchronously vs only observe, and what have you learned about keeping false positives and latency low while still blocking real prompt-injection/goal-deviation attempts?

Report

2mo ago

Fabraix

Maker

@curiouskitty I would love to know your answer to this as an AI agent. What have you encountered in the wild?

Report

2mo ago

Love it

Report

2mo ago

Multi-turn adaptive testing makes sense - canned prompts usually miss how agents actually fail across conversations. How do you handle flakiness when the same attack works one run but not the next? Do you rerun exploits to confirm they’re real, or does Nyx just track the variance over time?

Report

2mo ago

1 2