Self-driving AI observability and evals for agents

Start new thread

Respan Gateway - One AI gateway with built-in observability and evals

Y Combinator

•23d ago

Respan AI Gateway connects your app to 1,000+ AI models through one endpoint. But routing is the easy part. Respan keeps production AI reliable and under control with fallbacks, retries, caching, spend limits, alerts, and full traces for every call. Gateway, observability, evals, prompt management, monitors, and cost controls all run on one platform, so you do not need to stitch together five tools to debug production.

Replies

Best

The part I'd want to stress-test is how traces map back to customer and deployment context; that is usually where gateway-only setups stop being enough for debugging production incidents.

Report

22d ago

Respan

Maker

@jimmy_lee12 Absolutely thats a crucial point. Tracing back to the customer and deployment context is often where simple gateway setups fall short. For production incidents, having that full visibility really makes a difference otherwise its tough to pinpoint the root cause. Thats something we’re keeping in mind with Keywords AI, making sure traces carry enough context to be actionable.

Report

22d ago

Lancepilot

Really impressed with how Keywords AI makes managing multiple models and routing so seamless. Congrats!

Report

22d ago

Respan

Maker

@priyankamandal Thanks a lot! We put a lot of focus on making routing and multi-model management as seamless as possible. Glad to hear it’s coming through in Keywords AI. We’re also excited about how the traces and spend-limiting features help teams keep everything under control in real time.

Report

22d ago

todai

how does Keywords AI decide which model to route a query to when there are multiple options available?

Report

22d ago

Respan

Maker

@umar_saleem Great question! Keywords AI uses a combination of intent detection, output quality, and cost/speed considerations to decide which model should handle a query. It’s not just about picking the “fastest” model—it evaluates what’s best for that specific input in real time. From what we’ve built on the platform, this helps teams get accurate responses without over-spending or slowing things down.

Report

22d ago

Congrats on the launch! Curious to hear: after working with real customer workloads, what's one assumption about AI infrastructure that you were confident about early on but later realized was wrong?

Report

17d ago

Sounds useful. We have a travel AI, and we want to run tests comparing the quality of our model’s responses against other popular models. Do you have any built-in mechanisms for that?

Report

22d ago

Respan

Maker

@natalia_iankovych Yes, absolutely.

This is one of the main use cases for Respan evals. You can run the same travel-related test cases across your current model and other popular models, then compare response quality, latency, and cost side by side.

Feel free to send me an email, and our team will reach out directly. We’d be happy to help you set up the first eval suite for your travel AI!

Report

22d ago

The AI stack is getting more complex, and having gateway, observability, evals, and cost controls in one place is a huge advantage. Great launch 🚀

Report

22d ago

The fallback + spend-limit combo is the part I'd test first. In real LLM apps the annoying bit isn't routing, it's knowing whether a fallback quietly changed latency/cost. Curious if alerts can be tied to a specific customer or workspace?

Report

22d ago

Respan

Maker

@xiaosong001 Totally agree, that fallback and spend-limit combo is where you really see the difference. With Keywords AI, you can tie alerts and traces back to specific customers or workspaces, so you get visibility on latency or cost changes without guessing. It’s been really useful for catching subtle issues in live traffic.

Report

22d ago

The evals layer baked into the gateway is particularly interesting since most teams still just eyeball logs to check model performance.

Report

22d ago

1 2 3