Hey PH 👋

I built Flopex after watching AI startups burn out on inference
infrastructure problems. Groq is at capacity right now — you can't
even upgrade past the free tier. OpenAI rate-limits you the moment
you scale. Together had a rough month. Every provider has an outage,
a price change, or a "we're not taking new customers right now"
moment.

Flopex is a routing exchange for AI inference. You send one request,
we ping every live provider — Groq, Together, DeepInfra, Featherless —
and route to whichever one is up, cheap, and under their rate limit.
Drop-in compatible with OpenAI chat completions format.

The core idea: your provider will hit capacity at some point. Flopex
makes sure your product doesn't notice.

What's live today:
- Real-time routing across 4 providers
- Performance profiles (cheapest / balanced / fastest)
- Automatic failover when any provider 429s, 402s, or times out
- Prepaid wallet, $10 to start, no monthly fee, no commitment
- Live routing feed on our landing page (that's real prod traffic)
- Browser-based playground — try the API without even signing up

What's shipping next:
- Streaming (SSE)
- Python + TypeScript SDKs
- More providers (Fireworks, Anyscale, Hyperbolic in testing)
- Phase 2: GPU supply marketplace (Airbnb model)

Pricing: small markup on provider costs. No monthly fee, no
commitment, no seat licenses. You pay per token.

Try it: https://flopex.ai
Docs: https://flopex.ai/docs/quickstart
Playground: https://flopex.ai/docs/playground

I'm here all day — AMA about the routing logic, provider
economics, or why this exists at all. Especially want to hear from
anyone who's hit Groq's "come back later" wall or blown through an
OpenAI rate limit at the wrong moment.

Built with: @Cursor (editor), @Railway (deploy), @Claude by Anthropic (thinking partner)

Flopex

Your AI provider will hit capacity. Your product won't.

Your AI provider will hit capacity. Your product won't.