Launching today

Leo: The Prompt Engineering SDK

Launching today

Optimize, benchmark & evaluate LLM prompts with 1 command.

2 followers

Optimize, benchmark & evaluate LLM prompts with 1 command.

2 followers

Visit website

Bring rigor to your AI agents. Trusted by 8,500+ developers, Leo is a lightweight Python SDK designed to integrate prompt optimization directly into your CI/CD pipelines or internal tools. Stop shipping prompts that only work "most of the time." Leo provides a structured way to optimize drafts into role-based instructions and automatically evaluates them against real-world test cases using G-Eval and Hallucination Accuracy metrics. It's the missing piece of the LLM DevStack.

Leo: The Prompt Engineering SDK gallery image

Free

Launch tags:Open Source•Developer Tools•Artificial Intelligence

Launch Team

Wispr Flow: Dictation That Works Everywhere — Stop typing. Start speaking. 4x faster.

Stop typing. Start speaking. 4x faster.

Promoted

Maker

📌

Hey Product Hunt! 👋 I’m Leo, the creator of Leo Prompt Optimizer. I built this because I was tired of the 'trial-and-error' loop of prompt engineering. One day a prompt works; the next day, after a minor tweak or a model update, it breaks. I wanted a way to treat prompts like code: with structured frameworks, automated optimization, and objective benchmarks. Leo is a lightweight Python SDK and CLI designed to move your prompts from 'draft' to 'production' in seconds. It uses a 9-step engineering framework to expand your ideas into role-based, XML-structured instructions that most models actually follow reliably. I'm incredibly humbled that the library has already seen 8,500+ downloads as we’ve refined the core framework. What makes it different? It’s fast: Native Groq integration means you can optimize and evaluate 20+ test cases in the time it takes to sip your coffee. It's broad: Don't get locked in. Leo supports OpenAI, Groq, Anthropic, Gemini and Mistral. You can optimize on one, benchmark on another, and deploy on a third, all with a single client config. It’s objective: No more guessing. Use G-Eval and Hallucination metrics to see exactly how your changes impact performance. It’s structured: It automatically handles the XML tagging that modern models crave for better steerability. I’d love for you to take it for a spin! Whether you're building a simple chatbot or a complex agentic workflow, I hope Leo makes your dev life a bit easier. Check out the python library here: https://pypi.org/project/leo-pro... Check out the repo here: https://github.com/Leow92/prompt... I’ll be here to answer questions and hear your feedback. What’s the most frustrating prompt you’ve had to deal with lately? Let’s optimize it!

Report

1d ago

Been using Leo Prompt Optimizer for my thesis and honestly it made things way less frustrating.

Before, I was just tweaking prompts over and over and hoping they’d work. Now it feels a bit more structured and predictable. The evaluation part is especially useful, you can actually see if a change made things better or not. Also really like that it’s not tied to one model. I’ve tried stuff across different APIs without having to rethink everything. Overall, it just saves time and removes a lot of guesswork.

Report

1d ago

Wispr Flow: Dictation That Works Everywhere — Stop typing. Start speaking. 4x faster.

Stop typing. Start speaking. 4x faster.

Promoted

Maker

📌

Report

1d ago

Been using Leo Prompt Optimizer for my thesis and honestly it made things way less frustrating.

Report

1d ago