ShapedQL - The SQL engine for search, feeds, and AI agents
by•
Stop gluing Pinecone, Redis, and Python scripts together. ShapedQL is the SQL engine for relevance - powering "For You" feeds, Search, and RAG memory in minutes.
It compiles simple SQL into real-time ranking pipelines that retrieve, filter, score, and reorder results based on live user behavior.
Replace thousands of lines of infra with 30 lines of SQL. With native multi-modal embeddings and automated MLOps, ShapedQL helps you build real-time decisions, not just document retrieval.



Replies
Shaped
Hi Product Hunt! 👋
I'm Tullie, the founder and CEO of Shaped. Previously I was a researcher at Meta AI, leading several ML teams including one focused on Instagram Reels and Ads video ranking. I also created PyTorchVideo and was a core contributor to Pytorch Lightning.
We built ShapedQL because we realized that while retrieval has become easier (thanks to Vector DBs), ranking and relevance are still incredibly hard.
Most engineering teams we talk to are stuck maintaining a "Frankenstein" stack. To build a "For You" feed or give an AI Agent personalized memory, they have to glue together a vector database, a feature store (like Redis), a reranking service, and thousands of lines of Python spaghetti code.
We built ShapedQL to turn that "house of cards" into a single interface.
ShapedQL is a domain-specific SQL dialect that compiles down to a high-performance, multi-stage ranking pipeline. With a single query, you can define the four stages of modern relevance:
1. Retrieve: Fetch candidates from multiple sources (Hybrid Search, Collaborative Filtering, Trending).
2. Filter: Apply hard constraints (e.g., "in stock" or "within 50 miles").
3. Score: Rank results using real-time ML models (optimizing for clicks, purchases, or watch time).
4. Reorder: Enforce diversity so your users (or Agents) don't see the same 5 items repeatedly.
We're seeing teams reduce 2,000+ lines of maintenance code down to ~30 lines of ShapedQL, while shipping features like "Cart Upsell" or "Agent Memory" in days instead of months.
If you're not a fan of SQL you can also choose from Python or Typescript SDK's.
I'd love to hear your feedback and answer any questions about the syntax or how it works under the hood! 🚀
@tullie_murrell the scoring stage. how much control do you have over the ml model? can you bring your own model or is it mostly shaped's built-in ranking?
@tullie_murrell @topfuelauto you can choose and configure the ranking/scoring policies, and combine multiple models and objectives into a single score. This includes business logic as well as Shaped’s expansive library of supported model policies (where training, features, and weighting are all configurable). Bring Your Own Model is on the way!
@tullie_murrell congrats on the launch. For the retrieve function, can it also rank different sources to decide on the go which source to call? (in cases where you use multiple enrichment providers)
@tullie_murrell @austin_heaton yes! Once you integrate your providers into Shaped, you can control which sources are queried and in what order. Then the downstream scoring stage can rerank the results based on item quality.
@tullie_murrell @yasmeen_collins I meant can the sources be ranked based on the user input?
(e.g. based on the text input for User 1 we start with Source 1, for User 2 we start with source 3)
Shaped
Hey @austin_heaton !
Currently Shaped doesn't natively learn what retrievers to use / weight based on a user input. However, this can be easily achieved by using a combination of filtering & multiple Shaped engines that can be chained together.
@tullie_murrell Congrats on the launch, Tullie & team! The “Frankenstein stack → ~30 lines of SQL” line hit a little too close to home 😂
For teams already on Pinecone/Redis + rerankers, what’s the fastest “hello world” path—do you ship a starter template + eval loop (offline/online) so we can trust relevance before swapping infra?
Shaped
@hijacey There's typically two paths people take when getting started with Shaped:
1. Using Shaped as a reranker on their existing retrieval system. E.g. you can take the results from Pinecone, Elastic, Algolia, feed them to Shaped and get personalized, real-time reranked results.
2. Trying us out on a new retrieval use-case they're not using another provider for, e.g. a new agent use-case or something like that
Here's a link to our reranking docs so you can see how that works: https://docs.shaped.ai/docs/v2/query_reference/reranking. Let me know if you had any questions though or want to jump on a call and see a demo :)
Real-time retrieval and ranking tends to break at scale on feature freshness and training serving skew when event volume spikes and backfills happen.
Best practice is strict offline to online feature parity with a streaming fed online feature store plus impression logging for eval and safe shadow or canary rollouts.
How does ShapedQL handle feature definitions and model versioning plus A B testing while keeping low p99 latency across retrieve filter score reorder stages?
Shaped
@ryan_thill Yeah great question, these are all pretty much the most difficult part of things about RecSys and production IR and it's taken us ages to build all of this! Here's how we approach it (and the technologies we use if you're interested):
1. We have our own real-time feature store that handles the offline and online feature parity. It uses redis for online store and we provision a new redis instance for each tenant so it's all isolated. Part of Shaped's interface allows you to define the features you want to generate, although it's limited at the moment and we're planning on fleshing this out more in the coming months.
2. We have shadow and canary rollout system (uses Argo rollouts). We shadow for 30mins, then canary for 30mins typically, and only finalize a deployment if click-through-rate metrics and system metrics look good.
3. We have a predication store in Clickhouse that contains all of the impression logs and query requests, we do a join between these to work out attribution, and analyze A/B tests.
4. The query endpoints does have some abilities for internal A/B tests and we have a multi-armed bandit parameter optimization system, however, typically we ask customers to A/B test on their side so they can get an apples to apples comparison with what their benchmark.
5. We embed our vector store and model weights alongside each other in the query pod, which means we can get extremely fast end-to-end latency and zero copy performance through the rest of the 4 stages. We typically aim for 30ms P50.
Shaped
Hey @vouchy - I can take this one. Yes, there was one specific company (who we can't name 😉), who had 3000 lines of elastic search rules behind their ranking. Every decision was nested under layers upon layers of rules. Eventually we converted to it to ShapedQL and managed to cut it down to 30 lines.
clearspace
This is sick - more teams able to build better feeds. Are you able to share any feeds Shaped is powering that you're particularly proud of?
Shaped
@anteloper - One of my favorites is Afterhour's feed - check it out: https://www.afterhour.com/
Outdone
Very cool. Congrats on the launch, fellas.
Shaped
@jonathan_nass thanks for all the support!
Shaped
Congrats on the launch! Looks so good
Congrats on the launch @tullie_murrell ! How do teams actually iterate on relevance once they’re on ShapedQL? Is the biggest win faster experimentation, or just not having to touch infra every time they tweak ranking logic?
Shaped
@ehi_airewele yeah it's a great question. With search, recommendations and retrieval there's always something you need to iterate on, e.g. you might want to incorporate new types of data about your users, you might want to try a recently released AI model, you might have changing objectives quarter over quarter (e.g. conversions vs repeat purchases). Shaped is the infrastructure that helps your configure, adapt and experiment with all of these things faster (exactly as you mentioned). This ultimately leads to better, more relevant results that are updated to the needs of your business and users.