Launching today
Metoro
AI SRE that detects, root causes & auto-fixes K8s incidents
90 followers
AI SRE that detects, root causes & auto-fixes K8s incidents
90 followers
Metoro is an AI SRE for systems running in Kubernetes. Metoro autonomously monitors your environment, detecting incidents in real time. After it detects an incident it root causes the issue and opens a pull request to fix it. You just get pinged with the fix. Metoro brings its own telemetry with eBPF at the kernel level, that means no code changes or configuration required. Just a single helm install and you're up and running in less than 5 minutes.









Metoro
Hey PH! We're Chris & @ece_kayan , the founders of Metoro.
We built Metoro because dealing with production issues is still far too manual.
Teams are shipping faster than ever with AI, but when something breaks, engineers still end up jumping between dashboards, logs, traces, infra state, and code changes just to figure out what happened and how to fix it.
We started working on this back in 2023 during YC’s S23 batch, and learned a hard lesson from customers early on: generalized AI SRE doesn't work reliably for two reasons.
Every system is different. The architecture is different. Some teams run on VMs, some on Lambdas, some on managed services, some on Kubernetes, others on mixtures of all of them.
On top of that, telemetry is usually inconsistent. Some services have traces, some don’t. Some have structured logs, some barely log at all. Metrics are named differently everywhere.
This means that teams need to spend weeks or even months generating system docs, adding runbooks, producing documentation and instrumenting services before the AI SRE can be useful. That wasn't workable.
So we took a different approach.
With Metoro, we generate telemetry ourselves at the kernel level using eBPF. That gives us consistent telemetry out of the box with zero code changes required. No waiting around for teams to instrument services. No huge observability blind spots.
And because Metoro is built specifically for Kubernetes, the agent already understands the environment it’s operating in. It doesn’t need to learn a brand new architecture every time.
The result is an AI SRE that works out of the box in under 5 minutes.
We automatically monitor your infrastucture and applications, when we detect an issue we investigate and root cause it. When we have the root cause, we automatically generate a pull request to fix it, whether that's application code or infrastructure configuration. Detect, root cause, fix.
We’re really excited to be launching on Product Hunt today 🚀
We’d love for you to check it out, try it, and ask us anything. Whether that’s about Metoro, Kubernetes observability, or AI in the SRE space.
@ece_kayan @chrisbattarbee
I’ve been burned by 'AI SRE' promises before, but your approach to the data problem (eBPF) makes this actually feel technically grounded.. Really great to see
Metoro
@ece_kayan @priya_kushwaha1
Thanks Priya, honestly the hardest problem is getting the right data at the right time to the agent.
eBPF (+ kubernetes specificity) helps us make that possible
@ece_kayan @chrisbattarbee that's great.. all the best
@ece_kayan @chrisbattarbee eBPF is the part that made me stop here. Most tools in this space sound great until the data gets patchy. Starting with your own telemetry makes the whole thing feel a lot more believable.
How often are teams merging the PR as is?
Metoro
@ece_kayan @artem_kosilov Hey Artem!
Thanks!
So based on our tracking the PR (generated as is with follow ups on the PR itself) is around ~60%.
That being said, we know that people take the generated PR and open a new one themselves with that as the basis and they iterate on it themselves (which we don't have good metrics for right now as its separated from the initial PR that we create)
Does it work well with many scheduled jobs/tasks for which the code is in a large monorepo?
Metoro
@alexander_zakon Yes!
So each k8s cronjob gets mapped to a service internally in Metoro. Then each service is assigned a codepath which is a combination of repository and source path. It looks something like:
Metoro discovers those automatically by itself by comparing emitted logs, profiling information etc but you can also set it manually by setting an annotation on the pod or the CronJob itself https://metoro.io/docs/integrations/github#option-1-using-kubernetes-annotations-recommended
Where is telemetry data stored when using Metoro (cloud vs self-hosted)?
Do you support running on Azure Kubernetes Service (AKS), and are there any limitations?
Metoro
@anil_yucel1
Hey Anil :)
So we offer three distinct hosting options:
Metoro Cloud - Fully managed by Metoro, Metoro manages the infrastructure in our environment.
Telemtry data is stored in our cloud environment
BYOC (Bring Your Own Cloud) - Managed by Metoro, hosted in your cloud - in your case in your Azure account
Telemetry data is stored in your cloud environment in buckets that you own but Metoro operates (Azure Blob Storage in your case)
On Prem - Fully managed by you, we just provide support.
Telemetry data is stored wherever you choose to host Metoro, we support cloud based storage options like s3 and Azure blob storage or disk based solutions too (SSDs are recommended)
Yep we fully support AKS, no limitations!