Overseer helps engineers perform faster root cause analysis using machine learning. They integrate with your existing monitoring tools to fetch your dashboards, crunch on your data, and notify you of the top dashboards/metrics when an incident has triggered via PagerDuty.



Thanks @nivo0o0! Hi everyone, I'm one of the founders of Overseer. We built this tool because we noticed that engineers had to dig through many dashboards when diagnosing an incident. We felt this process could be streamlined through the use of machine learning. When we first started working on this project over a year ago, we weren't sure if the algorithms would work, or if our insights would be of value to anyone. We were also struggling to figure out how to make it easier for people to try the product without having to change their existing workflow. Since then, we've made huge improvements to the algorithms, deployed the tech for several large customers, and demonstrated value. Now I'd love to get a bit more feedback from you guys and see if we're going in the right direction! So here's how the tool works: 1 - We pull down your dashboards from your existing monitoring tool (e.g. Datadog/Wavefront/Librato) using your API key. 2 - We integrate with your PagerDuty account via a Webhook to notify us when an incident has triggered. 3 - When our Webhook is invoked, we will use machine learning to rank your dashboards, rank the metrics on those dashboards, and notify you via Slack/Email of the top dashboards/top metrics on those dashboards to look at. I'd love to see if the PH community would find this product useful and get feedback on how we can make it better. We're offering a free 30 day trial just for you guys, so sign up today!
