Turn Postgres into a scale-out distributed database

get it



You need to become a Contributor to join the discussion - Find out how.
Craig KerstiensMaker@craigkerstiens · Product at Citus
Hi fellow Product Hunters, I'm Craig from the Citus team. Today we released Citus 5.0–a Postgres extension, which does not fork Postgres, that transforms your database into a distributed database. Citus makes it easy to seamlessly distribute your data across multiple Postgres databases, or shards. Further, as of today Citus is entirely open source. You can read more details about the release over on our blog at And looking forward to answering any questions you have.
Mike Coutermarsh@mscccc · Code @ GitHub
@craigkerstiens Hey Craig! If a team is currently using Postgres. What problems could they be having that Citrus solves? Also, would love to know where you got the idea for this & how it started.
Craig KerstiensMaker@craigkerstiens · Product at Citus
@mscccc At Heroku when running Heroku Postgres I saw the common issue of 1 large table growing beyond what a single node could handle. Very commonly this table would be called 'events', 'messages', 'logs'... you get the idea. That one table would comprise the vast majority of the database and start to grow to where it could no longer be held in memory which left you either with crappy performance or having to figure out how to scale out. While you can shard with stock Postgres doing that is engineering time that you're not building your product, another option is to go to another distributed data store but they don't have quite the level of tooling that SQL does and it makes analytics hard. Citus generally fits well when you've got a relational database with a single table or few that you need to scale out, this can happen as early as 20-30 gigs, probably more commonly at 100 or so, and then is valid for up to terabytes. The common qualification that leads people to choose citus are that you need real-time inserts, as opposed to batch processing/ETL, and need real-time queries and analytics across your data, again as opposed to long running data warehouse jobs.
Mike Coutermarsh@mscccc · Code @ GitHub
@craigkerstiens Ahh got it. Makes sense & have definitely run into that problem. Are people using this currently on Heroku Postgres?
Craig KerstiensMaker@craigkerstiens · Product at Citus
@mscccc No, up to date it's been sold directly to customers who are running it themselves, one example is Cloudflare. A number of customers are running it in the cloud but managing it themselves.
Chase Lee@_chaselee · CTO, Ambassador
@craigkerstiens this looks awesome! really love that Citus has moved from a fork to an extension. in theory this would work with Heroku Postgres now. has anyone tried or is the team there onboard?
Andreas KlingerHiring@andreasklinger · Tech at Product Hunt 💃
@craigkerstiens This means TL;DR simple clusters for Postgres datasets? eg for too large tables or just in general too large db?
Craig KerstiensMaker@craigkerstiens · Product at Citus
@andreasklinger For the most part yes, it means simple sharding. To Citus a shard is just a Postgres table, then Citus takes care of distributing it across different Postgres instances. In addition it has different executors to handle various workloads. If you need just 1 record it can retrieve it < 10 ms, if you need to do aggregation it handles it differently and more efficiently for that but might take 50-100ms. Not sure that was a perfect TLDR, but hope it helps.
Joshua PinterPro@joshuapinter · Product at CNTRAL. Maker of ntwrk.
What's the estimated cost for a small setup if we wanted to get you to do it?
Craig KerstiensMaker@craigkerstiens · Product at Citus
@joshuapinter Josh it really varies based on the size of machines and number of nodes you ned. Though if you'd like please reach out, and happy to talk in more detail