Estuary Flow

Estuary Flow

Your data, where you want it, in milliseconds

5.0
1 review

427 followers

Managed change data capture and ETL pipelines with streaming SQL transforms.
Estuary Flow gallery image
Estuary Flow gallery image
Estuary Flow gallery image
Estuary Flow gallery image
Free
Launch Team
Flowstep
Flowstep
Generate real UI in seconds
Promoted

What do you think? …

David Yaffe
The Pitch: 🚰 Estuary Flow is a Real-Time Data Platform that enables building no-code reliable pipes that don’t require scheduling, and support batch/streaming and materialized views in milliseconds. 📒 A free account with up to 10gb/mo in data movement can be had here: www.estuary.dev The Details: Estuary Flow is built on top of an open-source streaming framework (Gazette) that combines millisecond-latency pub/sub with native persistence to cloud storage. Basically, it’s a real-time data lake. Beyond being able to sync data continuously between sources/destinations without configuring, say, Kafka, there are a few benefits to a UI built on top of this streaming framework, specifically: 🗄️ Managed CDC. Simple, efficient change data capture from databases with minimal impact and latency. Seamless backfills – even over your very large tables that Debezium tends to choke on – and real-time streaming out of the box. 🧑‍💻 Streaming SQL transformations. We have a quite powerful transformation product that allows for Streaming SQL transforms without a requirement for windowing. Join historical data with real-time without having to think about it. Flow also offers schema validation and first-class support for testing transformations, with continuous integration whenever you make changes. 💽 Collections instead of Buffers. When a data source is captured – like Postgres CDC, or Kinesis, or streaming Salesforce – the data is stored in your cloud storage as regular JSON files. Later, you can materialize all of that juicy history and ongoing updates into a variety of different data systems. Create identical, up-to-date views of your data in multiple places, now or in the future. 📈 Continuous Views instead of Sinks. Materialized views update in-place. Go beyond append-only sinks to build real-time fact tables that update with your captured data – even in systems not designed for it, like PostgreSQL or Google Sheets. Make any database a “real time” database. ✅ Completely Incremental, Exactly-Once. Flow uses a continuous processing model, which propagates transactional data changes through your processing graph. This helps keep costs low while maintaining exact copies across different systems. ⏩ Turnkey batch and streaming connectors. Both real-time data as well as historical data supported through one tool and access to pre-built connectors to ~50 endpoints. For example, you can capture from the batch Stripe API, join it with data from Kafka and push that all to Google Sheets – all without building a custom integration. Or if you want, plug in your own connector through Flow’s open protocol.
Dineshan
@dyaffe Congrats on the launch, Real-time data gathering and Synchronising data between multiple sources are impressive features that could save us a lot fo time
David Yaffe
@dineshan_sithamparanathan thanks! Appreciate the support and excited to see your product!
Huy Doan
Congrats guys on the launch. The idea looks good. Just a curious question: what is the main difference between Segment and Estuary Flow?
Phil Fried
@huy_doan_quang thanks! One difference is that Segment is a Customer Data Platform, whereas Flow is much more generic. We've tended to focus a bit more on integrations with things like databases, and Segment focuses more on things like capturing data directly from your website. We've actually had users push data from Segment into Flow (using webhooks) in order to materialize their data to an OLAP database like Bigquery or Snowflake. One other difference is that Segment is more focused on point-to-point integrations, while Flow focuses more on "data products". When you "capture" data in Flow, you don't just have the ability to materialize it somewhere else. Flow captures data into "collections", which have all the features of data products: - Discoverable: Users can come to Flow to find the data they need - Governance: Flow's authorization system allows you to safely share data across users, teams, and even organizations - Quality: Flow does schema validation at every step, and built-in transforms let you clean up your data and provide meaningful guarantees about the output data - Observable: Flow automatically instruments every task in a pipeline so you can stay on top of what's happening. (note that we're still working on exposing some of these observability features in the UI)
Johnny Graettinger
@huy_doan_quang Just to piggyback on Phil's answer: a fun thing you can do with Flow is capture WebHooks coming from Segment to then use for your own purposes. - Flow them into an analytics warehouse like Snowflake or BigQuery. - Roll them up into a live dashboard kept in a Google Sheets. - Transform and aggregate your Segment events into a customer 360 profile.
Huy Doan
@johnny_graettinger1 wow, it sounds super cool. Thank you for the explanation guys! Keep it up! Also, is Estuary Flow on Twitter? Would love to see and catch up with your product updates in the next few weeks/months.
David Yaffe
@johnny_graettinger1 @huy_doan_quang yes we are but we haven't been super active yet. Our handle is @estuarydev!
Gurudeep Shrotriya
Wonderful real time data platform tool to build no-code reliable pipelines. Can you please enlighten me about few use cases of Estuary Flow?
Mahdi Dibaiee
@gurudeep_shrotriya Hello Gurudeep, thank you for your kind words! Estuary Flow serves the purpose when you want to move data from APIs, databases, or different technologies, to certain destinations. On the way, you can define real-time transformations. All of this happens in real-time so you get milliseconds latency between changes in your source database and your destination. If you have a certain problem that you are trying to solve, we would be more than happy to discuss it as well. A few use cases that come to mind: 1. Gathering your analytics from different SaaS that you use (e.g. HubSpot, Salesforce, Mailchimp, etc.) and consolidating them in a single destination for analysis (e.g. Snowflake) 2. Synchronising data between a database and an Elasticsearch cluster, to allow for your data to be searched on 3. Synchronising data between SQL and/or NoSQL databases, e.g. if you are migrating from one database to another and want to keep data synchronised between the two, or for any reason you have specific use cases for different databases. e.g. Move data from Firestore to Postgres or MySQL to MongoDB, etc. 4. Sending events to a queue technology (e.g. Google PubSub) when certain changes are made to your data in any source 5. [... The list goes on!]
David Yaffe
@gurudeep_shrotriya @mahdi_dibaiee and another very common one is just doing change data capture from databases to warehouses! If you use streaming sql to create a materialized view, it can save a good amount of money when you query that warehouse.
Gurudeep Shrotriya
@dyaffe fantastic! thanks .
Gurudeep Shrotriya
@mahdi_dibaiee that is really awesome. Thank you very much. I'll really use Estuary Flow.
Nick Vas
Estuary Flow seems like a powerful and versatile tool for real-time data management. The fact that it offers a free account with up to 10gb/mo in data movement is a great way to get people interested in trying it out. How does Estuary Flow compare to other real-time data platforms, such as Kafka?
Johnny Graettinger
@cata is built on top of an open-source streaming spine (gazette.dev) that, from a distance, has a lot of overlap with Kafka. A killer feature of Gazette, that Flow uses extensively, is that cloud storage is _the_ native persistence layer for all data in the system. When you capture data with Flow, it's quite literally building out a real-time data lake. Later -- potentially much later -- when you use that data, it backfills from your cloud storage and thereafter stays up to date. Flow builds atop this to take your stated desired outcomes -- I want data from here, transformed in this way, and materialized to there -- and make it happen on your behalf. tl;dr You get the operational capabilities of Kafka, without having to actually run Kafka.
David Yaffe
@cata Kafka is low-level infrastructure which can be tough to manage. Confluent makes it easier, but even with that there are some big advantages with Flow: 1. Data history -- you get the ability to retain all historical data and use it very easily with Flow. Kafka and kinesis can be viewed as more of buffers which retain recent data. 2. Transactional connectors -- where possible, our connectors are fully transactional meaning that they employ exactly once semantics. 3. Integration with batch systems -- you can also pull data from batch systems with no problem. 4. Ease and cost -- you can create a live data flow without being super technical for about 1/10th the cost of something like Kafka Happy to help if I can provide more info!
Jenny Man
@cata Hi Nick, this article goes over some more differences in case you want to read more: https://www.estuary.dev/confluen...
srprs
Estuary Flow is a fantastic addition to the market with its ability to backfill data for large tables and stream data in real-time. However, I have some concerns about data security and pre-built connectors. Despite this, Estuary Flow remains a robust Real-Time Data Platform with advanced features that can streamline data processing workflows. Congratulations on the launch, and I'm eager to explore further!
Johnny Graettinger
@srprs Thanks! Data security is of course a wide and deep topic, but a quick couple of top-of-mind points: - Flow is designed to store all data in your infrastructure: your cloud storage, your databases, etc. Our managed service _moves_ data but doesn't persist extra copies outside of your systems. - All credentials are protected using Mozilla's [sops](https://github.com/mozilla/sops) tool, within both our UI and CLI. If you want to, you can even set up your own KMS for key management. Regarding connectors, our central focus is to offer and enhance a wide catalog of ready-to-use connectors. Flow is built such that connectors are light-weight plugins: it's pretty easy to put together a basic connector. In the future, Flow will let you bring your own connectors for bespoke or in-house systems you want to integrate. We think _most_ users just want a good out-of-box experience when connecting the systems they use, but we don't want to get in the way of power-users who want to create custom integrations!
srprs
Dzianis Yatsenka
Hey! Good luck on the launch! Can I use it in my application with firebase database?
David Yaffe
@dzianis_yatsenka thanks for the support and yes! That is something that's natively supported. We can capture from firebase, optionally automatically schematize the data, potentially transform it and push it where you'd like.
Dzianis Yatsenka
@dyaffe thank you for your answer! Good luck! Can I use it like Rowy to transfer firebase data to xls (or google sheets)?
David Yaffe
@dzianis_yatsenka yes, it can be used to transfer it to Google Sheets -- we actually have a bunch of different connectors that you can use to transfer it to. Ex. Snowflake, Mongo, etc as well!
Dzianis Yatsenka
@dyaffe thank you, I’ll try!
Max T.Pham
Congrats David & team on your launch! Estuary Flow seems like an impressive real-time data platform that simplifies streaming data products. With 23y in Tech & currently Our team & I are still handle 1000s of data flows (Streaming & Batch), so I'm excited to see how Estuary Flow can benefit my data and analytics work. Will send to my SRE team to check, good luck!
Mahdi Dibaiee
@maxtpham Thank you Max for your motivating words! We are always happy to learn about new use cases that people have for data pipelines, and we are eager to help you get started at any time! Look forward to see how you and your team get on with the app and if at any point you would like to discuss things (specific connectors / requirements, etc.), reach out to us!
Max T.Pham
@mahdi_dibaiee sure, thank you, Our team will contact you if we need any help!
David Yaffe
@maxtpham please feel free to reach out directly if you have any questions at all!
123
•••
Next
Last