Hey, great launch!
I’m Anirudh, founder of Bubbly. We let you create GPT3 Powered Help for your product in under a minute that can added anywhere!
I went ahead and trained Bubbly on your landing page. Here is a quick demo: https://www.getbubblyai.com/estu...
Is this something you are interested in?
Congrats David on creating Estuary Flow! 🎉 It's an awesome real-time data platform with great features. Have you considered adding more capture options? That would be really useful. Thanks for providing a free plan, looking forward to exploring more! 🙏
The Pitch:
🚰 Estuary Flow is a Real-Time Data Platform that enables building no-code reliable pipes that don’t require scheduling, and support batch/streaming and materialized views in milliseconds.
📒 A free account with up to 10gb/mo in data movement can be had here: www.estuary.dev
The Details:
Estuary Flow is built on top of an open-source streaming framework (Gazette) that combines millisecond-latency pub/sub with native persistence to cloud storage. Basically, it’s a real-time data lake.
Beyond being able to sync data continuously between sources/destinations without configuring, say, Kafka, there are a few benefits to a UI built on top of this streaming framework, specifically:
🗄️ Managed CDC. Simple, efficient change data capture from databases with minimal impact and latency. Seamless backfills – even over your very large tables that Debezium tends to choke on – and real-time streaming out of the box.
🧑💻 Streaming SQL transformations. We have a quite powerful transformation product that allows for Streaming SQL transforms without a requirement for windowing. Join historical data with real-time without having to think about it. Flow also offers schema validation and first-class support for testing transformations, with continuous integration whenever you make changes.
💽 Collections instead of Buffers. When a data source is captured – like Postgres CDC, or Kinesis, or streaming Salesforce – the data is stored in your cloud storage as regular JSON files. Later, you can materialize all of that juicy history and ongoing updates into a variety of different data systems. Create identical, up-to-date views of your data in multiple places, now or in the future.
📈 Continuous Views instead of Sinks. Materialized views update in-place. Go beyond append-only sinks to build real-time fact tables that update with your captured data – even in systems not designed for it, like PostgreSQL or Google Sheets. Make any database a “real time” database.
✅ Completely Incremental, Exactly-Once. Flow uses a continuous processing model, which propagates transactional data changes through your processing graph. This helps keep costs low while maintaining exact copies across different systems.
⏩ Turnkey batch and streaming connectors. Both real-time data as well as historical data supported through one tool and access to pre-built connectors to ~50 endpoints. For example, you can capture from the batch Stripe API, join it with data from Kafka and push that all to Google Sheets – all without building a custom integration. Or if you want, plug in your own connector through Flow’s open protocol.
@dyaffe Congrats on the launch, Real-time data gathering and Synchronising data between multiple sources are impressive features that could save us a lot fo time
@huy_doan_quang thanks! One difference is that Segment is a Customer Data Platform, whereas Flow is much more generic. We've tended to focus a bit more on integrations with things like databases, and Segment focuses more on things like capturing data directly from your website. We've actually had users push data from Segment into Flow (using webhooks) in order to materialize their data to an OLAP database like Bigquery or Snowflake.
One other difference is that Segment is more focused on point-to-point integrations, while Flow focuses more on "data products". When you "capture" data in Flow, you don't just have the ability to materialize it somewhere else. Flow captures data into "collections", which have all the features of data products:
- Discoverable: Users can come to Flow to find the data they need
- Governance: Flow's authorization system allows you to safely share data across users, teams, and even organizations
- Quality: Flow does schema validation at every step, and built-in transforms let you clean up your data and provide meaningful guarantees about the output data
- Observable: Flow automatically instruments every task in a pipeline so you can stay on top of what's happening. (note that we're still working on exposing some of these observability features in the UI)
@huy_doan_quang Just to piggyback on Phil's answer: a fun thing you can do with Flow is capture WebHooks coming from Segment to then use for your own purposes.
- Flow them into an analytics warehouse like Snowflake or BigQuery.
- Roll them up into a live dashboard kept in a Google Sheets.
- Transform and aggregate your Segment events into a customer 360 profile.
@johnny_graettinger1 wow, it sounds super cool. Thank you for the explanation guys! Keep it up! Also, is Estuary Flow on Twitter? Would love to see and catch up with your product updates in the next few weeks/months.
@gurudeep_shrotriya Hello Gurudeep, thank you for your kind words!
Estuary Flow serves the purpose when you want to move data from APIs, databases, or different technologies, to certain destinations. On the way, you can define real-time transformations. All of this happens in real-time so you get milliseconds latency between changes in your source database and your destination. If you have a certain problem that you are trying to solve, we would be more than happy to discuss it as well.
A few use cases that come to mind:
1. Gathering your analytics from different SaaS that you use (e.g. HubSpot, Salesforce, Mailchimp, etc.) and consolidating them in a single destination for analysis (e.g. Snowflake)
2. Synchronising data between a database and an Elasticsearch cluster, to allow for your data to be searched on
3. Synchronising data between SQL and/or NoSQL databases, e.g. if you are migrating from one database to another and want to keep data synchronised between the two, or for any reason you have specific use cases for different databases. e.g. Move data from Firestore to Postgres or MySQL to MongoDB, etc.
4. Sending events to a queue technology (e.g. Google PubSub) when certain changes are made to your data in any source
5. [... The list goes on!]
@gurudeep_shrotriya@mahdi_dibaiee and another very common one is just doing change data capture from databases to warehouses! If you use streaming sql to create a materialized view, it can save a good amount of money when you query that warehouse.
Estuary Flow
Streos
Estuary Flow
Collato
Estuary Flow
Estuary Flow
Collato
Estuary Flow
Estuary Flow
Estuary Flow