Launched this week

Clusy
AI notebook platform for modern data science
164 followers
AI notebook platform for modern data science
164 followers
Clusy is an agent-native notebook platform for researchers and data teams to build, branch, run, and evaluate ML and data science workflows in the cloud. Describe a goal in natural language, and Clusy plans the workflow, sources datasets, preprocesses data, runs parallel experiments in replicated kernels, compares model architectures, and helps produce optimal models through a human-in-the-loop notebook experience.






Free Options
Launch Team / Built With



Clusy
@eldar_hasanov Interesting to see this land right as I'm dealing with the tail end of a similar problem — most "data science platform" tools assume you're staying inside the notebook the whole time, but a lot of real work ends with someone needing a client-facing PDF or report out of it, which usually means exporting to a totally different tool. Curious how Clusy handles that last step, or if it's staying notebook-native by design.
Clusy
@deepanshu_garg9 thank you for the feedback! Clusy is able to produce essentially any format of files, including figures and reports. We are adding a new skill that will allow it to better render PDFs and make this feature even more catered
The parallel-experiments-in-replicated-kernels part is exactly where I'd want to stress-test this.
When I do this by hand I lose the thread fast — three branches, each with slightly different preprocessing, and a week later I genuinely can't tell which dataset version + seed produced the model I ended up keeping.
So when Clusy branches and runs experiments in parallel, does each branch pin its own environment and dataset snapshot, so a result stays reproducible weeks later?
And when I pick a winner, can I merge that branch back into the main notebook cleanly — or does it live on as a separate artifact?
Clusy
@rudratosh great questions - each branch is its own replicated kernel state, so yes, it does have its own environment setup technically. And in terms of merging, you can do both: merge to main or export as its own snapshot!
Clusy
@thys_beesman almost completely automated! The best part is: you decide. You can jump into the process anytime or let the agent handle parts you don't like.
How does it handle large datasets that don't fit in memory, and what integrations does it currently support for pulling data from sources like S3 or BigQuery?
Clusy
@adnan664848 The current free plan sandbox has 20GB of memory allowance. Datasets that don't fit in memory are processed in batches. We use our own S3 at the moment, but we will soon release features that allow users to connect their own data sources (databases, etc.). We already support Snowflake and Databricks.
How does the human-in-the-loop part actually work when the agent is branching off into parallel experiments. Do you step in between runs or only after the comparison view comes back?
Clusy
@araszengin54995 you can do either!
How does the "agent-native" planning actually handle steps where I need to bring in proprietary data sources or private models that aren't publicly available?
Clusy
@abdurrahmaaz1o in the available tiers, we allow BYOK, but for very proprietary use cases, we can also discuss on-premise self-hosted enterprise options
how does the branching actually work under the hood, like can i fork an experiment midway and rerun only the changed cells without burning through compute on the whole pipeline again?
Clusy
@narinengnejqio basically yes!