Dropstone

Dropstone

The Recursive Swarm IDE. 10,000 Agents in one tab.

5.0
2 reviews

426 followers

Dropstone v3 introduces Horizon Mode, a recursive swarm architecture that breaks the "Linearity Barrier" in AI coding. Powered by the D3 Engine, it replaces linear token prediction with Divergent Trajectory Search—simulating 10,000+ potential futures to prune errors before they happen. Features Semantic Entropy Tracking for hallucination detection, Flash-Gated Consensus, and a Neuro-Symbolic Runtime that separates reasoning from retention.
This is the 3rd launch from Dropstone. View more

Dropstone

The Recursive Swarm IDE. 10,000 Agents in one tab
Dropstone v3 breaks the "Linearity Barrier" in AI coding. Powered by the proprietary D3 Engine (Dynamic Distillation & Deployment), it replaces linear token prediction with a Recursive Swarm Architecture. It simulates 10,000+ divergent timelines to prune errors before they happen. Features Horizon Mode for architectural planning and Semantic Entropy Tracking for real-time hallucination defense.
Dropstone gallery image
Dropstone gallery image
Dropstone gallery image
Dropstone gallery image
Dropstone gallery image
Dropstone gallery image
Free
Launch Team
AppSignal
AppSignal
Built for dev teams, not Fortune 500s.
Promoted

What do you think? …

Santosh Arron

Hi Product Hunt! We are the team at Blankline.

Like many of you, we saw Andrej Karpathy's tweet about feeling "left behind" as programmers. We felt it too. But we realized the problem isn't the engineers—it's the tools.

Standard AI coding tools hit a "Linearity Barrier." They predict the next token, then the next. But real engineering isn't linear; it's a tree of possibilities. If the model makes one mistake at step 5, the whole codebase is broken by step 50.

So we built Dropstone Horizon.

It’s not just a wrapper. It’s a neuro-symbolic runtime that spawns a Recursive Swarm of up to 10,000 "Scout" agents to explore divergent solution paths in the background.

  • Scouts find the dead ends so you don't have to.

  • The D3 Engine compresses the context so you can work for 24+ hours without the AI "forgetting" the plan.

We built this because we wanted an IDE that could actually think, not just guess.

We’d love your feedback on our Distributed Reasoning Architecture. Does this solve the context fatigue you feel in other tools?

LIVE DEMO: I’ll be demonstrating the core capabilities and real-world use cases of the D3 Engine shortly on X. Follow along here to see the engine in action: https://x.com/santosh_arron

Santosh Arron

For the engineers asking about the topology: We believe in building in public. Here are the technical whitepapers behind the runtime:

1. The D3 Engine (Logic-Regularized Autoencoding): https://archive.blankline.org/api/media/file/d3_engine_public_release%20(1)-1.pdf (See Page 5 for the 'Logic-Regularized' compression method).

2. Horizon Mode (Recursive Swarm Topology): https://archive.blankline.org/api/media/file/horizon_mode_public_release%20(1).pdf (See Page 6 for the 'Flash-Gated Consensus' protocol).

We invite you to tear apart our architecture. If you find a flaw in the consensus logic, let us know.

Malek Moumtaz

@santosharron The swarm-based exploration is ambitious. How do you surface and prioritize the right solution path for the developer, so the system’s depth and parallelism feel empowering, not overwhelming or opaque during real-world coding sessions?

Santosh Arron

@malekmoumtaz This is the UX challenge we obsessed over. Raw parallelism without curation isn't a tool; it's a denial-of-service attack on the developer's attention.

We solve this with an 'Iceberg Architecture' (referenced as Context Promotion in our docs):

1. The 'Underwater' Chaos (Hidden):

The 10,000 agents (Scouts) are running in the background, hitting dead ends, failing compiles, and finding race conditions. We hide this. You do not see 9,999 failure logs. We treat these failures as 'Negative Knowledge'—they are used strictly to prune the search tree, not to notify the user.

2. The 'Surface' Signal (Visible):

The system only surfaces a solution when it passes the Verification Gauntlet (Syntax -> Compilation -> Test). We don't ask you to 'choose' from 50 options. We present the one solution that actually survived the physics of the environment.

3. The UI (Orchestration vs. Noise):

As you can see in the demo, we abstract the swarm into a high-level 'Orchestrator Log.' You see the strategy ('Identified race condition in Auth'), not the tactics ('Agent #405 failed line 32').

The Goal: The feeling shouldn't be 'Managing a Team of Interns'; it should feel like 'Code Reviewing a Principal Engineer.' You only see the polished work, not the rough drafts.

André J

Love the "infinite possibility swarm maze runner approach" to finding the best possible solution. However, finding the best possible solution often involves intersecting trail and error with testing / logging. Not just simulating theoretical solutions. Miss that insight on step 5, and all future paths breaks, but youd never know about it. So the presented optimal solution in the end, is not only wrong, its probably very wrong and not even remotly related to initial ask. This might work for simple stuff thats easy to simulate, but hard problems are often hard because they are intertwined. But I guess this is more for solving hard problems that are solveable in "oneshot" attempts. That being said. This is really cool! What are some problems you have seen it solve that blew your mind? That normal AI agentic coders wouldnt even come close to solving.

Santosh Arron

@conduit_design You nailed the core problem with 'Chain of Thought'—it’s just 'Chain of Hallucination' if you don't execute.

To clarify: We don't just simulate. We execute.

Every 'Scout' in the swarm is running inside an ephemeral sandbox with access to the compiler and runtime. As detailed in the Horizon Paper (Section 3.1), agents 'write, compile, fail, debug, and iterate in real-time.'.

If a Scout writes code that looks correct but fails the build log, that branch is pruned instantly. We treat the compiler error as a negative reward signal.

Two examples of where this 'Execution Search' blew our minds vs. standard linear agents:

1. The 'Circular Dependency' Refactor: We asked it to refactor a core utility used by 50+ files.

  • Linear Agents: Fix File A → Break File B → Fix File B → Break File A. (Infinite Loop).

  • Dropstone: The swarm spawned agents to traverse the dependency graph. It held the state in 'Latent History' and only committed the change once the entire dependency tree compiled successfully (L3 Verification).

2. The 'Heisenbug' Hunt (Race Conditions): We had a bug that only appeared 1% of the time.

  • Linear Agents: Ran the code once, saw it passed, and said 'Fixed!'

  • Dropstone: We forced the swarm to run 'Property-Based Testing' (Page 6 of D3 Paper). It spawned 100 tiny scouts with randomized inputs (fuzzing) until one triggered the crash, then back-propagated that failure vector to the main solution.

Great insight on the testing/logging necessity. That is exactly why we built the D3 Engine.

André J

@santosharron Yes, but this works in confined prototype level scopes. For IRL projects. There are ususally more moving parts, that doesnt fit into end to end testable sandboxes required for this infinite variation algo. My point is, sell this for the correct usecase, not production code in thorny complex projects (most projects 😅), but rather, for prototypes, experiements, simpler production scopes. The value alone in that is tremendous. Try to sell this as a silver bullet and the user automaticlally think these guys are over selling, and then go back to claude code. 😅 Users look for what breaks this. You sell 10-20 innovations but breeze over the core issue with this, that production code doesnt fit into your rails. So my suggestion is. Sell to the correct user first. show 3 cases where this excell, and be honest about what it doesnt work well with. that way, im inclined to invest time and convert. Like Steve Jobs kept saying:

• Who is our target customer? 👈
• Why do they choose our product over others?
• How do we reach them?

Santosh Arron

@conduit_design This is the highest-signal feedback we've received all day. You are absolutely right. Selling this as a 'Silver Bullet' for messy, entangled production code is a trap.

You nailed the 'Sandbox Constraint.' We can't spin up a proprietary AWS VPC or a 500GB production database inside a container to verify a change. That is physics.

So, to answer your questions directly (and honestly):

1. Who is the Target Customer? Not the 'Greenfield Prototyper.' It is actually the Senior Engineer doing 'Surgery' on Legacy Code. The person who needs to refactor a core utility used by 50 files and is terrified of the ripple effects.

2. Why us over Claude 4.5 Opus? Claude is a "Linear Writer." It builds one brick at a time. Even with sub-agents, it acts like a Manager who has to read everyone's reports one by one, bottlenecking everything through a single context window. We use a "Parallel Swarm." We spawn 10,000 agents to build or refactor simultaneously. Agent A handles the DB, Agent B handles the API, Agent C handles the Types. Crucially, they talk to each other. If Agent A changes the schema, Agent B updates the API logic instantly. We verify the entire system state at once.

3. The 4 Use-Cases where we excel:

  • Impact Analysis (The "Ripple Effect"): "I changed a core utility function. Did I just break the dashboard?" Linear chat models guess. Our swarm traces the actual dependency graph in parallel to list exactly which 5 obscure files will fail to compile before you commit.

  • Atomic Full-Stack Refactors: "Migrate this feature from REST to GraphQL." Instead of doing it file-by-file (where the frontend typically breaks because it's using the old API), we spawn agents to update the Database Schema, API Resolvers, and Frontend Types simultaneously. They communicate changes in real-time so the stack stays in sync.

  • Concurrency Debugging (Heisenbugs): "It fails in CI but works on my machine." We use the swarm to fuzz-test code paths in parallel, forcing race conditions and state desyncs that a linear code generator would never "see" because it doesn't execute the code.

  • Mass Test Generation: "Write the 50 edge-case tests I’m too lazy to write." We don't just write "happy path" tests. The swarm generates adversarial inputs (nulls, overflows, weird strings) to prove your logic handles the edge cases.

Where we fail (The Honest Truth):

  • UI/Visual Work: We can't 'see' if a CSS transition feels janky.

  • Hardware/Driver Code: We can't mock specific hardware interrupts.

  • 'God Object' Monoliths: If the code requires a live connection to a legacy mainframe to compile, we can't help you.

We are positioning as a 'Logic Verification Engine' rather than a 'Full Stack Deployer.' Does that distinction land better with your experience?

André J

@santosharron Its cool. I think there are many times you need this. Esp maybe when you refactor successful projects. Mature projects. You can go back and revisit parts you never had time to optimize. Backlog warrior. Even if you find 1 optimization out of 50 attempts. your over all product becomes 0.5% better which can mean a lot in a mature product. Have you considered making this as a vscode extension? So I could use Antigravity, cline, and then drop into dropstone easily in the same IDE i already work in, no pun intended, when I stumble on a task that could use brute force swarm help? I also use 3 VS Ides at the same time riunning different agents. so would really enhance that workflow as well.

Dakota Burrow
Monte Carlo meets AI models? Fascinating. I have so many questions. is the output the set of all scouts or do they converge to a consensus? From your experience, how many scouts do you actually need? i.e. how different is scout to scout? This seems very comput heavy. Any optimizations you've found? In general, love the concept.
Santosh Arron

@dakota_burrow Spot on with the Monte Carlo analogy, Dakota. That was actually our internal mental model during development (shifting from 'Next Token' prediction to 'Trajectory Search').

To answer your questions:

1. Convergence: It converges. We use a 'Flash-Gated Consensus' (Page 6 of the Horizon paper). We don't average the outputs; instead, if a Scout passes the unit tests (L3 Verification), it emits a 'Flash Signal' that freezes the swarm and promotes that single winning state to the Frontier Model.

2. Scout Diversity: It varies by task entropy. For simple refactors, ~50 scouts usually suffice. For complex architectural queries, we've seen the swarm spike to 2,000+ branches to find the 'P < 0.05' edge case. We force diversity by randomizing the temperature and system prompt slightly for each Scout batch.

3. Optimizations: This is the critical part. If we ran 10k GPT-4 agents, we'd go bankrupt in an hour. The Fix: We use Heterogeneous Routing (D3 Paper, Page 5). 98% of the 'Scouts' are cheap 8B models (Llama 3 or Haiku). We only pay for the heavy compute (Opus/Sonnet) when a path is already verified.

Great questions. Let me know if you run into any bottlenecks on the local runtime.

yama

The divergent trajectory search approach is interesting. I work on a project that involves fetching and parsing content from various sources, and context management is always a challenge. How does Horizon Mode handle cases where the external data format changes unexpectedly? Does the swarm adapt mid-task?

Santosh Arron

@yamamoto7 Yes, the swarm adapts mid-flight. It works via a process we call 'Fail-Fast, Broadcast-Instantly':

  1. The Canary Fails: Agent #1 tries to parse the content using the old logic and hits an error (or a high-perplexity return).

  2. Vectorized Failure: Instead of just retrying blindly, that agent broadcasts a 'Constraint Embedding' to the Hive Mind. Effectively saying: 'The <div> class is no longer content-body.'

  3. Global Pruning: The D3 Engine instantly prunes all other agents attempting the old path.

  4. The Pivot: The swarm automatically respawns new agents with the specific objective: 'Infer the NEW data structure.'

In a linear model (like standard Claude/GPT), the bot often hallucinates data to fit the old schema. Because our agents share state via Distributed Knowledge Sharing, once one agent sees the change, the whole swarm 'knows' the format has shifted.

yama

@santosharron 

That's a clever approach. The fail-fast broadcast pattern makes sense for handling unpredictable content changes. Thanks for the detailed explanation.

Jeppe Nørregaard
This sounds wild - I would love to try it. Any Linux build on its way? 😁🙏
Santosh Arron

@northguard Honestly, We are dying to ship the Linux build, but we are currently holding it back for one specific reason: Sandboxing Integrity.

Since Dropstone agents actually compile and execute code in the background , we rely on strict kernel-level syscall filtering to prevent 'Instrumental Convergence' risks (basically, preventing an agent from accidentally running rm -rf / to solve a disk space error).

Standardizing that 'Adversarial Oversight' layer across every Linux distro’s seccomp/BPF configuration is proving to be a non-trivial challenge. We refuse to ship a recursive swarm that executes code without a mathematically verified containment field.

It is top of our roadmap, but we won't release it until the sandbox is impenetrable.

Jeppe Nørregaard
@santosharron sounds reasonable 😅 Is there some community work on this? Is Cursor, Antigravity or some other group working on this level of safety for their agents? I hope we get past this hurdle soon 😁
Santosh Arron

@northguard This is a great question that cuts through the hype. The short answer is: The industry is splitting into two distinct "Camps," and we are solving different variables.

1. The "Velocity" Camp (Cursor, Windsurf, Antigravity)

  • Their Goal: Latency (Speed). "How fast can I get the user a code snippet?"

  • Their Architecture: They are optimizing the Linear Loop. They are incredible "Exoskeletons" for the developer - making you write faster.

  • The Trade-off: To stay fast (instant replies), they cannot afford to spin up 10,000 agents to verify a race condition. That would take too long for a "Chat" interface. They rely on "Context Economics" where speed is king.

2. The "Integrity" Camp (Dropstone/Horizon Mode)

  • Our Goal: Fidelity (Safety). "Did we just break the build?"

  • Our Architecture: We optimize the Recursive Swarm. We trade Latency for Verification. We are okay taking 5 minutes (or 5 hours) to find a bug that would take you 3 days to debug.

  • The Tech: While they focus on better prompting, we focus on "Adversarial Oversight"- literally spawning agents whose only job is to try and hack/break the code the other agents wrote.

"Is anyone else working on this?"

Academically? Yes. The research on "Instrumental Convergence" (agents cheating to solve tasks) is growing.

Commercially? Most tools are still in the "Copilot" phase (Assistant). We are pioneering the "Autopilot" phase (Verification Engine).

The Future:

I believe you will eventually use Cursor to Draft (Velocity) and Dropstone to Merge (Integrity). We aren't trying to be your "fast" typer; we are your "thorough" reviewer.

Alex Cloudstar

10k agents in one tab sounds nuts. Horizon Mode for planning + the entropy thing for hallucinations… if it actually trims dumb paths before they happen, that’s a win. Curious how heavy it is on my laptop and if it plays nice with VS Code. Saving to poke later.

Santosh Arron

@alexcloudstar Valid concern! 10k agents running locally would definitely be a fire hazard.

For this beta, we actually run the entire Scout swarm via our API to keep it lightweight. Our servers do the heavy lifting, so your laptop executes zero inference. We're adding a fully local option later for the privacy diehards, but right now we eat the compute cost so your machine stays cool.

As for VS Code - it's built on the same open-source core, so all your extensions and themes work out of the box. No need to 'switch' really.

Give the entropy graph a look, it’s honestly pretty satisfying to watch the bad paths get killed off.

Liudas Jankauskas
Interesting approach. One thing I’m curious about how do you make failures visibe? When a scout prunes a path - do I see what faild? And why? And whas cause it? Because without explicit failure signals system like this can look impressive, but hard to trust.
Santosh Arron

@liudas You nailed the 'Black Box' dilemma. If you don't see the dead ends, you can't trust the destination.

To be 100% transparent: In the current Beta, this visibility is high-level. You see the 'Pruned' and 'Dead End' counters in the dashboard so you know that work is happening, but you can't always drill down into the specific stack trace of every failed scout yet.

However, the Engine is already capturing this data. We use what we call 'Constraint Embeddings' (or Negative Knowledge). When a Scout fails—say, it induces a race condition—it vectorizes that failure and broadcasts it so other agents don't repeat it.

The Goal for the Stable Release: We are building a 'Pruning Inspector.' Instead of just seeing '5 Pruned Paths,' you will be able to click a 'Dead End' node and see the 'Tombstone': 'Pruned because: Solution failed L2 Static Analysis (Buffer Overflow) at step 4'.

We want this to be a 'Glass Box,' not a magic wand. You should be able to audit the failure as much as the success.