Selçuk Kızıltuğ

DecisionBox Fine-Tuning - A model trained on your data that runs on your infra

Autonomous AI discovery on your data warehouse, fine-tuned to your business; your schema, your terminology, and your analysis patterns. Training runs on your GPUs. Combined with Ollama, zero bytes ever leave your network — from the first query to the final finding.

Add a comment

Replies

Best
Selçuk Kızıltuğ

Hey PH,

Most "AI on your data" tools have a ceiling. The model doesn't know what your schema means, doesn't understand your internal terminology, and surfaces findings that are generically correct but contextually wrong.

Fine-tuning is how you close that gap.

DecisionBox Fine-Tuning lets you train an open-source base model on your own schema, your own terminology, and the analysis patterns that actually matter for your business.

Training runs on your GPUs, inside your network. Combined with Ollama, zero bytes ever leave your environment — from the first query to the final finding.

Three steps: pick an open-source base model → train it on your data → run it inside your network.

A fine-tuned 14B model running on your own hardware can match Claude or GPT-4 for your specific task at a fraction of the cost. When you're running hundreds of queries per discovery, that math matters

Happy to answer questions here on the product, the architecture, or anything else.

Can Abacigil

Hey Product Hunt, Can (CTO) is also here,


Quick note on who this feature is for, because it's not everyone.

If you're running a handful of discoveries a month, stay on Claude or GPT. The economics don't make sense.

Fine-tuning is for the companies already running DecisionBox at real volume. Hundreds of discoveries a month, big warehouses, token bills starting to hurt. At that point you have two problems: cost per run keeps going up, and your agent still doesn't really know your schema on discovery number 500.

What DecisionBox already captures solves both. Every query, every SQL error and its fix, every validated insight, every thumbs-up from your team. Structured, labeled, verified against your warehouse. Not scraped, not synthetic - your actual runs.

Export it, train an open model (Llama, Mistral, Qwen, whatever), and you get something a generic frontier model can't give you: a small model that actually knows your warehouse schema, your dialect quirks, and what your team cares about. Run it on your own hardware. Frontier API latency and cost, gone.

A couple of places this pays off:

A customer had six tables named some variant of "orders", three deprecated. Claude was guessing and wasting 20% of runs on dead tables. A model trained on their successful runs stops doing that.

Same customer kept hitting the same Redshift date math quirk on every discovery. The agent self-healed every time, but it cost tokens and minutes. After fine-tuning on the fix history, the error class just disappeared.