Victor Strandmoe

Victor Strandmoe

AI Alignment and Safety entrepeneur
4 points

Forums

Democratizing dataset influence on model performance

AI teams are data constrained, not model constrained and waste millions retraining models on data with little or negative impact.

They spend most of their budget collecting, processing, and labeling data without knowing what actually improves performance.

This leads to repeated failed retraining cycles, wasted GPU runs, and slow iteration because teams lack insights in which datasets improve the model and which degrade it.

Dowser - Find the right data, optimize training, ship models fast

Dowser doesn’t just clean or label data. It directly trains and benchmarks models to prove which datasets help or hurt performance. Using influence guided training, it produces confident influence scores in minutes on commodity hardware across Huggingface datasets. Teams get precise guidance on what data actually moves the model before spending GPU budget. After the benchmarks are completed, you may use the app to upload your model to Hugginface

How do you benchmark your local LLM performance? 🤔

Hey everyone!

I've been running a lot of local LLMs (Llama, Mistral) and Diffusers lately on my machine. But I always struggle to accurately measure their performance.

Usually, I just look at "tokens/sec" in the terminal, but it feels inconsistent.

How do you guys benchmark your local AI setup? Do you use any specific tools, or just rely on vibes?