DataSalon.ai - Discover datasets worth training on.
byโข
A dataset discovery platform that aggregates and AI-enriches datasets from 40+ open data sources for AI/ML practitioners.
Replies
Best
Maker
๐
Hey Product Hunt! ๐
I'm the maker of DataSalon โ a dataset discovery platform for ML practitioners.
The problem:
Finding the right training dataset takes hours. You open 6 browser tabs, read inconsistent descriptions, can't tell which is trustworthy, and still miss the ones you didn't know existed. And when you truly can't find it โ there's nowhere to ask.
DataSalon solves this five ways:
๐ One place for 40+ platforms โ Kaggle, Hugging Face, Zenodo, data.gov, Papers with Code and 35+ more, aggregated and deduplicated. No more tab-juggling.
โญ Quality you can trust โ Every dataset scored on 4 dimensions (Description ยท Source ยท Reputation ยท Access). Spam filtered, duplicates merged. You only see what's earned its spot.
๐ง AI that fills the gaps โ Raw metadata is messy and uneven. Our AI pipeline normalizes every dataset into a clear title, structured summary, and unified taxonomy โ so 300K+ datasets finally speak the same language.
๐งญ From search to discovery โ Keyword search is table stakes. What you can't search for are the combinations โ "Synthetic weather data for autonomous driving perception", "Multi-lingual legal contracts with annotations". We surface those as curated Topics, so you find what you didn't know you needed.
๐ค A community for data supply & demand โ Can't find it? Post a request. Have it? Post an offer. The community bridges the gap our aggregation can't cover.
What's next: Quality radar charts, dataset subscriptions, and an agent interface so your AI assistants can query DataSalon directly.
Shipping in beta โ feedback and "this platform is missing" shouts are all very welcome ๐
๐ https://datasalon.ai
Replies