Instella

Instella

Open 3B Small LMs from AMD

80 followers

Instella, from AMD, is the high-performance 3B language models. ResearchRAIL license for model weights, MIT license for code. Trained on MI300X.
Instella gallery image
Instella gallery image
Instella gallery image
Free
Launch Team
Wispr Flow: Dictation That Works Everywhere
Wispr Flow: Dictation That Works Everywhere
Stop typing. Start speaking. 4x faster.
Promoted

What do you think? …

Zac Zuo
Hi everyone! Check out Instella, a new family of 3B language models from AMD. These models are interesting because they're achieving performance comparable to larger, open-weight models, but with a smaller footprint. Key points: 📈 Strong Performance: Outperforms other 3B models, and rivals some larger open-weight models like Llama-3.2-3B and Gemma-2-2B. ✨ Reasoning Focus: Second stage of pretraining focused on math/reasoning. 📦 Multiple Versions: They've released pretrained, SFT (supervised fine-tuned), and DPO (Direct Preference Optimization) versions. 🔥 Trained from scratch on AMD Instinct MI300X GPUs. 🔑 Research License: The model weights are under a ResearchRAIL license, so check the terms for your specific use case. The code is open-source (MIT). It's a great example of how much can be achieved with efficient training and a focus on specific capabilities (like reasoning). The fact that they've released multiple versions, is also a plus.