Instella

Instella

Open 3B Small LMs from AMD

80 followers

Instella, from AMD, is the high-performance 3B language models. ResearchRAIL license for model weights, MIT license for code. Trained on MI300X.
Instella gallery image
Instella gallery image
Instella gallery image
Free
Launch Team
AssemblyAI
AssemblyAI
Build voice AI apps with a single API
Promoted

What do you think? …

Zac Zuo
Hi everyone! Check out Instella, a new family of 3B language models from AMD. These models are interesting because they're achieving performance comparable to larger, open-weight models, but with a smaller footprint. Key points: 📈 Strong Performance: Outperforms other 3B models, and rivals some larger open-weight models like Llama-3.2-3B and Gemma-2-2B. ✨ Reasoning Focus: Second stage of pretraining focused on math/reasoning. 📦 Multiple Versions: They've released pretrained, SFT (supervised fine-tuned), and DPO (Direct Preference Optimization) versions. 🔥 Trained from scratch on AMD Instinct MI300X GPUs. 🔑 Research License: The model weights are under a ResearchRAIL license, so check the terms for your specific use case. The code is open-source (MIT). It's a great example of how much can be achieved with efficient training and a focus on specific capabilities (like reasoning). The fact that they've released multiple versions, is also a plus.