Launching today

Sovereign-Lila-E8
Scaling is dead. Geometry is the new Scale
2 followers
Scaling is dead. Geometry is the new Scale
2 followers
Geometric Attention Transformer with the E8 Root System: Lila-E8 (Lie Lattice Attention Language Model) - The Geometry of Scale: Standard transformers scale by adding more 'Euclidean soup' (more parameters). LILA-E8 scales by increasing the packing density of the manifold. If the 8D version crunches 40M parameters into SOTA performance. 🚀 Results at 200k steps: - Model: 40M parameters. - Performance: 0.37 Train / 0.44 Val Loss - Stability:1000+ tokens without semantic loops.

