Launched this week

SCAO β Optimizer
I built a 2nd-order optimizer for LLMs.
2 followers
I built a 2nd-order optimizer for LLMs.
2 followers
54% faster LLM training. SCAO is a sparse, second-order PyTorch optimizer designed as a high-throughput, drop-in replacement for AdamW. - whispering3/scao



Hi everyone! Danilo here, the creator of SCAO.
The journey from a rejected Pull Request to a functional standalone tool taught me a lot about the gap between academic papers and what we actually need at home on our own GPUs.
I built this because I wanted to see if we could make "expensive" math affordable for devs with modest setups.
Some quick tips for testing:
Grab the scao.py from the repo.
If you have <8GB VRAM, use the train_local.py example (it uses LoRA).
If you want to see raw speed, try train_1m.py.
I'll be around all day to answer questions about the preconditoner math, the INT8 implementation, or just to chat about LLM architecture.
Canβt wait to hear your feedback!