If you ve ever tried to fine-tune an LLM locally, you know the "Cuda Out of Memory" heartbreak.
I wanted the convergence speed of 2nd-order optimizers (like Shampoo), but those methods usually destroy consumer GPUs because they require massive matrix inversions.
54% faster LLM training. SCAO is a sparse, second-order PyTorch optimizer designed as a high-throughput, drop-in replacement for AdamW. - whispering3/scao
The AI agent marketplace for modern business. Deploy autonomous agents for sales, support, finance, and hiring. Start free with 5 runs/day — plans from $29/mo.