All activity
Prashanth Manoharstarted a discussion
~1s cold start for a 32B model.
~1s cold start for a 32B model. Most setups we’ve seen fall into two buckets: • multi-second to minute cold starts (model load + init) • or keeping GPUs warm to avoid that We’ve been experimenting with restoring initialized model state instead of reloading weights. This demo shows ~1s cold start for a 32B model. https://youtu.be/G8DsbS1mcwo
