s1

Matching o1-preview with Only 1000 Examples

6 followers

Matching o1-preview with Only 1000 Examples

6 followers

s1 is a simple recipe for test-time scaling of LLMs, achieving strong reasoning performance comparable to o1-preview using only 1,000 examples & budget forcing. Open-source model, data, and code available.

Launch tags:Open Source•Artificial Intelligence•GitHub

Launch Team

Framer — Launch websites with enterprise needs at startup speeds.

Launch websites with enterprise needs at startup speeds.

Promoted

Flowtica Scribe

Hunter

📌

Hi everyone, Sharing s1, a new approach to improving LLM performance at test time. This work comes from researchers at Stanford University and the University of Washington, and offers some exciting results: · Strong Reasoning: Matches the performance of larger models (like o1-preview) on reasoning tasks. · Minimal Data: Achieves this with only 1,000 examples (insane, right?). · Budget Forcing: Uses a novel "budget forcing" technique during inference. · Test-Time Scaling: Improves performance without retraining the model. · Fully Open-Source: Model, data, and code are all available.

Report

1yr ago

Reviews

Most Informative