I built a framework where ai agents search for market inefficiencies and optimize risk-adjusted returns through an agentic loop:
(1) propose a hypothesis with economic justification,
(2) iterate on historical data (2010–2016),
(3) validate on out-of-sample periods (2017–2021).
No hyperparameter tuning is allowed once a strategy enters validation, forcing the system to behave like a researcher.
Holdout results (2022–2025): Sharpe 0.86 vs. 0.67 benchmark, 28.1% turnover, 11.4% max drawdown.