Validated Backtest Results

Connecting to server...

Free-tier server may take up to 60s to wake up

Validated Backtest Results

See how each of the 15 portfolio strategies performed during out-of-sample testing — returns, risk metrics, and consistency scores that weren't used to train the models.

Start with the Summary tab for the big picture, then drill into individual strategies. Green rows beat the S&P 500 benchmark. New here? Start with the Getting Started Guide or check the Glossary for unfamiliar terms.

What's on this page

Summary — Test period performance table (CAGR, Sharpe, Max DD, Calmar, Win Rate, MC Grade, Deploy Score) and CAGR bar chart for all 15 variants
Annual Returns — Year-by-year test period returns (2020–2025) per variant with cross-year average
Monthly Breakdown — Expected monthly return heatmap, monthly volatility heatmap, annual returns context table, cross-variant monthly bar chart, and good-vs-bad month expectation cards
Monte Carlo — 10K-simulation robustness results: CAGR percentiles (worst case/25th/median/75th/best case), max drawdown percentiles, % positive CAGR, and MC grade per variant
Overlay Impact — Crash overlay comparison: base vs overlay CAGR, CAGR cost, drawdown reduction, and risk-adjusted metrics (Calmar, Sharpe)
Episodes — Roundtrip trade analysis: total episodes, win rate, avg win/loss, profit factor, best/worst episode, and max consecutive win/loss streaks
Live Paper Trading — Real-time forward-test of all 15 variants since Jan 2026: portfolio value, return, position count, and status

Summary

Annual Returns

Monthly Breakdown

Monte Carlo

Overlay Impact

Episodes

Live Paper Trading

What am I looking at?

Rolling test period results for 15 deployment variants tested on 2020–2025 data the model never saw during training. All returns are net of simulated taxes and transaction costs. Each variant was optimized on earlier data and tested forward. CAGR, Sharpe, Calmar, and Max Drawdown come from the test period equity curves. Confidence Grade = Monte Carlo robustness test (10,000 simulations). Deploy Score (0-100) combines test period performance, simulation validation, and overlay compatibility.

Variant	Name	Account	Test CAGR	vs SPY	Sharpe	Max DD	Calmar	Win Rate	MC Grade	Deploy Score

Test Period CAGR Comparison

The bottom line — how each variant's compound annual growth rate stacks up side by side. The dashed line is a static SPY allocation; bars above it are outperforming the benchmark.

What am I looking at?

Year-by-year test period returns for each variant (2020–2025). These are calendar-year returns from the rolling test period, net of taxes. 2022 was a bear market (most strategies had drawdowns). 2020 and 2023 were strong recovery/bull years. This shows how each variant performs across different market regimes.

Variant	Name	Account	2020	2021	2022	2023	2024	2025	Avg

Variants starting mid-year show partial-year returns which may appear unusually high or low. "--" indicates the variant was not active that year.

What am I looking at?

Monthly return patterns from the 6-year test period (2020–2025). Each cell shows the expected monthly return based on historical rebalance-period returns in that calendar month. Use this to set expectations: some months are historically strong (e.g., Nov, Dec) while others are weak (Feb, Sep). Compare against actual annual returns to understand how monthly patterns compound into yearly performance. Volatility shows the typical dispersion of returns within each month.

Expected Monthly Returns (Test Period 2020-2025)

Color intensity = magnitude. Derived from avg rebalance-period returns per calendar month.

Variant	Acct	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Avg/Mo

Monthly Volatility (Risk per Month)

Higher volatility = wider range of outcomes. Helps identify which months have the most uncertain returns.

Variant	Acct	Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec	Avg

Annual Returns (for Context)

How monthly patterns compound into yearly results. Good months need to overcome bad months + taxes + slippage.

Variant	Acct	2020	2021	2022	2023	2024	2025	CAGR	Best Yr	Worst Yr

Cross-Variant Monthly Summary

Averaged across all 15 variants. Shows which months are systematically strong or weak for the strategy.

Setting Expectations: Good vs Bad Months

Why this matters — knowing which months are historically weak helps avoid panic-selling during normal seasonal dips. A −2% month in September is expected behavior, not a broken strategy.

What am I looking at?

10,000 block-bootstrap Monte Carlo simulations per variant. This tests robustness by randomly reshuffling the historical return sequence to see the range of possible outcomes. p5 = worst 5% of outcomes, p50 = median, p95 = best 5%. "% Positive" = percentage of simulations with positive CAGR. "% Beat SPY" = simulations that outperformed a static SPY allocation. Showing results for the recommended overlay (G2 Aggressive) per variant. MC Grade: STRONG PASS = all 6 criteria met, PASS = 5/6.

Variant	MC Grade	Worst CAGR	25th CAGR	Median CAGR	75th CAGR	Best CAGR	Worst MaxDD	Median MaxDD	% Positive

What am I looking at?

Crash overlay impact on each variant. Overlays automatically reduce position size during detected crash regimes to protect capital. G2 Aggressive = reduces to 20% exposure in crashes, G1 Balanced = 30% min, G Light = 50% min. Table compares each variant's no-overlay baseline vs recommended overlay. Overlays typically sacrifice 5–10% CAGR but cut max drawdown by 50–75%, dramatically improving risk-adjusted returns (Calmar ratio).

Variant	Overlay	Base CAGR	Overlay CAGR	CAGR Cost	Base MaxDD	Overlay MaxDD	DD Reduction	Overlay Calmar	Overlay Sharpe

What am I looking at?

Roundtrip episode analysis for the full test period (2020–2025). Each "episode" is one complete hold period from buy to sell for a single stock position. Win Rate = % of episodes that were profitable. Profit Factor = gross wins / gross losses (above 1.0 means profitable system). Avg Win/Loss shows the typical magnitude of winning vs losing trades. Max consecutive wins/losses shows streak behavior.

Variant	Total Episodes	Win Rate	Avg Win	Avg Loss	Profit Factor	Best Episode	Worst Episode	Max Consec W	Max Consec L

What am I looking at?

Real-time paper trading of all 15 variants since Jan 2, 2026. Each variant starts with positions carried from the 2025 backtest, normalized to $100,000. This is forward-testing to validate backtest results in live market conditions.

Page last updated: loading...