Sphinx Edge — The Stress Test

The stress test — including the part where it caught us

18 months of out-of-sample backtesting (October 2024 – April 2026), and the retest that exposed two of our own backtests as inflated.

A correction, on the record. An earlier version of this page called these backtests "the floor, not the ceiling," arguing that live execution would only improve on them. That was wrong, and our own data proved it. When we re-ran the crypto backtests "scan-aware" — evaluating entries and exits only at the moments a live bot would actually check, instead of on clean hourly closes — two strategies collapsed. We're leaving the original numbers visible next to the corrected ones, because a performance page that quietly edits its history isn't worth reading.

The retest that killed two bots

Original backtest vs scan-aware retest, same data, same strategy code.

Strategy	Original backtest	Scan-aware retest	Verdict
BTC short (MACD cross + EMA50)	+72.8% ROI · 63% WR	-9.2% ROI · 29% WR	Retired — never funded
ETH short (MACD cross + EMA50)	+79.8% ROI · 59% WR	+2.0% ROI · 31% WR	Rebuilt as v2, re-validating

Why the originals were wrong

Three compounding flaws. The backtests graded stop-losses on hourly closing prices, while the live bots check prices every 2 minutes — so the backtest never saw the intrabar spikes that stop out real positions. Entry timing carried look-ahead bias: the backtest entered at prices the live bot could never have gotten. And slippage was modeled at zero. Each flaw alone flatters the results; together they manufactured a +72.8% fantasy out of a losing strategy.

The proof from live data

The ETH short bot ran in dry-run while this played out. Its backtest claimed a 63% win rate. Its live record finished at 19 trades, 21% win rate. The deeper diagnosis: 51% of the BTC short's backtested entries never moved even 0.5% in the trade's favor — no exit logic could have saved them. The entries themselves were the problem.

What this bought us: both bots were dry-run only. Total real money lost to two broken strategies: $0. That's the validation pipeline doing its one job — strategies don't touch capital until their numbers survive hostile re-testing. The ETH strategy returns as v2 (ATR-sized stops, trailing locks, trend-resume entries) and must pass a 2-minute-granularity backtest — the honest kind — before any funding discussion.

ETF rotation — the backtest that survived

TQQQ/SQQQ, 18 months out-of-sample. The rotation family also passed scan-aware re-testing: SOXL/SOXS at +63% ROI, FNGU/FNGD at +50%.

Net P&L

+$1,125

On $5K starting capital

ROI

+22.5%

18 months out-of-sample

Win rate

57%

47 wins / 36 losses

Profit factor

2.61x

$2.61 gained per $1 lost

Max drawdown

2.0%

$121 peak-to-trough

Profitable months

74%

14 of 19 months green

+$58Oct 24

$0Nov 24

+$104Dec 24

+$29Jan 25

+$121Feb 25

+$42Mar 25

$0Apr 25

+$145May 25

+$78Jun 25

-$34Jul 25

+$22Aug 25

+$91Sep 25

$0Oct 25

+$165Nov 25

$0Dec 25

+$140Jan 26

$0Feb 26

+$202Mar 26

-$27Apr 26

The bear market edge: November 2025 — TQQQ lost $111, but SQQQ made +$277, netting +$165. March 2026 — SQQQ caught the crash for +$162, turning a losing month into the best month (+$202). The rotation made money in 14 of 19 months because it profits from both rallies and selloffs. Only 2 stop-loss exits in 83 trades. The live week of June 3–6, 2026 repeated the pattern at small scale: every dollar of fleet profit came from the bear side (SOXS, FNGD) while the Nasdaq dumped.

BTC long — backtested, now parked

Momentum + dual trend filter, 18 months out-of-sample. Honest caveats attached.

Net P&L

+$904

On $5K starting capital

ROI

+18.1%

18 months out-of-sample

Win rate

40%

Wins averaged 2x losses

Trades

182

~10 per month

Why it's parked anyway: this backtest shares the hourly-resolution methodology that inflated the crypto shorts, and has not yet passed a scan-aware retest — treat the numbers above as unverified. Its live dry-run behavior was correct (it refused to buy a downtrend, exactly as designed), but its target venue's liquidity is too thin for tight-stop strategies. If it returns, it returns on a deeper venue, rebuilt on the v2 chassis, with an honest backtest behind it.

Methodology — both kinds

Original backtests

162,565 one-minute bars (TQQQ + SQQQ) and 696,623 one-minute BTC bars, resampled to 5-minute resolution, October 2024 – April 2026. Indicators computed on hourly bars. Spread modeled at $0.02/share for ETFs; 0.03% taker per side plus funding for Coinbase perps. The flaw: trade outcomes were graded on hourly closes.

Scan-aware retests

Same data, same strategy code — but entries and exits are evaluated only at the timestamps a live bot would actually scan, with stops checked against intrabar prices rather than hourly closes. This is the standard every Sphinx Edge strategy must now pass before funding. It is harsher, slower, and tells the truth.

The rule going forward: a backtest that hasn't been stress-tested against live-execution mechanics is marketing, not evidence. We learned that on our own numbers, in public, for free. Most learn it with money.