← All projects
Project 1
ICT Turtle Soup
Backtest
A parameter-optimised backtest of the ICT Turtle Soup strategy on EURUSD daily data — with in-sample/out-of-sample validation and overfitting analysis.
Market: EURUSD / EURGBP (FX)
Timeframe: Daily
Language: Python
Status: Complete — v1
What is the Turtle Soup strategy?
Turtle Soup is an ICT (Inner Circle Trader) concept that fades the classic Turtle Trader breakout. Where traditional Turtle Traders buy new highs and sell new lows, Turtle Soup does the opposite — it anticipates false breakouts.
The setup
Price sweeps a recent N-day high or low, triggering breakout traders. It then reverses back inside the range — the "soup" — trapping those traders on the wrong side.
Liquidity sweep
Long signal
Price breaks below the N-day low (sweeps buy-side liquidity), then closes back above it within the same candle. Enter long at the next open.
Bullish reversal
Short signal
Price breaks above the N-day high (sweeps sell-side liquidity), then closes back below it. Enter short at the next open.
Bearish reversal
Stop & target
Stop loss placed at ATR × multiplier from entry. Take profit at R:R × stop distance. Both parameters are optimised in the grid search.
ATR-based sizing
How the model works
Methodology
This is not just a simple backtest — it includes a full parameter optimisation pipeline with overfitting protection. Three years of data are split into in-sample (years 1–2) and out-of-sample (year 3). Parameters are optimised only on in-sample data and then validated on the unseen out-of-sample period.
| Component | Details |
| Data | EURUSD daily OHLCV via yfinance (2021–2024) |
| Signal | N-day high/low sweep with same-candle close reversal |
| ATR | 14-period Average True Range for stop sizing |
| Position sizing | Fixed 1% risk per trade on £10,000 capital |
| Grid search | 96 parameter combinations across lookback, R:R, ATR mult |
| Ranking metric | Sharpe ratio on in-sample data |
| Validation | Best params re-run on out-of-sample data blind |
| Overfitting test | Sharpe decay: IS Sharpe vs OOS Sharpe comparison |
Parameters searched
| Parameter | Values tested | What it controls |
| Lookback | 10, 15, 20, 25, 30, 40 days | Window for defining the N-day high/low level |
| R:R ratio | 1.5, 2.0, 2.5, 3.0 | Take profit as a multiple of the stop loss distance |
| ATR multiplier | 0.75, 1.0, 1.25, 1.5 | Stop loss distance as a multiple of ATR |
Why the results were likely poor
A few structural issues explain underperformance on daily EURUSD — and they are all fixable.
⚠ Wrong timeframe
ICT Turtle Soup is primarily an intraday concept — it is designed to be traded on the 15min or 1hr chart during London/NY sessions. On daily bars, the signal logic loses most of its edge because the "sweep and reverse" often happens within a single day, invisible at daily resolution.
High impact
⚠ No session filter
ICT setups are session-specific. Turtle Soup setups that form outside of London open or New York open are significantly less reliable. Daily data cannot distinguish between a London open sweep and a random mid-session move.
High impact
⚠ No market structure filter
Turtle Soup longs should only be taken in a bullish market structure, and shorts in a bearish one. Trading both directions without a higher timeframe bias doubles noise.
Medium impact
⚠ Too few signals
On daily bars, sweep-and-reverse setups are rare. Fewer signals means higher variance in results and less statistical confidence. A good backtest needs 100+ trades to draw meaningful conclusions.
Medium impact
Planned improvements
These changes would meaningfully improve the strategy's edge and make the backtest results more trustworthy.
1
Switch to intraday data (1hr or 15min)
Download hourly EURGBP data via yfinance or a broker API. Redefine the N-period high/low using hourly bars. This aligns with how the strategy is actually traded and will generate far more signals with better signal quality.
2
Add London and New York session filters
Only allow entries between 07:00–10:00 GMT (London open) or 13:30–16:00 GMT (NY open). ICT setups derive their edge from institutional order flow at session opens — filtering to these windows removes a large amount of noise.
3
Add higher timeframe market structure filter
Use the daily chart to determine trend direction. Only take long Turtle Soup setups when daily structure is bullish (higher highs and higher lows), and only short when bearish. This aligns trades with institutional bias and reduces counter-trend losses.
4
Tighten the entry logic
The current signal fires on any candle that sweeps and closes back inside. A stricter version requires: (1) a clean displacement candle after the sweep, (2) the close to be more than 50% back inside the range, and (3) volume above the rolling average. Each condition removes false positives.
5
Test on multiple pairs
Run the same optimised parameters on GBPUSD, USDJPY, and EURUSD to test robustness. A strategy that only works on one pair is likely curve-fitted. One that works across multiple correlated pairs has a genuine edge.
6
Add walk-forward validation
Replace the single train/test split with rolling walk-forward optimisation — re-optimise every 3 months on the preceding 12 months of data. This is a much stronger test of whether the strategy adapts to changing market conditions.
The code
Full Python source. Runs with yfinance for real data or falls back to synthetic EURUSD if yfinance is unavailable.
How to run
Install dependencies:
pip install pandas numpy matplotlib yfinance
Then run:
python turtle_soup_optimisation.py
Output: a PNG chart showing heatmap, equity curves, and top 10 parameter combinations.
"""
Turtle Soup ICT — Parameter Optimisation + Out-of-Sample Test
==============================================================
1. Load EURUSD daily data (real via yfinance or synthetic)
2. Split: Years 1-2 = IN-SAMPLE | Year 3 = OUT-OF-SAMPLE
3. Grid search over lookback, R:R ratio, ATR multiplier
4. Rank by Sharpe ratio on in-sample data
5. Validate best params on out-of-sample data
6. Measure overfitting via Sharpe decay
"""
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from itertools import product
# ── SIGNAL LOGIC ─────────────────────────────────────────
def prepare(df, lookback=20):
d = df.copy()
d['turtle_high'] = d['High'].shift(1).rolling(lookback).max()
d['turtle_low'] = d['Low'].shift(1).rolling(lookback).min()
hl = d['High'] - d['Low']
hpc = (d['High'] - d['Close'].shift(1)).abs()
lpc = (d['Low'] - d['Close'].shift(1)).abs()
d['ATR'] = pd.concat([hl, hpc, lpc], axis=1).max(axis=1).rolling(14).mean()
# Turtle Soup: sweep the level then close back inside
d['long_signal'] = (d['Low'] < d['turtle_low']) & (d['Close'] > d['turtle_low'])
d['short_signal'] = (d['High'] > d['turtle_high']) & (d['Close'] < d['turtle_high'])
return d
# ── BACKTEST ENGINE ───────────────────────────────────────
def run_backtest(df, rr_ratio=2.0, atr_mult=1.0, risk=0.01, capital=10_000):
rows = df.reset_index()
equity = [capital]; trades = []; in_trade = False
entry_price = sl = tp = direction = None
for i in range(1, len(rows)):
row = rows.iloc[i]; prev = rows.iloc[i - 1]
if in_trade:
hit_tp = hit_sl = False
if direction == 'long':
if row['Low'] <= sl: hit_sl = True
if row['High'] >= tp: hit_tp = True
else:
if row['High'] >= sl: hit_sl = True
if row['Low'] <= tp: hit_tp = True
if hit_tp or hit_sl:
exit_px = tp if hit_tp else sl
pnl_pips = ((exit_px - entry_price) if direction == 'long'
else (entry_price - exit_px)) * 10_000
stop_pips = abs(entry_price - sl) * 10_000
if stop_pips > 0:
capital *= (1 + risk * (pnl_pips / stop_pips))
trades.append({'result': 'win' if hit_tp else 'loss',
'pnl_pips': round(pnl_pips, 1)})
in_trade = False
if not in_trade:
atr = prev['ATR']
if pd.isna(atr) or atr == 0:
equity.append(capital); continue
stop_dist = atr * atr_mult
if prev['long_signal']:
direction = 'long'; entry_price = row['Open']
sl = entry_price - stop_dist; tp = entry_price + stop_dist * rr_ratio
in_trade = True
elif prev['short_signal']:
direction = 'short'; entry_price = row['Open']
sl = entry_price + stop_dist; tp = entry_price - stop_dist * rr_ratio
in_trade = True
equity.append(capital)
eq = pd.Series(equity, index=df.index)
td = pd.DataFrame(trades) if trades else pd.DataFrame(columns=['result','pnl_pips'])
return eq, td
What this project taught me
In-sample / out-of-sample testing
Optimising on all available data and then reporting those results is the most common mistake in backtesting. Holding back a test set is the minimum standard for any honest evaluation.
Methodology
Overfitting detection
Sharpe decay (how much the Sharpe ratio degrades from IS to OOS) is a practical overfitting metric. Below 30% decay = low risk. Above 60% = the strategy is curve-fitted.
Statistics
ATR-based position sizing
Sizing positions by ATR rather than fixed pip amounts means the risk per trade adapts to current market volatility — a more robust approach than fixed sizing.
Risk management
Timeframe alignment
Strategy concepts designed for one timeframe often do not transfer directly to another. ICT setups are intraday by nature — forcing them onto daily data loses their core logic.
Strategy design