Project 1

ICT Turtle Soup
Backtest

A parameter-optimised backtest of the ICT Turtle Soup strategy on EURUSD daily data — with in-sample/out-of-sample validation and overfitting analysis.

Market: EURUSD / EURGBP (FX) Timeframe: Daily Language: Python Status: Complete — v1

What is the Turtle Soup strategy?

Turtle Soup is an ICT (Inner Circle Trader) concept that fades the classic Turtle Trader breakout. Where traditional Turtle Traders buy new highs and sell new lows, Turtle Soup does the opposite — it anticipates false breakouts.

The setup

Price sweeps a recent N-day high or low, triggering breakout traders. It then reverses back inside the range — the "soup" — trapping those traders on the wrong side.

Liquidity sweep

Long signal

Price breaks below the N-day low (sweeps buy-side liquidity), then closes back above it within the same candle. Enter long at the next open.

Bullish reversal

Short signal

Price breaks above the N-day high (sweeps sell-side liquidity), then closes back below it. Enter short at the next open.

Bearish reversal

Stop & target

Stop loss placed at ATR × multiplier from entry. Take profit at R:R × stop distance. Both parameters are optimised in the grid search.

ATR-based sizing

How the model works

Methodology

This is not just a simple backtest — it includes a full parameter optimisation pipeline with overfitting protection. Three years of data are split into in-sample (years 1–2) and out-of-sample (year 3). Parameters are optimised only on in-sample data and then validated on the unseen out-of-sample period.

Component	Details
Data	EURUSD daily OHLCV via yfinance (2021–2024)
Signal	N-day high/low sweep with same-candle close reversal
ATR	14-period Average True Range for stop sizing
Position sizing	Fixed 1% risk per trade on £10,000 capital
Grid search	96 parameter combinations across lookback, R:R, ATR mult
Ranking metric	Sharpe ratio on in-sample data
Validation	Best params re-run on out-of-sample data blind
Overfitting test	Sharpe decay: IS Sharpe vs OOS Sharpe comparison

Parameters searched

Parameter	Values tested	What it controls
Lookback	10, 15, 20, 25, 30, 40 days	Window for defining the N-day high/low level
R:R ratio	1.5, 2.0, 2.5, 3.0	Take profit as a multiple of the stop loss distance
ATR multiplier	0.75, 1.0, 1.25, 1.5	Stop loss distance as a multiple of ATR

Why the results were likely poor

A few structural issues explain underperformance on daily EURUSD — and they are all fixable.

⚠ Wrong timeframe

ICT Turtle Soup is primarily an intraday concept — it is designed to be traded on the 15min or 1hr chart during London/NY sessions. On daily bars, the signal logic loses most of its edge because the "sweep and reverse" often happens within a single day, invisible at daily resolution.

High impact

⚠ No session filter

ICT setups are session-specific. Turtle Soup setups that form outside of London open or New York open are significantly less reliable. Daily data cannot distinguish between a London open sweep and a random mid-session move.

High impact

⚠ No market structure filter

Turtle Soup longs should only be taken in a bullish market structure, and shorts in a bearish one. Trading both directions without a higher timeframe bias doubles noise.

Medium impact

⚠ Too few signals

On daily bars, sweep-and-reverse setups are rare. Fewer signals means higher variance in results and less statistical confidence. A good backtest needs 100+ trades to draw meaningful conclusions.

Medium impact

Planned improvements

These changes would meaningfully improve the strategy's edge and make the backtest results more trustworthy.

Switch to intraday data (1hr or 15min)

Download hourly EURGBP data via yfinance or a broker API. Redefine the N-period high/low using hourly bars. This aligns with how the strategy is actually traded and will generate far more signals with better signal quality.

Add London and New York session filters

Only allow entries between 07:00–10:00 GMT (London open) or 13:30–16:00 GMT (NY open). ICT setups derive their edge from institutional order flow at session opens — filtering to these windows removes a large amount of noise.

Add higher timeframe market structure filter

Use the daily chart to determine trend direction. Only take long Turtle Soup setups when daily structure is bullish (higher highs and higher lows), and only short when bearish. This aligns trades with institutional bias and reduces counter-trend losses.

Tighten the entry logic

The current signal fires on any candle that sweeps and closes back inside. A stricter version requires: (1) a clean displacement candle after the sweep, (2) the close to be more than 50% back inside the range, and (3) volume above the rolling average. Each condition removes false positives.

Test on multiple pairs

Run the same optimised parameters on GBPUSD, USDJPY, and EURUSD to test robustness. A strategy that only works on one pair is likely curve-fitted. One that works across multiple correlated pairs has a genuine edge.

Add walk-forward validation

Replace the single train/test split with rolling walk-forward optimisation — re-optimise every 3 months on the preceding 12 months of data. This is a much stronger test of whether the strategy adapts to changing market conditions.

The code

Full Python source. Runs with yfinance for real data or falls back to synthetic EURUSD if yfinance is unavailable.

How to run

Install dependencies: pip install pandas numpy matplotlib yfinance
Then run: python turtle_soup_optimisation.py
Output: a PNG chart showing heatmap, equity curves, and top 10 parameter combinations.

Python — turtle_soup_optimisation.py

"""
Turtle Soup ICT — Parameter Optimisation + Out-of-Sample Test
==============================================================
1. Load EURUSD daily data (real via yfinance or synthetic)
2. Split: Years 1-2 = IN-SAMPLE | Year 3 = OUT-OF-SAMPLE
3. Grid search over lookback, R:R ratio, ATR multiplier
4. Rank by Sharpe ratio on in-sample data
5. Validate best params on out-of-sample data
6. Measure overfitting via Sharpe decay
"""

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from itertools import product

# ── SIGNAL LOGIC ─────────────────────────────────────────
def prepare(df, lookback=20):
    d = df.copy()
    d['turtle_high'] = d['High'].shift(1).rolling(lookback).max()
    d['turtle_low']  = d['Low'].shift(1).rolling(lookback).min()
    hl  = d['High'] - d['Low']
    hpc = (d['High'] - d['Close'].shift(1)).abs()
    lpc = (d['Low']  - d['Close'].shift(1)).abs()
    d['ATR'] = pd.concat([hl, hpc, lpc], axis=1).max(axis=1).rolling(14).mean()
    # Turtle Soup: sweep the level then close back inside
    d['long_signal']  = (d['Low']  < d['turtle_low'])  & (d['Close'] > d['turtle_low'])
    d['short_signal'] = (d['High'] > d['turtle_high']) & (d['Close'] < d['turtle_high'])
    return d

# ── BACKTEST ENGINE ───────────────────────────────────────
def run_backtest(df, rr_ratio=2.0, atr_mult=1.0, risk=0.01, capital=10_000):
    rows = df.reset_index()
    equity = [capital]; trades = []; in_trade = False
    entry_price = sl = tp = direction = None

    for i in range(1, len(rows)):
        row = rows.iloc[i]; prev = rows.iloc[i - 1]

        if in_trade:
            hit_tp = hit_sl = False
            if direction == 'long':
                if row['Low']  <= sl: hit_sl = True
                if row['High'] >= tp: hit_tp = True
            else:
                if row['High'] >= sl: hit_sl = True
                if row['Low']  <= tp: hit_tp = True
            if hit_tp or hit_sl:
                exit_px = tp if hit_tp else sl
                pnl_pips = ((exit_px - entry_price) if direction == 'long'
                            else (entry_price - exit_px)) * 10_000
                stop_pips = abs(entry_price - sl) * 10_000
                if stop_pips > 0:
                    capital *= (1 + risk * (pnl_pips / stop_pips))
                trades.append({'result': 'win' if hit_tp else 'loss',
                               'pnl_pips': round(pnl_pips, 1)})
                in_trade = False

        if not in_trade:
            atr = prev['ATR']
            if pd.isna(atr) or atr == 0:
                equity.append(capital); continue
            stop_dist = atr * atr_mult
            if prev['long_signal']:
                direction = 'long'; entry_price = row['Open']
                sl = entry_price - stop_dist; tp = entry_price + stop_dist * rr_ratio
                in_trade = True
            elif prev['short_signal']:
                direction = 'short'; entry_price = row['Open']
                sl = entry_price + stop_dist; tp = entry_price - stop_dist * rr_ratio
                in_trade = True
        equity.append(capital)

    eq = pd.Series(equity, index=df.index)
    td = pd.DataFrame(trades) if trades else pd.DataFrame(columns=['result','pnl_pips'])
    return eq, td

What this project taught me

In-sample / out-of-sample testing

Optimising on all available data and then reporting those results is the most common mistake in backtesting. Holding back a test set is the minimum standard for any honest evaluation.

Methodology

Overfitting detection

Sharpe decay (how much the Sharpe ratio degrades from IS to OOS) is a practical overfitting metric. Below 30% decay = low risk. Above 60% = the strategy is curve-fitted.

Statistics

ATR-based position sizing

Sizing positions by ATR rather than fixed pip amounts means the risk per trade adapts to current market volatility — a more robust approach than fixed sizing.

Risk management

Timeframe alignment

Strategy concepts designed for one timeframe often do not transfer directly to another. ICT setups are intraday by nature — forcing them onto daily data loses their core logic.

Strategy design

← Back to projects

Version 1 — March 2026

ICT Turtle SoupBacktest

ICT Turtle Soup
Backtest