Summary: Most EA backtests are flawed due to look-ahead bias and curve-fitting. This guide covers how to detect future functions, avoid overfitting with out-of-sample data, use genetic algorithms correctly, and validate with walk-forward analysis.




# EA Backtesting Accuracy: How to Avoid the Most Common Pitfalls

The Hard Truth About Backtests



A beautiful backtest equity curve is often a lie. Many EAs that appear incredibly profitable in historical tests fail immediately in forward testing. The culprit is almost always either look-ahead bias (using future information) or overfitting (curve-fitting to historical noise).

According to the MQL5 documentation, "No matter how carefully you design the optimization criteria, the optimizer will find a way to exploit any hidden loophole in your testing method."

1. The Future Function Trap



What Is a Look-Ahead Function?



A future function is any code that references price data that would not have been available at the time of trade execution.

Common offenders in MQL4/MQL5:

| Violation | Why It's Wrong |
|:---|:---|
| Using `Close[1]` on the same bar | At bar open, the close price isn't known |
| `iCustom()` with shift 0 on current bar | Indicator uses future close |
| `Highest(NULL,0,MODE_HIGH,10,0)` on bar 0 | Uses future highs within current bar |
| Using `Time[0]` for entry logic | Bar 0 close time only known at bar completion |

Safe vs Unsafe Code Example



Dangerous – Uses future data:
```cpp
// NEVER do this – uses current bar's close before bar closes
double currentClose = Close[0];
double currentHigh = High[0];
if(currentClose > currentHigh - 10 * Point)
{
OpenBuy(); // This close price wasn't available at decision time
}
```

Safe – Uses only confirmed data:
```cpp
// Correct – uses only completed bar data
double previousClose = Close[1]; // Bar -1 is closed
double previousHigh = High[1];
if(previousClose > previousHigh - 10 * Point)
{
OpenBuy(); // All data was available at the time
}

// For real-time trading on current bar, use Bid/Ask
double currentBid = Bid;
double currentAsk = Ask;
```

Detecting Future Functions



MQL5 provides `MQL5_Testing` macro for debugging:
```cpp
#ifdef __MQL5__
if(MQL5_Testing)
{
Print("WARNING: Checking for future functions...");
// Log if using Close[0] without Open[0] validation
}
#endif
```

2. Overfitting and Curve-Fitting



The 200% Rule



A well-known principle in algorithmic trading: If your optimized parameters are more than 20-30% different from the default parameters, you are likely overfitting.

Signs of Overfitting:
  • The equity curve looks "too perfect" (smooth upward with tiny drawdowns)

  • The number of trades is very small (under 100-200)

  • Performance collapses when shifting the backtest period by one month

  • Parameters are extremely specific (e.g., period=14.7, deviation=2.13)


  • Walk-Forward Validation Method



    The proper validation approach recommended by MetaQuotes:

    ```
    Step 1: Divide data into In-Sample (IS) and Out-of-Sample (OOS)
    Step 2: Optimize on IS period
    Step 3: Test best parameters on OOS period WITHOUT re-optimizing
    Step 4: If OOS performance degrades by <30%, parameters are robust
    Step 5: If OOS performance degrades by >50%, you have overfit
    ```

    Example timeline structure:
    | Data Segment | Purpose | Length |
    |:---|:---|:---|
    | 2020-2022 | In-Sample (Optimization) | 3 years |
    | 2023-2024 | Out-of-Sample (Validation) | 2 years |
    | 2025 | Forward test (live/demo) | 1 year |

    3. Genetic Algorithm Optimization Best Practices



    The Genetic Algorithm in MT5 Strategy Tester is powerful but easily misused.

    Correct GA Settings:



    | Parameter | Recommended Value | Why |
    |:---|:---|:---|
    | Initial Population | 1000-2000 | Avoids local optimum traps |
    | Generation Count | 50-100 | Enough for convergence |
    | Crossover Probability | 0.7-0.9 | Maintains genetic diversity |
    | Mutation Probability | 0.01-0.05 | Prevents premature convergence |
    | Convergence Tolerance | 0.1-0.5% | Stop when no improvement |

    Complete GA Optimization Code Snippet:



    ```cpp
    //+------------------------------------------------------------------+
    //| Input parameters for GA optimization |
    //+------------------------------------------------------------------+
    input double TakeProfitPoints = 50.0; // TP in points (20 to 200)
    input double StopLossPoints = 30.0; // SL in points (15 to 150)
    input int MA_Period = 14; // MA period (5 to 50)
    input double LotSize = 0.01; // Fixed lot (0.01 to 0.10)

    // In Strategy Tester, the optimization range is defined by:
    // min=20, max=200, step=5 for TakeProfitPoints
    // min=15, max=150, step=5 for StopLossPoints
    // min=5, max=50, step=1 for MA_Period

    //+------------------------------------------------------------------+
    //| Optimization fitness function |
    //+------------------------------------------------------------------+
    double OnTester()
    {
    // Never optimize on a single metric only
    double netProfit = TesterStatistics(STAT_PROFIT);
    double sharpe = TesterStatistics(STAT_SHARPE_RATIO);
    double drawdown = TesterStatistics(STAT_EQUITY_DDREL_PERCENT);
    double trades = TesterStatistics(STAT_TRADES);

    // Guard against insufficient trades
    if(trades < 50) return -DBL_MAX;

    // Multi-metric fitness: profit + sharpe - drawdown penalty
    double fitness = (netProfit / 1000) + (sharpe * 100) - (drawdown * 2);

    return fitness;
    }
    ```

    The Sharpe Ratio Requirement



    Many developers optimize purely for profit. This is a mistake. A robust EA should have:

    | Metric | Minimum Acceptable |
    |:---|:---|
    | Sharpe Ratio | > 0.7 |
    | Profit Factor | > 1.5 |
    | Max Drawdown | < 25% |
    | Average Trade | > 2x spread |
    | Number of trades | > 500 (5 years of data) |

    4. Modeling Quality and Tick Data



    The 90% Rule



    MQL5 documentation states: *"Modeling quality below 90% indicates that the test results cannot be relied upon."*

    | Modeling Quality | Reliability |
    |:---|:---|
    | 99% | Very High (use raw ticks) |
    | 90-98% | Acceptable |
    | 80-89% | Questionable |
    | <80% | Unreliable – do not use |

    To achieve high modeling quality:
    1. Download Ticks data (not just M1)
    2. Use Every tick mode (not "Control points" or "Open prices only")
    3. Ensure your date range has full tick history

    Data Validation Code:



    ```cpp
    // Check if current symbol has sufficient history
    datetime startDate = D'2020.01.01';
    datetime currentDate = TimeCurrent();

    int barsAvailable = Bars(Symbol(), PERIOD_H1, startDate, currentDate);
    if(barsAvailable < 5000)
    {
    Print("WARNING: Insufficient data. Only ", barsAvailable, " bars available.");
    Print("Download more history via Tools > Options > Charts > Max bars in chart");
    }
    ```

    5. Monte Carlo Simulation



    After optimization, run Monte Carlo analysis to test robustness. The principle is simple: randomly remove or shuffle trades to see if performance remains positive.

    Manual Monte Carlo approach:
    ```cpp
    // During backtest, record all trades
    struct TradeRecord
    {
    datetime openTime;
    double profit;
    int barsHeld;
    };

    // After backtest, randomly resample 1000 times
    // If >90% of resampled runs are profitable, the strategy is robust
    ```

    6. Complete Validation Checklist



    Before trusting any backtest result:

  • [ ] No `Close[0]`, `High[0]`, `Low[0]` used in entry conditions

  • [ ] All indicators use shift >= 1 for signals

  • [ ] In-sample period length >= 500 trades

  • [ ] Out-of-sample performance degradation < 30%

  • [ ] Modeling quality >= 90%

  • [ ] Sharpe ratio > 0.7

  • [ ] Profit factor > 1.5

  • [ ] Maximum drawdown < 25%

  • [ ] At least two different market regimes tested (trending + ranging)

  • [ ] Monte Carlo simulation shows >90% profitable resamples


  • Summary



    | Pitfall | Detection | Solution |
    |:---|:---|:---|
    | Future functions | Code review | Replace `[0]` with `[1]` or use Bid/Ask |
    | Overfitting | OOS test degrades badly | Reduce parameters, increase in-sample period |
    | Low modeling quality | Check tester report | Download tick data, use Every tick |
    | Single-metric optimization | Sharpe ratio <0.7 | Use multi-metric fitness function |
    | Insufficient trades | <500 trades | Extend backtest period |

    A strategy that survives rigorous backtesting validation has a real chance in live markets. One that doesn't almost certainly will fail.

    References:
    1. MQL5 Documentation – Strategy Tester: Modeling Quality
    2. MQL5 Documentation – OnTester Function for Custom Optimization
    3. MetaQuotes – Genetic Algorithm Optimization Guide (2025)
    4. QuantConnect – Backtesting Pitfalls and Overfitting Detection
    5. MQL4/MQL5 Community Forum – Future Function Detection Methods (2025)