The answer to this question isn’t particularly straightforward, but few things in this game are.
Try to think about it this way, imagine you have a coin, with a 50% chance of landing heads or tails. How many times would you have to toss the coin to PROVE EXPERIMENTALLY that the chances of getting a head or a tail was 50% ?
If you toss it once, the experimental result is going to be 100% heads! or 100% tails right, so you need to toss it at least twice! but then you might get a head and a tail and the correct answer of 50%, but you might get 2 heads and 2 tails !
The more times that you toss the coin, the closer to the theoretical 50% you’ll get. So how many trials do you need. Well the error between the true probability of throwing a head, and the probability that you measured is proportional to the square root of the number of trials. So if you toss the coin 9 times, you’d expect an error of plus or minus 3 from what the theoretical result should be, so from 9 tosses, you might expect somewhere between 1 and 8 heads ! If you tossed the coin 400 times, you’d expect to get 200 heads theoretically but with an error of + or - 20, so somewhere between 180 and 220 heads, so within about 5%
So how does this relate to trading systems ?
Well if I gave you a trading system with a 50% win rate! it’s going to be like the coin example. In your first 9 trades for example you might expect anywhere between 1 and 8 wins. If you only got 1 win and 8 losses the chances are you’d abandon the system. If you got 8 wins, you’d think you found the holy grail, but in reality, you’d have a system with a 50% win rate.
Now imagine I gave you a weighted coin that when tossed came up heads 100% of the time, how many trials would you need to perform to get the correct answer by experimentation ?
So as the system win rate deviates from the 50% level this is going to effects the error in your experimental testing to. It gets even more complex because for most systems, it’s not just a matter of wins and losses, the size of those wins and losses varies too.
The answer to your question is that you probably have to go down the simulation route. So simulate 10 trades, and find the upper and lower boundaries in profit or loss, then simulate 50 trades, 100 trades, 200 trades etc and the range between the upper and lower boundaries should start to converge. At that point it’s up to you to decide what sort of error rate in testing you can live with.
The size of your edge is important to, very large edges require less testing, smaller edges need longer testing to differentiate themselves from statistical noise.
And of course, as all this is happening in real time, the market is constantly changing too, so any assumptions you make about win rate, or volatility in your simulation are probably going to be wrong, but it’s a starting point.
It’s all well and good anyone telling you that you need 400 trades, or even showing you the maths to calculate a number, but if you are trading a daily chart, there may not have been 400 trading opportunities over the past 20 years then you are pretty much screwed, what are you going to do, spend the next 100 years forward testing ? So you have to take practical constraints into account, or design a system with a trading frequency that can be forward tested in a reasonable time period.
Perhaps not the answer you where looking for, but hopefully it’s useful to somebody