Walk Forward Analysis is quite a hot topic on here but I would like to highlight some of the benefits that are less readily covered.
I personally like walk forward analysis, or at least I prefer it to other alternatives. As I general rule it is a good idea to make the way a strategy is tested the same as way it is traded. If you do not respect this rule then you can run into a lot of problems in regards to back test results not matching reality and it is best to just remove this as a degree of freedom if you can. The points made in this post use the assumption that the back testing framework you use is reflective of production. For more information on this please see the post in regards to data.
I personally see walk forward analysis as being separate to back testing. I see it as a layer on top of back testing which is used for strategy optimisation and selection. The selection process is dependent on individual backtest results but the power of walk forward analysis is its ability to give you a probability distribution for a strategyâs forward performance. Without this it is very difficult to know if a strategy is broken or if you are just unlucky. It is very important to manage your losers and WFA gives you an effective way to do this.
I would say there are 3 normal ways people test strategies.
- Take all of the data available, optimise their strategy in sample and then start trading it.
- Split the data into an in sample and and out of sample set, fit in sample and then test out of sample and make sure you make money in both.
- Split the data into slices, use testing strategy 2 but then repeat the process as you walk forward through the data. This is WFA.
Most people should know strategy 1 is like juggling hand grenades. What is less well known is testing strategy 2 is a specific example of walk forward analysis in which the in sample and out of sample periods are long enough, 1 cycle of the walk forward has taken up your whole dataset. Succinctly, A is a subset of B and as such A cannot be greater than B.
Say that you use testing strategy 2 to fit and test an EA you have written. Say for instance that you use a 1 year out of sample period. What will you do after you have traded it for a year?
- leave it alone
- re- calibrate it
- bin it
If the EA made money then I can see arguments to donât fix what isnât broken. But if it doesnât then how do you know whether to re-fit it or to put it in the bin? How bad does it have to suck before you give up? This is a difficult question to answer if you have not traded that strategy for multiple years or have not used walk forward analysis.
You can lose money for many reasons but for simplification I would like to say there are two main ones.
- God hates you and itâs just not your day.
- Your strategy is fitted to a specific market regime which is no longer valid due to structural changes in the market
Before you stop trading a strategy you need to know which one of these two things could be the case for you not making money. Walk forward analysis provides you with a handy way to do this.
The most important results from a selection procedure is that it should give you predictability in regards to what will happen in the future. If the back test says that you will have made money in sample then when you trade it, there should be a good chance you will make money out of sample i.e. when you are live trading. One of the most important things when selecting a calibration of a strategy you are testing is that if it is in the top x performers in sample, then it should be in the top x performers out of sample consistently. If this is the case then you are likely to have found something robust. This is where walk forward analysis is useful. For a specific calibration of a strategy if you for instance fit a strategy if you use a walk forward with an in sample of 1 year and out of sample of 3 months. Then over a 10 year period you have 40 in and 40 out of sample results. The question is if the fitted strategy makes $10,000 in sample what is the expectation of the out of sample results. This can be worked out using a simple linear regression and there results that can happen when doing this.
- There is no relation => your fundament idea/ strategy is crap
- There is a relation but the variance is really high => you should trade it but only when youâre expected pnl is very high
- There is a good relationship with low variance => you just found your own personal atm
- There is a good relationship with low variance => you thew a million calibrations at it so you were always going one calibration like this.
This is also useful as it answers the question to if we fit this strategy over 2013 and trade it for Q1 in 2014. If it made $10,000 in the fit over 2013 what can we expect the results for Q1 to be? If the walk forward says that the std. deviation is 2.5k then you can expect pro rata the pnl in Q1 to be 2.5k ± 1.25k with a 2.28% chance (1 tail test, assumes Gaussian pnl distribution and sufficient number of trades). If you are way outside of that its worth taking a look at some of the assumptions made in the selection process.
This also is an important point in regards to trading automated strategies, I see a lot that people expect to trade a strategies continuously and they should continue to work forever. This assumption often comes from a lack of understanding of the markets and where trading opportunities come from. Very few last forever, most are fleeting. You should only trade an EA when you have confidence in regards to their forward performance. There are a couple of ways of gaining confidence in their forward performance, one is to trade it for years the other is WFA but you need to know when the opportunity you are making money from is gone.
This leads me neatly into the next thing in wanted to discuss which is. In a walk forward analysis, what is the correctly length of the in and out of sample periods? Good question pip, Iâm glad you asked!
It depends upon the market conditions that you would like to exploit and how long you think that regime will last for. We are currently in a state of low vol, strategies that require high vol ( you know who you are) are currently having a hard time. This is where the trade-off is, using a short out of sample period runs the risk of not locking into a regime, and this giving bad out of sample results, a really long out of sample period will mean that you cannot adapt to changes of regime very quickly and can have long draw downs. So what to do? My answer is not to care about the specific length of in or out of sample results but look at which gives you the best predictability with the in and out of sample results. There is a balance to be struck and its specific to the individual strategy but this is how it can be struck and tested for. A long in sample period can give a good mean but bad variance of out of sample performance, a short mean can give a bad mean but low variance of out of sample performance and really you are looking for the best expected pnl out of sample with the smallest expected variance so that you can be confident you will be successful when you are trading live.
I am hoping from the last paragraph that you have noticed something weird. I am suggesting that there should be an optimisation of the selection procedure used to optimise the overall strategy (insert inception reference). This is exactly what I am suggesting but the alternative is that you use the same in and out of sample periods for all strategies. It is not possible to say it makes sense to always fit a strategy on 2 yearsâ worth of data. Itâs a magic number and they should always be questioned. The right answer is I should select the calibration fitted over x months/weeks/ days of data because it gives be the highest expected return over the next y months/weeks/days.
The other thing that should be highlighted is that WFA is like a seat belt. When seatbelts were introduced in the US the death rate for a short time increased because everyone started driving like d1cks thinking I have a seatbelt on so I canât die. If you throw enough indicators and calibrations at a good WFA strategy selector you will always find a few specific instances that pass all of the tests you throw at it even though they will not make you money. This is where I think a lot of the grievances in regards to walk forward testing come from. My argument to this is the same as the argument in regards to the seat belt. I realise that they will not save my life in all eventualities but I think if I drive carefully it will help me to survive most of the unexpected eventualities.
I hope this helps.