Forex Statistics and Probability

Hi guys

I have a question for you :slight_smile:

How useful is statistical data obtained by studying historical prices?

I will try to explain by giving an example: Let’s say I found out by studying historical prices that a signal from an indicator gave a 60% probability of winning 20 pips. Can I use that in future trades? Will it work? Why? or Why not?

I nominate Lexy, Clint, and Peterma for an answer to this :slight_smile:

For me, I find statistical analysis veey appealing, but in the end there is no guarantee of price history repeating (or not repeating itself)

"If you want a guarantee, buy a toaster."
Clint Eastwood

.

Excellent :slight_smile:

Variably useful. The range of variation can go all the way from ā€œindispensibleā€ to ā€œof bugger-all useā€.

Yes.

Sometimes.

Because sometimes your back-testing information will be adequate, accurate and statistically significant enough for forward-testing or forward-trading to produce very similar overall outcomes over a significant timescale, and sometimes the market won’t have changed in ways that impact adversely on whatever you’re looking at. You can try to verify this, by doing some forward-testing as well as back-testing. This equates (near enough) to what statisticians call ā€œin-sampleā€ and ā€œout-of-sampleā€ analysis.

It also depends on the basis of the underlying ideas and method.

If you apply the principle to the trades of the top-performing Zulutrade traders, for example, the value of it will be roughly [I]zero[/I], because they’re almost all using high-risk, highly backfitted (ā€œdesigned specifically to have back-tested well, typically at the cost of being unsoundā€) methods which will eventually crash and burn, and the fact they’ve performed well (and apparently safely) over the previous 6 months doesn’t in any way imply or infer that they’ll continue to do so for the next 6 months (in fact, if anything, it probably rather implies the exact opposite).

On the other hand, if you take a method out of a book by Bob Volman or Al Brooks and back-test it yourself (not easy to do, admittedly), the results you’ll get are highly likely to be duplicated going forward as well, subject to your own skill-set, of course.

That’s because some methods are reproducible while others aren’t.

Or, as Clint put it so much more succinctly and eloquently above, ā€œIf you want a guarantee ā€¦ā€ :slight_smile:

For a broad overview of the probability and statistics involved in this subject, as applicable to forex trading, you probably need look no further than the book [I]Profitability and Systematic Trading[/I] by Michael Harris (Wiley, 2007).

Your reply is so sharp and to the point!

I hope the O.P. will appreciate it too :slight_smile:

I would just add the words

ā€˜market conditions’ to stress what you

so eloquently put, that is, how historic

data and back-testing can fail due to

changing market conditions… As Anton

Kreil once said: when the machines fail,

humans need to step in… Any E.A. needs

human management … As Anna Couling says:

trading is more art than science.

Thank you guys for your answers! I have to admit, I was expecting to be silenced. You gave me more confidence :smiley:

Thund3r can no more be silenced than l1ghtn1ng can be extinguished. :wink:

I consider there to be three types of probabilities to be looked at in a scenario like this. I call them Hit %, non-hit %, and trade %.

What you have discovered is hit %: the probability that the trade will move in your favor by your criteria (in this case 60%, 20 pips).
The second probability is playing the other side of it. Non-hit % would ask a question like: what is the probability that price will not go down by 20 pips?
The last is executing a full ā€œtrade scenarioā€: what is the probability that price will hit + 20 pips before going down 20 pips?
The combination of hit % and non-hit % is the same as the trade %, but I think you can learn a lot more from dissecting the pieces than you can individually. You can continue to combine them and you can sometimes find some cool relationships.

For example, if your non-hit % is basically the same as your stop loss. The lower the number, the more likely you can go for a position and hold it. The Hit % & non-Hit % (in other words, the probability for price to move 20 pips up then 20 pips down, or vice versa) can you tell you about the ā€œexhaustionā€ probability of current price.
Stat trading is a tricky route, but it certainly rewards creativity and hard work.

Bottom line is that there is no guarantee. There will be patterns but anytime, the pattern could break.

There is no guarantee of future performance, but professional trading is really about statistics and probabilities, there is no other way to put the odds in your favor if you dont have statistics at hand.

Thund3r, you are raising an incredibly important concept here than underpins the viability and justifiability principles of pretty much the entire usage of technical indicators in forming trading strategies across the whole industry. If indicators do not give an ā€œedgeā€ then they are totally useless.

I don’t think you were actually talking about any ā€œguaranteesā€, I think you were more interested in relying on ā€œprobabilitiesā€ and ā€œstatisticsā€? These are so tricky and can be [I]so [/I]misleading!
Personally, I don’t think 60% is a very useful result and indicates an overall expectancy of maybe 50/50. But the question of whether [I]any [/I]kind of historical outcome is reliable for current decision-making depends on many factors, a few examples:

Duration:
E.g. was it for one week or one year data?

Selectivity:
Was it across entire trading days or selective omitting quiet times?

Extremities and criteria:
The result may be 60% in total but what are the extremes in max loss and max gain, do the TL and SL match your own trading criteria?

I think it is also important to study in this way the [I]nature [/I]of the historical data used and compare it with what your future intentions are. For example, if a result of 60% has been achieved over the last 18 months can you expect it to produce 60% over the next 2 weeks?

Another important factor is whether the analysis has been carried out on a backtest/optimising program. In my opinion this kind of backtesting is only best-fitting the indicator to the specific time period concerned. The characteristics of markets do change over time and the ā€œbest-fitā€ scenario from the past will not necessarily continue in the future. This suggests regular backtesting and optimising may be worthwhile.

There are many, many indicators both new and old, some stay, many fade away. Probably the reason why the older established ones continue to survive is because they do indeed give an edge.

But I personally think that one should always remember that an ā€œindicatorā€ is exactly that - an indicator. One must not lose sight of the price action actually taking place and only stare at the indicator. Indicators can help you see the ā€œwoodsā€ in spite of all the ā€œtreesā€ and which way the tide might be flowing but if you only stare at the compass and don’t look where you are going well you might just fall over the cliff.

Good thread going here, I just wanna chime in. Most of this is talking about the backtesting of a technical indicator to generate a signal using past data and estimating if that will work in the future and for how long. That’s all well and good, with some really solid info posted above me. In this case we are developing a trading system which happens to use a statistical measurement. And has all of the pitfalls mentioned above and more.

But I want to talk about something much more general. General descriptive statistics about the any market. These can quantify the personality of a market, as well as be used to make determinations on how far the market is deviated from its baseline/ normal state. (I am assuming everyone here knows that markets are non normal distributions and i don’t mean Gaussian normal) An inherent advantage to this type of analysis is that it is not datamined/curve fitted because we are extracting the data we are given and not transforming it or applying additional parameters to it. E.G. like with a technical indicator and settings. It is exactly as the instruments time series has given us, no more no less.

So lets consider a situation in where pure descriptive statistics can assist you. With no data mining but just logical reasoning.

Imagine we know that the average volatility of some instrument is 50, and that any up day is followed by a down day 50% of the time on average through the entire history. We notice that during extremely high volatility days +2 Std Devs that in fact the average up day is followed by a down day only 48% of the time, and the reverse for low volatility days at -2 Std Devs.

This would still net in a 50-50 split average up day down day ratio, which would basically be un-tradeable. However by identify a certain regime in this case extreme volatility, we earn a natural 2% statistical edge. Now the argument can be made that the # of std devs is a parameter, and it is as you could have chosen 1 or 3 etc. But they are still based on the underlying distribution, and you would like to see stability over the choice of std dev anyway. IMHO this would be a much more naturally occurring phenomenon to attempt to exploit as it revealed it self in the underlying time series, the probability of persistence is higher. Rather than attempting to try dozens of indicators to reveal some signals.

I know this example was long winded and used a very simplistic toy example. But I want to put emphasis on the fact that most people are not quantifying the descriptive statistics and general tendencies of the markets as a whole. Instead they are immediately jumping into data mining technical indicators for set ups, where curve fitting can become a much larger issue. Don’t miss the forest for the trees.

I think there is a lot of value in what you are saying here. I guess in a nutshell we could say that most backtesting of indicator performance is not a true statistical analysis at all?

I think, rather than accepting some quoted percentage accuracy of any given indicator or system and diving straight into the deep end, it is far better to run the indicator alongside one’s existing methods and see how it either complements or conflicts with it. Sometimes it may improve on and replace a component part of one’s method, sometimes it may work well but not in the TF or trading parameters that one prefers.

Either way, it is less expensive to run it parallel and test it in your own back yard rather than blindly accepting a stated success rate - afterall, nowadays one can’t even accept back-tested car emissions results as stated! :smiley:

Jack Schwager talking about risk, probability and data.

A possible answer for your question: min. 12.30 & 14.42 & 20.27

I want to say yes and no so bad… so I will yes and no. But seriously, backtesting an indicator is statistical analysis. It is telling you the hypothetical results of some experiment, usually the trading results based on this indicator. The statistics generated are measuring the interaction between the indicator, the market, and the hypothetical trades. Yes this is the foundation of all back tests.

The real question is what are you measuring statistics of? The market itself, in which my example above points that out. A potential set up or strategy, in which case back testing makes sense. Or of something else. Its really about asking the right question. If you can’t ask the right question, how will get know the answer you get is meaningful?

I want to take some time to discuss Technical analysis. 90% of it is based on price data, OHLC. If we take that as a basis, every technical indicator, trend line, fib, candle stick pattern etc. Is some derivative of price. If that’s the case then any statistical study based on an indicator or technical analysis tool is some way of filtering, transforming, lagging or modifying the data in order to be understood by a human being. We need this as no one could fully comprehend tick by tick data in any market.

Now if every TA method is a derivative, then they all are at least 1 degree of separation away from the ā€œtruthā€. Truth in this case is true price. As you add more indicators, oscillators, etc you add additional degrees of separation form the market price. Same with modifying parameters, the more noise you remove the more lag you add. The easiest way to understand this is look at a simple moving average, the parameter of 1 is the price, 5 is a very fast SMA, 50 slower, 200 even slower. But as you get slower you get smoother.

Lets stack another indicator on that, another SMA making an SMA cross over. Now take a look at the results. Before if price were above or below the SMA as we moved the parameter from 1 to 200 it got slower but smoother. But when we add the cross over, chances are price has already been trading above the original moving average for quite some time. And then after some lag the 2nd moving average will cross over. Yes it removes even more noise, and reduces whipsaw. But we have done that at the cost of increased lag / increase degrees of separation.

Now your thinking jesus Meihua why dont we all just trade price action then? It should be the least lagging method. Well you would be wrong, candle sticks themselves are in fact indicators, so are any bars. You still have to wait for the bar to complete to see if the pattern fits even if its a 1 second bar. That is of course the dreaded lag. The only true price is tick data, which as a human being its very unlikely you will be able to make heads or tails of it.

I am not saying indicators are worthless, and I am not saying that everyone should stare at tick charts. What I am saying is that you need to understand EXACTLY what you are measuring, and why the statistics from these measurements will be meaningful. Price action and Indicators have their use, EAs/trading algos have theirs as well. There is a proper amount by which to remove noise to unearth signal, but its a goldilocks zone. Proper application of statistical measurement, asking the right questions, and understanding of the tool set of different statistics is required.

These two posts should be stickied somewhere.

What’s the other 10%?

Well said. Everyone who trades does so with some kind of bias, be it trend following, S/R, channel, breakout, etc. TA traders measure and test these theories with indicators (tools), and therefore one must make sure their tool is created with the intention with measuring the said theory as accurately (and cleanly) as possible.

Excellent summary, thanks :slight_smile:
Even without the issue of interpreting back-tested statistics it is equally important to understand the indicator or system itself and exactly what it is doing. Nowadays, indicators can be quite complex and not even clearly explained concerning their internal functioning. Even well-known indicators like Stochastics and MACD are used without always clearly understanding what they are actually measuring.

But regardless of how well one examines the nature of the statistical analysis and the indicators concerned that still leaves open the actual question in the OP - how much can one rely on these results in making forward projections and designing and taking new positions on the basis of these results.

One could say that on a purely instinctive basis a historical 60% success rate sounds very weak, but, on the other hand, a 90% success rate would sound very suspicious, i.e. perhaps based on manipulated or selective data.

I guess, in the final analysis, the truth of the matter is in your statement that: ā€œThere is a proper amount by which to remove noise to unearth signal, but its a goldilocks zone.ā€

I left off 10% for volume. The reason why it is so low percentage wise is because you are always looking at the candle that created that volume. So it is not like 50-50%. You are naturally connecting/correlating the volume to a price. So I gave it a lower percentage.

Also good to hear from you in a long time Liquid.