Statistics in Technical Analysis

All of my stuff came from using Excel and should be repeatable by anyone who cares to crunch a few numbers. In other words, no subjectivity (except for my intepretation of the results which is as subjective as it gets). Variations will exist due to using different brokers or, in the case of pivots, using different calc times. The general trends should, however, hold true if the same conditions are applied (5pm est for close of day and pivot calculations).

Futures data will also have variations from the spot markets but general trends should, hopefully, apply across related markets. I would love to see anything you have tested in the past.

Well I thought I would contribute something to this thread. this is an update from my original thread which caught little traction but maybe this is the right crowd. Some of the jpegs are dead now from the hosting site i used so i attached a txt file with all the stats.

http://forums.babypips.com/newbie-island/46361-datamining-quantifying-your-markets-personality.html

I do this type of very basic data mining on every instrument I trade. The data feed I am using is FXCM. The dates for each level of study are noted for each time frame, all bars are built using 1 minute time bars.

This study is based on the closing prices of bars, I am using % ages as opposed to pure price movements for several reasons. Usually at higher prices we have higher volatility, changing to percentages helps remove that bias. Also financial time series are in general non stationary and non normal, so determining any confidence level becomes more difficult, changing to % returns (you can do log returns as well) creates a weak stationary time series which becomes easier and more reliable to work with.

lets address what each field means.

of periods = how many bars of that time frame were analyzed

of Moves = how many bars consecutively closed in the same direction

Average period in move = how long they lasted
Avg AMP = the amplitude in % the moves traveled
Chance to last X+ periods = is the % chance that a move lasts X or more periods/candles
Chance > X % move = the % chance that a move larger than X percent may occur.

My analysis of this data:
Euro/Usd is symmetrical in both directions on all time frames, this is not the case with other instruments. Take stocks for example they exhibit a long side bias. This is both for the num of moves as well as the average amplitude of moves.

Also note that as the time frame decreases the average amp (volatility) does not decrease at the same rate, it decreases significantly less. So you would expect from the daily to 4 hour ( 6 periods) a 1/6th the average amplitude of the daily move (.155) you actually get (.37) about double what we would have guessed in a linear relationship. This is true again for all time frames.

Lets look at the characteristics of the % chance of a X + move, think of this as a trending characteristic. The % chance between the daily, 4h and 1h are all very similar, only 1-2 % age points difference. Now lets compare daily and 5 minute. Most people would say that the higher the time frame the less noise, and the ā€œeasier to tradeā€. So this comparison would be natural as the 5 minute is very ā€œnoisyā€ as seen on numerous forum posts. The 5 minute % chance of X move is larger than the daily % chance of X move in every category. Actually its larger than every chance of X move on every time frame. But the significance comes from the chance for a 4 + period move. About 6 % (5.81 % on average) larger than the daily time frame, it is 3.79 % higher on average than the hourly time frame. What does this tell us? That lower time frames have ā€œmomentumā€. That they tend to exhibit more trending behavior than they are given credit for, however the amplitude is very small so the amount of market friction (slippage/commissions, execution latency etc.) becomes a larger factor. But this just shows that being profitable at the 5 minute and lower level is possible and there is an inherent edge there. This edge is best shown by the 2 + % chance stat. taking a random entry (50/50 long short) the % chance to last 2 periods respectively is 51.96 and 52.04. Basically a net 2 % edge either direction. Now is the bar large enough for a retail trader to profit due to those market frictions, probably not. But you can see where the HFTs and large institutions have a fertile trading ground.

Great. Thanks to MeiHua, any future posts I make on this thread will have to be worded so it sounds like I know what I am doing (which is doubtful). Interesting stuff Mei. Seeing market truisms refuted and dashed upon the rocks like that is kind of fun to watch.

I hate to see this thread die. Especially after my post. I truly believe in quantitative research and evidence based TA. I donā€™t know if itā€™s just this community isnā€™t ready for this type of material or just plainly isnā€™t interested. I was planning on doing more post as time went on and discussing others findings. But there hasnā€™t been anymore shared. I could post a some of my work here but I feel it would high jack the thread if I was the only one sharing . So if there are people still interested in that kind of work I may split off in my own thread just for that or I hope to see more people share here. Like I said in my original post about this my thread and this type of topic always die. If anyone can give me insight into why that would be great. Is there truly no interest in scientifically repeatable and statistically significant results???

I am interested, but the problem is that to post on this thread, I would expect myself to have something insighful to add that includes research Iā€™d done, which I can only do at a minimum. It is a really good thread idea, but I wonder how many readers want to do that kind of research, and for the ones that do, I imagine some donā€™t want to share wha they find. Iā€™m interested in what you have to say MeiHua, so wherever you wind up sharing, Iā€™ll subscribe. Iā€™d like to see this thread continue, so Iā€™d hope for it to be here, but obviously itā€™s not my thread.

Hehe, yeah, mee tooo :slight_smile:

Since the last time, I have tried to make my own seasonal analysis, check this out(almost guaranteed not accurate though:/ =

The green line is data for 2013, big purple one is total mean, blue one is the mean for the last 5 years, red is the mean for the first 5 years of the studyā€¦

Havent tested many TA ideas yet, still trying to grasp R :wink:

It would be nice to have all the stat related posts located here in one central location. It would beat jumping from thread to thread IMO. I will post a few things that are candle pattern related in the near future. There may be very few interested but that will be their problem.

Why do these types of threads die? For a number of reasons. I happen to like knowing that my chances when doing X are greater than my chances when doing Y or that doing Z is a statistics-declared disaster so I should avoid it. Some donā€™t care. Also, many if not most methods in trading are quite subjective. It is hard to quantify subjectivity.

Anyway, I am all for giving a bit of CPR to the thread.

No I am very interested in seeing the research you and everyone else has done. I made the thread specifically for people from all possible trading approaches to share their understanding of the markets with the evidence that supports their opinions. Post away I enjoyed reading what you shared and iā€™m sure others did too. Contributing is not hijacking this thread.

This type of thread will never be popular, there is no leadership or structure in place to follow, no plan to make the readers rich, just a place to share what you find and the evidence that supports your claims and see if you get useful/useless feedback.

While I started the thread I have little to contribute as the field is not something I am experienced in, but I find that sharing ideas about real evidence is a great way to learn and or demonstrate ideas.

What exactly are you calculating? its not really clear. How are you isolating the seasonal cyclical factor and then parsing the data to calculate? why over the last 5 years first 5 years and total mean? are we looking for drift? considering that the euro is not say like corn, crude or other commodities which are locked in through either natural or fundamental cycles. i think it would be hard to resolve a drift with fundamentals. Though I am not certain your data speaks to that. Also how many years and what data feed was using in your study?

Hi, sorry for the unclear stats, I was just trying to produce my own seasonal tendenciesā€¦ The data is for the 2001 - 2011. And the close price is transformed with rate of changeā€¦

Well I was actually going to post my own seasonality study although calculated completely differently then above. Its still quite redundant. Although I will show my seasonal volatility study that goes along with the price trend. This uses Euro/Usd data from 1999-2012. I am calculating the % change of each month for the volatility study, also introducing the max and min volatility values as well to understand the bounds as well as the average itself. Just because richard brought out his R skills, I should I would use it to. I am such a sheep. It is the most powerful stats package around. Although most of this kind of work can be completed in excel, as it is not very demanding. But playing with R is fun :stuck_out_tongue:



What did I take away from this:
Well the range of volatility from min to max is huge, i mean sometimes we explode. Most of this huge upper bounds comes from the last 2 years due to the euro crisis etc. The min values are fairly stable. But I think most of the information can be garnered from the Average volatility line (Black). Most people claim that the ā€œsummer doldrumsā€ are here and its time to take a break and go on vacation. Well there is some merit to that, as June seems to be the least volatile month of the year, both in the average and max category. Although July and August are about average, on par with April, and February. May is on average the most volatile month, instead of the winter time. Although Jan, September and December are still up there. So to take a queue from Mythbusters, I would say the summer doldrums is, plausible. Why because you do have the lowest volatility of a single month of the year, but the surrounding months are not nearly as low as to be completely out of line with the rest of the year. Taking summer as June, July and August, compare that to 3 month period of February, March and April. The numbers are remarkably similar, but you donā€™t have the spring doldrums. So can summer be a low volatility time for the Euro/Usd enough to stop trading for an entire season, possibly. But on average its comparable to several other months of the year where people would consider that level of volatility able to be traded.

Any theories as to why the max volatility for Dec is off the chart like that? I have always assumed there was an end of year winding down period but it looks like max, min and average all steadily climb from Nov on.

Interesting.

It is cool to see interest in this topic. I spend most of my time trying to statistically analyze my systems. I would like to post some of my findings here. Below is a glimpse to what I do.

stops and Profit Target:

Every system has the chance that the trade will move against the position and price usually does for at least some amount of time. In trying to determine how far I should set my stop I thought, ā€œI wish I could measure the average number of pips the market moves against me.ā€ So I dug out the old college statistics book and after a year of playing with statistic this is what Iā€™ve come up with. Basically, I model my mechanical system in excel and collect the number of pips, max and min, that the trade moves in both directions for the life of the trade.


I do this over and over again hundreds if not thousands of times. Which looks like what you see below:


(If you look at the bottom left corner you can see that I have over 60,000 data points)

I then mine this data using EasyFit 5.5 Professional (Student Version). This program uses advanced statistics to fit the distribution. They are anything but normal!


Here you can see that the stop analysis fits the ā€œkumaraswamyā€ distributionā€¦


Above you can see that there is an 83.34% [P(X>x1)=.83342] chance that my stop will not be hit or stated another way there is a 16.65% probability that I will be stopped out if I place my stop 30 pips away from my entry. Of course I use this method for limit placement and a combined analysis of stop and limit data historically and randomly.

You should take a look into John Sweenyā€™s work on MAE and MFE, its basically what your measuring. Also Thomas Stridesman (could be someone else but i think its him) life cycle of a trade, which uses sweeneys MAE and MFE .

Just from looking at the numbers 2012 was extremely volatile, the fiscal cliff in tandem with the euro crisis has increase volatility in the winter time. I think that explains the max vol, basically anything on this time scale monthly seasonality, is going to be fundamentally driven. But if we look at the averages, because we are using 13 years of data, the winter time is highly volatile but not as spiky due to 2012. But yes, that would confirm that DEC JAN is a great time to trade, which I have always traded, minus Xmas to new years.

Thanks MeiHua! Iā€™ve never heard of John Sweeny or his work. I think I might have to get his book. I searched for him and his work looks interesting. Do you have any experience with MAE MFE?

As Iā€™ve been looking at this data and trying to place me stop and profit target I have been trying to get as close to a 1:2 risk:reward ratio as possible. Last week after a year of playing with this I was reading about Hoosain Harkener and his goal of 10 pips a day. He stated that he was about 90% accurate and that his risk was double the reward. He would place his stop 20 pips away and take profit at 10 pips.

Using a mechanical system that I have tested I found that if I set my stop at 60 pips there was only a 10% probability that I would get stopped out before the trade ended (ie it hit its profit target or I got a signal for a trade in the opposite direction). On the other side of the trade I found that there is a 54% chance that the trade will go 90 pips.

Here is my stop analysis:


And her is my target analysis:


So I currently have a 50% accuracy with after about 2 months of trading, which isnā€™t that large of a sample.

As I read about Harkenerā€™s method I began to wonder if I were to set my stop at 60 pips and take profit at 30 pips (i.e. 2:1 risk:reward with a theoretical accuracy of 80%) what would that do to my account. So I build my own random number generator and Monte Carlo system in excel. I found that if I was to use my standard 2% risk the gains were consistent but the slope of the gains was significantly less than the 1:2 r:r method Iā€™ve been working on, shown above. However if I risk 5% to 10% of my capital, with a 80% accuracy, the profit curve becomes parabolic. Iā€™M NOT ABOUT TO TRY THIS!!! But it is interesting.

Iā€™ll try to up load my spreadsheet for anyone who wants to play around with it. Also Iā€™ll try to make some graphs today on risk of ruin.

Wow, interesting analysis!

MeiHua, could you show your seasonal study, and explain it, I have tried to figure out how to do it on my own, and my results show that :stuck_out_tongue: I would love to dissect yor way of doing it and learn a little :slight_smile:

For the record, I have no statistical background, so dont trust me at all!

Yes, I am familiar with his work and use his concepts of MFE and MAE very frequently in my system development. I think that for stop and profit target placement its probably the best tool around, other then straight up optimization. Although its kind of an optimization in itself because you are deciding which trades to cut off, or to ring the cash register. But as with anything in system development, its all curve fitting. Its a question of how much and at what point is this robust.

I donā€™t really want to get into very specific details to be honest, because its going to high jack this thread. I feel this thread is about doing statistical or quantitative analysis on pairs that can be shared. Not system development. IF you want to start your own thread about your system you can share there.

Alright guys now you can take a look at how I personally analyses of seasonality. To be perfectly honest, this is much better for commodities which have an intrinsic cycle governed by nature say corn, soybeans or wheat, and for things that have a fundamental yearly business cycle to them, for example energy products. When we consider currencies especially EUR/USD which is a gathering of many different countries all with different resources, business cycles, and policies. For me its hard to conceive that a cycle would be stable, or at least grounded in something so fundamental as to be in the bed rock of the instrument.

However this is not a forum on any of the products that best fit this type of study, so i had to go with the best analogue. USD/CAD why because its a comdoll with some correlation, i note SOME, with crude oil. Which has cyclical properties. So as an example I thought it would be best to use this, as the cycle will probably be much clearer than if i used EUR/USD.

Lets take a pause for a second and I am the first to say there is a lot going on in this picture. Its probably overwhelming for some of you who donā€™t do this study or have never seen it before done this way.

Lets get into my procedure, I used data going back to 1993 - 2013 on the loonie. Using daily prices, I then calculated the average price of the month and the median price of each month, giving each month about 30 points of granularity, 120 if you count Open high low close. I donā€™t believe the closing price of the month as a single point is particularly representative, because a lot of price shocks can occur and resolve in that time. Then I analyses directional and volatility patterns, this is shown in the first chart on the upper left. The bar above or below the zero line shows direction, size of the bar is the % change, indication volatility. Basically an expanded study of what I had showed earlier, which is the same work behind it. I then created an average cumulative seasonal change, basically isolating those average movements to give us an idea of how they may build on themselves through out the year.

Its important to note the percentage of increasing months while looking ath the seasonal directional patterns. This is because the seasonal directional pattern may be caused by a large price shock, so if we can correlate that with the actual percentage and they line up we can take it as another step towards confirming the bias, as opposed to a straight average which can be skewed with a large outlier.

The bottom left chart shows the monthly medians, this chart gives us roughly where on average each month trades at, although I donā€™t put much weight on this one as a bias or where we should be trading. I am looking for things that are very much out of line, say take the month of April, its largely out of whack higher than most months. But <50% of the time April is an up month, and the seasonal directional pattern shows this. So I can be lead to believe there was a price shock somewhere in the past in april, which can be studied further.

IMHO the most important information comes from the middle right and bottom right. I created a linear model and then extracted the residuals. I then plotted the residuals over time to give a chart of the detrended price. This is highly important because i need to isolate the cyclical seasonal factor, I can not allow a sustained trend in the instrument to overwhelm that. So by using linear modeling I take the overall trend of the instrument and then remove it. As you can see these points oscillate around the zero line. Now I then reparse the residuals back into its respective time and dates then create an average of all the residuals from that month. This gives me the best representation of the cyclical oscillations of the instrument. It moves around the zero line, and in general follows the seasonal directional patter outline.


So I decided to post this study, again something that I use in my other trading. Mostly futures, and equities where we have a lot more diversification. I thought I would try it on 11 forex pairs and see what we have. SO lets start with the nuts and bolts, what is Fractal Efficiency, for those more mathematically inclined it is given by this formula

where n is the period, and P is price.

How did i measure this, I used the last 365 days of the pairs eur/usd, gbp/usd, aud/usd, eur/gbp, hkd/jpy, nzd/usd, usd/jpy, usd/chf, nzd/cad, usd/cad, chf/nok. I then took a moving window of 20 1 hour bars and calculated using the formula above, then i took the average of all calculations to come up with the FE of the past 365 days.

So what does FE measure, it measures what i consider to be noise to signal, how much up and down motions do we need to get from point A to point B that could be drawn in a straight line. Basically a perfect trend would be a straight line from bottom left to upper right, or upper left to bottom right from point A to point B, noise creates spikes in this perfect trend that make it less efficient travel, which we see as the bounces up and down, and eventually get from point A to B but have a lot of ā€œwasted movementā€ in between. This measures that difference, with 1 being the best at a perfect trend and 0 being all noise.

Unfortunately because all of these are currencies, we dont see any stark difference, if we were comparing different commodities, futures, equities of different sectors and activity, and/or fixed income then you would see a HUGE difference. Unfortunately FX is very uniform and basically 1 ā€œsectorā€ so to speak.

Basically, what we can see here is this, there is basically difference from trading any 1 currency to the other, unless they are very exotic, CHF/NOK. Even HKD/JPY stands in the middle of the range. So patterns that arise on the majors, which actually have the LEAST amount of noise, which is quite an insight. I would never have imagined that, usually the more liquid and highly traded the more noise.

So we have the adage that if you can trade fiber and cable then you can translate that to other pairs, well if we consider that the efficiency ratio or signal to noise of basically all majors and even most exotics are very similar. This means that if you can extract the correct amount of signal out of fiber to trade it profitably you will be able to extract enough signal out of most other pairs to trade them profitably as well. I mean even if you are trading CHF/NOK its only 10% less signal than say fiber.

This can be also used to measure the effect of HFT, HFT generates a lot of noise, with major index futures have less than half of the FE value of these FX pairs. So the signal to noise ratio is very low, caused by there constant pushing of price in small areas. So I would presume that the amount of HFT thatā€™s in the spot FX market is significantly less than what exists in the equity index markets and possibly others.