Statistics in Technical Analysis

This cant be correct with a moments logic

if the probability of throwing a six is 1 out of 6, which it is, forget convering to percentage, it is 1/6.

If you throw the die 6 times, and ADD:

1/6 +1/6+1/6…

you get 6/6

=1

= certainty!

This is a good start guys, I do this analysis for every tool in my toolbox. Although previously had shared information it never really stuck, although it was mostly in the chat room. Most people just glazed over it. I have 1 problem some of these posts with the manual back testing, its not a repeatable process for everyone because its inherently subjective. So I can not verify your work, even given the same data set. When I do my research its 100% programmed so that its is a repeatable experiment for anyone, this is what is required of scientific findings. Unfortunately most of my research is on futures contracts, if you guys are still interested I will share them. I hate to criticize then contribute nothing. But if this thread is going to take off, i think we should have a set of rules so all findings meet some minimum criteria in order to be seen as valid.

Yes, and there was a post with pivots from 2 hr. chart. Thats great for software that allows 2 hr charts, but Metatrader jumps from 1 to 4 hr, so making the application limited.

However, in the spirit of contribution, heres a book everyone might want to check out:

Microsoft Excel for Stock and Options Traders, by Jeff Augen.

All of my stuff came from using Excel and should be repeatable by anyone who cares to crunch a few numbers. In other words, no subjectivity (except for my intepretation of the results which is as subjective as it gets). Variations will exist due to using different brokers or, in the case of pivots, using different calc times. The general trends should, however, hold true if the same conditions are applied (5pm est for close of day and pivot calculations).

Futures data will also have variations from the spot markets but general trends should, hopefully, apply across related markets. I would love to see anything you have tested in the past.

Well I thought I would contribute something to this thread. this is an update from my original thread which caught little traction but maybe this is the right crowd. Some of the jpegs are dead now from the hosting site i used so i attached a txt file with all the stats.

http://forums.babypips.com/newbie-island/46361-datamining-quantifying-your-markets-personality.html

I do this type of very basic data mining on every instrument I trade. The data feed I am using is FXCM. The dates for each level of study are noted for each time frame, all bars are built using 1 minute time bars.

This study is based on the closing prices of bars, I am using % ages as opposed to pure price movements for several reasons. Usually at higher prices we have higher volatility, changing to percentages helps remove that bias. Also financial time series are in general non stationary and non normal, so determining any confidence level becomes more difficult, changing to % returns (you can do log returns as well) creates a weak stationary time series which becomes easier and more reliable to work with.

lets address what each field means.

of periods = how many bars of that time frame were analyzed

of Moves = how many bars consecutively closed in the same direction

Average period in move = how long they lasted
Avg AMP = the amplitude in % the moves traveled
Chance to last X+ periods = is the % chance that a move lasts X or more periods/candles
Chance > X % move = the % chance that a move larger than X percent may occur.

My analysis of this data:
Euro/Usd is symmetrical in both directions on all time frames, this is not the case with other instruments. Take stocks for example they exhibit a long side bias. This is both for the num of moves as well as the average amplitude of moves.

Also note that as the time frame decreases the average amp (volatility) does not decrease at the same rate, it decreases significantly less. So you would expect from the daily to 4 hour ( 6 periods) a 1/6th the average amplitude of the daily move (.155) you actually get (.37) about double what we would have guessed in a linear relationship. This is true again for all time frames.

Lets look at the characteristics of the % chance of a X + move, think of this as a trending characteristic. The % chance between the daily, 4h and 1h are all very similar, only 1-2 % age points difference. Now lets compare daily and 5 minute. Most people would say that the higher the time frame the less noise, and the “easier to trade”. So this comparison would be natural as the 5 minute is very “noisy” as seen on numerous forum posts. The 5 minute % chance of X move is larger than the daily % chance of X move in every category. Actually its larger than every chance of X move on every time frame. But the significance comes from the chance for a 4 + period move. About 6 % (5.81 % on average) larger than the daily time frame, it is 3.79 % higher on average than the hourly time frame. What does this tell us? That lower time frames have “momentum”. That they tend to exhibit more trending behavior than they are given credit for, however the amplitude is very small so the amount of market friction (slippage/commissions, execution latency etc.) becomes a larger factor. But this just shows that being profitable at the 5 minute and lower level is possible and there is an inherent edge there. This edge is best shown by the 2 + % chance stat. taking a random entry (50/50 long short) the % chance to last 2 periods respectively is 51.96 and 52.04. Basically a net 2 % edge either direction. Now is the bar large enough for a retail trader to profit due to those market frictions, probably not. But you can see where the HFTs and large institutions have a fertile trading ground.

Great. Thanks to MeiHua, any future posts I make on this thread will have to be worded so it sounds like I know what I am doing (which is doubtful). Interesting stuff Mei. Seeing market truisms refuted and dashed upon the rocks like that is kind of fun to watch.

I hate to see this thread die. Especially after my post. I truly believe in quantitative research and evidence based TA. I don’t know if it’s just this community isn’t ready for this type of material or just plainly isn’t interested. I was planning on doing more post as time went on and discussing others findings. But there hasn’t been anymore shared. I could post a some of my work here but I feel it would high jack the thread if I was the only one sharing . So if there are people still interested in that kind of work I may split off in my own thread just for that or I hope to see more people share here. Like I said in my original post about this my thread and this type of topic always die. If anyone can give me insight into why that would be great. Is there truly no interest in scientifically repeatable and statistically significant results???

I am interested, but the problem is that to post on this thread, I would expect myself to have something insighful to add that includes research I’d done, which I can only do at a minimum. It is a really good thread idea, but I wonder how many readers want to do that kind of research, and for the ones that do, I imagine some don’t want to share wha they find. I’m interested in what you have to say MeiHua, so wherever you wind up sharing, I’ll subscribe. I’d like to see this thread continue, so I’d hope for it to be here, but obviously it’s not my thread.

Hehe, yeah, mee tooo :slight_smile:

Since the last time, I have tried to make my own seasonal analysis, check this out(almost guaranteed not accurate though:/ =

The green line is data for 2013, big purple one is total mean, blue one is the mean for the last 5 years, red is the mean for the first 5 years of the study…

Havent tested many TA ideas yet, still trying to grasp R :wink:

It would be nice to have all the stat related posts located here in one central location. It would beat jumping from thread to thread IMO. I will post a few things that are candle pattern related in the near future. There may be very few interested but that will be their problem.

Why do these types of threads die? For a number of reasons. I happen to like knowing that my chances when doing X are greater than my chances when doing Y or that doing Z is a statistics-declared disaster so I should avoid it. Some don’t care. Also, many if not most methods in trading are quite subjective. It is hard to quantify subjectivity.

Anyway, I am all for giving a bit of CPR to the thread.

No I am very interested in seeing the research you and everyone else has done. I made the thread specifically for people from all possible trading approaches to share their understanding of the markets with the evidence that supports their opinions. Post away I enjoyed reading what you shared and i’m sure others did too. Contributing is not hijacking this thread.

This type of thread will never be popular, there is no leadership or structure in place to follow, no plan to make the readers rich, just a place to share what you find and the evidence that supports your claims and see if you get useful/useless feedback.

While I started the thread I have little to contribute as the field is not something I am experienced in, but I find that sharing ideas about real evidence is a great way to learn and or demonstrate ideas.

What exactly are you calculating? its not really clear. How are you isolating the seasonal cyclical factor and then parsing the data to calculate? why over the last 5 years first 5 years and total mean? are we looking for drift? considering that the euro is not say like corn, crude or other commodities which are locked in through either natural or fundamental cycles. i think it would be hard to resolve a drift with fundamentals. Though I am not certain your data speaks to that. Also how many years and what data feed was using in your study?

Hi, sorry for the unclear stats, I was just trying to produce my own seasonal tendencies… The data is for the 2001 - 2011. And the close price is transformed with rate of change…

Well I was actually going to post my own seasonality study although calculated completely differently then above. Its still quite redundant. Although I will show my seasonal volatility study that goes along with the price trend. This uses Euro/Usd data from 1999-2012. I am calculating the % change of each month for the volatility study, also introducing the max and min volatility values as well to understand the bounds as well as the average itself. Just because richard brought out his R skills, I should I would use it to. I am such a sheep. It is the most powerful stats package around. Although most of this kind of work can be completed in excel, as it is not very demanding. But playing with R is fun :stuck_out_tongue:



What did I take away from this:
Well the range of volatility from min to max is huge, i mean sometimes we explode. Most of this huge upper bounds comes from the last 2 years due to the euro crisis etc. The min values are fairly stable. But I think most of the information can be garnered from the Average volatility line (Black). Most people claim that the “summer doldrums” are here and its time to take a break and go on vacation. Well there is some merit to that, as June seems to be the least volatile month of the year, both in the average and max category. Although July and August are about average, on par with April, and February. May is on average the most volatile month, instead of the winter time. Although Jan, September and December are still up there. So to take a queue from Mythbusters, I would say the summer doldrums is, plausible. Why because you do have the lowest volatility of a single month of the year, but the surrounding months are not nearly as low as to be completely out of line with the rest of the year. Taking summer as June, July and August, compare that to 3 month period of February, March and April. The numbers are remarkably similar, but you don’t have the spring doldrums. So can summer be a low volatility time for the Euro/Usd enough to stop trading for an entire season, possibly. But on average its comparable to several other months of the year where people would consider that level of volatility able to be traded.

Any theories as to why the max volatility for Dec is off the chart like that? I have always assumed there was an end of year winding down period but it looks like max, min and average all steadily climb from Nov on.

Interesting.

It is cool to see interest in this topic. I spend most of my time trying to statistically analyze my systems. I would like to post some of my findings here. Below is a glimpse to what I do.

stops and Profit Target:

Every system has the chance that the trade will move against the position and price usually does for at least some amount of time. In trying to determine how far I should set my stop I thought, “I wish I could measure the average number of pips the market moves against me.” So I dug out the old college statistics book and after a year of playing with statistic this is what I’ve come up with. Basically, I model my mechanical system in excel and collect the number of pips, max and min, that the trade moves in both directions for the life of the trade.


I do this over and over again hundreds if not thousands of times. Which looks like what you see below:


(If you look at the bottom left corner you can see that I have over 60,000 data points)

I then mine this data using EasyFit 5.5 Professional (Student Version). This program uses advanced statistics to fit the distribution. They are anything but normal!


Here you can see that the stop analysis fits the “kumaraswamy” distribution…


Above you can see that there is an 83.34% [P(X>x1)=.83342] chance that my stop will not be hit or stated another way there is a 16.65% probability that I will be stopped out if I place my stop 30 pips away from my entry. Of course I use this method for limit placement and a combined analysis of stop and limit data historically and randomly.

You should take a look into John Sweeny’s work on MAE and MFE, its basically what your measuring. Also Thomas Stridesman (could be someone else but i think its him) life cycle of a trade, which uses sweeneys MAE and MFE .

Just from looking at the numbers 2012 was extremely volatile, the fiscal cliff in tandem with the euro crisis has increase volatility in the winter time. I think that explains the max vol, basically anything on this time scale monthly seasonality, is going to be fundamentally driven. But if we look at the averages, because we are using 13 years of data, the winter time is highly volatile but not as spiky due to 2012. But yes, that would confirm that DEC JAN is a great time to trade, which I have always traded, minus Xmas to new years.

Thanks MeiHua! I’ve never heard of John Sweeny or his work. I think I might have to get his book. I searched for him and his work looks interesting. Do you have any experience with MAE MFE?

As I’ve been looking at this data and trying to place me stop and profit target I have been trying to get as close to a 1:2 risk:reward ratio as possible. Last week after a year of playing with this I was reading about Hoosain Harkener and his goal of 10 pips a day. He stated that he was about 90% accurate and that his risk was double the reward. He would place his stop 20 pips away and take profit at 10 pips.

Using a mechanical system that I have tested I found that if I set my stop at 60 pips there was only a 10% probability that I would get stopped out before the trade ended (ie it hit its profit target or I got a signal for a trade in the opposite direction). On the other side of the trade I found that there is a 54% chance that the trade will go 90 pips.

Here is my stop analysis:


And her is my target analysis:


So I currently have a 50% accuracy with after about 2 months of trading, which isn’t that large of a sample.

As I read about Harkener’s method I began to wonder if I were to set my stop at 60 pips and take profit at 30 pips (i.e. 2:1 risk:reward with a theoretical accuracy of 80%) what would that do to my account. So I build my own random number generator and Monte Carlo system in excel. I found that if I was to use my standard 2% risk the gains were consistent but the slope of the gains was significantly less than the 1:2 r:r method I’ve been working on, shown above. However if I risk 5% to 10% of my capital, with a 80% accuracy, the profit curve becomes parabolic. I’M NOT ABOUT TO TRY THIS!!! But it is interesting.

I’ll try to up load my spreadsheet for anyone who wants to play around with it. Also I’ll try to make some graphs today on risk of ruin.

Wow, interesting analysis!

MeiHua, could you show your seasonal study, and explain it, I have tried to figure out how to do it on my own, and my results show that :stuck_out_tongue: I would love to dissect yor way of doing it and learn a little :slight_smile:

For the record, I have no statistical background, so dont trust me at all!