Datamining, quantifying your markets personality

So I was doing some research for a new project and I realized. Not many traders here use all the data they have at their fingertips. Yes everyone of us has years and years of back data from our broker. All for free. But what do you do with it? Chances are NOTHING! Well that’s just a waste. So I am going to run through here a very simple way to data mine and come up with some quantitative results based on FIBER. I am going to show you a method that’s very simple, I also credit author thomas strideman. A few caveats, I am going to be using the 6E futures contract to generate the data that will be ratio back adjusted. The reason I am doing this is I personally believe that the futures data I have from the exchange is much cleaner and provides a better picture than different spot brokers. Broker to broker feeds will differ in pricing and bid ask levels, this becomes especially apparent when you have a broker with variable spread. The data Ranges from 3-20-2001 to 8-10-2012, over 10 years of data to work with. Let me give you a snapshot of what the final result looks like.

ps: i take no responsibility for what you do with this data nor claim any accuracy. this is just to show what can be learned from historical data and how to turn it into something quantifiable.

This is brilliant. Its exactly the kind of data I was going to be start researching myself. I was reading a old but very good trading book called the ‘market wizards’ and one of the traders on there explained how he used this kind of data. Tbh i dont know what the 6E future contract is you refer to? Also what does Avg AMP stand for and how do you define what a move is?
Sorry for coming straight out with a lot of questions and I apologise in advance if these were to be answered in thurhter posts. Im just quite excited to seee this thread as its something I was going to start looking at!!

I do a lot of data mining myself as well. Especially looking cointegration vs. correlation right now. I think it’s a good avenue to start off. :slight_smile: Some pretty interesting, and not so interesting results so far.

Nice, so your saying,

Daily?

Haha… What are you talking about…?

MeiHua, How do you define a move in your stats above?

Lots of interesting things to be found, Clark if you go off the well beaten path.

After a couple of days from working with Pipstradamous to resolve this techinical issue, I should know be able to reply to my own thread. YIPPEE!!!

Let me explain what each field means first then I will show you how i generated it. This is just a simple sample, once you grasp the concept you will be able to extract any sort of idea and determine its probability.

Periods (up/down): these are bars in which this bars close is higher/lower than the previous bars close

Moves: these are the counts of where candles moved in a sequence in one single direction, ie are trending.

Avg Period in move: how long the market continued to make a move in that direction that consecutively.

Average AMP(litude): this is the amplitude in % age not pips of the move. The reason here is we need this to be comparable across all markets. If we do pip points say comparing a USD/JPY to Fiber we run into a problem. IF we use % ages everything is relative and a straight comparison between any instrument can be derived without converting values.

Chance to last: This is the % probability that a move will last X bars in duration.

Chance to move > X %: this is the % probability that a move will go X % distance.

I am not saying anything, you should use this information to your advantage though. How you use it is basically up to you. This is the general topography of the battle field so to speak. How you plan to attack is completely up to you. But now we are not flying blind.

But lets give this a little analysis from my point of view and maybe that will give some insight as to how i use this data.

So I marked up this screenshot to explain what i mean.

Blue: You can see from the right hand side the chance for any given move to last X number of periods is pretty much the same in any time frame. So your chance to see a straight 3 period rally in the hourly is the same chance to see a straight 3 period rally in the daily.

Red: Obviously larger time frames give you the chance to hit larger moves. But how much of a move are you giving up if you trade lower. I don’t know if its exactly exponential We can see here is larger than linear. For example the 4hour chance for a > 1 % move is 7.06 which if you divided by 4 you should get 1.75 in the 1 hour. However that is not the case, you actually get 1.08.

Green: This also shows that the distribution from Up to Down moves is about equal on all time frames. This could lend do why a lot of academics believe that the random walk theory would work in markets.

Good stuff Mei,

Thats what I got out of it, just narrowed it down to one word,lol

I guess I am not as concise

Thanks for this thread, MeiHua. I’ve been trying to pursue a similar line of thought myself. :35:

Well, of course you would be Mei, Your explaining a point…

and again, I applaud your efforts,

Keep Rockin it,

Heres a screenshot of how i did it in excel. I wont give away my spreadsheet because I believe if you make it yourself you can start to experiment with different concepts. all of which can and should be shared here. But here is the basic set up. First I Created a RAD contract for the 6E futures symbol (euro FX futures). Because it has accurate ticks, my data also comes with real volume (which I didnt need for this experiment but have stored). Then I just created these columns. These counters are just based on closed price in relation to the previous close. Although any value can be measured based on the OHLC info. The first line must be skipped as that is the baseline from which the second line is measured. Thus starting the count. you can see in line 2 and line 3 that we have 2 consecutive lower closes. Thus giving us a move count of 2. I then calculate the direction of the move and how much its moved giving a count of -2. negative for down positive for up. Then using sumif and countif statements in excel you can calculate anything you want using this base system. Although tedious to get the data in there once its in you can do all sorts of analysis, if you add volume in here you can come up with a lot of stuff. A ideas are, test price and volume action, session times, price to volume, determine volume on breakouts referencing the daily info. How far a breakout on average goes, how far a trend continuation goes, determining the average moves on any given day say NFP days, the ideas are basically limitless