That’s the way to do it. That’s the nature of Analysts.
One single question,
How can you possibly get around the fact that a technology hasn’t been working so far but will in the future?
What will be the fundamental key in any new technology in order to get around that issue?
I’ve done quite a lot of work on the effects of time of trade on various auto traded strategies.
One of the approaches I use is to retrospectively analyse trade results by applying filters to limit trades within certain hours. For example I’ll analyse equity curves compised of samples of trades taken between certain hours, for example 6am - 7am, 6am - 8am, 6am - 9am, 7-8, 7-9, 7-10 etc
Extending this to a full 24 hour period, and rounding session times to the nearest hour gives me around 300 or so unique session times that can be plotted as a 2d surface, and it’s quite noticeable that the best results tend to cluster around particular sessions.
It’s also quite clear that if I take a simple analysis of average profit v time of trade, the results for trades with above average returns definitely cluster around particular sessions. If I do the same analysis for losses, I tend to get the inverse
I suspect in my case there’s a direct correlation in performance with market volatility (or I’m being fooled by randomness !) trades taken around particular market opening times also tend to do better than average, e,g London and New York open, but again that’s a consequence of the edge I’m trying to exploit.
Personally I’d be reluctant to include a time based filter unless I had a fairly solid understanding of why the filter worked, otherwise it’s potentially another data mining bias.
the first thing you have to do is identify why it quit working. then you have to pay me a small fortune to tell you the rest. lol. not all failures are created equal, plus I happen to be in a field where the base technology itself continues to be improved as do the computers it runs on, not to mention there is always new thinking in how to apply it. a large part of my failure was in the data-set. the structure of it was no longer relevant. it needed different information. that took a very long time though, a lot of research and I realize that is a process that i need to stay on top of (a head of). - JP
I think the danger of data mining in any of this can’t be understated, but I do think there is something to be said for it if the sample set is large enough, which will help prevent-curve fitting. The other thing I would consider in your case is the time periods before and after the sweet-spot. Are the signals generated dramatically different in terms of P/L? If they are I’d be concerned - I do occasionally see this in my modeling. If they aren’t and the sample size is significant (1500 unique trades give or take) than you may actually be finding some efficiency in your approach you can exploit. - JP
You mentioned that the structure was no longer relevant’ just to know that we are on the same page here, are you referring to
Data feed or a selected optimized source?
How could you be certain on the reasoning on why it stopped working?
I never feel certain about the markets, or the modeling. The best we can do is model enough data that the results are statically significant and then keep an eye on the efficiency of the model. Where I failed in my previous data-set was not trying to account for increasing complexity over time. A model built around 2002 data doesn’t represent the impacts two economies are experiencing in 2007 or 2009 etc. We have to be digging all the time to find and include the things that impact and remove those that don’t. Or, if you don’t want to remove them (in case they become relevant again) you need to improve your computational power to account for the extra size and computational load in the set. The data-set I classified today had just under 3mm data-points in it and this was a small one. It took the equivalent of 2700 hours of processing (based on a single core) and evaluated just over 300mm models. To help avoid data snooping (given the huge number of models evaluated) I don’t rely on a single one, instead I take a blended approach creating a multi-model that incorporates several different criteria injected into different phases of the modeling process. - JP
That makes sense,in that order you can implement what you know already worked at a given time and maintain the ability
to tweak criteria for further analysis.
Once you do that in intervals you are always up to date.(and keep your energy provider happy). Moreover you got a Plan B.
If in time your model fails you completely you’ll have invested in other possibilities ie; thinking or technology.
Just one thought. Do not discount the impact Daylight Savings Time has on time based entries. Platform GMT does not change, but the relationship of other market times change relationship with GMT.
I have several time based strategies that I employ with high success. But when I was in the early stages of programming them, I noticed a serious year after year testing change in March - October results. It took me a while to realize what caused that. When I did, everything fell into place so to speak.
Right now yes. It’s 80+ hrs a week, the suck factor is really high, but that’s what I get for taking time off. I’m hoping in a year to be back in a position to hire. - JP