Monte Carlo Simulation of trades in backtesting a la Van Tharp

aarbee · 5 February 2009

Hi Howard,

Many thanks for your detailed replies. I completely concur with your view on ranking of trades in backtesting thus obviating the need for Monte Carlo and have been using the PositionScore to rank the trades.

Apropos, Monte Carlo analysis on the trade database, in your opinion, is there any value at all vis a vis stress testing the trading system. I am fully aware of your views on WFO etc. My next mission is to fully understand the automatice WFO methodology to effectively use it with my Objective Function in testing of trading systems.

Thanks again for responding. I am sure most members of the forum would find your responses of immense benefit.

Cheers,

TradeSim · 6 February 2009

aarbee said:
Hi Howard,

Many thanks for your detailed replies. I completely concur with your view on ranking of trades in backtesting thus obviating the need for Monte Carlo and have been using the PositionScore to rank the trades.

Apropos, Monte Carlo analysis on the trade database, in your opinion, is there any value at all vis a vis stress testing the trading system. I am fully aware of your views on WFO etc. My next mission is to fully understand the automatice WFO methodology to effectively use it with my Objective Function in testing of trading systems.

Thanks again for responding. I am sure most members of the forum would find your responses of immense benefit.

Cheers,

But why rank the trades if tossing a pair of die and then selecting the trades according to the outcome of the die could provide a better outcome for the trading system as a whole ??

What is it as part of your trade ranking criteria that makes you think that you are always selecting the best trades and what tests did you use to verify this ??

I'm not saying that trade ranking is not a bad idea but blindly accepting a particular ranking strategy as some sort of panacea for developing an optimum trading system may be a bit short sighted without some sort of verification.

aarbee · 6 February 2009

TradeSim said:
But why rank the trades if tossing a pair of die and then selecting the trades according to the outcome of the die could provide a better outcome for the trading system as a whole ??

What is it as part of your trade ranking criteria that makes you think that you are always selecting the best trades and what tests did you use to verify this ??

I'm not saying that trade ranking is not a bad idea but blindly accepting a particular ranking strategy as some sort of panacea for developing an optimum trading system may be a bit short sighted without some sort of verification.

I refer you back to post #28 onwards in the following thread:
https://www.aussiestockforums.com/forums/showthread.php?t=2060&page=2&highlight=hometrader

Cheers,

tech/a · 6 February 2009

My understanding of Monte Carlo analysis is its use in the combination of as many variablesapplied to a data set as possible to then find with as much confidence as possible that any combination of variables will give a positive result.

Now I'm taking this beyond financial data and those variables WE set in our system. From what I understand OUR variables which we set in a system falls way short of the POSSIBLE variables that can be applied to any data set.
As such Monte Carlo analysis the way we use it --- evidently falls way short of a definative result.

Which then brings us to the view of ranking.
All well and good today but as not all possible varibles are present in the test it is highly likely that a test in a week or so will give an entirely different set of ranking.

If all possible variables were being viewed then the ranking would be more likely to continue.

Lets say I was testing the deflection in Steel.
My data only gave me variables related to heat applied to the steel.
I could also use Live or Dead load but I dont have information available for running of this Monte Carlo test on the steel.

Monte carlo ranks my results and I find an optimum temperature.
I'm sure you can see that these results would vary if I then introduced other variables which are present but not used in my analysis. Even just with temperature.

So with out ALL variables present ranking seems pointless?

weird · 6 February 2009

In terms of the original question, it's already been covered by David.

To discuss the other topics,

Single path ranking lends itself to too much over-optimization of backtesting results.

While I respect the scientific or statistical use of out of sample data, to confirm or validate an in sample data, I wonder if this is as valid to people with trading systems that test again and again and again on the 'same' static in and out of sample database, until they find something that agrees between the both. Perhaps a smooth equity curve over both periods would suffice ? New data, such as actually trading, even better ?

Personally I think the logic of performing out of sampling to confirm in sampling, on the same static database of responses, again and again and again until something agrees between both, is flawed ... but would like to be proven wrong.

After one run on out of sample, that data set then becomes in sample. I hope the first run is a good one, unless you have enough other out of sample periods - and not the same - to further test on.

Nick Radge · 6 February 2009

I agree Dave. It's much more productive to understand how to gain a positive expectancy regardless of the data being tested. In sample/out of sample is just an added 'confidence' booster. I think an innate appreciation for positive expectancy and probability theory will do a lot more for long term survival.

aarbee · 7 February 2009

Nick Radge said:
I agree Dave. It's much more productive to understand how to gain a positive expectancy regardless of the data being tested. In sample/out of sample is just an added 'confidence' booster. I think an innate appreciation for positive expectancy and probability theory will do a lot more for long term survival.

Nick and Tech/a,

Thanks for your posts. Putting the issue of ranking and in/out of sample testing aside, I would like your opinion on the validity or usefulness of conducting a MC analysis on the R-multiples dataset obtained from backtesting towards gaining a knowledge of higher likelihood of system behaviour in the future. Again, I am referring to the methodology put forth by Van Tharp (Definitive Guide to Position Sizing) and Larry Sanders (tradelabstrategies.com) ebook.

Best regards

Nick Radge · 7 February 2009

aarbee,
Yes, Monte Carlo simulations are necessary. I'm not sure how Van does it as I have not seen his book, but a single test run will leave you in the dark as to where that run stands within a series of runs. We don't know if its at the lower side of the range, which could mean the system is incorrectly discarded, or whether the run is at the top end of the range, meaning one has high expectations that may never be realized.

bingk6 · 7 February 2009

aarbee said:
Nick and Tech/a,

Thanks for your posts. Putting the issue of ranking and in/out of sample testing aside, I would like your opinion on the validity or usefulness of conducting a MC analysis on the R-multiples dataset obtained from backtesting towards gaining a knowledge of higher likelihood of system behaviour in the future. Again, I am referring to the methodology put forth by Van Tharp (Definitive Guide to Position Sizing) and Larry Sanders (tradelabstrategies.com) ebook.

Best regards

aarbee,

Just a few simple thoughts.

If you ran just a single pass through your insample data and the results are acceptable, it does not necessarily mean you have a robust system. You should run the Monte Carlo simulations (using your insample data) so that you know the high and low bounds and where your single pass results rank in the overall scope of things. As an example, if you ran a simulation consisting of 50000 possible runs and 50% of those simulations show losses and your single pass was profitable, would you trade that system ?? I suspect not, but the point is that you would not have known that if you had not conducted that Monte Carlo test in the first place.

Alternatively, you may find that your single pass result ranks in the bottom 10% , in which case you could legitimately question the effectiveness of your ranking strategy, when 90% of random selections are superior to your ranking strategy. Once again, you would not have known thatis if you had not conducted the Monte Carlo test in the first place. Monte Carlo analysis tells you a great deal about your system.

The ideal situation is for the high and low bounds to be quite close together across a large number of simulation runs, and they are all sufficiently profitable, then you can have added confidence in your system because it has shown its ability to perform regardless of the actual trades taken. As far as possible, its best to take the “Luck” factor out of the equation and to be able to say that the system has demonstrated its robustess by being able to perform at an acceptable level no matter which combination of trades it takes. This confidence can only increase with a larger number of Monte Carlo runs (in excess of 20000), a large number of trades being evaluated, across a large universe and finally for the test to be run over a sufficiently long period so that the system can be evaluated across a range of market conditions.

I should preface the previous paragraph by saying that the system must generate sufficient signals to provide sufficient alternative trade paths for the Monte Carlo process to evaluate. If there are insufficient signals then most of the runs will, in all likelihood, share large chunks of trades and you’ll end up with a small variance in results from high to low bound, which defeats the purpose of performing Monte Carlo analysis, and invalidates the test.

I should also mention the Monte Carlo simulations should only take place using insample data. As Howard has said on many occasions, perform as mush analysis on the insample data as you wish and when you are completely satisfied, try it out on the outsample data and see how it goes. That is the ultimate test.

I would put forward the proposition that a well structured testing methodology, encompassing comprehensive Monte Carlo Analysis using insample data provides, in all probability, better results in out of sample testing than a system developed without any Monte Carlo Analysis.

nizar · 7 February 2009

bingk6 said:
I should also mention the Monte Carlo simulations should only take place using insample data.

Why so?

Just as there are multiple possible paths of trades in the in-sample data, there are many equally likely paths that are possible in the out-of-sample data.

If monte carlo analysis is required for in-sample testing, i don't see why it should not be used in the walk forward test.

I agree with everything else you said, and that's a good point you made about how monte carlo analysis is required so you know how your trade ranking affected the performance (upper, mid, or lower end).

TradeSim · 7 February 2009

nizar said:
Why so?

Just as there are multiple possible paths of trades in the in-sample data, there are many equally likely paths that are possible in the out-of-sample data.

If monte carlo analysis is required for in-sample testing, i don't see why it should not be used in the walk forward test.

I agree with everything else you said, and that's a good point you made about how monte carlo analysis is required so you know how your trade ranking affected the performance (upper, mid, or lower end).

Yes this is correct. In this case you would be comparing one set of statistics from in sample data with another set from out of sample data and hoping that the two would be in some sort of agreement. This is in contrast to comparing one set of metrics from in sample data to one set of metrics from out of sample data. The comparison of statistics from the two different sample spaces would provide a much more conclusive comparison.

Regards
David

aarbee · 7 February 2009

tech/a said:
My understanding of Monte Carlo analysis is its use in the combination of as many variablesapplied to a data set as possible to then find with as much confidence as possible that any combination of variables will give a positive result.

Now I'm taking this beyond financial data and those variables WE set in our system. From what I understand OUR variables which we set in a system falls way short of the POSSIBLE variables that can be applied to any data set.
As such Monte Carlo analysis the way we use it --- evidently falls way short of a definative result.

Which then brings us to the view of ranking.
All well and good today but as not all possible varibles are present in the test it is highly likely that a test in a week or so will give an entirely different set of ranking.

If all possible variables were being viewed then the ranking would be more likely to continue.

Lets say I was testing the deflection in Steel.
My data only gave me variables related to heat applied to the steel.
I could also use Live or Dead load but I dont have information available for running of this Monte Carlo test on the steel.

Monte carlo ranks my results and I find an optimum temperature.
I'm sure you can see that these results would vary if I then introduced other variables which are present but not used in my analysis. Even just with temperature.

So with out ALL variables present ranking seems pointless?

Let’s look at the following two scenarios:

AA My system has filters for Turnover, volatility, trend strength, etc that many times gives more signals on daily scans than the available capital would allow me to take.

BB This other system has filters for Turnover, volatility, trend strength, and a couple of others that triggers trades daily that are far fewer than AA and very rarely are there more signals than available capital to take them.

In AA the MC as excellently done in TradeSim is useful for all the reasons that other posters have so clearly outlined.
In BB the MC would be quite redundant because there would only be a single path.

Just because there is a single path in BB, does the backtesting result become any less reliable than AA?? There aren’t any more filters in BB, just the same as AA but more stringent.
If your answer is “Yes”, I would like to hear the reason. If the answer is “No”, then what’s the problem with ranking. It’s just another filter.

As for ranking changing every week, I am not sure I understand. The figure for any filter in the system would change and would be optimally different on weekly or monthly basis. That per se doesn't invalidate their use in screening the stocks.

Cheers,

aarbee · 7 February 2009

Nick Radge said:
aarbee,
Yes, Monte Carlo simulations are necessary. I'm not sure how Van does it as I have not seen his book, but a single test run will leave you in the dark as to where that run stands within a series of runs. We don't know if its at the lower side of the range, which could mean the system is incorrectly discarded, or whether the run is at the top end of the range, meaning one has high expectations that may never be realized.

Hi Nick,
You are quite correct in pointing out the deficiency of a picking a single run when many different runs are possible.
Van Tharp's recommended way of doing MC analysis is on the following lines. This is also the same as outlined on Larry Sanders site and his ebook "Trading Strategies".
In backtesting, a trades list is produced. This can be taken or a list of actual past trades is used. Let’s say the backtest has given us one sample of trades - say 100 in number stretching over months or years. We then calculate R-Multiples for each of these trades. These are fed into the MC simulator. The MC simulator randomly picks one R-multiple value from the sample and assumes it's the result of the first trade. For the next trade, it does the same thing and selects another (it could even be the same trade) trade from the population. It keeps doing this a 100 times. This forms one run and gives one equity curve of R-multiples. The simulator then does say 20,000 such runs generating an equity curve for each. It then works out the results of DDs, profits, losing streaks, winning streaks etc and works out the statistical probabilities for each. This would be invaluable in getting an idea of what can happen if the system is traded and would assist in understanding the system and working out the position sizing for the trades. I know that there are certain assumptions implicit in the above way of MC analysis. The whole purpose of this thread was to initiate discussion on the validity or usefulness, limited or otherwise, of conducting this kind of analysis. I am quite well aware of the usefulness of MC the way Compuvision’s Tradesim does it and the efficient way it does so. It’s the MC on R-multiples that I need clarity about.

Cheers

howardbandy · 8 February 2009

weird said:
While I respect the scientific or statistical use of out of sample data, to confirm or validate an in sample data, I wonder if this is as valid to people with trading systems that test again and again and again on the 'same' static in and out of sample database, until they find something that agrees between the both. Perhaps a smooth equity curve over both periods would suffice ? New data, such as actually trading, even better ?

Personally I think the logic of performing out of sampling to confirm in sampling, on the same static database of responses, again and again and again until something agrees between both, is flawed ... but would like to be proven wrong.

After one run on out of sample, that data set then becomes in sample. I hope the first run is a good one, unless you have enough other out of sample periods - and not the same - to further test on.

Hi Dave, and all --

I think we are agreeing. One of the points I make is that adjusting the trading system based on information obtained by an analysis of the out-of-sample results changes the previously out-of-sample data into in-sample data.

There are modeling and simulation techniques that use three sets of data. The first, in-sample data, is used develop the model. Than the system is run using the second set of data -- call it the "tuning data set", for want of a common name. Based on the results from the tuning set, the model is changed. The procedure does go back and forth between those two sets of data, often in an automated way such that the values of the tunable parameters are saved from each iteration. The results will start at some level, rise to a peak, then drop. The procedure notes that the peak has been reached and remembers the parameter values from the peak point. Using those values, it makes one more test, this time on the third set of data, called the out-of-sample or validation data set. Based on one evaluation of the out-of-sample data, use the model or start over. The idea is the same -- do whatever you want to using in-sample data, but use out-of-sample data sparingly -- ideally only once.

My larger point is this: Tomorrow, when I plan to place trades based on my trading system, I want as much confidence as possible that the signals will be accurate. Tomorrow is out-of-sample. If I do make a trade tomorrow, I only get to make that trade one time. I cannot see what happens, adjust my system, scratch that trade, and re-trade tomorrow. The best -- the only -- way I have of estimating the performance on tomorrow's data is to follow a procedure that lets me see many transitions from in-sample development to out-of-sample performance. Each transition is one data point -- one sample of what might happen tomorrow. Since there is no way I can peek into tomorrow's data, I should not be peeking into the out-of-sample data during the development of the system. Said differently -- every time I peek at the out-of-sample results that I get during development of the system and then make a modification to the trading system, I am reducing the probability that the signals I will receive in real time will be accurate and profitable.

Thanks for listening,
Howard

howardbandy · 8 February 2009

Nick Radge said:
I agree Dave. It's much more productive to understand how to gain a positive expectancy regardless of the data being tested. In sample/out of sample is just an added 'confidence' booster. I think an innate appreciation for positive expectancy and probability theory will do a lot more for long term survival.

Hi Nick, and all --

I completely agree that the trading system must have a positive expectancy.

The point I am making is that the expectancy when measured over the in-sample results is certain to be good -- and I should ignore those -- they will always be good, and they have no value in estimating the future performance of the system.

It is only the expectancy when measured over truly out-of-sample results that is important.

Thanks,
Howard

howardbandy · 8 February 2009

Greetings all --

I know I am repeating myself here, but it is important to your wealth that proper, rigorous, valid modeling and simulation techniques are applied to the construction of trading systems.

If the trading system always has the same result when applied to the same data, then Monte Carlo techniques have value in these ways:
1. Add random noise to the data. This tests the sensitivity of the system to the precise data that was used.
2. Add random perturbations to those parameter values where it makes sense to perturb them. This tests the robustness of the system relative to its parameters. It helps identify best solutions that are at the top of peaks in profitability, when areas that are lower in profit but more stable, more like plateaus than peaks, would be safer.

I know how hard it is to keep from peeking at the out-of-sample results and adjusting the system. That is why I continue to recommend that the best way to avoid peeking is to automate the parameter selection and out-of-sample process through use of automated walk-forward testing.

Automated walk-forward testing is far, far more valuable in developing robust and likely-to-be-profitable trading systems than Monte Carlo applied to in-sample results.

At the completion of the walk-forward testing, then use Monte Carlo analysis of the out-of-sample results to develop estimates of performance that might be expected when the system is traded.

Thanks for listening,
Howard

Wysiwyg · 8 February 2009

howardbandy said:
Greetings all --

I know I am repeating myself here, but it is important to your wealth that proper, rigorous, valid modeling and simulation techniques are applied to the construction of trading systems.

I have been following this thread and thankyou for providing clear concise explanations of the correct application.It really is a waste of time complicating these things but regardless of this we have returned to the correct way.

tech/a · 8 February 2009

aarbee said:
Let’s look at the following two scenarios:

AA My system has filters for Turnover, volatility, trend strength, etc that many times gives more signals on daily scans than the available capital would allow me to take.

BB This other system has filters for Turnover, volatility, trend strength, and a couple of others that triggers trades daily that are far fewer than AA and very rarely are there more signals than available capital to take them.

In AA the MC as excellently done in TradeSim is useful for all the reasons that other posters have so clearly outlined.
In BB the MC would be quite redundant because there would only be a single path.

Just because there is a single path in BB, does the backtesting result become any less reliable than AA?? There aren’t any more filters in BB, just the same as AA but more stringent.
If your answer is “Yes”, I would like to hear the reason. If the answer is “No”, then what’s the problem with ranking. It’s just another filter.

Firstly Id be trying other bourses to get more results. I personally wouldnt feel comfortable with a single run type method. My opinion on the second part of your question falls within the next of your questions.

As for ranking changing every week, I am not sure I understand. The figure for any filter in the system would change and would be optimally different on weekly or monthly basis. That per se doesn't invalidate their use in screening the stocks.

Cheers,

My opinions are based more on my logic than mathamatical arguement.

A 20,000 Montecarlo run gives me results of a "What if" on 20,000 portfolio's.
How do I choose the best "what if" going forward? If your answer is optimisation then this is quite different to Montecarlo. I have a suspicion this is where there is a crossing of argument.

If we are ranking systems then yes.

As for Ranking altering.
Longer trem methods dont often have larger alterations in the data set. IE market conditions are more prolonged.
Shorter methods have chance of seeing a wider variety of swings within a data set. 10000 1 min bars could show a great diversity of bull/bear and flat.
Where as 10000 daily bars are not likely to produce this.
This is in my opinion why most including myself find it difficult to find excellent trading methods short term if only trading bullish!

So ranking is likely to alter with market diversity.
The more diverse the data tested then the less likely the swings in ranking.

Not withstanding Howards explainations above which I agree with,both in Montecarlo results and smoothness from peaks and troughs,to walk forward analysis.

TradeSim · 8 February 2009

howardbandy said:
Greetings all --

Automated walk-forward testing is far, far more valuable in developing robust and likely-to-be-profitable trading systems than Monte Carlo applied to in-sample results.

But what happens if I don't rank trades and there is variance in the trading system results due to multiple entry triggers ?? How do you deal with that situation and how would you optimize it then ??

Regards
David

pilbara · 9 February 2009

TradeSim said:
But what happens if I don't rank trades and there is variance in the trading system results due to multiple entry triggers ?? How do you deal with that situation and how would you optimize it then ??

if the triggers are simultaneous maybe make a basket of them all, each an equal fraction, with total size equal to a single stake. This would have higher brokerage costs though.

Monte Carlo Simulation of trades in backtesting a la Van Tharp

aarbee

TradeSim

aarbee

tech/a

No Ordinary Duck

weird

Nick Radge

aarbee

Nick Radge

bingk6

nizar

TradeSim

aarbee

aarbee

howardbandy

howardbandy

howardbandy

Wysiwyg

Everyone wants money

tech/a

No Ordinary Duck

TradeSim

pilbara

Similar threads

Connect with us