Australian (ASX) Stock Market Forum

Systems testing

I'll attempt to clarify my reasoning on survivorship bias. First let's define what is is and what it can mean. Survivorship bias (or selection bias) relates to something that has been included or removed from a testing environment using hindsight. As an example, a value investor may deem that they would never have held HIH or SGW in their hypothetical portfolio's or tech stocks during the tech crash. The opposite would be including a stock such as Davnet, that, from memory, ran from 2c to over $5 or so. If these singular events create a large skew on the test results, then there is concern that these test results cannot be replicated into the future. In other words the fact that they were included is nothing more than luck.

A recent post on this forum highlighted the exact case where a poster had picked a single stock which had provided him/her with vast profits. His/her conclusion was that all analysis was garbage and that selecting the next great stock was all that was needed. Maybe so, but basing that theory on a limited "lucky" outcome is representative of survivorship bias, even more so during this massive resources boom.

The point being put forward in this post is that selecting a current universe of symbols and testing them backward would lead to survivorship bias, because had we actually started back at that point of time the selection would have been different. The example here was based on the ASX-200 of today vs. the ASX-200 of some point in history.

My argument is that if the system being used was robust, then the universe being used would not matter a great deal. After all, a trend is a trend is a trend. A robust trend following system is designed to pick up and ride trends until they reverse. Which symbol is actually being ridden is secondary. I stress this is not a discussion on the market phases in which certain phases create great results for trend following whilst other phases show poor results. My argument is, so what if OneTel or Davnet were included in a trend following system test in the 90's? Today contains trends that are just as spectacular but using different stocks. The key element is that the system can pick up the trend when and where they come along. However, as stated earlier, so long as a single stock does not make up the vast majority of the P&L then there is no risk of survivorship bias.

Let's use a real time example but to show the survivorship bias I will work backward. Below are the real time results of one of my trend following systems.

1growthdn9.png


Before we move on I will ask if you feel these results are acceptable? There are no severe outliers that are skewing the results. If so, read on.

What is/are the survivorship bias?

Well, there are two occurring. The universe of stocks selected for this system must have a minimum Broker Consensus rating of BUY. As we know, Broker Consensus is a liquid measure, in other words Brokers change their minds so testing with one universe at one period of time will be quite different to trading that same universe at another period of time. One example here is ORG which was entered on Dec 4th. The issue is that Brokers now do not have a BUY ranking on the stock. In essence it should not be included in these results because if we were to run the test again now we would not select ORG to be involved..

However, there are plenty of others. Note the following 3 charts and that none of them appear in the current results even though the testing suggest they should.

IIN
1iinuj3.png



RIO
1riobw6.png


MBL
1mbltd4.png


These 3 charts all show clear BUY signals that are all nicely profitable. At the time of entry for these 3 signals Broker Consensus did not meet the required criteria so these 3 missed being in the universe in actual trading. If were to now do the testing we'd now be selecting these stocks.

In summary we have a moving universe of stocks, both for the testing phase and for the real time trading phase BUT I did pose the question right at the start about being agreeable with the real time results and whether or not they were acceptable. If the answer was yes, then we've managed to achieve this even though there is clear survivorship bias occurring. In turn this means that the system is able to continue to operate and generate profits regardless of what actually contained within that universe.

If I now take this one step even further back I can show you the actual testing before these trades were taken.

1testbv4.png


We can see a reasonable out performance in real time verse testing. But the issue is really this, "how did you test the Broker Consensus criteria?". You can't. I didn't. I ran the test raw THEN added that overlay. It's a common sense assumption. We work with the worst possible raw data then make common sense assumptions in real time. The end result is more than likely going to be better than the test and adds another element of robustness.
 
Compounding and re investment of profits.
Increasing trade size rather than number of trades.
Amplified again by Leverage.

Yes, but only point three explains the 1300%
 
Yes, but only point three explains the 1300%

On its own no.
While it has a large impact some trades were double the initial parcel size traded.
Doubling AND Leverage are very powerful in the end result.

Note the time held and return achieved V length of trend ridden.
ZFX V ASX
 

Attachments

  • Results.gif
    Results.gif
    19.5 KB · Views: 134
On its own no.
While it has a large impact some trades were double the initial parcel size traded.
Doubling AND Leverage are very powerful in the end result.

Note the time held and return achieved V length of trend ridden.
ZFX V ASX

Well no thats true. But while compounding and increasing trade size is jolly essential, they are a given. (IMHO)

But leverage is the turbocharger here.
 
We can see a reasonable out performance in real time verse testing. But the issue is really this, "how did you test the Broker Consensus criteria?". You can't. I didn't. I ran the test raw THEN added that overlay. It's a common sense assumption. We work with the worst possible raw data then make common sense assumptions in real time. The end result is more than likely going to be better than the test and adds another element of robustness.

The same seems to be true for the discretionary entry on TechTrader which states the the share must be clearly breaking out or reversing trend (or words to that effect)...practically impossible to apply in testing.

Additionally there is the $10 price limit and the money flow filter. Both of which might ultimately be affected by inflation, unless of course $10 has been some psychological level for decades...so unless you adjust such filters going back in time you will pick the RIOs and the BHPs back when they ticked those boxes and make yourself a killing. But maybe that is okay? Maybe we would have traded those if the system was running back then. I'm not convinced and I'm not sure it's wise to over-rate hindsight.

I do agree that such a system is robust enough to pick such trends going forward though. There is perhaps then less value in the actual statistical results of backtesting shares and more value in going back and looking at the charts, as you have done with IIN, RIO and MAP, to see if your system is catching what you intend it to.

I would never have thought about the broker consensus rating as a filter or a liquidity measure. Does it mean that if enough brokers rate it a buy then either them or their clients are making a market for it?? Clearly not everyone gets common sense assumptions :)
 
Nick,

Thanks for your explanation.

Nick Radge said:
His/her conclusion was that all analysis was garbage and that selecting the next great stock was all that was needed.
Not sure if you're referring to my recent post here, but I certainly didn't say or conclude that.

My conclusion was essentially that one needs to be wary of backtesting over conditions that rarely occur in the universe being tested.

My argument is that if the system being used was robust, then the universe being used would not matter a great deal.
That is perhaps the significant point. With a robust system it shouldn't matter, but on the road to developing a robust system you need to be sure that your backtesting is providing reasonably reaslistic results, otherwise you might end up thinking you have a profitable and robust system when in fact you haven't.

As an example, during testing of a system under development I was using the ASX300 as the universe and getting some pretty good results, even with Monte Carlo testing. However, on closer inspection of the results I noticed that the main winning trades were ones purchased at very low share prices, which would have mostly been before those stocks were in the ASX300. As an attempt to closer simulate the current ASX300 I then set a minimum share price of $2. Running the same backtests again showed a drop in return to almost nothing. That's the sort of thing I'm suggesting one needs to be wary of.

Cheers,
GP
 
Not sure if you're referring to my recent post here, but I certainly didn't say or conclude that.

I'm pretty sure Nick was referring to STC's thread. i.e. Just get out your crystal ball and buy the next PDN (or whatever share it was)
 
As an attempt to closer simulate the current ASX300 I then set a minimum share price of $2. Running the same backtests again showed a drop in return to almost nothing. That's the sort of thing I'm suggesting one needs to be wary of.

Definately!

Ive found that during this bull market some of those shares that caused my portfolio (actual, not test) to outperform were the likes of MCR and OXR when they were trading below $1. A lot of my other winners/losers have been around 1R either way and hence tended to a cancel one another out or just track the index. Although this probably more so reflects my ability to trade as opposed to something concrete.

Having said all that, Im surprised that removing trades on shares below $2 caused the system to return "almost nothing".
 
Applying a $2 filter on my current ASX300 list, I would be removing 75 candidates from testing =(

TT applies a less than $10 filter, I have even read Nick mentioning an idea once of possibly using a less than $5 filter to catch the early ones ...

GP, just then testing 2 mechanical systems I trade, with a greater than $2 filter , the performance did drop incredibly ... both systems dropped to around 13% PA for a 6 year period.

However I don't use a greater than $2 filter though in system design, and trading. I have in the past used a larger than $0.40 filter in backtesting, but that number was choosen for no real good reason.

I do remove large outliners from backtesting, however the occasional 300% + return from some stocks is not that unusual from some long term trend trading systems.
 
As an attempt to closer simulate the current ASX300 I then set a minimum share price of $2.

Try whole database with a variable turnover (highest avg turnover = generally "top stocks"). Makes more sence than using ASX300 :headshake

Edit: and purchase data containing delisted stocks.
 
theasxgorilla said:
Im surprised that removing trades on shares below $2 caused the system to return "almost nothing"
When I say "system", I mean an idea for a system that I'd recently coded and had started doing some testing on. It was by no means a well-tested and robust system - although of course I had hopes that it might get to be :D.

And there was nothing special about the $2 figure. It was just a number I quickly picked out of the air to see what difference it made to the results after noticing many of the large-gain trades having much lower purchase prices - lower than most stocks currently in the ASX300 (and yes, I know there are some stocks in the ASX300 with lower prices than that).

Cheers,
GP
 
GP,
That comment was certainly not aimed at you. I think Wayne has pointed out the correct individual.

Some other things to consider when testing a system and its viability in the real world.

Firstly there is no real point building a simplistic system that, when operational, requires a discretionary over ride to distinguish signals. I have numerous people come to me because they've done certain courses, which all shall remain nameless, that teaches trend following in various formats. The problem is that when they run the system rules they get 10 signals each and every day. From that juncture one must then pick and choose which to take and as we all know, we'll always pick the wrong ones.

The above issue comes from "simplistic" systems. A simplistic system will have several inherent errors when it comes to real world trading. The one I just mentioned above is the most common, especially in this kind of environment. The other is more serious when it comes it comes to the psychology of the system. A simplistic system will tend to generate many false signals during a bearish market phase which in turn will lead to a lot of frustration but also a dwindling account balance. I see this as a real risk for those embarking on this journey in the current environment. Any old simple trend following system will do okay in this environment. I know of a very well known broker touting such a system to the public that uses a 22/35 week crossover method. Yes it will catch killer long term trends but it will get mauled in any longer term bearish environment and will also be prone to false signals in a sideways period. Another simplistic system would be some kind of price breakout, like a channel breakout of a n-day high.

HERE is an interesting paper on trend following in stocks. I think this is a simple way to overcome the n-day breakout suggested above.

Another "common sense" assumption is finding those golden trends, albeit with added volatility and risk. Essentially a business will grow and with it so will the share price. One therefore needs to find the "emerging" businesses, that is those that may end up being the next TOL or CSL. Almost every one of these will be a relatively new listing and therefore be trading below $10.00. I recommend to my subscribers looking to get a bit more "bang for their buck" to only take signals on those stocks trading below $10 without any lower limitations. If you remove the signals above under $10.00 you get a reasonable increase in the risk adjusted return. If however you only take signals below $5 you get a substantial increase in risk adjusted return.

This process will also reduce the universe and, with simplistic systems, reduce the potential trade frequency and therefore the discretionary input levels.

What is an easy way to find these? Focus on stocks that do not pay dividends, which is, funnily enough, another portfolio I run.
 
Edit: and purchase data containing delisted stocks.

This idea has merit. Can anyone suggest who provides such data for the ASX?

I just read the following quote from the paper that you linked to Nick:

"Survivorship bias:
The database used for this project included historical data for all stocks that were delisted at some point
between 1983 and 2004. Slightly more than half of the database is comprised of delisted stocks."

Whilst I gather from your comments that survivorship bias has a tendency to be overrated as a factor and a good, robust system will catch trends and backtest okay regardless, for my peace of mind it would be useful to have such data. Particularly if our own ASX historical data could claim to have around half of its quantity in delisted shares! (as their study did).

ASX.G
 
I think it is available.

But here is my contention; stocks that delist don't just evaporate over night. They usually trend continually down for an extended period of time. Granted, a super long term weekly/monthly system may not capture that trend, but most daily and weekly systems would and therefore even though its delisted it would still have provided the opportunity prior. ONE is a classic example. It was a great trending and highly profitable stock at the time. Yes it no longer exists and could be considered as a candidate for survivorship bias. However, it should not have been detrimental to a portfolio's performance at the time.

Lets use an assumption in todays terms. Evans and Tate would be an example of a stock that is slowly going to oblivion; no different from HIH, ONE or SGW. Its clearly in no position to trigger a buy for such a system outlined in that report.

I think perhaps the difference in thinking is that whilst I may trade a trend in ABC Company, I do not look at that being a candidate ever again. There will always be another to take its specific place. Whereas survivorship bias relates specifically ABC company being available again.
 
They usually trend continually down for an extended period of time.

Then there is the flipside...the likes of say GCN. I think that was the code...I dont have the chart as my provider pulls feeds from delisted companies :(...small gold miner, formed a consolidation pattern like OXR or MCR, then broke out...went .35 to .50 in some weeks and was then taken over (for all intents and purposes, delisted). In all likelihood a TechTrader like system would have traded it (assuming it met the BT list, moneyflow and breakout criteria, which I cant be sure of), but now in backtesting with my data at least, that trade would not be there to be taken.

It seems to a degree to be swings and roundabouts as they say. My curiousity just wonders to what degree. It wouldn't stop me trading a good system, but if you wanted to produce a white paper on the ASX like the guys at the link you showed it would be handy to have ALL the data.
 
GGN! Gallery Gold...from the chart it looks like TechTrader would certainly have traded it.
 

Attachments

  • 20070707 - GGN - Gallery Gold.png
    20070707 - GGN - Gallery Gold.png
    10.2 KB · Views: 123
theasxgorilla said:
and was then taken over (for all intents and purposes, delisted)
If it was taken over though, then presumably it wouldn't have been detrimental to your portfolio at the time since I assume you would have got some equivalent value in the parent stock.

GP
 
If it was taken over though, then presumably it wouldn't have been detrimental to your portfolio at the time since I assume you would have got some equivalent value in the parent stock.

:) So the next logical question is, would NOT testing with the full history of data including delisted shares understate backtesting results? Since as Nick pointed out the system would not trade the downtrending dogs, but in times of high MA activity like now and perhaps the late 90s our testing may not include of GGN as they broke out prior to being taken over.
 
Top