Australian (ASX) Stock Market Forum

Hello Howard. Substance? Evidence in the form of results or more to the point, evidence in the form of trading algorithms or methods.

The techniques for developing trading systems using machine learning, and the techniques for managing trading using sequential Bayesian learning, including fully disclosed ready to run code that can be used as a template for additional development, have been published in the "Quantitative Technical Analysis" book.

Best, Howard
 
Howard,

you mentioned "Random forest" earlier. Presumably what will happen is each decision tree will become a classifier for a particular pattern.

1 - Is a decision tree the best choice for classifying data patterns ?
2 - how do you determine if a tree is worth putting in the forest ?
3 - Do you look at purely raw data, or do you filter it, or use indicators ?
4 - what variables would you use for the nodes ? Standard indicators.

Over the last 3-4 years i have built an adaptive trading model. In one of my stages, the classification stage i use clusters of NN's, which act more like agents or bots to determine a chart pattern. I might try plugging in a random forest ( decision trees ) instead. I think my conclusions will be fairly identical, but who knows ???
 
Hi David --

There is a "theorem" of machine learning that is called "no free lunch." It states, loosely, that it is not possible to know which model is best for a given problem in advance of running the fit and test processes.

Random forest is one of several "ensemble" techniques that use many (often thousands) of relatively simple but, importantly, dissimilar models, each of which is a "weak learner" to combine into a more powerful learner. Ensemble techniques do well in many applications.

Each individual tree is predicting the same target. (More below) Individually, they may be identifying different patterns, but then together they are predicting the same target.

The logic used to decide how individual trees will be included depends on the ensemble algorithm used -- the individual trees may be equally weighted or the weight may depend on the accuracy.

The modeler (that is the person building the model) chooses what the target to be modeled is. A common target is the relation between today's close and tomorrow's close. There are two broad model categories -- classification and regression.

For a classification problem, the classification categories are binary -- for example "long" or "flat." For a regression problem, the target is a more-or-less continuous numeric value -- for example the percentage change from today's close to tomorrow's close. In the vocabulary of traditional analysis, this is the dependent variable.

The predictor variables (independent variables) can be anything. To be useful, they should change in some identifiable way at about the same frequency as the target changes. Technical indicators with short lookback periods work -- RSI(2), for example. Pick one -- multiple fast oscillating technical indicators are redundant and usually reduce the accuracy. Raw price is almost never used. Formulation of predictor variables is one of the major tasks of the modeler.

Nodes? What are you thinking of when you say node?

Ensemble models are made up of "simpler" models. Pruned decision trees are often used because they are fast to create and train, and they have high variance. Conceivably, neural networks could be used. Calculation time could be a problem. Try it to see. Think through when, relative to placing the trading order, the models will be trained and how they will be recalled for prediction.

Best regards, Howard
 
Hi David --

There is a "theorem" of machine learning that is called "no free lunch." It states, loosely, that it is not possible to know which model is best for a given problem in advance of running the fit and test processes.

Random forest is one of several "ensemble" techniques that use many (often thousands) of relatively simple but, importantly, dissimilar models, each of which is a "weak learner" to combine into a more powerful learner. Ensemble techniques do well in many applications.

Each individual tree is predicting the same target. (More below) Individually, they may be identifying different patterns, but then together they are predicting the same target.

The logic used to decide how individual trees will be included depends on the ensemble algorithm used -- the individual trees may be equally weighted or the weight may depend on the accuracy.

The modeler (that is the person building the model) chooses what the target to be modeled is. A common target is the relation between today's close and tomorrow's close. There are two broad model categories -- classification and regression.

For a classification problem, the classification categories are binary -- for example "long" or "flat." For a regression problem, the target is a more-or-less continuous numeric value -- for example the percentage change from today's close to tomorrow's close. In the vocabulary of traditional analysis, this is the dependent variable.

The predictor variables (independent variables) can be anything. To be useful, they should change in some identifiable way at about the same frequency as the target changes. Technical indicators with short lookback periods work -- RSI(2), for example. Pick one -- multiple fast oscillating technical indicators are redundant and usually reduce the accuracy. Raw price is almost never used. Formulation of predictor variables is one of the major tasks of the modeler.

Nodes? What are you thinking of when you say node?

Ensemble models are made up of "simpler" models. Pruned decision trees are often used because they are fast to create and train, and they have high variance. Conceivably, neural networks could be used. Calculation time could be a problem. Try it to see. Think through when, relative to placing the trading order, the models will be trained and how they will be recalled for prediction.

Best regards, Howard

Hi Howard,

What would you recommend as a starting guide/ actions/resources for someone beginning their Quant journey?

cheers
 
Howard,

when i mentioned node, i meant the branch on a tree, or decision point. Hence if a formula such as c<ma(c,5) is true then go left otherwise go right. Building such a tree would be relatively simple, is that what you should use, or is there something better.

I could actually see this working a lot better on fundamental data rather than price action.

Thanks,
 
Howard,

when i mentioned node, i meant the branch on a tree, or decision point. Hence if a formula such as c<ma(c,5) is true then go left otherwise go right. Building such a tree would be relatively simple, is that what you should use, or is there something better.

I could actually see this working a lot better on fundamental data rather than price action.

Thanks,

Hi David --

You are describing creation of the rules that define the decision tree. There are several techniques for deciding how to divide the data points -- entropy and information gain are the two major ones. This webpage has a reasonable explanation:
http://www.saedsayad.com/decision_tree.htm

When developing trading systems using a traditional development platform, the modeler (that is the person guiding the development) creates a set of rules and their associated parameters. The example you use -- c<ma(c,5) -- is a good one. After a sequence of these rules, the system arrives at a terminal node and issues Buy or Sell signals. The metric used is neither entropy or information gain. Rather, it is whatever objective function was provided by the platform or designed by the developer, and is usually based on some variation of profit and risk.

Traditional development is an example of "compute an indicator, then see what happens after."

When developing using machine learning, the modeler provides a target, such as gain for the next evaluation period (regression) or a beLong / beFlat signal (classification), and some predictor variables that represent whatever indicators the developer thinks might contain information. Think of each data point being a row in a spreadsheet for a single day with columns for each of the predictor variables and one column for the target. The machine learning process decides how to divide the set of data so that the rows are divided or sorted accurately according to the values of the target. The machine learning library function builds the rules from the data, remembering the relationships in a matrix of coefficients for later use.

Machine learning is an example of "identify some desirable outcome, then see what preceded it."

In both cases, as with all modeling, in-sample results are (almost) always good. The value of the system for trading can only be determined by testing data not used during development -- out-of-sample data. If the model is overfit to the data, or if the data is not stationary beyond the in-sample period, OOS results will be poor and trading would not be profitable.

Decision trees are among the most basic of the models machine learning can develop. They are fast and easy to build, easily stored for later retrieval and prediction, and readily interpreted. They are the model type used by traditional trading system development platforms.

Decision trees are one of many types of models that can be used to develop trading systems using machine learning. To name a few, others include: linear regression, logistic regression, discriminant analysis, neural network, nearest neighbor, support vector, Bayesian, random forest, ... They vary in many aspects -- accuracy, ease of training, robustness, speed of prediction, interpretability, ...

------------------

Related to your comment about using fundamental data --

The data used to predict the target must itself change state at about the same frequency as the target changes state. Predictor variables that have the same state for extended periods of time are not helpful. An example that is commonly used is a long-lookback trend filter such as a 200 day moving average: c>MA(c,200). If this condition is imposed in a traditional system, trades are blocked for long periods of time. That might be what the developer intended.

Including a predictor variable that is related to that same filter -- perhaps a binary variable that indicates whether the close is above or below the moving average -- or the ratio of the closing price to the moving average value -- as an input to a machine learning function will result in that variable being ignored.

With regard to use of fundamental data, a paper I published several years ago might be helpful. It can be read and/or downloaded from this page:
http://www.blueowlpress.com/wp-content/uploads/2016/10/FT-Fundamental-Analysis-Appendix-A.pdf

Best regards, Howard
 
Hi Howard,

What would you recommend as a starting guide/ actions/resources for someone beginning their Quant journey?

cheers

Hi Omega --

For free, begin by watching these videos:
http://www.blueowlpress.com/video-presentations

The very first book to read is Daniel Kahneman's "Thinking, Fast and Slow."
https://www.amazon.com/Thinking-Fast-Slow-Daniel-Kahneman/dp/0374533555/

Of my books, "Quantitative Technical Analysis."
http://www.blueowlpress.com/123-2/quantitative-technical-analysis

Best regards, Howard
 
Hi Omega --

For free, begin by watching these videos:
http://www.blueowlpress.com/video-presentations

The very first book to read is Daniel Kahneman's "Thinking, Fast and Slow."
https://www.amazon.com/Thinking-Fast-Slow-Daniel-Kahneman/dp/0374533555/

Of my books, "Quantitative Technical Analysis."
http://www.blueowlpress.com/123-2/quantitative-technical-analysis

Best regards, Howard

Thank you very much and for fast reply.

Time to read, will get back to annoy you later

ahaha :)
 
Morgan will not build those departments and pay those costs unless the trading results are adequate to fund them.

Hi Howard,

So what returns are these companies making, on average, vs the market index? Because I have often read that the vast majority of fund managers fail to beat the market index. Is that correct, or am I way off the mark there?
 
I've read the top ones are consistently in the 20-30%pa range. Pretty good for their size I guess.

http://www.barrons.com/articles/best-100-hedge-funds-1466223924?tesla=y

It's only a 3 year annual return shown, so not really representative of their typical returns, but even so it shows that these very top 100 hedge funds returned an average of 16.98% vs 15.13% for the S&P500. I'd say that if you looked at their 10 year performance the difference would be far less. I just came across an article stating "The 20-Year Performance Of Hedge Funds And The S&P 500 Are Almost Identical". So if that is right, you would have to wonder why these companies would spend vast amounts of money employing all these quants, when a strategy of just buying and holding the index would produce more or less the same returns.
 
when a strategy of just buying and holding the index would produce more or less the same returns.

The index has the advantage of no commissions/spread and assumed unlimited liquidity in its stocks. So to match the index in the real world after all the costs is actually outperforming it.

Also of course everyone in the game thinks they can outperform.
 
It's only a 3 year annual return shown, so not really representative of their typical returns, but even so it shows that these very top 100 hedge funds returned an average of 16.98% vs 15.13% for the S&P500. I'd say that if you looked at their 10 year performance the difference would be far less. I just came across an article stating "The 20-Year Performance Of Hedge Funds And The S&P 500 Are Almost Identical". So if that is right, you would have to wonder why these companies would spend vast amounts of money employing all these quants, when a strategy of just buying and holding the index would produce more or less the same returns.

Managing Millions or Billions year on year to a profit is a vastly different and more difficult task than doing the same as a single retail investor.
 
Managing Millions or Billions year on year to a profit is a vastly different and more difficult task than doing the same as a single retail investor.

Exactly! However Howard is painting this picture that these companies are going to totally destroy us retail traders, due to all their resources and talented people that they employ. But I don’t see any evidence of that in these results. These funds are at a severe disadvantage to us retail traders due to their size. Perhaps Howard can show us some evidence to support his claim.

I don’t see these big trading companies as being our competition anyway. Due to their size, they are playing at a completely different ballgame to what we are playing.

And as for simple systems not working anymore – well you can’t get any simpler than buy-and-hold a low cost index fund, but this approach will, apparently, beat the majority of professional fund managers.
 
Exactly! However Howard is painting this picture that these companies are going to totally destroy us retail traders, due to all their resources and talented people that they employ. But I don’t see any evidence of that in these results. These funds are at a severe disadvantage to us retail traders due to their size. Perhaps Howard can show us some evidence to support his claim.

I don’t see these big trading companies as being our competition anyway. Due to their size, they are playing at a completely different ballgame to what we are playing.

And as for simple systems not working anymore – well you can’t get any simpler than buy-and-hold a low cost index fund, but this approach will, apparently, beat the majority of professional fund managers.

I consider myself an expert in another field (not trading or financial markets), and I can tell you the "top" people in this field - the 'names', the professors, the lecturers, the academics, the 'published', have very little idea about the real world or what's really possible on the leading edge. Those who are really ahead of the game - no one recognizes them, or they look like lunatics. So if the hedge funds with their team of Ivy League quants are achieving 20%pa, you can bet there are individuals who are absolutely smashing that by many times (using computing or discr.), even with big accounts.
 
Exactly! However Howard is painting this picture that these companies are going to totally destroy us retail traders, due to all their resources and talented people that they employ. But I don’t see any evidence of that in these results. These funds are at a severe disadvantage to us retail traders due to their size. Perhaps Howard can show us some evidence to support his claim.

I don't think he is making that claim at all. What he is saying is that the masses are able to access the sort of quant analysis which was the realm of the bigger companies. So our/your competition is now very likely to be a growing throng of very skilled (Or learning to be very skilled) people with growing computer capability and advanced analytic ideas and techniques. This new breed will become the norm in trading and investing.

I don’t see these big trading companies as being our competition anyway. Due to their size, they are playing at a completely different ballgame to what we are playing.

Yes they are but the skills and ideas which are forever growing can now be accessed by Joe Normal.
Pretty soon Joe Normal will become Joe Abnormal!
Personally I traded my horse in as soon as I saw the first T Ford!

And as for simple systems not working anymore – well you can’t get any simpler than buy-and-hold a low cost index fund, but this approach will, apparently, beat the majority of professional fund managers.

If you want to just get so so returns---fine. There is also bank interest.
 
If you're really bright and can program at a high level, why on earth would you work for a hedge fund where other people are getting wealthy on the back of your hard work... and in the process eroding your edge? You'd maybe work for them for a while, keep your best work to yourself until you save up and then go solo.

The only time it would make sense to use someone else's money is if you're a good discretionary trader, and even then, the moment you succeed, everyone is going to be watching your every move.

What I'm saying is, the big funds aren't set up to be able to find the best ideas. People know what ideas are worth and they don't want to have their IP used unless they are paid extremely well.
 
If you're really bright and can program at a high level, why on earth would you work for a hedge fund where other people are getting wealthy on the back of your hard work... and in the process eroding your edge? You'd maybe work for them for a while, keep your best work to yourself until you save up and then go solo.

The only time it would make sense to use someone else's money is if you're a good discretionary trader, and even then, the moment you succeed, everyone is going to be watching your every move.

What I'm saying is, the big funds aren't set up to be able to find the best ideas. People know what ideas are worth and they don't want to have their IP used unless they are paid extremely well.

What you seriously think they wouldn't be paid well??
 
Top