Australian (ASX) Stock Market Forum

Designing a Neural Network, need your help...

Joined
15 August 2013
Posts
6
Reactions
0
A little bit about me, I'm an analytical professional who for a while now has been making a living designing predictive models. (Regression analysis, decision trees, neural networks etc...)

A while ago I decided to build a neural network to predict the outcome of sporting matches, it turned out to be quite successful, 70%+ accuracy. I had designed over 200 variables to be built into the model, well long story short I learnt it doesn't take long before bookies stop taking your bets. I kept betting but only in small numbers and not in numbers large enough to really make a living. (ROI on these models were between 15% and 25%)

So second option is to move to parimutuel betting (read up on Zeljko Ranogajec and David Walsh) only problem is the data needed to build a decent model in horse racing would be almost impossible to get, the aforementioned guys have teams of people collecting the info just for them.

So what I've decided is to hit the stock market, I have a few ideas I want to try out but I'm finding that the data to get is extremely hard. I need an API or even a site I could screen scrape to get data on Price, Dividends, Vol, Franking, PE/PB ratios, EPS etc...

Yahoo and google have price and dividend historical but that's it.

Every place has the metrics I need but only at the current point in time.

So the million dollar question is, where can I get this data historical?
 
what are you planning on trading and what data do you want ?

It sounds like you are after historical fundamental data. I'm not sure what your thinking is but i would think the price action is more applicable. Fundamental data is merely a historical financial projection ie. someone in the past thinks a future price will be $x.

If stocks, then yahoo is probably fine although it's not clean but it's 95%+ accurate which is more than enough for NN. The percentage win rate is still not earth shattering. Don't underestimate the number of NN's out there that you're competing with or their complexity.

I've been looking into BSP lately as they're more rule based, which would more closely follow a human. Also the code could then be transported to other platforms, ninja, ab etc.

My Advice,
Learn the stockmarket first. Look for things that you can program first, ie, support / resistance areas, VSA, all the methods that people currently use. Plug this data into the NN to give you an edge. It's pretty basic but some of these signals should help

Aaaahhh too many ideas, too many kids, not enough time :(
 
Hi Neural --

I have been working with artificial intelligence applications, including neural networks, to financial markets for about 50 years.

There are several neural network packages commercially available that have functions specific to trading.

Alyuda addins for Excel:
http://www.alyuda.com/index.html

BioComp Systems Profit and Dakota:
http://www.biocompsystems.com/markets/financial.html

Tradecision:
http://www.tradecision.com/index.htm

Ward Systems NeuroShell:
http://www.wardsystems.com/

A web search will list several others. Many are no longer being supported. There are some questionable ethics among vendors and consultants.

----------------

Neural network models are very susceptible to overfitting. Use the best modeling and simulation techniques, including rigorous validation, to avoid introducing unfavorable bias.

Based on my research and experience, accurate forecasts depend on having high quality data, thoughtfully preprocessed, predicting one bar ahead.

Note that neural networks are one technique among many machine learning techniques. Consider exploring others, such as support vector machines, that are more robust and often provide better models.

Financial data is extremely non-stationary. Modeling techniques that work well on stationary data, including sports betting, fail badly when applied to non-stationary data.

Best regards,
Howard
 
Don't the TAB take all bets???

Give a squizz at your system and I'll check out if the TAB cut me off :D
 
Hi Howard,

Based on your experience, are they worth exploring for price estimation ?. Can they provide a better result, a comparable result to a human or at the very least can they hold their own against the benchmarks ?

I have been using them more for portfolio optimisation, however like all things, once you understand complex problems it's easier to write the algorithm than it is to have the computer learn it.
 
Hi Dave --

In my experience -- experience which includes fifty years of working with neural networks and other artificial intelligence techniques, and two concentrated years of research focused on use of neural networks (sponsored by a major trading company) -- and based on the disappearance of previously promising neural network-based financial trading system development platforms, I think there are better techniques than neural networks for development of trading systems.

If you are using a platform such as Python with the scikit-learn library to present financial data to a machine learning routine, it is easy to try out a variety of learning techniques simply by performing data pre-processing one time, then calling each of the different functions. Neural network is one, decision tree another, support vector machine a third, and so on for about twenty different learning techniques. In that case, include neural networks as one of the alternatives. But I would not begin a project focused solely on neural networks for use with financial time series data.

Best regards,
Howard
 
HB, do you happen to have a good primer for SVM to hand? Thanks.
 
So the million dollar question is, where can I get this data historical?

I would agree Yahoo has reasonable daily data with volume etc and other data. Ninja Trader is fairly efficient with c# programming strategy wizard which makes it pretty quick to build a strategy etc.

I read a couple of neural sites and looks like you will still need a strategy of some sort.

Step 4: Selecting strategy rules for performance testing
To test a neural model, you need to define a strategy based on the model forecasts (model-based strategy).

Trading strategies produced by New Model Wizard are based solely on model forecasts. These strategies produce buy/sell signals that depend only on model forecasts, and, therefore, allow testing model performance only, and not the money management or indicator-based rules.


Money management, self control are 2 that you will have to determine. I will be looking for your future posts to see how it all goes.:)

Pnut.
 
HB, do you happen to have a good primer for SVM to hand? Thanks.

Hi DeepState --

svm == support vector machine. One of the techniques for model building in the general framework of machine learning / pattern recognition / artificial intelligence.

Google "machine learning svm" will bring up several high quality articles.

Amazon search "book support vector machine" will bring up many results -- some specific to svm, others general machine learning / artificial intelligence, most of which introduce svm.

You can read the bibliography from my latest book:
http://www.quantitativetechnicalanalysis.com/book.html
Click on the link to "Bibliography"

Best regards,
Howard
 
Hi DeepState --

svm == support vector machine. One of the techniques for model building in the general framework of machine learning / pattern recognition / artificial intelligence.

Google "machine learning svm" will bring up several high quality articles.

Amazon search "book support vector machine" will bring up many results -- some specific to svm, others general machine learning / artificial intelligence, most of which introduce svm.

You can read the bibliography from my latest book:
http://www.quantitativetechnicalanalysis.com/book.html
Click on the link to "Bibliography"

Best regards,
Howard

Hi Howard,

I am also interested in checking the performance of a backpropagation algorithm, but I am unsure in the training methodology. As a first step, I want to play with simulated data - I want to supply a stream based on sinewave values with a bit of noise added to the single input of the neural network and get an output that should predict the value of the next data point.
From some basic examples, I noticed it can take many iterations for a bp network with three inputs and two outputs to learn a single set of data. So, if my training stream has 10000 points, what should be the training procedure? Do I need to supply the complete set of training data in cycles and repeat over and over again from the beginning until the mean square error of all points is under a desired target value?
Or does it have to learn each point in multiple iterations and then to advance to the next data value?
I am (attempting to) designing the software in C#, so I have to understand just about everything that is going on with this process. If my network will have at the moment one input and one output, how many hidden neurons should it have? Is 4 good enough?

Another question - for Forex, there are so called "Expert Advisor" software that can be added to MetaTrader and the real accounts that make use of the software appear to have (in some cases) very good results with decently low drawbacks.
For example, https://www.mql5.com/en/signals/65880 shows a pretty consistent profit of about 5% per month. Does this mean that software can be made so successful that can consistently achieve these figures? I somehow doubt it.

Kind regards,
Nick
 
Hi Nick --

For your first question --

Simulated data -- known data and known relationships -- is valuable to help understand the process of working with the tools and understanding the results. [Assuming you will eventually be working with price data for some tradable issue, leave the simulated data and work with the real data when you begin developing the trading system. Simulated stock price data has no value in developing a system for trading real stock price data.]

Select a neural network tool kit. It is possible to design, program, debug, and maintain your own. If you do that, maintaining the tool kit will become your project, rather than stock trading being your project.

Select a data source for the real data you will eventually use. Make certain that the data supplier has the data streams that you need, in the format you need them, at the time you need them.

Read the documentation from the neural network tools to learn its requirements for data format, and its capabilities -- such as parameters to specify the number of layers and nodes per layer.

The tool probably has some toy problems that can be used to get started. Work with them first.

When you move on to trading data, the data "rows" must stand alone. If you use a lagged value of an indicator, say yesterday's value of an RSI indicator, there must be a field that holds the value of yesterday as well as a field that holds the value of today. There will be a special field, often the final field in the data, that has the value of the target. The target is the value being predicted. It might be the direction of price change one bar into the future.

The learning process is an attempt to separate the rows of data into groups where the target values in each group are similar. Successful learning is required for profitable trading. Successful learning requires predictor variables that do differentiate between target categories by means of discoverable relationships. Most of your time will be spent looking for useful predictors -- data series and transformations of that data.

It will be hard work. Neural networks are very sensitive to stationarity and tend to overfit. My presentation on stationarity on YouTube might be helpful.
https://www.youtube.com/watch?v=iBhrZKErJ6A

Allow for some data that is not used during learning to be saved for validation.

Best regards,
Howard
 
Hi Nick --

Your second question about the future profitability of the Forex system you referenced --

In-sample results are always good. They have little to no value in estimating future performance. The only way to estimate future performance is by running the system on data that was not used during development of the model -- out-of-sample data -- resulting in out-of-sample performance. It takes surprisingly few uses of what was once out-of-sample data, followed by modification of the rules, to compromise the out-of-sampleness of the validation data, so use it sparingly.

Do not trust any posted system results until you have verified to your own satisfaction that it does provide enough profit to compensate for the risk as measured over a truly out-of-sample period of real trading.

First, be certain that the system is tradable and that the results posted can be achieved. That signals come in time to make the trades, or that the trades posted by automated executions are the trades credited to customer's accounts.

If that part looks good, then try this technique:
1. Download, or copy and paste, the trade list for the most recent period and store it on your computer.
2. Using that date you did that as a starting point, monitor the realtime performance of the system and evaluate the results.

3. If the result over the period of time you observed are good enough that you want to pay the developer's fee and trade the system, download another copy of the trade list for the period you used in step 1 above. Check to be certain that it has not changed.

Best regards,
Howard
 
Hi Nick --

For your first question --

Simulated data -- known data and known relationships -- is valuable to help understand the process of working with the tools and understanding the results. [Assuming you will eventually be working with price data for some tradable issue, leave the simulated data and work with the real data when you begin developing the trading system. Simulated stock price data has no value in developing a system for trading real stock price data.]

Select a neural network tool kit. It is possible to design, program, debug, and maintain your own. If you do that, maintaining the tool kit will become your project, rather than stock trading being your project.

Select a data source for the real data you will eventually use. Make certain that the data supplier has the data streams that you need, in the format you need them, at the time you need them.

Read the documentation from the neural network tools to learn its requirements for data format, and its capabilities -- such as parameters to specify the number of layers and nodes per layer.

The tool probably has some toy problems that can be used to get started. Work with them first.

When you move on to trading data, the data "rows" must stand alone. If you use a lagged value of an indicator, say yesterday's value of an RSI indicator, there must be a field that holds the value of yesterday as well as a field that holds the value of today. There will be a special field, often the final field in the data, that has the value of the target. The target is the value being predicted. It might be the direction of price change one bar into the future.

The learning process is an attempt to separate the rows of data into groups where the target values in each group are similar. Successful learning is required for profitable trading. Successful learning requires predictor variables that do differentiate between target categories by means of discoverable relationships. Most of your time will be spent looking for useful predictors -- data series and transformations of that data.

It will be hard work. Neural networks are very sensitive to stationarity and tend to overfit. My presentation on stationarity on YouTube might be helpful.
https://www.youtube.com/watch?v=iBhrZKErJ6A

Allow for some data that is not used during learning to be saved for validation.

Best regards,
Howard

Dear Howard,

Thank you very much for the advice. I also watched the video, I found it very interesting and useful.

Best regards,
Nick
 
Hi Nick --

Your second question about the future profitability of the Forex system you referenced --

In-sample results are always good. They have little to no value in estimating future performance. The only way to estimate future performance is by running the system on data that was not used during development of the model -- out-of-sample data -- resulting in out-of-sample performance. It takes surprisingly few uses of what was once out-of-sample data, followed by modification of the rules, to compromise the out-of-sampleness of the validation data, so use it sparingly.

Do not trust any posted system results until you have verified to your own satisfaction that it does provide enough profit to compensate for the risk as measured over a truly out-of-sample period of real trading.

First, be certain that the system is tradable and that the results posted can be achieved. That signals come in time to make the trades, or that the trades posted by automated executions are the trades credited to customer's accounts.

If that part looks good, then try this technique:
1. Download, or copy and paste, the trade list for the most recent period and store it on your computer.
2. Using that date you did that as a starting point, monitor the realtime performance of the system and evaluate the results.

3. If the result over the period of time you observed are good enough that you want to pay the developer's fee and trade the system, download another copy of the trade list for the period you used in step 1 above. Check to be certain that it has not changed.

Best regards,
Howard

This is sound advice.
In your experience, are there any automated systems capable of providing a consistent profit in Forex? What would be a result that would somebody be proud of? Is 10% per year attainable? Is the share market easier to predict than Forex?

Best regards,
Nick
 
This is sound advice.
In your experience, are there any automated systems capable of providing a consistent profit in Forex? What would be a result that would somebody be proud of? Is 10% per year attainable? Is the share market easier to predict than Forex?

Best regards,
Nick

Hi Nick --

I am not familiar with enough of the systems trading Forex to make a reasonable reply specific to Forex.

I have a friend who is an excellent trader who regularly tells me that he would gladly give up all gains in excess of 8% per year if he could be assured of a steady 8%.

Which is easier -- shares or Forex? I would begin with an analysis of the price series of each market independent of rules to buy and sell -- similar to that I describe in my Quantitative Technical Analysis book. I would look at symmetry of upward and downward changes in price. We know that shares have an inherent upward bias that favors being long, and that influences the design of rules to buy and sell. Forex may be more symmetrical and in some sense easier to model.

Before trading Forex, carefully assess the relationship between the quote vendor, order execution, and clearing organizations. Without unified price distribution (best bid and offer) and independent central clearing, Forex retains and deserves its Wild West reputation.

Best regards,
Howard
 
Top