# Designing a Neural Network, need your help...



## Neural (9 April 2015)

A little bit about me, I'm an analytical professional who for a while now has been making a living designing predictive models. (Regression analysis, decision trees, neural networks etc...)

A while ago I decided to build a neural network to predict the outcome of sporting matches, it turned out to be quite successful, 70%+ accuracy. I had designed over 200 variables to be built into the model, well long story short I learnt it doesn't take long before bookies stop taking your bets. I kept betting but only in small numbers and not in numbers large enough to really make a living. (ROI on these models were between 15% and 25%)

So second option is to move to parimutuel betting (read up on Zeljko Ranogajec and David Walsh) only problem is the data needed to build a decent model in horse racing would be almost impossible to get, the aforementioned guys have teams of people collecting the info just for them.

So what I've decided is to hit the stock market, I have a few ideas I want to try out but I'm finding that the data to get is extremely hard. I need an API or even a site I could screen scrape to get data on Price, Dividends, Vol, Franking, PE/PB ratios, EPS etc...

Yahoo and google have price and dividend historical but that's it.

Every place has the metrics I need but only at the current point in time.

So the million dollar question is, where can I get this data historical?


----------



## skc (9 April 2015)

Neural said:


> So the million dollar question is, where can I get this data historical?




https://www.aussiestockforums.com/forums/showthread.php?t=23925

Hope you make trillions!


----------



## Atari rose (9 April 2015)

I'm more interested in your sporting model ! Links please ........


----------



## DaveDaGr8 (9 April 2015)

what are you planning on trading and what data do you want ?

It sounds like you are after historical fundamental data. I'm not sure what your thinking is but i would think the price action is more applicable. Fundamental data is merely a historical financial projection ie. someone in the past thinks a future price will be $x.

If stocks, then yahoo is probably fine although it's not clean but it's 95%+ accurate which is more than enough for NN. The percentage win rate is still not earth shattering. Don't underestimate the number of NN's out there that you're competing with or their complexity. 

I've been looking into BSP lately as they're more rule based, which would more closely follow a human. Also the code could then be transported to other platforms, ninja, ab etc.

My Advice,
Learn the stockmarket first. Look for things that you can program first, ie, support / resistance areas, VSA, all the methods that people currently use. Plug this data into the NN to give you an edge. It's pretty basic but some of these signals should help

Aaaahhh too many ideas, too many kids, not enough time


----------



## howardbandy (10 April 2015)

Hi Neural --

I have been working with artificial intelligence applications, including neural networks, to financial markets for about 50 years.

There are several neural network packages commercially available that have functions specific to trading.  

Alyuda addins for Excel:
http://www.alyuda.com/index.html

BioComp Systems Profit and Dakota:
http://www.biocompsystems.com/markets/financial.html

Tradecision:
http://www.tradecision.com/index.htm

Ward Systems NeuroShell: 
http://www.wardsystems.com/

A web search will list several others.  Many are no longer being supported.   There are some questionable ethics among vendors and consultants.

----------------

Neural network models are very susceptible to overfitting.  Use the best modeling and simulation techniques, including rigorous validation, to avoid introducing unfavorable bias.  

Based on my research and experience, accurate forecasts depend on having high quality data, thoughtfully preprocessed, predicting one bar ahead.

Note that neural networks are one technique among many machine learning techniques.  Consider exploring others, such as support vector machines, that are more robust and often provide better models. 

Financial data is extremely non-stationary.  Modeling techniques that work well on stationary data, including sports betting, fail badly when applied to non-stationary data.

Best regards,
Howard


----------



## fiftyeight (10 April 2015)

Don't the TAB take all bets???

Give a squizz at your system and I'll check out if the TAB cut me off


----------



## DaveDaGr8 (11 April 2015)

Hi Howard,

Based on your experience, are they worth exploring for price estimation ?. Can they provide a better result, a comparable result to a human or at the very least can they hold their own against the benchmarks ?

I have been using them more for portfolio optimisation, however like all things, once you understand complex problems it's easier to write the algorithm than it is to have the computer learn it.


----------



## tech/a (11 April 2015)

Dr Bruce Vanstone has done a wealth of work on the topic

http://works.bepress.com/bruce_vanstone/


----------



## howardbandy (11 April 2015)

Hi Dave --

In my experience -- experience which includes fifty years of working with neural networks and other artificial intelligence techniques, and two concentrated years of research focused on use of neural networks (sponsored by a major trading company) -- and based on the disappearance of previously promising neural network-based financial trading system development platforms, I think there are better techniques than neural networks for development of trading systems.  

If you are using a platform such as Python with the scikit-learn library to present financial data to a machine learning routine, it is easy to try out a variety of learning techniques simply by performing data pre-processing one time, then calling each of the different functions.  Neural network is one, decision tree another, support vector machine a third, and so on for about twenty different learning techniques.  In that case, include neural networks as one of the alternatives.  But I would not begin a project focused solely on neural networks for use with financial time series data. 

Best regards,
Howard


----------



## DeepState (12 April 2015)

HB, do you happen to have a good primer for SVM to hand? Thanks.


----------



## Pnut (12 April 2015)

Neural said:


> So the million dollar question is, where can I get this data historical?




I would agree Yahoo has reasonable daily data with volume etc and other data. Ninja Trader is fairly efficient with c# programming strategy wizard which makes it pretty quick to build a strategy etc.

I read a couple of neural sites and looks like you will still need a strategy of some sort.

_*Step 4: Selecting strategy rules for performance testing
To test a neural model, you need to define a strategy based on the model forecasts (model-based strategy). 

Trading strategies produced by New Model Wizard are based solely on model forecasts. These strategies produce buy/sell signals that depend only on model forecasts, and, therefore, allow testing model performance only, and not the money management or indicator-based rules. *_

Money management, self control are 2 that you will have to determine.  I will be looking for your future posts to see how it all goes.

Pnut.


----------



## howardbandy (13 April 2015)

DeepState said:


> HB, do you happen to have a good primer for SVM to hand? Thanks.




Hi DeepState --

svm == support vector machine.  One of the techniques for model building in the general framework of machine learning / pattern recognition / artificial intelligence.

Google "machine learning svm" will bring up several high quality articles.

Amazon search "book support vector machine" will bring up many results -- some specific to svm, others general machine learning / artificial intelligence, most of which introduce svm.

You can read the bibliography from my latest book:
http://www.quantitativetechnicalanalysis.com/book.html
Click on the link to "Bibliography"

Best regards,
Howard


----------



## DaveDaGr8 (14 April 2015)

howardbandy said:


> svm *== *support vector machine.
> Howard




Love it. I have accidentally used != in an email before to non programmers. Blank looks everywhere, but nobody wanted to ask me what it meant. Nobody wanted to be the first to sound stupid i guess.


----------



## NickF (19 May 2015)

howardbandy said:


> Hi DeepState --
> 
> svm == support vector machine.  One of the techniques for model building in the general framework of machine learning / pattern recognition / artificial intelligence.
> 
> ...




Hi Howard,

I am also interested in checking the performance of a backpropagation algorithm, but I am unsure in the training methodology. As a first step, I want to play with simulated data - I want to supply a stream based on sinewave values with a bit of noise added to the single input of the neural network and get an output that should predict the value of the next data point. 
From some basic examples, I noticed it can take many iterations for a bp network with three inputs and two outputs to learn a single set of data. So, if my training stream has 10000 points, what should be the training procedure? Do I need to supply the complete set of training data in cycles and repeat over and over again from the beginning until the mean square error of all points is under a desired target value? 
Or does it have to learn each point in multiple iterations and then to advance to the next data value?
I am (attempting to) designing the software in C#, so I have to understand just about everything that is going on with this process. If my network will have at the moment one input and one output, how many hidden neurons should it have? Is 4 good enough?

Another question - for Forex, there are so called "Expert Advisor" software that can be added to MetaTrader and the real accounts that make use of the software appear to have (in some cases) very good results with decently low drawbacks.
For example,  https://www.mql5.com/en/signals/65880 shows a pretty consistent profit of about 5% per month. Does this mean that software can be made so successful that can consistently achieve these figures? I somehow doubt it.

Kind regards,
Nick


----------



## howardbandy (20 May 2015)

Hi Nick --

For your first question --

Simulated data -- known data and known relationships -- is valuable to help understand the process of working with the tools and understanding the results.  [Assuming you will eventually be working with price data for some tradable issue, leave the simulated data and work with the real data when you begin developing the trading system.  Simulated stock price data has no value in developing a system for trading real stock price data.]

Select a neural network tool kit.  It is possible to design, program, debug, and maintain your own.  If you do that, maintaining the tool kit will become your project, rather than stock trading being your project.  

Select a data source for the real data you will eventually use.  Make certain that the data supplier has the data streams that you need, in the format you need them, at the time you need them.

Read the documentation from the neural network tools to learn its requirements for data format, and its capabilities -- such as parameters to specify the number of layers and nodes per layer.  

The tool probably has some toy problems that can be used to get started.  Work with them first.

When you move on to trading data, the data "rows" must stand alone.  If you use a lagged value of an indicator, say yesterday's value of an RSI indicator, there must be a field that holds the value of yesterday as well as a field that holds the value of today.  There will be a special field, often the final field in the data, that has the value of the target.  The target is the value being predicted.  It might be the direction of price change one bar into the future.  

The learning process is an attempt to separate the rows of data into groups where the target values in each group are similar.  Successful learning is required for profitable trading.  Successful learning requires predictor variables that do differentiate between target categories by means of discoverable relationships.  Most of your time will be spent looking for useful predictors -- data series and transformations of that data.

It will be hard work.  Neural networks are very sensitive to stationarity and tend to overfit.  My presentation on stationarity on YouTube might be helpful.
https://www.youtube.com/watch?v=iBhrZKErJ6A

Allow for some data that is not used during learning to be saved for validation.

Best regards,
Howard


----------



## howardbandy (20 May 2015)

Hi Nick --

Your second question about the future profitability of the Forex system you referenced --

In-sample results are always good.  They have little to no value in estimating future performance.  The only way to estimate future performance is by running the system on data that was not used during development of the model -- out-of-sample data -- resulting in out-of-sample performance.  It takes surprisingly few uses of what was once out-of-sample data, followed by modification of the rules, to compromise the out-of-sampleness of the validation data, so use it sparingly.

Do not trust any posted system results until you have verified to your own satisfaction that it does provide enough profit to compensate for the risk as measured over a truly out-of-sample period of real trading.  

First, be certain that the system is tradable and that the results posted can be achieved.  That signals come in time to make the trades, or that the trades posted by automated executions are the trades credited to customer's accounts.

If that part looks good, then try this technique:
1.  Download, or copy and paste, the trade list for the most recent period and store it on your computer.
2.  Using that date you did that as a starting point, monitor the realtime performance of the system and evaluate the results.

3.  If the result over the period of time you observed are good enough that you want to pay the developer's fee and trade the system, download another copy of the trade list for the period you used in step 1 above.  Check to be certain that it has not changed.

Best regards,
Howard


----------



## NickF (20 May 2015)

howardbandy said:


> Hi Nick --
> 
> For your first question --
> 
> ...




Dear Howard,

Thank you very much for the advice. I also watched the video, I found it very interesting and useful.

Best regards,
Nick


----------



## NickF (20 May 2015)

howardbandy said:


> Hi Nick --
> 
> Your second question about the future profitability of the Forex system you referenced --
> 
> ...




This is sound advice. 
In your experience, are there any automated systems capable of providing a consistent profit in Forex? What would be a result that would somebody be proud of? Is 10% per year attainable? Is the share market easier to predict than Forex?

Best regards,
Nick


----------



## howardbandy (21 May 2015)

NickF said:


> This is sound advice.
> In your experience, are there any automated systems capable of providing a consistent profit in Forex? What would be a result that would somebody be proud of? Is 10% per year attainable? Is the share market easier to predict than Forex?
> 
> Best regards,
> Nick




Hi Nick --

I am not familiar with enough of the systems trading Forex to make a reasonable reply specific to Forex.  

I have a friend who is an excellent trader who regularly tells me that he would gladly give up all gains in excess of 8% per year if he could be assured of a steady 8%.

Which is easier -- shares or Forex?  I would begin with an analysis of the price series of each market independent of rules to buy and sell -- similar to that I describe in my Quantitative Technical Analysis book.  I would look at symmetry of upward and downward changes in price.  We know that shares have an inherent upward bias that favors being long, and that influences the design of rules to buy and sell.  Forex may be more symmetrical and in some sense easier to model.  

Before trading Forex, carefully assess the relationship between the quote vendor, order execution, and clearing organizations.  Without unified price distribution (best bid and offer) and independent central clearing, Forex retains and deserves its Wild West reputation.

Best regards,
Howard


----------



## DaveDaGr8 (24 May 2015)

Hi Nick,

You said 1 input. Are you using feedback loops ?


----------



## NickF (24 May 2015)

DaveDaGr8 said:


> Hi Nick,
> 
> You said 1 input. Are you using feedback loops ?




Hi Dave,

I am just at the beginning, where I learn and try to figure out the proper architecture of the neural networks for data forecasting. I have a lot to read. I came across a very valuable (in my opinion) document that describes some apparently successful approach of forecasting Forex - http://arxiv.org/ftp/cond-mat/papers/0304/0304469.pdf

It provides methods to analyse if data is forecastable and to estimate the quality of prediction and this can be used not only with neural networks, but with any kind of system, for example any of the indicators. For example, if we would implement various strategies (ichimoku, MA crossover, etc) for a given sample, we could identify if any of these indicators are worthwhile and which of them are not worth bothering with.

I will probably purchase in the future one of Howard's books, it should provide even more valuable information.
I already installed Python on one of my computers, but I have to focus in one direction, as I tend to look everywhere and not finish anything. 

So far I am involved in trying to understand and get comfortable with some C# sample code for neural networks - I got excellent quality code, but I will have to modify it for my needs. I've been a beginner programmer for the last 20 years 

I am uncertain if using just one or two inputs would be sufficient for forecasting - at the moment (without enough reading), I tend to believe that for pattern recognition, it would be better to provide input data from the last n periods (for example, have 10 inputs to receive smoothed data of last 10 days, plus some extra inputs. I changed my mind from using one input based on this - if we consider a sinewave with values between -1 to +1, even if I tell you that current data is 0.5, you would be unable to tell me if the next value will be higher or lower than 0.5. While if you have the last n results and plot them (provided that data is sampled correctly), it would be trivial to answer the question. If I read more, I might change again my theory 

To answer your question, in the first stage I will probably build a backpropagation network with 100 inputs and one hidden layer (not sure how many hidden neurons, probably between 2 and 100, on a single layer) and see if it will be capable of learning the data, maybe for 1000 values. With ANN, there is a problem with choosing the number of neurons on a hidden layer - too few and it may never learn, too many and it will overfit the data). Also the learning rate and momentum need to be chosen carefully - too small and it will take forever to learn, too large and it will never learn. At least, this is what I know at the moment.

Later, I may try to implement the network as described in the document above (an Elman-Jordan architecture with two hidden layers, each of 100 neurons). Then, I could laugh how my own network is so much better...

Anyway, ANN and artificial intelligence is such a vast area of research, I guess I will need an entire month to become expert in it...

Cheers,
Nick


----------



## howardbandy (25 May 2015)

Hi Nick --

It is possible to use only one input data stream for a neural network.  But there will probably be several derivatives created from it, so that input layer to the NN will have several input nodes.  

For example, begin by using closing price as the input stream.  Since all variables must be transformation invariant, price itself is not a good input.  So create several derivative indicators that are transformation invariant.  Try RSI(2), RSI(3), Z-Score(5), ROC(1), etc.  There should be as many zero crossings per time period (say month) in each of the inputs to the NN as there are changes in state of the target variable per that same time period.  

Since a neural network works best when the inputs are all in the range of 0 to 1, normalize everything using a sliding window.  There will be, say 10, inputs to the NN.  Each row of the input must be independent of all other rows.  So, in order to compare values or use changes, the prior data or change must be included as its own variable.  For example, RSI(2) today and RSI(2)ChangeFromYesterday.  10 inputs is not many, but look at what happens next. 

There are proofs that a single hidden layer is sufficient.  Try two, if you wish.  Not three or more.

The number of terms in the equation represented by the NN is the product of all the nodes.  With 10 inputs, 5 in each of two hidden layers, 1 output, there are 250 nodes.  If the period of stationarity is determined to be one year, 252 trading days, the learning process will be fitting a 250 term equation to 252 data points.  This is all in-sample and the fit will be really good.  Test by predicting the next three months -- 60 trading days -- as an out-of-sample test.  Expect results to really bad.  (If they are good, double check everything -- you are probably fooling yourself and made a mistake somewhere.)  If they are bad, either reduce the number of nodes or increase the number of data points.  Reducing the number of nodes means the equation will be less complex -- fewer inputs and / or fewer hidden nodes.  Deciding what to keep and what to give up is difficult and will require experimentation.  Increasing the number of data points is good provided that relationships being learned are consistent throughout -- this is concept of stationarity.  More data is bad when the additional data is different. 

Keep us posted.


Best regards,
Howard


----------



## keithj (25 May 2015)

howardbandy said:


> There are proofs that a single hidden layer is sufficient.  Try two, if you wish.  Not three or more.



I would agree with a single layer - 2nd & subsequent layers usually make a trivial difference to the results and take an order of magnitude more to learn.

Visit the forums at kaggle.com for more ideas - IMO a small community of v. high quality people solving real world problems.

Also consider getting a (free) AWS account & running your python scripts there 24/7.


----------



## NickF (25 May 2015)

howardbandy said:


> Hi Nick --
> 
> It is possible to use only one input data stream for a neural network.  But there will probably be several derivatives created from it, so that input layer to the NN will have several input nodes.
> 
> ...




Hi Howard,

First of all, I am really grateful that somebody with your experience is guiding us. I will need to read some more about the matters you explained. But please let me know if I understand this correctly - neural networks is all about pattern recognition. You say that one can use only one variable and a few derivatives of that variable. Let's say I use Close and RSI(2), RSI(3), Z-Score(5), ROC(1). Since these variables are based on data of the last 5 days at most (let's make this assumption, even if it may not be correct) - does it mean that the pattern recognition is limited to data of the last 5 days?  Does it mean that if I have a very smart ANN, I can supply just the close values of the last five days to its five inputs and, being so smart, it will remove the noise by itself, find the best data and create its own internal indicators and provide better results?

If this is the case, would this approach work? It is an alternative to ANN. I have 4 input variables. I normalize each of them to 0...1 interval and then I create a large matrix and I split this interval in 3 ranges - then, for every new data set, I store the following information  - pattern[0..2][0..2][0..2][0..2], increment the number of appearances of this particular pattern, how many time the price increased following this pattern, how many times the price decreased. Then I choose only the patterns that appear most often and that have the largest bias between wins/losses. There is an obvious roughness of this method, compared to ANN, but the advantage would be that learning would be much faster.

By the way, what is the expected learning time for a neural network like you described above? Is it minutes, hours, days, weeks? 

Best regards,
Nick


----------



## NickF (25 May 2015)

keithj said:


> I would agree with a single layer - 2nd & subsequent layers usually make a trivial difference to the results and take an order of magnitude more to learn.
> 
> Visit the forums at kaggle.com for more ideas - IMO a small community of v. high quality people solving real world problems.
> 
> Also consider getting a (free) AWS account & running your python scripts there 24/7.




Hi Keith,

Thank you very much for invitation. I will join that forum as well. It is quite amazing that Amazon offers computer power for running user scripts. My computers sleep for most of the day, so at the moment I could put them to work, if I have some ideas.

Cheers,
Nick


----------



## keithj (25 May 2015)

NickF said:


> But please let me know if I understand this correctly - neural networks is all about pattern recognition.



Yes. And the best pattern recognition system by far is the human brain. However, as demonstrated by some of the TA threads it can suffer from overfitting.



NickF said:


> You say that one can use only one variable and a few derivatives of that variable. Let's say I use Close and RSI(2), RSI(3), Z-Score(5), ROC(1). Since these variables are based on data of the last 5 days at most (let's make this assumption, even if it may not be correct) - does it mean that the pattern recognition is limited to data of the last 5 days?  Does it mean that if I have a very smart ANN, I can supply just the close values of the last five days to its five inputs and, being so smart, it will remove the noise by itself, find the best data and create its own internal indicators and provide better results?
> 
> If this is the case, would this approach work? It is an alternative to ANN. I have 4 input variables. I normalize each of them to 0...1 interval and then I create a large matrix and I split this interval in 3 ranges - then, for every new data set, I store the following information  - pattern[0..2][0..2][0..2][0..2], increment the number of appearances of this particular pattern, how many time the price increased following this pattern, how many times the price decreased. Then I choose only the patterns that appear most often and that have the largest bias between wins/losses. There is an obvious roughness of this method, compared to ANN, but the advantage would be that learning would be much faster.



So you are essentially the brains behind the NN.

At it's simplest, a NN is list of nodes that each have a weighting that it may or may not pass to an output. If one of the inputs is the weather in Timbukto yesterday, you'd hope that node was weighted at 0.  For inputs that aren't completely random (eg yesterdays price move), there should be a non-zero weighting.  It will take a NN minutes to work out that Timbukto isn't relevant to tomorrows price, whereas you would know it instantly.

Your NN should  find there's some fairly obvious patterns that work a little over 50% of the time.  Your own brain should also be able to spot them too. Whether they are tradeable or not is a different question.

The clever bit of NNs is the back propagation (or deciding on each nodes weightings). As you propose above, human brains are far better at it for small data sets.



NickF said:


> By the way, what is the expected learning time for a neural network like you described above? Is it minutes, hours, days, weeks?



As long as a piece of string...  depends on how many layers, how many nodes, how much data, how much processing power, your stopping conditions.  When developing a NN aim for minutes, so you can iterate to the next dud idea faster.  When you've got something vaguely reasonable throw lots of data at it & let it train for a couple of days ?

And as mentioned by others, it is absolutely essential to try it with out of sample data to see how it actually performs.


----------



## NickF (25 May 2015)

keithj said:


> Yes. And the best pattern recognition system by far is the human brain. However, as demonstrated by some of the TA threads it can suffer from overfitting.




Hmm, my attempt to using the ANN was to find a better alternative than my brain. Without trying to hurt its feelings, my brain is not capable of winning the Forex game. It is pretty lazy and predisposed to gambling...

The good part is that software now makes good progress. For example, some software based on artificial intelligence has managed to attain super-human capabilities in character recognition. Of course, there is a need of a good brain behind good software...

Also, I was reading years ago about a ANN system developed for playing backgammon. The system played against itself, millions of games and it became, according to expert players, a top player itself. And people even managed to learn from it which of some particular alternative moves were better. 

I was reading some article that said that about 21% of Forex players with accounts under $1500 are winning. While about 40 something percent of Forex players with accounts above $5000 are winning. That means, there is hope. And the obvious choice to become a more successful player, is to add more money to my shrinking real account 

Cheers,
Nick


----------



## howardbandy (25 May 2015)

Hi Nick --

Please re-read my posts #5 and #9 in this thread.

Forecasting the direction of price change one bar ahead (or something equivalent as a target for the learning) is a very difficult problem.  The markets are nearly efficient, the signal to noise ratio is very low, the data is nonstationary, neural networks are prone to overfitting to the in-sample data, etc.

Become very familiar with modeling and simulation techniques.  Begin with modeling stationary data (such as the Iris data), then progress to time series, then to financial time series.  Each is more difficult than the previous.  Whenever you are reading about or working with any technique, pay very close attention to how validation will be done.

The most value the modeler (that is you) can add is clever transformations of the raw input data to produce a highly predictive model.  Expect to spend 80 percent of your time in this task.

Begin by reading a lot.  The internet for starters.

Enroll in artificial intelligence / pattern recognition classes in Coursera or one of the other online educational programs.  They are (mostly) free and university undergraduate and graduate level.  If necessary, review the math needed -- mostly linear algebra, some statistics.    

Learn to use one of the machine learning libraries that has a neural network component.  I recommend the Python language with the Scikit-Learn library.  Also free.

Keep in mind that the problem you are working on is modeling, simulating, and forecasting financial time series.  Financial time series analysis is an order of magnitude more difficult than analysis of stationary data.   

Expect to spend on the order of 5,000 to 10,000 hours in an attempt to develop a usable solution.  Do not be too disappointed if you never find one that you have confidence in using for real money trading, even after all that time and effort.

Best,
Howard


----------



## luutzu (25 May 2015)

howardbandy said:


> Hi Nick --
> 
> Please re-read my posts #5 and #9 in this thread.
> 
> ...





hi Howard,

Did you just say: The road is long and hard; it will take years of effort and studies, and in the end you might end up with a useless solution.

You don't sugarcoat things do you?

Beats university any day.


----------



## NickF (25 May 2015)

howardbandy said:


> Hi Nick --
> 
> Expect to spend on the order of 5,000 to 10,000 hours in an attempt to develop a usable solution.  Do not be too disappointed if you never find one that you have confidence in using for real money trading, even after all that time and effort.
> 
> ...




Well, to be honest, I like to dream that I will find an automated method that will poor money into my pockets, slowly but surely. If this will not happen, at least I will enjoy some of the side benefits of the journey - learning to program better, reading about interesting things, feeling I am a part of a major common goal (similar to the alchemists dream of making gold), etc. It surely beats watching TV six hours a day... 
Of course, there are probably better ways to invest many thousands of hours to reap some benefits.

Thanks for bringing me back to earth 

By the way, the Iris you mention - is it a database of recognition of eye irises? 
If that's the case, I read about some databases for character recognition, MNIST, the data in it would also be stationary? Any advantage in using Iris instead of MNIST (http://yann.lecun.com/exdb/mnist/)?
Where can I find details about Iris (apart for looking into a mirror  )?

Kind regards,
Nick


----------



## keithj (25 May 2015)

NickF said:


> By the way, the Iris you mention - is it a database of recognition of eye irises?
> If that's the case, I read about some databases for character recognition, MNIST, the data in it would also be stationary? Any advantage in using Iris instead of MNIST (http://yann.lecun.com/exdb/mnist/)?
> Where can I find details about Iris (apart for looking into a mirror  )?



The iris dataset is just numeric 4 attributes for a sample of 150 iris flowers.


----------



## NickF (25 May 2015)

keithj said:


> The iris dataset is just numeric 4 attributes for a sample of 150 iris flowers.




Thanks Keith,

I'll give it a go.

Nick


----------



## howardbandy (26 May 2015)

Hi Nick --

Keith is correct about the Iris dataset.  It is one of datasets often used to demonstrate machine learning techniques.
https://archive.ics.uci.edu/ml/datasets/Iris

If you have not already, you might download and read the free chapters here:
http://www.quantitativetechnicalanalysis.com/book.html

Here are links to some of the many machine learning courses available free online:
https://www.coursera.org/course/ml
https://work.caltech.edu/telecourse.html
http://ocw.mit.edu/courses/electric...ence/6-034-artificial-intelligence-fall-2010/

Note that none of these courses go beyond analysis of stationary data.  None discuss time series. 

Read about the developments already made in applications of machine learning to trading (among many other fields):
http://www.amazon.com/Rise-Robots-T...1432569128&sr=8-1&keywords=rise+of+the+robots

Best regards,
Howard


----------



## NickF (26 May 2015)

howardbandy said:


> Hi Nick --
> 
> Keith is correct about the Iris dataset.  It is one of datasets often used to demonstrate machine learning techniques.
> https://archive.ics.uci.edu/ml/datasets/Iris
> ...




Hi Howard,

I enrolled in Coursera class. I just installed MatLab. There is also a package called Octave, which can be used in a similar manner and it's free. I may try it as well later.

I checked the book you suggested, I am very interested in reading it.

I came across a great video that explains how the backpropagation works and why we choose the sigmoid function (or similar) instead of a step function https://www.youtube.com/watch?v=q0pm3BrIUFo

Thanks a lot for the advice. 

Best regards,
Nick


----------



## howardbandy (27 May 2015)

Hi Nick --

Good decisions!

You wrote "I came across a great video that explains how the backpropagation works and why we choose the sigmoid function (or similar) instead of a step function"

One way to think of a sigmoid transfer function is that it is a step function with curved transitions from the linear central section to the upper and lower limits.  Sigmoid is a class of functions.  One that is particularly valuable in developing trading systems is "softmax."
http://en.wikipedia.org/wiki/Softmax_function

Best regards,
Howard


----------

