The Predictive Power of High-Performance Computing in Finance

HPC Financial Services

The principle of machine learning separates cleanly into three steps, namely calibration, validation, and verification.

Let us for the moment assume that the available data has been separated into two disjoint sets for training and validation.

Machine learning consists of the following three steps.

I. Calibration (a.k.a. training) is the process of taking a supervised learning algorithm and finding a set of parameters which approximates a desired target to a satisfactory degree when measured on the training data set.

II.Validation is the process of checking the performance of the calibrated learner on the previously unused part of the data.

III. Verification is the process of choosing our desired learner amongst a variety of algorithms based on their calibration and validation performances.

Once we have calibrated, validated, and verified our learner, we’re ready to deploy it on previously unseen data, i.e. we can run it out-of-sample. For trading algorithms, out-of-sample normally refers to the real-time investment process (whereas all previous steps work with historical data),

So let’s look at what exactly artificial intelligence means in this context.

I. A learner A is more intelligent than a learner B if it can learn the solution to at least as many problems as B.

II. General artificial intelligence is achieved when a learner is more intelligent than all other available learners.

When general artificial intelligence is achieved, verification (the third step above) becomes obsolete and machine learning reduces to a two-step routine.

However, more intelligent is not necessarily better. In a data-rich environment, the more intelligent an algorithm, the better. But in a data-poor environment, algorithmic intelligence can lead to very poor results out-of-sample.

An example of this phenomenon can easily be constructed by using neural networks. When working with neural networks, if the data sets for training and validation are finite, for every calibrated and validated supervised learner that depends on a certain parameter set, another supervised learner can be found which

  • has identical training and validation performance
  • and which has a much greater parameter set.

The second learner can be constructed by simple appending a bit of neural network to the first learner and making sure that the parameters in the added bit are such that none of the neurons in it ever get triggered on the training and validation data.

Due to the bigger parameter set of the second learner, when the algorithms encounter out-of-sample data that is dissimilar from training and validation sets, potential differences between the two learners can arise.

By reverse logic, whenever we use a supervised learner with a parameter set that is not minimised for the task at hand, we are consciously accepting out-of-sample errors.

For any application relying on financial market data, data can generally be considered scarce. Hence, this last observation is of enormous importance in finance. Put differently: unless we are minimising the parameters in our supervised learner to the absolute minimum required to achieve the desired training and validation performance, we are creating out-of-sample errors.

For finite data sets, we therefore need to adapt our verification procedure to find a parameter-minimised algorithm which has our desired calibration and validation performances. Unfortunately, this can be a computationally very expensive task since a very large number of different algorithms needs to be tried. The benefit of course is an improvement in out-of-sample performance.

Let us look at a numerical example. We take data for five US stocks and create a target return by taking the daily average return plus a small white noise error term. Figure 1 shows the calibration results for the daily returns of 2017 and we see that both the shallow and the deep learner calibrate well.

In Figure 2 we see the out-of-sample performance on 2018 data. As would be expected, the tracking of the target data is not as good as in training, but, with the naked eye, we cannot yet see a noticeable difference between the shallow and the deep learner.

However, Figure 3 shows the out-of-sample performance on the daily returns for the period 2010 to 2016. Over this longer out-of-sample period, we see that, clearly, the error of the deep learner is much greater than that of the shallow learner. This is the type of out-of-sample error which a verification procedure which looks for a parameter-minimised learner would have avoided.

In Figure 4, we see the same 2010-2016 out-of-sample error results for a variety of learners which are ranked on the x-axis by the number of their parameters. (Deep1 has 10 free parameters, Deep2 has 12, Deep3 has 14, and so forth.) All of the plotted learners were calibrated to achieve identical results on the 2017 data. We see that, clearly, the out-of-sample performance decays substantially as the number of parameters increases.

We will briefly summarise our observations, which are twofold. First, general artificial intelligence, can never exist in an environment of data scarcity. Second, the importance of high-performance computing is inversely related to the size of the available calibration and validation data sets.

For the creation of algorithmic trading strategies, data can generally be considered scarce. Not because there isn’t a lot of financial data around (there is plenty), but because most of it is useless for prediction purposes due to structural changes in the markets (e.g., political or other). Consequently, the more computational power you have available for your verification procedure, the better your chance at making correct predictions.

Figure 1: We calibrate a shallow and a deep learner on the daily returns of 2017.

Figure 2: Out-of-sample performance on the daily returns of 2018.

Figure 3: Out-of-sample performance on the daily returns of 2010 to 2016.

Figure 4: Out-of-sample performance on the daily returns of 2010 to 2016 for a variety of learners ordered on the x-axis by increasing number of parameters. (Deep1 has 10 free parameters, Deep2 has 12, Deep3 has 14, and so forth.) We see that, clearly, the out-of-sample performance decays substantially as the number of parameters increases.


J. B. Heaton, N. G. Polson, J. H. Witte: Deep Learning for Finance: Deep Portfolios; Applied Stochastic Models in Business and Industry (ASMB), 33(1), pp. 1-12, 2017.

J. B. Heaton, N. G. Polson, J. H. Witte: Generating Synthetic Data to Test Financial Strategies and Investment Products for Regulatory Compliance,

Written by Jan Witte

See Jan Witte's blog

Jan is a quantitative analyst who is generally interested in the areas of numerical mathematical finance, systematic trading, and portfolio optimisation.

Related blogs

Deploying HPC within Financial Services: Which path should your firm follow?

Against a backdrop of accelerated industry change, forward-thinking banks and fintech firms are increasingly turning to high performance computing (HPC) to secure a competitive edge and mitigate risk. So which path should your firm follow when deploying intensive HPC workloads?

Read more

HPC & AI on Wall Street - Rumours from the Trade Show Floor

New York is always an exciting, energetic city to visit and last week was made even more so as I attended the ‘HPC & AI on Wall Street’ conference which HPCWire are now championing. It was well worth the train ride from Boston and interesting to see the varied mix of attendees present and hear how HPC and AI is evolving in the world of finance.

Read more

Taking The Complexity Out of Green Computing

In April of this year, Google announced that it is taking the next step in making its data centers greener and cleaner. The company indicates that it has been carbon-neutral since 2007, and it has covered its energy consumption with 100% renewables since 2017.

Many large corporations have undertaken similar commitments, covering the equivalent of their total electricity use with renewable energy from Power Purchase Agreements (PPSa). These PPAs match their total electricity consumption to the output of a new “additional” renewable facility built on their behalf. However, only the total volumes match, not the actual physical flows of power. For example, if a company were to offset its 100 megawatthours (MWh) of consumption with 100 MWh produced by a solar facility, then at times unused surplus solar would be sold into the market while at other moments (nights, for example) the company would be buying system power from the grid at whatever carbon intensity the grid was offering at that moment.

Read more

We use cookies to ensure we give you the best experience on our website, to analyse our website traffic, and to understand where our visitors are coming from. By browsing our website, you consent to our use of cookies and other tracking technologies. Read our Privacy Policy for more information.