Can machine learning assist in investment processes?

Stephán Engelbrecht and Alungile Gcaza, co-founders of Mazi NextGen

There is a solid case to be made that artificial intelligence (AI) or machine learning (ML) has been the central theme in the investment community in 2023. According to an article in Business InsiderAI-related terms were mentioned 7,358 times between all companies in the S&P500 during their second-quarter earnings calls. That is a staggering number of mentions! The management teams of companies, from manufacturers of GPUs (Nvidia) to grocery retailers (Kroger), have touted AI, whether the demand thereof or its capabilities, as their competitive advantage in the current operating environment.

Investment managers have also been touting how AI and ML will disrupt almost every business sector. They are probably correct. But the one area where many investment managers have been downplaying the impact of ML is in their own industry. This could be for several reasons, from an ostrich mentality of hoping that it will not impact their lucrative industry, to trying to hide the progress they have made in ML from their competitors.

Let’s hope that it is more the latter than the former…

In this piece, we will highlight two areas where ML can assist investment managers in improving their investment processes: asset allocation and stock screening. These are but two of several areas where ML can assist in the investment management industry.

Asset allocation

In his memo to clients on November 20, 2001 titled “You can’t predict. You can prepare”, Howard Marks from Oaktree Capital made the following statement: “In my opinion, the key to dealing with the future lies in knowing where you are, even if you can’t know precisely where you’re going.”

Forecasting the macro environment is notoriously difficult. It is common to have experts argue over what macro environment we are currently facing, not to mention the environment we will face in the future. This is because the macro environment is so intricate, with many indicators and decision-makers. It is thus not surprising that different experts will focus on different indicators, and often these indicators may point in different directions.

ML can assist investment managers and asset allocators in identifying the current market environment by fairly considering many macro indicators and comparing them to historic experiences. In this example, we consider 21 variables, comprising economic indicators, local and global interest rates, local and global market indicators, currencies, and commodities. Although we believe these variables capture a significant portion of the information about the economic environment, there is no limit to the number of variables that can be considered.

Chart 1 (above) shows the different market regimes identified by reducing 21 economic, market, and commodity indicators through Principal Component Analysis (PCA) and then grouping the Principal Components (PCs) using K-Means Clustering. The four regimes identified are created by grouping the time periods where the market environments were most similar.

The 21 variables are fed to the ML algorithm and, using principal component analysis and clustering algorithms, it can collate economic and market conditions and calculate the proximity of past economic environments to the current environment using Euclidian distances. This information can then be used to examine the performance of investment factors, sectors, asset classes, or fund managers over a subsequent investment period.

Stock screening

ML algorithms can assist investment professionals by identifying high-probability investment opportunities based on fundamental, momentum, quality, growth, and technical variables highlighted in various academic studies to yield successful investment outcomes. These algorithms will then analyse the entire investment universe and identify potential investment opportunities the investment team may have missed.

It is important to highlight that these are not simple linear screening models. These ML algorithms are non-linear and multi-layered. In statistical terms, this means that the model will consider conditionalities when calculating the probability of success. In a layperson’s words, the algorithm will identify that, as a simplistic example, a company exhibiting low price volatility and reasonable price momentum that historically was able to generate a return on equity (ROE) of 20% and is currently trading on a 10x price-earnings multiple is more attractive than a company with low or negative price momentum that was only able to generate an ROE of 8% trading on the same 10x PE multiple. The algorithms will identify these conditionalities by optimising classification trees that attempt to group companies as outperformers and underperformers, given their current characteristics.

These models are very successful at identifying high-probability investment opportunities. To highlight this, let’s consider the age-old critique of active investment management made by Burton Malkeil in his 1973 book, A Random Walk Down Wall Street, that “…a blindfolded monkey throwing darts at a newspaper’s financial pages could select a portfolio that would do just as well as one carefully selected by experts”.

We created a simulation where the computer randomly selects 20 shares every month from the 100 most liquid shares listed on the Johannesburg Stock Exchange (JSE) at the time and holds the shares for three months. We did this every month over the 10 years, from July 2013 to July 2023. We then simulated this 1,000 times to get a distribution of the average annualised returns that our “dart-throwing monkey” could generate. This distribution is depicted by the black dotted line in Chart 5. We then recreated the above experiment, but instead of offering the “dart-throwing monkey” all 100 most liquid shares, we only offered the liquid shares with the highest probability of investment success (depicted by the blue shaded histograms) at the time and then with the shares with the lowest probability of investment success (represented by the red shaded histograms) at the time.

The results are fascinating. The average annualised returns clearly become greater when the “dart-throwing monkey” can only “pick” from the population of high probability shares and lower when the “dart-throwing monkey” can only “pick” from the population of low probability shares. We are confident that humans can do slightly better than the “dart-throwing monkey”.


Difficult as it is to imagine, some investment managers in the late 1980s and early 1990s thought there would never be any use for personal computers in the investment industry. Fast forward to today, and it would be difficult to find any investment professional who is not highly skilled in some aspect of the computer, whether it is Excel or using search engines. It is our opinion that ML algorithms are in the same nascent position today.

The above examples are two very simplistic methods of using ML in the investment process, and there are many more exciting examples and ideas.

It is the investment managers’ responsibility to their investors to start incorporating these powerful tools into their process and approach, or they risk falling behind. Copyright. HedgeNews Africa – November 2023.

Stephán Engelbrecht and Alungile Gcaza are co-founders of Mazi NextGen, a partnership with Mazi Asset Management, where they focus on pioneering asset management systems and skills.

The Mazi NextGen Long Short Prescient RI Hedge Fund is a South African long/short equity hedge fund, using machine learning and artificial intelligence as a differentiator.

Engelbrecht is a CFA charterholder with a degree in financial mathematics and investment management from the University of Johannesburg and an MBA from Stellenbosch University. He is busy with a PhD in Finance with a focus on machine learning and the implications for asset management.

Gcaza has a BSc Honours in Statistical Sciences from the University of Cape Town.