1 Introduction

Long Short-Term Memory (LSTM) neural networks have become extremely popular in the deep learning and data science space in recent years. After being largely ignored for decades, this type of recurrent neural networks (RNN) have garnered significant attention due to their ability to model sequential data and capture long-term dependencies in data, making them perfectly suited for tasks such as language and speech processing, time series analysis, and much more. LSTM models get their power from their capacity to effectively handle the vanishing and exploding gradient problem seen with other RNN’s, enabling the modeling of sequential data with a great level of precision. As a result, LSTM networks have become a cornerstone of machine learning.

This paper dives into the history of LSTM neural networks, diving into previous literature and research studies conducted using LSTM networks. We then plan on putting an LSTM model to the test by seeing how accurately it can predict financial time series data, specifically looking at numerous companies listed on the New York Stock Exchange. While modeling financial data is an extremely difficult task due to the unpredictability of financial markets, we will explore whether they hold any potential to accurately predict short and medium term movements in stock prices. Our hope is this paper can spur further research into the ability to accurately model financial data and how such modeling can be improved in the years to come.

1.1 What is an LSTM Network?

LSTM networks are a type of recurrent neural network that works well with sequential data, such as time series data and text. LSTMs are designed to solve sequence prediction problems. Such sequence prediction problems involve predicting the next value based on a given input. A simple example is the input [1, 2, 3, 4, 5] and the sequence prediction model would output [6] as the next digit in the sequence. In theory, RNN’s would be able to solve such a problem. Historically, however, and when dealing with real-world datasets, RNN’s have faced the challenge of the weights being changed too quickly after a few cycles, meaning the results would be either too small that it won’t affect the output (vanishing gradients problem) or too large that it results in an overflow (exploding gradients problem). LSTMs overcome this challenge by regulating the weights with three “gates.” The Forget Gate decides what information to discard from the cell, the Input Gate decides which values from the input to update the memory cell, and the Output Gate decides what to output based on input and the memory of the cell.

1.2 Applications of LSTM Networks

Long short term memory (LSTM) models are used in a wide range of situations. The influence of the LSTM network has been notable in natural language modeling, speech recognition, machine translation, and other applications[1]. LSTM networks were mainly created to solve the exploding/vanishing gradient problem[1]. Their capacity to model and understand long range dependencies makes them critical in executing various tasks. The advantage of using LSTMs over other recurrent neural networks is that an LSTM is able to save the data for much longer periods[2] than their RNN counterparts. They are especially essential in the field of finance because they can solve several problems such as identifying complex patterns in past pricing data, forecasting stock prices and the movement of the financial markets as a whole. With their ability to handle complicated sequential data, LSTMs have revolutionized how we approach and resolve issues involving sequences of information. They have become an essential tool in many machine learning and artificial intelligence fields.

1.3 Previous Research

Trying to predict stock market returns is a tale as old as time. There has been extensive research published on the ability to forecast financial markets through machine learning [3]. However, very few papers have focused specifically on LSTM models, which are ideal for such a time-series prediction. These papers have all been published in the last few years, making research on this topic novel and additional research vital to further understand the prediction value of LSTM neural networks as it applies to the financial markets.

In one 2022 paper conducted by researchers at a Chinese university, the authors applied an LSTM model to Chinese stock market data going back three years and found that the LSTM model performed better than linear regression model [4]. They concluded their research by stating that “stock returns are predictable to some extent” and that LSTM models “can help improve the prediction” potential of such returns.

Another 2022 paper, this one published Swedish researcher Dr. Carmina Fjellström, looked at the ability of LSTM models to select better-performing stock portfolios [5]. She took a Swedish stock index with the 29 most-traded companies in Sweden (OMX30) and downloaded the daily closing prices dating back 18 years. After running an LSTM model, she compared the results of the portfolio selected by the LSTM to the index as a whole (all 29 stocks), as well as to a randomly chosen portfolio. Dr. Fjellström found that LSTM portfolio had the highest average daily return of the three.