NeuralProphet applied to Stock Price Prediction

In partnership with

Want to get the most out of ChatGPT?

ChatGPT is a superpower if you know how to use it correctly.

Discover how HubSpot's guide to AI can elevate both your productivity and creativity to get more things done.

Learn to automate tasks, enhance decision-making, and foster innovation with the power of AI.

🚀 Your Investing Journey Just Got Better: Premium Subscriptions Are Here! 🚀

It’s been 4 months since we launched our premium subscription plans at GuruFinance Insights, and the results have been phenomenal! Now, we’re making it even better for you to take your investing game to the next level. Whether you’re just starting out or you’re a seasoned trader, our updated plans are designed to give you the tools, insights, and support you need to succeed.

Here’s what you’ll get as a premium member:

  • Exclusive Trading Strategies: Unlock proven methods to maximize your returns.

  • In-Depth Research Analysis: Stay ahead with insights from the latest market trends.

  • Ad-Free Experience: Focus on what matters most—your investments.

  • Monthly AMA Sessions: Get your questions answered by top industry experts.

  • Coding Tutorials: Learn how to automate your trading strategies like a pro.

  • Masterclasses & One-on-One Consultations: Elevate your skills with personalized guidance.

Our three tailored plans—Starter Investor, Pro Trader, and Elite Investor—are designed to fit your unique needs and goals. Whether you’re looking for foundational tools or advanced strategies, we’ve got you covered.

Don’t wait any longer to transform your investment strategy. The last 4 months have shown just how powerful these tools can be—now it’s your turn to experience the difference.

In this article, we will experiment with using NeuralProphet to forecast stock prices. Developed by Meta, NeuralProphet is considered superior to the Prophet package. In a previous article, we used Prophet to perform a similar task and the results were unimpressive. We will explore if NeuralProphet gives a superior performance in this case. From the NeuralProphet website:

NeuralProphet — Fusing traditional time series algorithms using standard deep learning methods, built on PyTorch, inspired by Facebook Prophet and AR-Net.

CoreWeave gained 209%. We called it early.

Stocks & Income’s free daily investing newsletter sends you the breakout stocks before they go mainstream.

Here are some recent highlights:

CoreWeave (before it soared 209%)
Palantir (+441% this year)
On Holding (+25%)
Nova Ltd. (+13% so far)

And much, much more.

Read what we’re tracking next before it takes off.

With Stocks & Income, you’ll get AI stock picks, streamlined news coverage, key charts, and 1-sentence sector updates—all built to help you invest smarter and beat the market in 5 minutes per day.

It’s 100% free and delivered daily to your inbox.

Join 100,000+ smart investors receiving the breakout trades early.

Stocks & Income is for informational purposes only and is not intended to be used as investment advice. Do your own research.

NeuralProphet uses a decomposable time series model with model components including trend, seasonality, event effects, and regression effects. They are combined using the equation

y(t) = T(t) + S(t) + E(t) + F(t) + A(t) + L(t),

where 

T(t) = Trend at time t
S(t) = Seasonal effects at time t
E(t) = Event and holiday effects at time t
F(t) = Regression effects at time t for future-known exogenous variables
A(t) = Auto-regression effects at time t based on past observations
L(t) = Regression effects at time t for lagged observations of exogenous variables

Such decomposable time series are very common in forecasting and later in the article, we will see how to tune each component of the above equation.

Disclaimer: This is a technical exercise to understand the inner workings and performance of the NeuralProphet model. Stock trading is risky, and I will not be responsible for your losses.

Problem Statement

We aim to predict the daily adjusted closing prices of Vanguard Total Stock Market ETF (VTI), using data from the previous N days. In this experiment, we will use 6 years of historical prices for VTI from 2013–01–02 to 2018–12–28, which can be easily downloaded from yahoo finance. After downloading, the dataset looks like this:

Downloaded dataset for VTI.

Altogether, we have 1509 days of data to play with. Note that Saturdays and Sundays are not included in the dataset above. A plot of the adjusted closing price in the entire dataset is shown below:

Adjusted closing prices from 2013–01–02 to 2018–12–28.

We calculate daily returns using the formula

where r(t) and p(t) denote the daily returns and adjusted closing price on day t respectively. To observe the distribution of the daily returns, we plot the distribution plot below:

Distribution plot of the daily returns.

As can be observed above, the distribution plot approximates quite well to a Gaussian distribution. To further verify this, we plot the probability plot using the scipy.stats package as shown below. For the probability plot in scipy.stats, the sample distribution is compared with the Gaussian distribution by default. The better the fit to the straight red line, the better the fit to the Gaussian distribution. We observe that the distribution of the daily returns fits the Gaussian distribution pretty well, except at the extreme values (around > +/- 2.5).

To effectively evaluate the performance of NeuralProphet, running one forecast at a single date is not enough. Instead, we will perform various forecasts at different dates in this dataset, and average the results. For all forecasts, we will compare the NeuralProphet method with the Last Value method.

To evaluate the effectiveness of our methods, we will use the root mean square error (RMSE), mean absolute percentage error (MAPE), and mean absolute error (MAE) metrics. For all metrics, the lower the value, the better the prediction.

Training and Validation

To perform a forecast, we need training and validation data. We will use 3 years of data as the train set, which corresponds to 756 days since there are about 252 trading days in a year (252*3 = 756). We will use the next 1 year of data to perform validation, which corresponds to 252 days. In other words, for each forecast we make, we need 756+252 = 1,008 days of data for model training and validation. The model will be trained using the train set, and model hyperparameters will be tuned using the validation set.

To tune the hyperparameters, we will use the moving window validation method. This is best explained using an example. Suppose we have 896 days of data in all and we wish to perform a forecast at the 857th day with forecast horizon = 40 days. As a heuristic, for a forecast horizon H, we generally make a forecast every H/2 periods. With a train set of 756 days we will be able to do validation 4 times (see diagram below). The error metrics from these 4 validations will be averaged, and model hyperparameters will be chosen based on the combination that gives the lowest averaged error metrics. With these optimum hyperparameters, we will then do a forecast on the 857th day and report the results.

Moving window validation example.

In what follows, we will first perform a forecast on the 1009th day of our dataset, with a forecast horizon of 21 days (note there are about 21 trading days in a month, excluding weekends). For NeuralProphet, we will use the first 1008 days as training and validation set, with a 756:252 split as mentioned above. We will first explain the Last Value method in the next section.

Last Value

In the Last Value method, we simply set the prediction as the last observed value. In our context, this means we set the current adjusted closing price as the previous day’s adjusted closing price. This is the most cost-effective forecasting model and is commonly used as a benchmark against which more sophisticated models can be compared. There are no hyperparameters to be tuned here. Our forecast on the 1009th day of our dataset, for a forecast horizon of 21 days is shown below.

Predictions on test set using Last Value method.

The forecast above gives a RMSE of 1.89, MAPE of 1.59%, and MAE of 1.80.

NeuralProphet with no Hyperparameter Tuning

To run a forecast using NeuralProphet, use the code below.

from neuralprophet import NeuralProphet, set_random_seed

train_size = 252*3                     # Use 3 years of data as train set
val_size = 252                         # Use 1 year of data as validation set
train_val_size = train_size + val_size # Size of train+validation set
i = train_val_size                     # Day to forecast
H = 21                                 # Forecast horizon

set_random_seed(random_seed) # Set a random seed for reproducibility

m = NeuralProphet()
m.set_plotting_backend("plotly-static")
metrics = m.fit(df_nprophet[i-train_val_size:i])

# Create dataframe with the dates we want to predict
future = m.make_future_dataframe(df_nprophet[i-train_val_size:i], n_historic_predictions=True, periods=H)

# Predict
forecast = m.predict(future)

For a quick visualization, you can plot your forecast using Prophet as such:

m.plot(forecast)

Predictions for stock returns plotted using NeuralProphet.

This is a simple model with a trend, a weekly seasonality, and a yearly seasonality estimated by default. You can also look at the individual components separately as below.

m.plot_components(forecast);

Note that the weekly seasonality plot is almost flat, which means NeuralProphet is not able to detect intra-week differences.

The individual coefficient values can also be plotted as below to gain further insights.

m.plot_parameters()

Note the Seasonality: yearly and Seasonality: weekly charts shown above are one period of the components chart.

In the above chart for the forecast, the forecasts shown are for stock returns. We can convert them back to price using code as shown:

# Convert back to price
est_adj_close = []
prev_tg = df.loc[i-1, 'adj_close']
for n in range(H):
    est_adj_close.append((float(preds_list.iloc[n])/100+1)*prev_tg)
    prev_tg = (float(preds_list.iloc[n])/100+1)*prev_tg

After doing so, the forecast of stock prices is shown below.

The forecast above gives a RMSE of 1.60, MAPE of 1.35%, and MAE of 1.52.

NeuralProphet with Hyperparameter Tuning — Changepoints

Time series usually have abrupt changes in their trajectories. The strength of this changepoint detection can be adjusted by using the parameter n_changepoints. Increasing n_changepoints will make the trend more flexible and result in overfitting. Decreasing the n_changepoints will make the trend less flexible and result in underfitting. By default, this parameter is set to 10.

NeuralProphet uses a classic approach to model the trend as the combination of an offset m and a growth rate k. The trend effect at a time t1 is given by multiplying the growth rate k by the difference (t1−t0) in time since the starting point t0 on top of the offset m.

trend(t1) = m + k(t1 - t0) = trend(t0) + k(t1-t0)

We will test with values of 2, 5, 10, 15, 20 for n_changepoints. For each value of n_changepoints, we ran our forecast using the training and validation sets, and below are the results:

Tuning n_changepoints using RMSE, MAPE and MAE.

The above process took 8 mins in total. From the above, we observe that the optimum value for n_changepoints is 2. Next, we use this value and run our predictions on the test set. The results are shown below.

Predictions on test set after tuning n_changepoints.

The forecast above gives a RMSE of 2.75, MAPE of 2.35%, and MAE of 2.65.

NeuralProphet with Hyperparameter Tuning — Monthly Seasonality

Monthly seasonality can be set in NeuralProphet as such:

m = NeuralProphet()
m = m.add_seasonality(name="monthly", period=30.5, fourier_order=3)

Seasonality in NeuralProphet is modeled with the help of Fourier terms. Fourier order above refers to the number of terms in the partial sum which is used to estimate seasonalities. For more details about the Fourier order, refer to NeuralProphet documentation here and the paper on NeuralProphet here.

We will test with values of 2, 4, 6, 8, and 10 for fourier_order. For each value of fourier_order, we ran our forecast using the training and validation sets, and below are the results:

Tuning fourier_order using RMSE, MAPE and MAE.

The above process took 8 mins in total. From the above, we observe that the optimum value for fourier_order is 8. Next, we use this value and run our predictions on the test set. The results are shown below.

Predictions on test set after tuning fourier_order.

The forecast above gives a RMSE of 2.19, MAPE of 1.87%, and MAE of 2.12.

NeuralProphet with Hyperparameter Tuning — Events and Holidays

The holiday period may affect how the stock market performs. For example, there is a well-known phenomenon known as the Christmas rally where the stock market rises around the Christmas festive season. NeuralProphet is able to take events into account when performing predictions. First, we need to construct an events dataframe that looks like the one below, which can be easily constructed from a CSV file.

Holidays of the United States from 2013 to 2018 (only first 5 rows are shown).

Events can be set in NeuralProphet as such (events is the dataframe shown above):

m = NeuralProphet()
m.add_events("hols", lower_window=lower_window, upper_window=upper_window)
history_df = m.create_df_with_events(df, holidays)
m.fit(history_df)
future = m.make_future_dataframe(history_df, events_df=events, n_historic_predictions=True, periods=H)
forecast = m.predict(future)

The window size hyperparameter extends the holiday out to [lower_window, upper_window] days around the date. For simplicity, we assume the lower_window and upper_window have the same magnitude ie. lower_window = upper_window = window size. We will test with values of 0, 1, and 2 for window size. For each value of window size, we ran our forecast using the training and validation sets, and below are the results:

Tuning window size for events using RMSE, MAPE and MAE.

The above process took 5 mins in total. From the above, we observe that the optimum value for window size is 1. Next, we use this value and run our predictions on the test set. The results are shown below.

Predictions on test set after tuning window_size.

The forecast above gives a RMSE of 2.05, MAPE of 1.74%, and MAE of 1.96.

NeuralProphet with Hyperparameter Tuning — Autoregression

Autoregression is a time series model that uses observations from previous time steps as inputs to a regression equation to predict the value at the next time step.

Autoregression in NeuralProphet is determined by the n_lags parameter. We will test with values of 0, 2, 5, and 10 for n_lags. For each value of n_lags, we ran our forecast using the training and validation sets, and below are the results:

Tuning n_lags for autoregression using RMSE, MAPE and MAE.

The above process took 7 mins in total. From the above, we observe that the optimum value for n_lags is 2. Next, we use this value and run our predictions on the test set. The results are shown below.

Predictions on test set after tuning n_lags.

The forecast above gives a RMSE of 1.18, MAPE of 0.89%, and MAE of 1.01.

NeuralProphet with Hyperparameter Tuning — All Hyperparameters Combined

The next step will be to test combinations of the hyperparameters to find out which set is optimum. Testing each of the above hyperparameter combinations in a grid search manner would have taken 5 5 3 4 7 mins = 35 hours in total. To save some time, we take reference from the results above and test only with the below values:

n_changepoints_list = [2, 5]
fourier_order_list = [6, 8]
window_list = [1]         
n_lags_list = [0, 2] 

Further, we tried the above hyperparameters firstly without the events hyperparameter, and secondly with the events hyperparameter. Below are the results without the events hyperparameter.

Below are the results with the events hyperparameter.

From the above, we observe that the optimum values for n_changepoints, fourier_order, window, and n_lags are 2, 6, 1, 2, and using events respectively. Next, we use these values and run our predictions on the test set. The results are shown below.

Predictions for test set after tuning n_changepoints, fourier_order, window_size, n_lags and events.

The forecast above gives a RMSE of 1.04, MAPE of 0.81%, and MAE of 0.91.

At this point it will be good to consolidate our results from above:

Performance evaluation of our various methods, based on forecast made for the 1009th day.

By tuning all the hyperparameters combined, we achieved superior performance using NeuralProphet for our forecast on the 1009th day. RMSE, MAPE, and MAE are the lowest when all hyperparameters are tuned, beating out all other methods including the Last Value benchmark. Next, let’s see how NeuralProphet works for other days.

Forecasts on Multiple Days

Having observed the effect of hyperparameter tuning above, here we will extend our methodology to run forecasts on multiple days. We will start by performing a forecast on the 1009th day of our dataset, and repeat this forecast every 42 days. Since we have 1509 days in our dataset, we will make 12 forecasts in all. For each forecast, we will stick to a forecast horizon of 21 days. Also, for each forecast on day t, we use the optimum hyperparameters from above.

Predictions on multiple days using NeuralProphet.

From the above, we observe that not all forecasts are good. There are certain days where the direction and level of the forecasts match the actual values well and certain days where the direction and level of the forecasts are totally off. Below are the performance results of each forecast.

Performance evaluation of each forecast.

Let’s compare the above results with the Last Value method shown below.

Predictions on multiple days using Last Value method.

NeuralProphet returned a mean RMSE (averaged over the 12 forecasts) of 2.15, a mean MAPE of 1.41%, and a mean MAE of 1.88. The Last Value method returned a mean RMSE of 2.53, a mean MAPE of 1.69%, and a mean MAE of 2.26. In this case, NeuralProphet is superior. I thought this was quite interesting because in my previous experiment with Prophet (not NeuralProphet), the Last Value method turned out to be better. In this case, NeuralProphet redeemed itself.

You can find my Jupyter Notebook here. Feel free to play with it, and if you manage to find hyperparameters such that NeuralProphet can consistently beat the Last Value method, please share in the comments below! Also, please give a star to the repo if you find it useful!