- GuruFinance Insights
- Posts
- Easily Optimize a Stock Portfolio using PyPortfolioOpt in Python
Easily Optimize a Stock Portfolio using PyPortfolioOpt in Python
How to obtain stock data, analyze it and use PyPortfolioOpt to optimize a portfolio for max Sharpe ratio
The best marketing ideas come from marketers who live it. That’s what The Marketing Millennials delivers: real insights, fresh takes, and no fluff. Written by Daniel Murray, a marketer who knows what works, this newsletter cuts through the noise so you can stop guessing and start winning. Subscribe and level up your marketing game.
🚀 Your Investing Journey Just Got Better: Premium Subscriptions Are Here! 🚀
It’s been 4 months since we launched our premium subscription plans at GuruFinance Insights, and the results have been phenomenal! Now, we’re making it even better for you to take your investing game to the next level. Whether you’re just starting out or you’re a seasoned trader, our updated plans are designed to give you the tools, insights, and support you need to succeed.
Here’s what you’ll get as a premium member:
Exclusive Trading Strategies: Unlock proven methods to maximize your returns.
In-Depth Research Analysis: Stay ahead with insights from the latest market trends.
Ad-Free Experience: Focus on what matters most—your investments.
Monthly AMA Sessions: Get your questions answered by top industry experts.
Coding Tutorials: Learn how to automate your trading strategies like a pro.
Masterclasses & One-on-One Consultations: Elevate your skills with personalized guidance.
Our three tailored plans—Starter Investor, Pro Trader, and Elite Investor—are designed to fit your unique needs and goals. Whether you’re looking for foundational tools or advanced strategies, we’ve got you covered.
Don’t wait any longer to transform your investment strategy. The last 4 months have shown just how powerful these tools can be—now it’s your turn to experience the difference.
In this article, we will be fetching stock prices for companies that we are interested to include in our portfolio. We will then perform some analysis on it to introduce concepts of returns, volatility, Sharpe ratio, the Modern Portfolio Theory and efficient frontier. Finally we will use the PyPortfolioOpt library to optimize the portfolio and get the optimized weights and portfolio performance.
The code for the article is available in this Jupyter Notebook.
These are the necessary imports. Remember to do pip installs for the following libraries if you do not already have them (pandas-datareader
, PyPortfolioOpt
and, plotly
).
from pandas_datareader.data import DataReader
from pypfopt.discrete_allocation import DiscreteAllocation, get_latest_prices
from pypfopt import EfficientFrontier
from pypfopt import risk_models
from pypfopt import expected_returns
from pypfopt import plotting
import copy
import numpy as np
import pandas as pd
import plotly.express as px
import matplotlib.pyplot as plt
import seaborn as sns
Get Stock Prices using pandas-datareader Library
Let’s get some data for the stock tickers that we want to include in our portfolio. The pandas-datareader library provides a method to pull stock price data from the web and store it in a DataFrame.
If you do not pass in the start and end dates, stock prices will be given for the full date range available. Note that the date range may be different for different stocks depending on when they were listed.
start_date = '2013/01/01'
end_date = '2022/4/17'tickers = ['MA', 'FB', 'V', 'AMZN', 'JPM', 'BA']
stocks_df = DataReader(tickers, 'yahoo', start = start_date, end = end_date)['Adj Close']
stocks_df.head()

Plot Individual Stock Prices
Here we use the plotly library to make a line plot of the prices. Unlike Matplotlib, this is an interactive graph and you can hover your mouse pointed over the lines for details.
fig_price = px.line(stocks_df, title='Price of Individual Stocks')
fig_price.show()

As seen above, Amazon seems to dominate the scale of the graph as the absolute price of the stock is very high. The graphs of all other stocks are flattened out. A graph like this is not very useful to compare the relative performance of the stocks. To address this, let’s see how we can better measure the performance of a stock by exploring the concepts of daily returns and volatility.
Daily Returns
The daily returns of a stock is the fractional gain (or loss) on a given day relative to the previous day, it is given by

As it is a relative value, it provides a fairer comparison between stock returns regardless of absolute stock prices. The pct_change()
method can be used to get the daily returns efficiently.
daily_returns = stocks_df.pct_change().dropna()
daily_returns.head()

Let’s plot out the daily returns of 2 of the stocks, Boeing (BA) and Visa (V), throughout the whole time period.
fig = px.line(daily_returns[['BA', 'V']], title='Daily Returns')
fig.show()

We see that they tend to fluctuate gently around 0. Notably, the fluctuations are much greater during a period of high volatility (i.e. during the Covid crash in March 2020).
Volatility
Daily Volatility
Daily Volatility is the average difference between the return on a given day and the average return over the time period. Mathematically, it is just the standard deviation of the daily returns. Volatility is one of the measures of risk in that highly volatile investments can carry greater risk.
daily_returns.std()

Here we see that BA has a slightly higher volatility, compared to that of V. When we compare the density plots of their daily returns, we can see that V has a narrower curve with a higher peak, while BA has a wider curve indicating higher standard deviation and hence volatility.
sns.displot(data=daily_returns[['BA', 'V']], kind = 'kde', aspect = 2.5)
plt.xlim(-0.1, 0.1)

Annual Volatility
For completeness, the Annual Volatility is a more common measure and can be calculated simply by multiply the daily volatility by the square root of the number of trading days in a year i.e. 252.
Plot Individual Cumulative Returns
The cumulative returns of the stock can be easily calculated by adding one to the daily returns and taking the cumulative product over the whole period. Here we plot the cumulative returns of stocks starting with an initial investment of $100 (i.e. how much would investing $100 at the start in each individual stock get you over the time period?) This is a fair comparison for the performance of the stocks.
def plot_cum_returns(data, title):
daily_cum_returns = 1 + data.dropna().pct_change()
daily_cum_returns = daily_cum_returns.cumprod()*100
fig = px.line(daily_cum_returns, title=title)
return fig
fig_cum_returns = plot_cum_returns(stocks_df, 'Cumulative Returns of Individual Stocks Starting with $100')
fig_cum_returns.show()

Correlation Matrix between Stocks
The correlation matrix gives us the correlation coefficients between every pair of stocks. Correlation coefficients are indicators of the strength of the linear relationship between two different variables. It is a value from 0 to 1, with 1 indicating the strongest relationship.
A positive value indicates a positive relationship, i.e. the two variables move together.
A negative value indicates a negative relationship, i.e.the two variables are inversely related.
A zero value indicates no relationship.
Pandas has a convenient corr()
method for us to generate the matrix.
corr_df = stocks_df.corr().round(2) # round to 2 decimal places
fig_corr = px.imshow(corr_df, text_auto=True, title = 'Correlation between Stocks')
fig_corr.show()

In general (though not always), stock prices tend to move together (increase in bull market, decrease in bear market), hence the correlation would most likely be positive as shown below. Also notice how Boeing (BA) is weakly correlated to other stocks during this time period. This may be because it is in a very different industry, or because of the bad news that has been hitting it over the recent years. And notice how the similar companies Mastercard (MA) and Visa (V) are almost perfectly correlated.
Expected Returns and Covariance Matrix
Similarly, the covariance matrix measures whether stocks move in the same direction (a positive covariance) or in opposite directions (a negative covariance).
It is used to calculate the volatility of the whole portfolio of stocks (full mathematics here), which in turn is used by portfolio managers to quantify its risk. PyPortfolioOpt makes it easy to get this matrix, and also the mean annual return of each stocks printed below. We need these as inputs to find our optimized portfolio later.
mu = expected_returns.mean_historical_return(stocks_df)
S = risk_models.sample_cov(stocks_df)print(mu)

Portfolio Returns, Risk Free Rate, Volatility and the Sharpe Ratio
Similar to stocks, a portfolio has an expected return and volatility as well. Its expected return is obtained from the weights of each stock multiplied by the stock’s expected return above and then summing them together. Its volatility can be calculated from the covariance matrix as mentioned earlier.
Sharpe Ratio
The Sharpe ratio of a portfolio measures its return in relation to the risk-free rate (e.g. U.S. Treasury rate) and its risk (standard deviation). It is given by:

From Investopedia
Higher values of Sharpe ratio is more desirable because its risk-adjusted performance is greater. In fact, the max Sharpe ratio portfolio is the optimized portfolio we want.
Visualize the Efficient Frontier and max Sharpe Ratio Portfolio
The Modern Portfolio Theory (MPT) is a model for developing an asset portfolio that maximizes expected return for a given level of risk. The theory assumes that the average human is risk-averse. Hence for a given level of expected return, the least risky portfolio is always preferred. The set of optimal portfolios that offer the lowest risk (volatility) for a given level of expected return forms the Efficient Frontier. It is represented by a curve on a Return vs Volatility graph.
Of course, the max Sharpe ratio portfolio lies on the efficient frontier.
To represent everything visually, the function below (code from documentation) generates 1000 portfolios of our stocks with random weights and plot out their returns and volatility. The efficient frontier and the max Sharpe ratio portfolio is also plotted on the graph.
def plot_efficient_frontier_and_max_sharpe(mu, S):
# Optimize portfolio for maximal Sharpe ratio
ef = EfficientFrontier(mu, S) fig, ax = plt.subplots(figsize=(8,6))
ef_max_sharpe = copy.deepcopy(ef)
plotting.plot_efficient_frontier(ef, ax=ax, show_assets=False) # Find the max sharpe portfolio
ef_max_sharpe.max_sharpe(risk_free_rate=0.02)
ret_tangent, std_tangent, _ = ef_max_sharpe.portfolio_performance()
ax.scatter(std_tangent, ret_tangent, marker="*", s=100, c="r", label="Max Sharpe")# Generate random portfolios
n_samples = 1000
w = np.random.dirichlet(np.ones(ef.n_assets), n_samples)
rets = w.dot(ef.expected_returns)
stds = np.sqrt(np.diag(w @ ef.cov_matrix @ w.T))
sharpes = rets / stds
ax.scatter(stds, rets, marker=".", c=sharpes, cmap="viridis_r")# Output
ax.set_title("Efficient Frontier with Random Portfolios")
ax.legend()
plt.tight_layout()
plt.show()
plot_efficient_frontier_and_max_sharpe(mu, S)

Hopefully the graph gives you a better idea of what the efficient frontier is. In the graph above, we see that it is not possible for portfolios to lie at the left boundary of the frontier (else we would have portfolios with the same expected returns as those on the frontier but with lower volatility).
Get Weights for Optimized Portfolio
The max Sharpe ratio portfolio and the weights of its constituent stocks can be easily obtained with PyPortfolioOpt using the following code. Here we use a risk free rate of 2% which corresponds roughly to the U.S. treasury yield today. The weights are given as an OrderedDict object, which is converted to a DataFrame for easier reference.
ef = EfficientFrontier(mu, S)
ef.max_sharpe(risk_free_rate=0.02)
weights = ef.clean_weights()
print(weights)

weights_df = pd.DataFrame.from_dict(weights, orient = 'index')
weights_df.columns = ['weights']
weights_df

Notice how certain weights tend to dominate by high performing stocks such as Amazon (high returns with relatively ok volatility), while some weights tend to be 0. We will come to this later towards the end of the article.
Expected Annual Return, Annual Volatility and Sharpe Ratio for Optimized Portfolio
The portfolio_performance()
method allows us to view the expected annural return, annual volatility and Sharpe ratio of the portfolio. These values should be the same as the ‘starred’ portfolio in the efficient frontier plot above.
expected_annual_return, annual_volatility, sharpe_ratio = ef.portfolio_performance()print('Expected annual return: {}%'.format((expected_annual_return*100).round(2)))
print('Annual volatility: {}%'.format((annual_volatility*100).round(2)))
print('Sharpe ratio: {}'.format(sharpe_ratio.round(2)))

Generate Portfolio with Optimized Weights
Now let us generate the portfolio with optimized weights and plot out its cumulative returns over time.
stocks_df['Optimized Portfolio'] = 0for ticker, weight in weights.items():
stocks_df['Optimized Portfolio'] += stocks_df[ticker]*weightstocks_df.head()

Plot Cumulative Returns of Optimized Portfolio
fig_cum_returns_optimized = plot_cum_returns(stocks_df['Optimized Portfolio'], 'Cumulative Returns of Optimized Portfolio Starting with $100')
fig_cum_returns_optimized.show()

Final Words
We see from the above plot that if we have invested $100 at the start of 2013, we would end up with $1146 today, which is pretty lucrative.
However, recall earlier that certain weights in the portfolio tend to dominate by high performing stocks such as Amazon (high returns with relatively ok volatility), while some weights tend to be 0. Hence one of the weakness of the max Sharpe portfolio optimization approach is that the portfolio may not be as diversified (across types of stocks or industries) as we want it to be.
Also, as cliché as it may sound, past performance is not indicative of future results. In our approach, we obtained the expected returns and volatility based on the movement of stock prices in the past, future returns/volatility may well differ.
Nevertheless, I hope this gives you a good introduction to portfolio optimization and the power of the PyPortfolioOpt library and Python, as you explore the other methods of portfolio optimization in the future.