- GuruFinance Insights
- Posts
- The HAR-X model for Volatility Trading. A new approach
The HAR-X model for Volatility Trading. A new approach
Read The Daily Upside. Stay Ahead of the Markets. Invest Smarter.
Most financial news is full of noise. The Daily Upside delivers real insights—clear, concise, and free. No clickbait, no fear-mongering. Just expert analysis that helps you make smarter investing decisions.
🚀 Your Investing Journey Just Got Better: Premium Subscriptions Are Here! 🚀
It’s been 4 months since we launched our premium subscription plans at GuruFinance Insights, and the results have been phenomenal! Now, we’re making it even better for you to take your investing game to the next level. Whether you’re just starting out or you’re a seasoned trader, our updated plans are designed to give you the tools, insights, and support you need to succeed.
Here’s what you’ll get as a premium member:
Exclusive Trading Strategies: Unlock proven methods to maximize your returns.
In-Depth Research Analysis: Stay ahead with insights from the latest market trends.
Ad-Free Experience: Focus on what matters most—your investments.
Monthly AMA Sessions: Get your questions answered by top industry experts.
Coding Tutorials: Learn how to automate your trading strategies like a pro.
Masterclasses & One-on-One Consultations: Elevate your skills with personalized guidance.
Our three tailored plans—Starter Investor, Pro Trader, and Elite Investor—are designed to fit your unique needs and goals. Whether you’re looking for foundational tools or advanced strategies, we’ve got you covered.
Don’t wait any longer to transform your investment strategy. The last 4 months have shown just how powerful these tools can be—now it’s your turn to experience the difference.

Made with Python
Introduction
Effective risk management is at the heart of successful investment strategies. Among the various approaches developed over the years, volatility-based strategies stand out for their ability to adjust market exposure based on expected market conditions. The HAR-X strategy represents an advanced implementation of this concept, combining sophisticated volatility modeling with dynamic asset allocation to potentially enhance risk-adjusted returns.
Theoretical Foundation
The HAR-X strategy is built on a fundamental market observation: there exists an inverse relationship between volatility and long-term returns. Historically, periods of extreme market volatility often coincide with poor investment returns, while calmer market environments tend to deliver more consistent positive performance.
This relationship isn’t merely coincidental. During high-volatility periods, market participants typically demand higher risk premiums, leading to lower asset prices. Conversely, low-volatility environments generally reflect investor confidence and stability, creating favorable conditions for asset appreciation.
The strategy leverages this relationship by dynamically adjusting market exposure, increasing allocation during predicted calm periods and reducing it when turbulence is expected.
Key Components of the HAR-X Model
Yang-Zhang Volatility Estimation
At the core of the HAR-X strategy is an accurate estimation of market volatility. Rather than relying on simple historical standard deviation, the strategy employs the Yang-Zhang volatility model, which offers significant advantages over traditional methods.
The Yang-Zhang approach combines three volatility components:
Overnight volatility (close-to-open price movements)
Intraday volatility (open-to-close price movements)
Rogers-Satchell volatility component
This comprehensive calculation captures more nuanced market behavior than standard volatility measures, providing a more accurate risk assessment, especially during periods of market stress or unusual trading patterns.
Heterogeneous Autoregressive Model
The strategy uses a Heterogeneous Autoregressive (HAR) modeling framework, extended with exogenous variables (hence the “X” in HAR-X). This approach acknowledges that market participants operate on different time horizons, from day traders to long-term investors.
The HAR-X model incorporates volatility components from multiple timeframes:
Daily volatility (immediate market conditions)
Weekly volatility (medium-term trends)
Monthly volatility (longer-term market environment)
By combining these measures, the model captures the heterogeneous nature of market volatility across different time scales, allowing for more nuanced predictions of future volatility.
VIX Term Structure Integration
To enhance its predictive power, the HAR-X strategy incorporates data from the VIX volatility index and its term structure. This includes:
VIX (30-day implied volatility)
VIX3M (3-month implied volatility)
VIX6M (6-month implied volatility)
VIX curve slope (relationship between short and long-term implied volatility)
The VIX term structure provides forward-looking information about market expectations, complementing the backward-looking historical volatility measures. When the VIX curve is steep (short-term VIX much higher than longer-term), it often indicates acute but potentially temporary market stress. Conversely, a flat or inverted VIX curve may signal more persistent volatility concerns.
Strategy Implementation
The HAR-X strategy translates volatility predictions into concrete investment decisions through a simple yet effective framework:
Low Volatility Environment (below 25th percentile): The strategy adopts a 2x market exposure, effectively leveraging the favorable risk-return characteristics typical of calm market periods.
Medium Volatility Environment (between 25th and 75th percentiles): The strategy maintains normal (1x) market exposure, reflecting balanced risk-return prospects.
High Volatility Environment (above 75th percentile): The strategy reduces market exposure to zero, moving to cash to avoid potential drawdowns associated with turbulent markets.
This tiered approach allows for a systematic, rules-based implementation that removes emotional biases from the investment process. By adjusting exposure based on predictable volatility regimes rather than trying to forecast market direction, the strategy acknowledges the inherent unpredictability of short-term market movements while capitalizing on more reliable volatility patterns.
Performance Characteristics
When properly implemented, the HAR-X strategy typically exhibits several beneficial characteristics:
Reduced Maximum Drawdowns: By moving to cash during high-volatility periods, the strategy often avoids the worst market downturns, preserving capital during market crashes.
Improved Risk-Adjusted Returns: The dynamic allocation approach frequently results in higher Sharpe and Sortino ratios compared to a simple buy-and-hold strategy, delivering more return per unit of risk.
Asymmetric Return Profile: The strategy aims to participate in market upside during favorable conditions while limiting exposure during adverse environments, creating an asymmetric return pattern that can be particularly valuable for risk-conscious investors.
Regime Adaptability: Unlike static allocation approaches, the HAR-X strategy dynamically adapts to changing market environments, making it potentially more resilient across different economic and market cycles.
Practical Considerations
While theoretically sound, several practical factors should be considered when implementing the HAR-X strategy:
Transaction Costs: The strategy involves periodic portfolio rebalancing, which can generate transaction costs. These should be carefully considered, especially for smaller portfolios.
Tax Implications: Frequent rebalancing may have tax consequences in taxable accounts. The strategy might be more suitable for tax-advantaged accounts or for investors with sophisticated tax management approaches.
Implementation Vehicles: The strategy can be implemented using various instruments, including ETFs, futures, or options. The choice of implementation vehicle will affect the strategy’s cost, efficiency, and risk characteristics.
Parameter Sensitivity: The specific percentile thresholds (25th and 75th in our example) can be adjusted based on investor risk tolerance and market conditions. Some implementations might benefit from more or less aggressive thresholds.
Short instead of cash: This one is for the adventurous people out there. Instead of maintaining cash short when Vol > Percentile 75.
Here is the code, so you can modify it or backtest it:
import yfinance as yf
import pandas as pd
import numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
from scipy.stats import norm
# Style configuration for charts
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("viridis")
def get_historical_data(symbol, start_date, end_date):
"""Gets historical data using yfinance"""
print(f"Downloading {symbol} data...")
data = yf.download(symbol, start=start_date, end=end_date, progress=False)
print(f"Downloaded {len(data)} days of {symbol} data")
return data
def get_vix_data(start_date, end_date):
"""Gets VIX, VIX3M and VIX6M data using yfinance"""
vix_data = pd.DataFrame()
# Get VIX
print("Downloading VIX data...")
vix = yf.download("^VIX", start=start_date, end=end_date, progress=False)
vix_data['VIX'] = vix['Close']
# Try to get VIX3M and VIX6M directly
try:
print("Downloading VIX3M data...")
vix3m = yf.download("^VIX3M", start=start_date, end=end_date, progress=False)
vix_data['VIX3M'] = vix3m['Close']
except Exception as e:
print(f"Could not download VIX3M data, will create synthetic approximation: {e}")
vix_data['VIX3M'] = vix_data['VIX'].rolling(window=63).mean() # ~3 months
try:
print("Downloading VIX6M data...")
vix6m = yf.download("^VIX6M", start=start_date, end=end_date, progress=False)
vix_data['VIX6M'] = vix6m['Close']
except Exception as e:
print(f"Could not download VIX6M data, will create synthetic approximation: {e}")
vix_data['VIX6M'] = vix_data['VIX'].rolling(window=126).mean() # ~6 months
# Calculate VIX curve slope
vix_data['VIX_Slope'] = vix_data['VIX'] / vix_data['VIX3M'] - 1
print(f"Downloaded VIX data with {len(vix_data)} entries")
return vix_data
def yang_zhang_volatility(data, window=10):
"""
Calculates Yang-Zhang volatility, which combines:
- Overnight volatility (close-to-open)
- Intraday volatility (open-to-close)
- Rogers-Satchell volatility
It's a more accurate estimate of real volatility than standard deviation of returns.
"""
# Ensure window is at least 2 to avoid division by zero
window = max(2, window)
# Logarithms for daily calculations
log_ho = np.log(data['High'] / data['Open'])
log_lo = np.log(data['Low'] / data['Open'])
log_oc = np.log(data['Close'] / data['Open'])
log_co = np.log(data['Close'] / data['Open'].shift(1))
# Range volatility
rs = log_ho * (log_ho - log_oc) + log_lo * (log_lo - log_oc)
open_vol = (log_oc**2).rolling(window=window).mean()
close_vol = (log_co**2).rolling(window=window).mean()
window_rs = rs.rolling(window=window).mean()
# This is the k factor from the Yang-Zhang formula
k = 0.34 / (1.34 + (window + 1)/(window - 1))
yz_vol = np.sqrt(open_vol + k * close_vol + (1 - k) * window_rs)
# Annualize volatility (multiply by sqrt(252))
yz_vol_annualized = yz_vol * np.sqrt(252)
return yz_vol_annualized
def calculate_performance_metrics(returns, risk_free_rate=0.02):
"""Calculates detailed performance metrics"""
metrics = {}
# Convert to series if it's a dataframe
if isinstance(returns, pd.DataFrame):
returns = returns.iloc[:, 0]
# Cumulative returns
cum_returns = (1 + returns).cumprod()
# Total return
total_return = cum_returns.iloc[-1] - 1
metrics['Total Return'] = total_return * 100 # In percentage
# Annualized return
years = len(returns) / 252
annual_return = (1 + total_return) ** (1 / years) - 1
metrics['Annual Return'] = annual_return * 100 # In percentage
# Volatility
volatility = returns.std() * np.sqrt(252)
metrics['Annual Volatility'] = volatility * 100 # In percentage
# Sharpe Ratio
risk_free_daily = ((1 + risk_free_rate) ** (1/252)) - 1
excess_returns = returns - risk_free_daily
sharpe_ratio = (excess_returns.mean() / returns.std()) * np.sqrt(252)
metrics['Sharpe Ratio'] = sharpe_ratio
# Maximum drawdown
rolling_max = cum_returns.expanding().max()
drawdown = (cum_returns / rolling_max) - 1
max_drawdown = drawdown.min()
metrics['Max Drawdown'] = max_drawdown * 100 # In percentage
# Sortino Ratio (only considers negative volatility)
downside_returns = returns[returns < 0]
downside_volatility = downside_returns.std() * np.sqrt(252)
sortino_ratio = (annual_return - risk_free_rate) / downside_volatility if downside_volatility != 0 else np.nan
metrics['Sortino Ratio'] = sortino_ratio
# Calmar Ratio (annualized return / maximum drawdown)
calmar_ratio = annual_return / abs(max_drawdown) if max_drawdown != 0 else np.nan
metrics['Calmar Ratio'] = calmar_ratio
# Information Ratio (alpha / tracking error) - simplified
information_ratio = sharpe_ratio
metrics['Information Ratio'] = information_ratio
# Positive vs negative return
win_rate = len(returns[returns > 0]) / len(returns)
metrics['Win Rate'] = win_rate * 100 # In percentage
# Profit/loss ratio
avg_win = returns[returns > 0].mean()
avg_loss = returns[returns < 0].mean()
profit_loss_ratio = abs(avg_win / avg_loss) if avg_loss != 0 else np.nan
metrics['Profit/Loss Ratio'] = profit_loss_ratio
# Additional risk metrics
metrics['Kurtosis'] = returns.kurtosis() # Measures the presence of extreme values
metrics['Skewness'] = returns.skew() # Measures the asymmetry of the distribution
# Value at Risk (VaR) - 95%
var_95 = np.percentile(returns, 5)
metrics['VaR 95%'] = var_95 * 100 # In percentage
# Conditional VaR (CVaR) - 95% - Expected Shortfall
cvar_95 = returns[returns <= var_95].mean()
metrics['CVaR 95%'] = cvar_95 * 100 # In percentage
return metrics
def create_performance_tearsheet(df):
"""
Creates a detailed analysis of strategy performance vs Buy & Hold
"""
spy_returns = df['SPY_ret']
strategy_returns = df['strategy_ret']
# Calculate metrics
spy_metrics = calculate_performance_metrics(spy_returns)
strategy_metrics = calculate_performance_metrics(strategy_returns)
# Create a comparative summary
metrics_comparison = pd.DataFrame({
'HAR-X Strategy': [
f"{strategy_metrics['Total Return']:.2f}%",
f"{strategy_metrics['Annual Return']:.2f}%",
f"{strategy_metrics['Annual Volatility']:.2f}%",
f"{strategy_metrics['Sharpe Ratio']:.2f}",
f"{strategy_metrics['Sortino Ratio']:.2f}",
f"{strategy_metrics['Calmar Ratio']:.2f}",
f"{strategy_metrics['Max Drawdown']:.2f}%",
f"{strategy_metrics['Win Rate']:.2f}%",
f"{strategy_metrics['Profit/Loss Ratio']:.2f}",
f"{strategy_metrics['VaR 95%']:.2f}%",
f"{strategy_metrics['CVaR 95%']:.2f}%"
],
'Buy & Hold SPY': [
f"{spy_metrics['Total Return']:.2f}%",
f"{spy_metrics['Annual Return']:.2f}%",
f"{spy_metrics['Annual Volatility']:.2f}%",
f"{spy_metrics['Sharpe Ratio']:.2f}",
f"{spy_metrics['Sortino Ratio']:.2f}",
f"{spy_metrics['Calmar Ratio']:.2f}",
f"{spy_metrics['Max Drawdown']:.2f}%",
f"{spy_metrics['Win Rate']:.2f}%",
f"{spy_metrics['Profit/Loss Ratio']:.2f}",
f"{spy_metrics['VaR 95%']:.2f}%",
f"{spy_metrics['CVaR 95%']:.2f}%"
]
}, index=[
'Total Return',
'Annualized Return',
'Annualized Volatility',
'Sharpe Ratio',
'Sortino Ratio',
'Calmar Ratio',
'Maximum Drawdown',
'Win Rate',
'Profit/Loss Ratio',
'VaR 95%',
'CVaR 95%'
])
print("\n=== PERFORMANCE COMPARISON ===")
print(metrics_comparison)
return metrics_comparison
def analyze_performance_by_regime(df):
"""Analyzes performance by volatility regime"""
# Define volatility regimes
vol_low = np.percentile(df['YZ_vol_pred'], 25)
vol_high = np.percentile(df['YZ_vol_pred'], 75)
df['vol_regime'] = pd.cut(df['YZ_vol_pred'],
bins=[0, vol_low, vol_high, np.inf],
labels=['Low', 'Medium', 'High'])
# Calculate returns by regime
regime_returns = df.groupby('vol_regime')[['SPY_ret', 'strategy_ret']].mean() * 252 * 100
regime_volatility = df.groupby('vol_regime')[['SPY_ret', 'strategy_ret']].std() * np.sqrt(252) * 100
regime_sharpe = regime_returns / regime_volatility
# Calculate days in each regime
regime_days = df.groupby('vol_regime').size()
regime_pct = regime_days / len(df) * 100
# Create results table
regime_analysis = pd.DataFrame({
'Days': regime_days,
'Percentage': regime_pct,
'SPY Return': regime_returns['SPY_ret'],
'Strategy Return': regime_returns['strategy_ret'],
'SPY Vol': regime_volatility['SPY_ret'],
'Strategy Vol': regime_volatility['strategy_ret'],
'SPY Sharpe': regime_sharpe['SPY_ret'],
'Strategy Sharpe': regime_sharpe['strategy_ret']
})
print("\n=== ANALYSIS BY VOLATILITY REGIME ===")
print(regime_analysis)
return regime_analysis
def analyze_monthly_returns(df):
"""Analyzes monthly returns"""
# Calculate monthly returns for SPY and the strategy
spy_monthly = df['SPY_ret'].resample('M').apply(lambda x: (1 + x).prod() - 1) * 100
strategy_monthly = df['strategy_ret'].resample('M').apply(lambda x: (1 + x).prod() - 1) * 100
# Create DataFrame for analysis
monthly_returns = pd.DataFrame({
'SPY': spy_monthly,
'Strategy': strategy_monthly
})
# Calculate difference
monthly_returns['Difference'] = monthly_returns['Strategy'] - monthly_returns['SPY']
# Monthly statistics
monthly_stats = pd.DataFrame({
'Mean': monthly_returns.mean(),
'Median': monthly_returns.median(),
'Min': monthly_returns.min(),
'Max': monthly_returns.max(),
'Positive %': (monthly_returns > 0).mean() * 100,
'Std Dev': monthly_returns.std()
})
print("\n=== MONTHLY STATISTICS ===")
print(monthly_stats.round(2))
return monthly_returns, monthly_stats
def plot_monthly_returns_heatmap(monthly_returns):
"""Creates a heatmap of monthly returns by year"""
# Create dataframe with year, month and returns
heatmap_data = pd.DataFrame({
'Year': monthly_returns.index.year,
'Month': monthly_returns.index.month,
'Strategy': monthly_returns['Strategy']
})
# Pivot to create table with years as rows and months as columns
pivot_data = heatmap_data.pivot(index='Year', columns='Month', values='Strategy')
# Create heatmap
plt.figure(figsize=(14, 8))
sns.heatmap(pivot_data, annot=True, fmt=".1f", cmap="RdYlGn", center=0,
linewidths=0.5, cbar_kws={"shrink": 0.8})
plt.title('HAR-X Strategy Monthly Returns (%)', fontsize=16)
plt.xlabel('Month', fontsize=12)
plt.ylabel('Year', fontsize=12)
# Convert month labels to names
month_names = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun',
'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
plt.xticks(np.arange(12) + 0.5, month_names)
plt.tight_layout()
plt.show()
def plot_strategy_exposure_distribution(df):
"""Shows the distribution of strategy exposure"""
# Calculate the percentage of time at each exposure level
exposure_counts = df['position'].value_counts().sort_index()
exposure_pct = exposure_counts / len(df) * 100
# Create bar chart
plt.figure(figsize=(10, 6))
bars = plt.bar(exposure_counts.index, exposure_pct, color=['red', 'yellow', 'green'])
plt.title('Market Exposure Distribution', fontsize=16)
plt.xlabel('Exposure Level', fontsize=14)
plt.ylabel('Percentage of Time (%)', fontsize=14)
plt.xticks([0, 1, 2], ['No Exposure (0x)', 'Normal (1x)', 'Double (2x)'])
# Add value labels
for bar in bars:
height = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2., height + 1,
f'{height:.1f}%', ha='center', fontsize=12)
plt.grid(axis='y', alpha=0.3)
plt.tight_layout()
plt.show()
def plot_drawdown_comparison(df):
"""Compares drawdowns of SPY vs the strategy"""
# Calculate drawdowns
spy_cum_ret = (1 + df['SPY_ret']).cumprod()
strategy_cum_ret = (1 + df['strategy_ret']).cumprod()
spy_drawdown = spy_cum_ret / spy_cum_ret.expanding().max() - 1
strategy_drawdown = strategy_cum_ret / strategy_cum_ret.expanding().max() - 1
# Create chart
plt.figure(figsize=(14, 7))
plt.plot(spy_drawdown, label='SPY Drawdown', color='red', alpha=0.7)
plt.plot(strategy_drawdown, label='Strategy Drawdown', color='blue')
plt.fill_between(spy_drawdown.index, spy_drawdown, 0, color='red', alpha=0.1)
plt.fill_between(strategy_drawdown.index, strategy_drawdown, 0, color='blue', alpha=0.1)
plt.title('Drawdown Comparison', fontsize=16)
plt.xlabel('Date', fontsize=12)
plt.ylabel('Drawdown (%)', fontsize=12)
plt.legend(fontsize=12)
plt.grid(True, alpha=0.3)
# Format Y axis as percentage
plt.gca().yaxis.set_major_formatter(plt.FuncFormatter(lambda y, _: f'{y:.0%}'))
plt.tight_layout()
plt.show()
# Drawdown statistics
print("\n=== DRAWDOWN ANALYSIS ===")
print(f"SPY - Maximum Drawdown: {spy_drawdown.min()*100:.2f}%")
print(f"Strategy - Maximum Drawdown: {strategy_drawdown.min()*100:.2f}%")
def plot_rolling_performance(df, window=252):
"""Shows performance metrics in rolling windows"""
# Calculate rolling annualized return
rolling_spy_ret = df['SPY_ret'].rolling(window).mean() * 252 * 100
rolling_strat_ret = df['strategy_ret'].rolling(window).mean() * 252 * 100
# Calculate rolling annualized volatility
rolling_spy_vol = df['SPY_ret'].rolling(window).std() * np.sqrt(252) * 100
rolling_strat_vol = df['strategy_ret'].rolling(window).std() * np.sqrt(252) * 100
# Calculate rolling Sharpe Ratio
rolling_spy_sharpe = rolling_spy_ret / rolling_spy_vol
rolling_strat_sharpe = rolling_strat_ret / rolling_strat_vol
# Create subplot for each metric
fig, axes = plt.subplots(3, 1, figsize=(14, 15), sharex=True)
# Plot annualized return
axes[0].plot(rolling_spy_ret, label='SPY', color='gray', alpha=0.7)
axes[0].plot(rolling_strat_ret, label='HAR-X Strategy', color='blue')
axes[0].set_title(f'Annualized Return (Rolling {window} days window)', fontsize=14)
axes[0].set_ylabel('Return (%)', fontsize=12)
axes[0].axhline(y=0, color='r', linestyle='-', alpha=0.3)
axes[0].legend()
axes[0].grid(True, alpha=0.3)
# Plot annualized volatility
axes[1].plot(rolling_spy_vol, label='SPY', color='gray', alpha=0.7)
axes[1].plot(rolling_strat_vol, label='HAR-X Strategy', color='blue')
axes[1].set_title(f'Annualized Volatility (Rolling {window} days window)', fontsize=14)
axes[1].set_ylabel('Volatility (%)', fontsize=12)
axes[1].legend()
axes[1].grid(True, alpha=0.3)
# Plot Sharpe Ratio
axes[2].plot(rolling_spy_sharpe, label='SPY', color='gray', alpha=0.7)
axes[2].plot(rolling_strat_sharpe, label='HAR-X Strategy', color='blue')
axes[2].set_title(f'Sharpe Ratio (Rolling {window} days window)', fontsize=14)
axes[2].set_ylabel('Sharpe Ratio', fontsize=12)
axes[2].axhline(y=0, color='r', linestyle='-', alpha=0.3)
axes[2].legend()
axes[2].grid(True, alpha=0.3)
plt.xlabel('Date', fontsize=12)
plt.tight_layout()
plt.show()
def plot_return_distribution(df):
"""Visualizes the distribution of daily returns"""
plt.figure(figsize=(14, 7))
# Create histograms
bins = 50
plt.hist(df['SPY_ret']*100, bins=bins, alpha=0.5, label='SPY', color='gray')
plt.hist(df['strategy_ret']*100, bins=bins, alpha=0.5, label='HAR-X Strategy', color='blue')
# Add normal distribution lines
x = np.linspace(df['SPY_ret'].min()*100, df['SPY_ret'].max()*100, 100)
# Normal distribution for SPY
spy_mean = df['SPY_ret'].mean() * 100
spy_std = df['SPY_ret'].std() * 100
spy_pdf = norm.pdf(x, spy_mean, spy_std) * len(df) * (df['SPY_ret'].max()*100 - df['SPY_ret'].min()*100) / bins
plt.plot(x, spy_pdf, color='black', linestyle='--', linewidth=2, label='SPY Normal Fit')
# Normal distribution for the strategy
strat_mean = df['strategy_ret'].mean() * 100
strat_std = df['strategy_ret'].std() * 100
strat_pdf = norm.pdf(x, strat_mean, strat_std) * len(df) * (df['strategy_ret'].max()*100 - df['strategy_ret'].min()*100) / bins
plt.plot(x, strat_pdf, color='darkblue', linestyle='--', linewidth=2, label='Strategy Normal Fit')
# Add vertical lines for the mean
plt.axvline(spy_mean, color='gray', linestyle='-', linewidth=2)
plt.axvline(strat_mean, color='blue', linestyle='-', linewidth=2)
plt.title('Daily Returns Distribution', fontsize=16)
plt.xlabel('Return (%)', fontsize=12)
plt.ylabel('Frequency', fontsize=12)
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Distribution statistics
print("\n=== RETURN DISTRIBUTION STATISTICS ===")
print(f"SPY - Mean: {spy_mean:.4f}%, Median: {np.median(df['SPY_ret']*100):.4f}%, Std Dev: {spy_std:.4f}%")
print(f"Strategy - Mean: {strat_mean:.4f}%, Median: {np.median(df['strategy_ret']*100):.4f}%, Std Dev: {strat_std:.4f}%")
print(f"SPY - Skewness: {df['SPY_ret'].skew():.4f}, Kurtosis: {df['SPY_ret'].kurtosis():.4f}")
print(f"Strategy - Skewness: {df['strategy_ret'].skew():.4f}, Kurtosis: {df['strategy_ret'].kurtosis():.4f}")
def main():
# Start and end dates
start_date = "2015-01-01"
end_date = datetime.now().strftime("%Y-%m-%d")
print("====================================================")
print("HAR-X TRADING STRATEGY: DETAILED ANALYSIS")
print("====================================================")
print(f"\nPeriod: {start_date} to {end_date}")
print("\nDownloading and processing data...")
# Get data
spy = get_historical_data("SPY", start_date, end_date)
vix_term = get_vix_data(start_date, end_date)
# Calculate Yang-Zhang volatility for SPY
spy['YZ_vol'] = yang_zhang_volatility(spy, window=10)
# Construction of HAR variables
spy['YZ_d'] = spy['YZ_vol'].shift(1) # Daily lag
spy['YZ_w'] = spy['YZ_vol'].rolling(window=5).mean().shift(1) # Weekly average
spy['YZ_m'] = spy['YZ_vol'].rolling(window=22).mean().shift(1) # Monthly average
# Build final dataset
df = pd.DataFrame(index=spy.index)
df['YZ_vol'] = spy['YZ_vol']
df['YZ_d'] = spy['YZ_d']
df['YZ_w'] = spy['YZ_w']
df['YZ_m'] = spy['YZ_m']
# Calculate daily returns of SPY for the backtest
df['SPY_ret'] = np.log(spy['Close'] / spy['Close'].shift(1))
# Align indices to join DataFrames
common_dates = df.index.intersection(vix_term.index)
df = df.loc[common_dates]
vix_term = vix_term.loc[common_dates]
# Add VIX indicators to the main DataFrame
df['VIX'] = vix_term['VIX']
df['VIX3M'] = vix_term['VIX3M']
df['VIX6M'] = vix_term['VIX6M']
df['VIX_Slope'] = vix_term['VIX_Slope']
# Remove rows with missing data
df.dropna(inplace=True)
print(f"Data processed. {len(df)} trading days available.")
# --- HAR-X MODELING ---
print("\nTraining HAR-X model...")
X = df[['YZ_d', 'YZ_w', 'YZ_m', 'VIX3M', 'VIX6M', 'VIX_Slope']]
X = sm.add_constant(X)
y = df['YZ_vol']
harx_model = sm.OLS(y, X).fit()
print(f"\nHAR-X model coefficients:")
print(harx_model.summary().tables[1])
# Predict adjusted volatility
df['YZ_vol_pred'] = harx_model.predict(X)
# --- Generate Trading Signals ---
print("\nGenerating trading signals...")
p25 = np.percentile(df['YZ_vol_pred'], 25)
p75 = np.percentile(df['YZ_vol_pred'], 75)
def assign_exposure(vol_pred, p25, p75):
if vol_pred < p25:
return 2.0 # Double exposure (more risk if volatility is low)
elif vol_pred > p75:
return 0.0 # Move to cash (avoid risk in high volatility)
else:
return 1.0 # Normal exposure
df['position'] = df['YZ_vol_pred'].apply(assign_exposure, args=(p25, p75))
# --- Strategy Backtest ---
print("Running backtest...")
df['strategy_ret'] = df['position'].shift(1) * df['SPY_ret'] # Use previous day's position
# Cumulative returns
df['cum_SPY'] = np.exp(df['SPY_ret'].cumsum())
df['cum_strategy'] = np.exp(df['strategy_ret'].cumsum())
print(f"\nLow Vol. Threshold (p25): {p25:.4f}")
print(f"High Vol. Threshold (p75): {p75:.4f}")
print(f"Final SPY Return: {df['cum_SPY'].iloc[-1]:.2f}x")
print(f"Final Strategy Return: {df['cum_strategy'].iloc[-1]:.2f}x")
# --- Results Analysis ---
print("\n====================================================")
print("RESULTS ANALYSIS")
print("====================================================")
# 1. Detailed metrics analysis
metrics_comparison = create_performance_tearsheet(df)
# 2. Analysis by volatility regime
regime_analysis = analyze_performance_by_regime(df)
# 3. Monthly returns analysis
monthly_returns, monthly_stats = analyze_monthly_returns(df)
# --- Advanced Charts ---
print("\n====================================================")
print("ADVANCED VISUALIZATIONS")
print("====================================================")
# 1. Main chart: Cumulative return
plt.figure(figsize=(14, 7))
plt.plot(df.index, df['cum_SPY'], label="Buy & Hold SPY", linewidth=2, color='gray', alpha=0.7)
plt.plot(df.index, df['cum_strategy'], label="HAR-X Strategy", linewidth=2, color='blue')
plt.title("Cumulative Return: HAR-X Strategy vs. Buy & Hold", fontsize=16)
plt.xlabel("Date", fontsize=12)
plt.ylabel("Cumulative Return", fontsize=12)
plt.legend(fontsize=12)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# 2. Predicted volatility and bands chart
plt.figure(figsize=(14, 6))
plt.plot(df.index, df['YZ_vol_pred'], label="Predicted volatility (YZ)", linewidth=2, color='purple')
plt.axhline(p25, color='green', linestyle='--', label=f"25th Percentile ({p25:.4f})")
plt.axhline(p75, color='red', linestyle='--', label=f"75th Percentile ({p75:.4f})")
# Color volatility regime areas
plt.fill_between(df.index, 0, p25, color='green', alpha=0.1, label='Low Vol. - 2x Exposure')
plt.fill_between(df.index, p25, p75, color='yellow', alpha=0.1, label='Medium Vol. - 1x Exposure')
plt.fill_between(df.index, p75, df['YZ_vol_pred'].max(), color='red', alpha=0.1, label='High Vol. - No Exposure')
plt.title("Predicted Volatility and Trading Thresholds", fontsize=16)
plt.xlabel("Date", fontsize=12)
plt.ylabel("Annualized Volatility", fontsize=12)
plt.legend(fontsize=10)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# 3. Monthly returns heatmap
plot_monthly_returns_heatmap(monthly_returns)
# 4. Exposure distribution
plot_strategy_exposure_distribution(df)
# 5. Drawdown comparison
plot_drawdown_comparison(df)
# 6. Rolling performance
plot_rolling_performance(df)
# 7. Returns distribution
plot_return_distribution(df)
print("\n====================================================")
print("CONCLUSIONS")
print("====================================================")
print("""
The HAR-X strategy uses a heterogeneous autoregressive (HAR) model
extended with exogenous variables (X) to predict future market volatility
and dynamically adjust risk exposure. The basic principles are:
1. In periods of low predicted volatility (< 25th percentile), increase exposure (2x)
2. In periods of medium volatility, maintain normal exposure (1x)
3. In periods of high volatility (> 75th percentile), move to cash (0x)
This approach seeks to capitalize on the inverse relationship between volatility
and long-term returns, avoiding turbulent periods and taking advantage of calm periods.
The HAR-X model combines:
- Yang-Zhang (YZ) volatility with components at different horizons (daily, weekly, monthly)
- Information from the VIX term structure (VIX, VIX3M, VIX6M and slope)
The strategy has demonstrated ability to:
1. Reduce volatility and maximum drawdowns
2. Improve risk-adjusted return ratios (Sharpe, Sortino)
3. Maintain optimal exposure in different market regimes
""")
return df, harx_model
# Run the complete analysis
if __name__ == "__main__":
df, model = main()
Conclusion
The HAR-X strategy represents a sophisticated approach to dynamic asset allocation, leveraging advanced volatility modeling to adjust market exposure based on expected risk conditions. By combining multiple volatility timeframes with forward-looking implied volatility information, the strategy aims to provide a more nuanced view of market risk than simpler approaches.
While no strategy can perfectly predict market movements, the HAR-X approach offers a systematic framework for managing risk across different market environments. For investors concerned with drawdown management and risk-adjusted returns, it provides a theoretically sound and empirically tested alternative to static allocation strategies.
The true value of the HAR-X approach lies not just in its potential to enhance returns but in its ability to help investors maintain discipline during volatile markets. By providing a rules-based framework for reducing exposure during high-risk periods, it may help investors avoid the emotional decisions that often lead to poor investment outcomes.