Glossary
Glossary
A comprehensive reference for statistical, mathematical, financial, and trading terms used throughout Hidden Regime. Each entry includes definitions, formulas (where applicable), interpretation guidance, use cases, and cross-references to related concepts.
Updated: October 2025 - Expanded with 51 new trading and risk management terms from Advanced Trading Applications series.
Quick Navigation: A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W
Section A
Akaike Information Criterion (AIC)
Definition: A measure of statistical model quality that balances goodness-of-fit against model complexity.
Formula:
$$\text{AIC} = 2k - 2\ln(\hat{L})$$where $k$ is the number of parameters and $\hat{L}$ is the maximum likelihood.
Interpretation: Lower values are better. AIC penalizes model complexity to prevent overfitting. When comparing models, prefer the one with the lowest AIC. AIC tends to favor more complex models than BIC.
Use case: Model selection when prediction accuracy is the priority. In HMM context, use AIC to determine optimal number of states.
See also: BIC, Log Likelihood
Reference: Wikipedia - Akaike Information Criterion
Augmented Dickey-Fuller Test (ADF)
Definition: Statistical test to determine if a time series is stationary (does not have a unit root).
Hypothesis:
- Null hypothesis ($H_0$): Series has a unit root (non-stationary)
- Alternative ($H_1$): Series is stationary
Interpretation: Low p-value (< 0.05) indicates stationarity. If p < 0.05, reject the null hypothesis and conclude the series is stationary. This is the opposite interpretation of KPSS.
Use case: Must verify stationarity before applying HMMs or other time series models. Log returns should pass this test, raw prices should fail it.
See also: KPSS Test, Stationary, Phillips-Perron
Reference: Wikipedia - Augmented Dickey-Fuller Test
Definition: The correlation of a time series with a delayed copy of itself, measuring how current values relate to past values.
Formula:
$$\rho_k = \frac{\sum_{t=k+1}^T (y_t - \bar{y})(y_{t-k} - \bar{y})}{\sum_{t=1}^T (y_t - \bar{y})^2}$$where $k$ is the lag.
Interpretation: Values range from -1 to +1. High autocorrelation at lag $k$ means past values are predictive of future values. Financial returns typically show low autocorrelation (close to zero), while volatility shows high autocorrelation (volatility clustering).
Use case: Identify patterns in time series data. Significant autocorrelation in returns violates efficient market hypothesis.
See also: Time Series, Stationary
Definition: The yearly rate of return calculated from returns over a period other than one year, allowing comparison across different time horizons.
Formula:
$$R_{\text{annual}} = R_{\text{daily}} \times 252$$where 252 is the number of trading days per year. For multi-period returns:
$$R_{\text{annual}} = (1 + R_{\text{total}})^{\frac{252}{n}} - 1$$where $n$ is the number of days in the period.
Interpretation: Annualization scales returns to a common yearly basis for fair comparison. A daily return of 0.1% annualizes to approximately 25.2% per year (0.1% × 252).
Important: For mean returns, use arithmetic scaling (multiply by 252). For volatility, use square root scaling (multiply by $\sqrt{252}$).
Use case: Essential for comparing strategies with different time horizons. HMM regime returns are typically annualized for interpretability.
See also: Log Returns, Volatility, Sharpe Ratio
Section B
Definition: The process of testing a trading strategy on historical data to evaluate its performance before deploying it with real capital.
Process:
- Define strategy rules (entry/exit signals)
- Apply rules to historical price data
- Calculate hypothetical returns and risk metrics
- Analyze performance and drawdowns
Key metrics evaluated:
- Total and annualized returns
- Sharpe ratio
- Maximum drawdown
- Win rate
- Transaction costs impact
Interpretation: Past performance does not guarantee future results. Backtesting shows how a strategy would have performed, not how it will perform.
Pitfalls:
- Overfitting to historical data
- Look-ahead bias (using future information)
- Survivorship bias (only testing on surviving assets)
- Ignoring transaction costs and slippage
Use case: Essential validation step before live trading. For HMM strategies, backtest regime-based signals on out-of-sample data.
See also: Walk-Forward Analysis, Out-of-Sample Testing, Paper Trading
Reference: Read our Advanced Trading Applications article
Definition: In the forward-backward algorithm for HMMs, the backward variable represents the probability of observing the remaining sequence given the current state.
Formula:
$$\beta_t(i) = P(o_{t+1}, o_{t+2}, \ldots, o_T | q_t = s_i, \lambda)$$where:
- $o_{t+1}, \ldots, o_T$ are future observations
- $q_t = s_i$ is being in state $i$ at time $t$
- $\lambda$ is the HMM model parameters
Recursion:
$$\beta_t(i) = \sum_{j=1}^N a_{ij} b_j(o_{t+1}) \beta_{t+1}(j)$$Interpretation: Backward variables are computed recursively from the end of the sequence backwards to the beginning. Combined with forward variables, they enable smoothing - estimating state probabilities using the full observation sequence.
Use case: Core component of forward-backward algorithm. Used in Baum-Welch training and for computing state probabilities at each time step.
See also: Forward Variable, Forward-Backward Algorithm, Baum-Welch Algorithm, Smoothing
Definition: One hundredth of a percentage point (0.01%).
Formula:
$$1 \text{ bp} = 0.0001 = 0.01\%$$Interpretation: Used in finance to describe small percentage changes. 100 basis points = 1%.
Example: If interest rates rise from 5.00% to 5.25%, they increased by 25 basis points.
Reference: Investopedia - Basis Points
Definition: An Expectation-Maximization (EM) algorithm for training Hidden Markov Model parameters by maximizing the likelihood of observed data.
Purpose: Estimates the transition matrix $A$, emission parameters $B$, and initial state distribution $\pi$ that best explain the observations.
What it does: Iteratively refines model parameters until convergence. Does NOT determine state sequences - that’s Viterbi’s job.
Process:
- E-step: Compute forward and backward probabilities
- M-step: Update parameters based on expected state occupancy
- Repeat until log-likelihood converges
Interpretation: Convergence typically occurs in < 100 iterations. If not converging, may indicate poor initialization or model misspecification.
Use case: Training HMMs on historical data. Always run Baum-Welch before using Viterbi for prediction.
See also: Viterbi Algorithm, Forward-Backward Algorithm, Expectation-Maximization, HMM
Reference: Read our HMM Algorithms Explained article
Bayesian Uncertainty Quantification (UQ)
Definition: Characterization of uncertainty in model predictions and parameters using Bayesian probability theory.
Key principle: Uncertainty is expressed as probability distributions over parameters rather than point estimates.
Use case: Quantifying confidence in regime predictions. In HMMs, provides probability distributions over states rather than hard classifications.
See also: HMM
Reference: Wikipedia - Uncertainty Quantification
Definition: A market condition where prices generally decrease, typically defined as a 20% decline from recent highs.
Characteristics: Negative returns, increasing volatility, risk-off sentiment
Use case: One of the fundamental market regimes detected by HMMs. Often characterized by high volatility and negative mean returns.
See also: Bull Market, Regime, Crisis
Bayesian Information Criterion (BIC)
Definition: A measure of statistical model quality that penalizes complexity more heavily than AIC.
Formula:
$$\text{BIC} = k\ln(n) - 2\ln(\hat{L})$$where $k$ is the number of parameters, $n$ is sample size, and $\hat{L}$ is maximum likelihood.
Interpretation: Lower values are better. BIC penalizes model complexity more than AIC (logarithmic vs. linear penalty). Tends to favor simpler models with fewer states.
Rule of thumb: Use BIC when model interpretability matters and you want to avoid overfitting. In HMM context, BIC often suggests fewer states than AIC.
Example: For HMM model selection, if 2-state model has BIC = 1500 and 3-state model has BIC = 1520, prefer the 2-state model (lower is better).
See also: AIC, Log Likelihood
Reference: Wikipedia - Bayesian Information Criterion
Definition: A technical indicator consisting of a moving average and two bands (upper and lower) placed at standard deviations above and below the average, used to identify overbought and oversold conditions.
Formula:
$$\text{Middle Band} = SMA_{20}$$$$\text{Upper Band} = SMA_{20} + 2\sigma_{20}$$$$\text{Lower Band} = SMA_{20} - 2\sigma_{20}$$where $SMA_{20}$ is the 20-period simple moving average and $\sigma_{20}$ is the 20-period standard deviation.
Interpretation:
- Price near upper band: Potentially overbought, reversal down possible
- Price near lower band: Potentially oversold, bounce up possible
- Band width: Narrow bands = low volatility, wide bands = high volatility
- Bollinger squeeze: Very narrow bands often precede large moves
Trading signals:
- Price touching upper band in uptrend → continuation signal
- Price touching lower band in downtrend → continuation signal
- Mean reversion: Price outside bands tends to return to middle
Use case: Volatility measurement and overbought/oversold identification. In HMM context, Bollinger Band signals can confirm regime transitions or contradict them.
See also: Moving Average, Volatility, Overbought, Oversold, Technical Indicators
Reference: Investopedia - Bollinger Bands
Definition: A market condition where prices generally increase, characterized by positive returns and investor optimism.
Characteristics: Positive mean returns, relatively low volatility, risk-on sentiment
Use case: One of the fundamental market regimes detected by HMMs. Typically has different emission characteristics than bear or sideways regimes.
See also: Bear Market, Regime, Euphoric
Definition: A passive investment strategy where an investor purchases securities and holds them for a long period regardless of market fluctuations.
Philosophy: Based on the belief that markets trend upward over long periods despite short-term volatility. Minimizes trading costs and tax implications.
Advantages:
- Low transaction costs (minimal trading)
- Simple to implement and maintain
- Tax-efficient (long-term capital gains)
- Historically effective for broad market indices
Disadvantages:
- Full exposure during market downturns
- No risk management during crises
- Ignores regime changes
- Maximum drawdowns can be severe
Performance baseline: Buy and hold typically serves as the benchmark against which active strategies are measured.
Use case: In backtesting HMM strategies, buy-and-hold provides the baseline performance. Regime-based strategies aim to outperform by reducing exposure during bear markets.
See also: Backtesting, Maximum Drawdown, Regime, Risk Management
Reference: Read our Advanced Trading Applications article
Section C
Definition: In risk measurement, the probability threshold used to calculate Value at Risk (VaR) and other tail risk metrics.
Common levels:
- 95% confidence: Captures events that occur 1 in 20 days (once per month)
- 99% confidence: Captures events that occur 1 in 100 days (2-3 times per year)
- 99.9% confidence: Captures rare extreme events
Interpretation: A 95% confidence VaR of -$10,000 means: "We expect losses will not exceed $10,000 on 95% of days."
Trade-off: Higher confidence levels capture more extreme events but have wider intervals and less statistical precision.
Use case: Setting risk limits and position sizing. Higher confidence levels appropriate for risk-averse strategies. In HMM context, regime confidence is different - it’s the probability of being in a particular state.
See also: Value at Risk, Expected Shortfall, Tail Risk
Reference: Wikipedia - Confidence Interval
Definition: A statistical measure describing the linear relationship between two random variables.
Formula:
$$\rho_{X,Y} = \frac{\text{Cov}(X,Y)}{\sigma_X \sigma_Y} = \frac{\sum_i (x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_i (x_i - \bar{x})^2} \sqrt{\sum_i (y_i - \bar{y})^2}}$$Interpretation:
- $\rho = 1$: Perfect positive correlation
- $\rho = 0$: No linear correlation
- $\rho = -1$: Perfect negative correlation
Important: Correlation measures LINEAR relationships only. Assets can be strongly related nonlinearly while showing zero correlation.
Use case: Measuring co-movement between assets. Correlation often increases during crisis regimes (correlation breakdown).
See also: Standard Deviation, Volatility
Reference: Wikipedia - Correlation
Definition: A market condition where prices are decreasing sharply, characterized by extreme volatility and panic selling.
Characteristics: Large negative returns, very high volatility, fat tails in return distribution
Statistical signature: High negative mean, very high standard deviation, often brief duration but severe impact
Use case: One of the most important regimes to detect for risk management. HMMs can identify crisis regimes and trigger defensive portfolio adjustments.
See also: Bear Market, Regime, Volatility
Definition: The total return on an investment over a specified period, accounting for compounding of gains and losses.
Formula (from log returns):
$$R_{\text{cumulative}} = \exp\left(\sum_{t=1}^T r_t\right) - 1$$where $r_t$ are log returns.
Formula (from simple returns):
$$R_{\text{cumulative}} = \prod_{t=1}^T (1 + r_t) - 1$$Interpretation: A cumulative return of 0.50 (or 50%) means the investment grew by 50% over the period. Unlike individual period returns, cumulative returns show the total wealth creation.
Visualization: Cumulative return charts show portfolio growth over time, making it easy to identify drawdown periods and overall performance.
Use case: Primary metric for backtesting and performance evaluation. Compare cumulative returns of buy-and-hold vs regime-based strategies.
See also: Annualized Return, Log Returns, Drawdown, Backtesting
Section D
Definition: A market condition where prices make a series of lower highs and lower lows over time, indicating sustained selling pressure.
Characteristics:
- Each price peak is lower than the previous peak
- Each price trough is lower than the previous trough
- Moving averages slope downward
- Negative momentum
Technical identification:
- Price consistently below moving averages
- Sequence of lower highs and lower lows
- Negative MACD and RSI in lower ranges
Trading implications:
- Trend-following strategies take short positions
- Mean-reversion strategies wait for reversal signals
- HMM may identify as Bear regime
Use case: Identifying directional bias in markets. Downtrends often persist longer than expected, making trend-following profitable.
See also: Uptrend, Bear Market, Momentum, Trend-Following
Definition: The decline in value from a peak to a trough in an investment or portfolio, expressed as a percentage.
Formula:
$$\text{DD}_t = \frac{V_t - \max_{\tau \leq t} V_\tau}{\max_{\tau \leq t} V_\tau}$$where $V_t$ is portfolio value at time $t$ and $\max_{\tau \leq t} V_\tau$ is the running maximum.
Interpretation: A drawdown of -20% means the portfolio has declined 20% from its peak. Drawdowns are always negative or zero.
Recovery time: Number of periods needed to return to previous peak. Long recovery times indicate severe market stress.
Use case: Essential risk metric for portfolio management. Large drawdowns psychologically difficult to tolerate and may force liquidation. HMM regimes help predict drawdown risk.
See also: Maximum Drawdown, Risk Management, Value at Risk
Reference: Investopedia - Drawdown
Section E
Exponential Moving Average (EMA)
Definition: A type of moving average that gives more weight to recent prices, making it more responsive to new information than a simple moving average.
Formula:
$$EMA_t = \alpha \cdot P_t + (1-\alpha) \cdot EMA_{t-1}$$where:
- $P_t$ is the current price
- $\alpha = \frac{2}{n+1}$ is the smoothing factor
- $n$ is the period (e.g., 12-day, 26-day)
Characteristics:
- More responsive to recent price changes than SMA
- Reduces lag compared to simple moving average
- Commonly used in MACD indicator (EMA12 - EMA26)
Common periods:
- EMA(12): Short-term trend
- EMA(26): Medium-term trend
- EMA(50), EMA(200): Long-term trends
Interpretation:
- Price > EMA → Uptrend
- Price < EMA → Downtrend
- EMA crossovers generate trading signals
Use case: Trend identification and signal generation. MACD uses difference between EMA(12) and EMA(26). Can confirm HMM regime transitions.
See also: Moving Average, Simple Moving Average, MACD, Trend-Following
Reference: Investopedia - Exponential Moving Average
Definition: An iterative algorithm for finding maximum likelihood estimates of parameters in models with latent (hidden) variables.
Process:
- E-step: Compute expected value of log-likelihood with respect to current parameter estimates
- M-step: Maximize this expectation to find new parameter estimates
- Repeat until convergence
Key property: Guarantees log-likelihood improvement at each iteration (monotonic convergence).
Use case: Baum-Welch is an EM algorithm specialized for HMMs. Used whenever you have hidden variables that need to be marginalized out.
See also: Baum-Welch Algorithm, Maximum Likelihood Estimation
Definition: Matrix of probabilities defining how likely each observation is given each hidden state in an HMM.
Formula:
$$b_j(k) = P(o_k | s_j)$$where $o_k$ is observation $k$ and $s_j$ is state $j$.
For Gaussian emissions:
$$b_j(x) = \frac{1}{\sqrt{2\pi\sigma_j^2}} \exp\left(-\frac{(x-\mu_j)^2}{2\sigma_j^2}\right)$$Interpretation: Each state has a characteristic distribution of outputs. In financial HMMs, each regime (state) has a typical mean return and volatility (emission parameters).
Example: Bull regime might emit returns from $\mathcal{N}(0.001, 0.01)$ (positive mean, low volatility), while Crisis regime emits from $\mathcal{N}(-0.003, 0.03)$ (negative mean, high volatility).
See also: Transition Matrix, HMM, Baum-Welch Algorithm
Definition: A market condition where prices are increasing sharply, characterized by excessive optimism and potentially unsustainable gains.
Characteristics: Very high positive returns, increasing volatility, often precedes corrections
Statistical signature: High positive mean return, elevated volatility, may show unsustainable trends
Use case: Important regime for risk management - detecting euphoria can signal profit-taking opportunities before reversal.
See also: Bull Market, Crisis, Regime
Definition: The average loss that occurs beyond the Value at Risk (VaR) threshold, providing a more complete picture of tail risk than VaR alone. Also known as Conditional Value at Risk (CVaR).
Formula:
$$ES_\alpha = E[r \mid r \leq -VaR_\alpha] = -\frac{1}{\alpha} \int_0^\alpha VaR_p \, dp$$where $\alpha$ is the confidence level (e.g., 0.05 for 95% confidence).
Interpretation: If VaR(95%) = -$10,000, and ES(95%) = -$15,000, this means: on the worst 5% of days, the average loss is $15,000 (not just $10,000).
Why ES > VaR:
- ES captures tail severity: VaR only gives threshold, ES gives average beyond
- ES is coherent: Satisfies all axioms of coherent risk measures
- ES for position sizing: Better than VaR for capital allocation
Example:
- VaR(95%) = -2%: Losses exceed 2% on 5% of days
- ES(95%) = -3%: When losses exceed VaR, average loss is 3%
Use case: Superior to VaR for risk management and position sizing. In HMM context, calculate regime-specific ES to size positions appropriately.
See also: Value at Risk, Tail Risk, Confidence Level, Position Sizing
Reference: Wikipedia - Expected Shortfall
Section F
Definition: An algorithm that computes the probability of an observation sequence given an HMM, and the probability of being in each state at each time step.
Purpose: Solves the “evaluation problem” - what is $P(\text{observations} | \lambda)$?
Components:
- Forward variable $\alpha_t(i) = P(o_1, ..., o_t, q_t = s_i | \lambda)$: Probability of observing sequence up to time $t$ and being in state $i$
- Backward variable $\beta_t(i) = P(o_{t+1}, ..., o_T | q_t = s_i, \lambda)$: Probability of observing remaining sequence given state $i$ at time $t$
Use case: Core component of Baum-Welch algorithm. Also used for smoothing - estimating state probabilities at all time points given full observation sequence.
See also: Baum-Welch Algorithm, Viterbi Algorithm, HMM
Reference: Read our HMM Algorithms Explained article
Definition: In the forward-backward algorithm for HMMs, the forward variable represents the probability of observing the sequence up to time $t$ and being in state $i$ at time $t$.
Formula:
$$\alpha_t(i) = P(o_1, o_2, \ldots, o_t, q_t = s_i | \lambda)$$where:
- $o_1, \ldots, o_t$ are observations up to time $t$
- $q_t = s_i$ is being in state $i$ at time $t$
- $\lambda$ is the HMM model parameters
Recursion:
$$\alpha_t(i) = \left[\sum_{j=1}^N \alpha_{t-1}(j) a_{ji}\right] b_i(o_t)$$Initialization:
$$\alpha_1(i) = \pi_i b_i(o_1)$$Interpretation: Forward variables are computed recursively from the beginning of the sequence forward. They accumulate evidence for each state based on all observations seen so far.
Use case: Core component of forward-backward algorithm. Used to compute the likelihood $P(O|\lambda)$ and for filtering - estimating the current state given observations so far.
See also: Backward Variable, Forward-Backward Algorithm, Baum-Welch Algorithm, Smoothing
Section G
Geometric Brownian Motion (GBM)
Definition: A continuous-time stochastic process where the logarithm of the variable follows a Brownian motion with drift.
Formula:
$$dS_t = \mu S_t dt + \sigma S_t dW_t$$where $\mu$ is drift, $\sigma$ is volatility, and $W_t$ is a Wiener process.
Key property: Prices $S_t$ are log-normally distributed. This implies log returns are normally distributed (approximately).
Use case: Classical model for stock price movements. Foundation for Black-Scholes option pricing. HMMs extend GBM by allowing parameters to switch between regimes.
See also: Log Returns, Volatility
Section H
Definition: The property of a time series where the variance (volatility) changes over time rather than remaining constant.
Contrast with homoskedasticity: Constant variance over time.
Formula (ARCH test): Tests if $\sigma_t^2$ varies with time.
Interpretation: Financial returns exhibit strong heteroskedasticity - volatility clusters (high volatility followed by high volatility, low by low). This is why volatility modeling (GARCH) is important.
Use case: Violates assumptions of basic linear regression. HMMs naturally handle heteroskedasticity by allowing different volatility in different states.
See also: Volatility, Standard Deviation
Reference: Investopedia - Heteroskedasticity
Definition: A graphical representation showing the frequency distribution of data by dividing it into bins.
Use case: Visual tool for identifying distribution shape. Use to check if returns are approximately normal, or if fat tails/skewness are present.
Interpretation: Bell-shaped histogram suggests normal distribution. Fat tails indicate kurtosis. Asymmetry indicates skewness.
See also: Q-Q Plot, Kurtosis, Skewness
Reference: Tableau - What is a Histogram
Definition: A statistical model for systems with hidden (unobservable) states, where state transitions and observations are probabilistic.
Five components: $\lambda = (S, O, A, B, \pi)$
- $S$: Hidden states (e.g., Bull, Bear, Sideways)
- $O$: Observations (e.g., daily returns)
- $A$: Transition matrix (state-to-state probabilities)
- $B$: Emission matrix (observation probabilities given state)
- $\pi$: Initial state distribution
Key assumptions:
- Markov property: $P(q_t | q_{t-1}, ..., q_1) = P(q_t | q_{t-1})$
- Output independence: $P(o_t | q_t, o_{t-1}, ...) = P(o_t | q_t)$
Three fundamental problems:
- Evaluation: $P(\text{observations} | \lambda)$ - solved by Forward-Backward
- Decoding: Most likely state sequence - solved by Viterbi
- Learning: Find best $\lambda$ - solved by Baum-Welch
Use case: Detecting market regimes from price data. The “hidden” part is the market regime (bull/bear/crisis), and the “observations” are daily returns.
See also: Baum-Welch Algorithm, Viterbi Algorithm, Forward-Backward Algorithm, Regime, Markov Property
Reference: Read our Guide to HMMs and HMM 101 articles
Section I
Definition: The dataset used to train, fit, or optimize a model’s parameters.
Purpose: Develop and calibrate model by finding parameters that best explain historical patterns.
Characteristics:
- Used for parameter estimation
- Model has “seen” this data
- Performance metrics may be optimistic
- Risk of overfitting if model too complex
Typical split: 70-80% of data for in-sample training.
Interpretation: In-sample performance represents how well the model explains known data, not how well it predicts new data.
Pitfalls:
- Overfitting: High in-sample performance, poor out-of-sample
- Data snooping: Repeatedly optimizing on same data
- Look-ahead bias: Using future information in training
Use case: For HMMs, train model parameters (transition matrix, emissions) on in-sample data, then validate on out-of-sample.
See also: Out-of-Sample Testing, Backtesting, Overfitting, Walk-Forward Analysis
Initial State Distribution (π)
Definition: The probability distribution over hidden states at the start of the observation sequence.
Formula:
$$\pi_i = P(q_1 = s_i)$$where $q_1$ is the state at time $t=1$.
Constraint: $\sum_i \pi_i = 1$
Interpretation: Represents prior belief about which state the system starts in. Often initialized uniformly ($\pi_i = 1/N$) if no prior knowledge.
Use case: One of three parameter sets estimated by Baum-Welch. Less critical than transition/emission matrices for long sequences.
See also: Transition Matrix, Emission Matrix, HMM
Inverse CDF (Quantile Function)
Definition: The inverse of the cumulative distribution function (CDF), mapping probabilities to values. Given a probability $p$, returns the value $x$ such that $P(X \leq x) = p$.
Formula:
$$F^{-1}(p) = x \text{ such that } F(x) = p$$where $F$ is the CDF.
Interpretation: For a 95% confidence level (p=0.95), $F^{-1}(0.95)$ gives the value below which 95% of the data falls.
Common use - VaR calculation:
$$\text{VaR}_\alpha = -F^{-1}(\alpha)$$For example, VaR at 5% level uses the inverse CDF at p=0.05.
Relationship to quantiles: The inverse CDF is the quantile function. The pth quantile is $F^{-1}(p)$.
Use case: Essential for Value at Risk calculations and generating random samples from distributions. In HMM context, use regime-specific inverse CDF to calculate tail risk.
See also: Value at Risk, Quantile, Confidence Level
Reference: Wikipedia - Quantile Function
Section J
Definition: A statistical test for normality based on sample skewness and kurtosis.
Formula:
$$JB = \frac{n}{6}\left(S^2 + \frac{(K-3)^2}{4}\right)$$where $S$ is skewness, $K$ is kurtosis, and $n$ is sample size.
Hypothesis:
- Null ($H_0$): Data is normally distributed
- Alternative ($H_1$): Data is not normal
Interpretation: Low p-value (< 0.05) rejects normality. Financial returns typically fail this test due to fat tails and skewness.
Use case: Testing distributional assumptions. HMMs can still work with non-normal data if mixture of Gaussian states approximates the true distribution.
Reference: Wikipedia - Jarque-Bera Test
Section K
Definition: A statistical test for stationarity (Kwiatkowski-Phillips-Schmidt-Shin test).
Hypothesis:
- Null ($H_0$): Series is stationary
- Alternative ($H_1$): Series has a unit root (non-stationary)
Interpretation: High p-value (> 0.05) indicates stationarity. This is the OPPOSITE of ADF interpretation. For robust stationarity testing, use both ADF and KPSS - series should reject null in ADF and fail to reject in KPSS.
Use case: Complementary to ADF test. Log returns should pass both tests (stationary), prices should fail both (non-stationary).
See also: ADF Test, Stationary
Reference: Wikipedia - KPSS Test
Kolmogorov-Smirnov Test (KS Test)
Definition: A non-parametric statistical test that compares two probability distributions to determine if they differ significantly. Can compare a sample distribution to a reference distribution, or compare two sample distributions.
Test Statistic:
$$D = \sup_x |F_1(x) - F_2(x)|$$where $F_1$ and $F_2$ are the empirical cumulative distribution functions (CDFs) of the two samples, and $\sup$ denotes the supremum (maximum distance).
Hypothesis:
- Null ($H_0$): The two distributions are the same
- Alternative ($H_1$): The distributions are different
Interpretation: Low p-value (< 0.05) indicates distributions are significantly different. The test measures the maximum vertical distance between two CDFs. Larger distances (higher D statistic) indicate more different distributions.
Advantages:
- Non-parametric (makes no assumptions about distribution shape)
- Sensitive to differences in both location and shape
- Easy to interpret (maximum distance between CDFs)
Use case: Common in regime detection to validate that market behavior changed between periods. For example, comparing return distributions before and after a crisis to quantify paradigm shifts. Also used to test if returns follow a normal distribution.
See also: Time Series, Stationary, Jarque-Bera Test
Reference: Wikipedia - Kolmogorov-Smirnov Test
Definition: A measure of the “tailedness” of a probability distribution, indicating the frequency of extreme values.
Formula:
$$\text{Kurt}[X] = \frac{E[(X-\mu)^4]}{(\sigma^2)^2}$$Excess kurtosis: $\text{Kurt}[X] - 3$ (normal distribution has kurtosis = 3)
Interpretation:
- Excess kurtosis = 0: Normal distribution (mesokurtic)
- Excess kurtosis > 0: Fatter tails than normal (leptokurtic) - more extreme events
- Excess kurtosis < 0: Thinner tails than normal (platykurtic)
Financial reality: Stock returns typically have excess kurtosis between 3-20, meaning extreme events (crashes, rallies) occur much more frequently than normal distribution predicts.
Use case: Assessing tail risk. High kurtosis indicates need for robust risk management. HMMs can capture fat tails through mixture of states.
See also: Jarque-Bera Test, Skewness
Reference: Investopedia - Kurtosis
Section L
Definition: The use of borrowed capital or financial derivatives to amplify potential returns (and losses) from an investment.
Formula:
$$\text{Leverage Ratio} = \frac{\text{Total Position Size}}{\text{Equity}}$$Types:
- 2x leverage: $100 equity controls $200 position
- Margin: Borrowing from broker
- Derivatives: Options, futures provide built-in leverage
Risk magnification: Leverage multiplies both gains AND losses.
Example:
- Without leverage: 10% gain = $10 profit on $100
- With 2x leverage: 10% gain = $20 profit, but 10% loss = -$20 (20% of equity)
Margin call: If losses reduce equity below maintenance requirement, broker may force liquidation.
Use case: Can enhance returns in favorable regimes, but dramatically increases risk. HMM regime detection can inform when to use leverage (Bull regimes) vs when to reduce (Bear/Crisis).
Important: Leverage can lead to total loss of capital. Use cautiously with proper risk management.
See also: Position Sizing, Risk Management, Maximum Drawdown
Reference: Investopedia - Leverage
Definition: Owning an asset with the expectation that its price will increase, profiting from upward price movement.
Mechanics:
- Buy asset at price $P_0$
- Hold asset
- Sell at higher price $P_1$
- Profit = $P_1 - P_0$ (minus costs)
Characteristics:
- Maximum loss: Purchase price (if asset goes to zero)
- Maximum gain: Unlimited (theoretically)
- Cash flow: Requires upfront capital
- Holding period: Can be indefinite
In returns: Long position profits when returns are positive.
Use case: Default position in buy-and-hold strategies. HMM strategies take long positions in Bull regimes, exit to cash in Bear regimes.
See also: Short Position, Buy and Hold, Bull Market, Position Sizing
Definition: The natural logarithm of the likelihood function, measuring how well a statistical model fits observed data.
Formula:
$$\ell(\theta) = \log L(\theta | x) = \sum_{i=1}^n \log f(x_i | \theta)$$where $f(x_i | \theta)$ is the probability density function.
Why use log:
- Converts products to sums (numerically stable)
- Easier to optimize (differentiable)
- Comparable across different sample sizes
Interpretation: Higher values indicate better fit. Note: Often reported as negative log-likelihood where lower is better.
Use case: Core metric in Baum-Welch algorithm. Monitors convergence - training stops when log-likelihood increase falls below threshold.
See also: AIC, BIC, Maximum Likelihood Estimation
Reference: StatLect - Log Likelihood
Definition: The logarithmic change in price from one period to the next.
Formula:
$$R_t = \log\left(\frac{P_t}{P_{t-1}}\right) = \log(P_t) - \log(P_{t-1})$$Key properties:
- Time-additive: $R_{1 \to T} = R_1 + R_2 + ... + R_T$
- Stationary: Mean and variance don’t change over time
- Scale-invariant: Comparable across different price levels
- Symmetric: +10% and -10% have equal magnitude in log space
- Approximately normal for small returns
Converting to percentage: $\%\text{return} = (e^R - 1) \times 100$
Why use instead of simple returns:
- Required for statistical modeling (stationarity)
- Mathematically elegant (additivity)
- Small return approximation: $\log(1+r) \approx r$
Use case: ALWAYS use log returns for HMMs and time series analysis. Raw prices violate stationarity assumption.
See also: Simple Returns, Stationary, Scale Invariance
Reference: Read our Why Use Log Returns? article
Definition: Representing values using their logarithms rather than original scale.
Purpose:
- Converts multiplicative relationships to additive
- Compresses large ranges (makes visualization easier)
- Stabilizes variance
Example: Price $P_t$ in log space is $\log(P_t)$. Returns in log space are $\log(P_t/P_{t-1})$.
Use case: Financial modeling uses log space because price changes are multiplicative (percentage-based) not additive (dollar-based).
See also: Log Returns
Section M
MACD (Moving Average Convergence Divergence)
Definition: A trend-following momentum indicator that shows the relationship between two exponential moving averages of an asset’s price.
Formula:
$$\text{MACD Line} = EMA_{12} - EMA_{26}$$$$\text{Signal Line} = EMA_9(\text{MACD})$$$$\text{Histogram} = \text{MACD Line} - \text{Signal Line}$$Components:
- MACD Line: Difference between fast (12) and slow (26) EMAs
- Signal Line: 9-period EMA of MACD line
- Histogram: Visual representation of divergence
Trading signals:
- Bullish: MACD crosses above signal line
- Bearish: MACD crosses below signal line
- Divergence: Price makes new high/low but MACD doesn’t (reversal signal)
Interpretation:
- MACD > 0: Upward momentum
- MACD < 0: Downward momentum
- Increasing histogram: Strengthening trend
- Decreasing histogram: Weakening trend
Use case: Trend identification and momentum measurement. Can confirm HMM regime transitions or provide early warning signals.
See also: EMA, Moving Average, Momentum, Trend-Following
Reference: Investopedia - MACD
Definition: The property that the conditional probability distribution of future states depends only on the present state, not on the sequence of events that preceded it.
Formula:
$$P(X_{t+1} = x | X_t = x_t, X_{t-1} = x_{t-1}, ..., X_0 = x_0) = P(X_{t+1} = x | X_t = x_t)$$Interpretation: “The future is independent of the past given the present.” No “memory” beyond current state.
First-order Markov: Depends only on previous state (most HMMs) Higher-order Markov: Depends on multiple previous states (more complex)
Use case: Core assumption of HMMs. Makes computation tractable. Reasonable approximation for many financial processes.
See also: HMM, Random Walk
Reference: Wikipedia - Markov Property
Definition: The largest peak-to-trough decline in portfolio value over a specified time period, representing the worst possible loss an investor would have experienced.
Formula:
$$\text{MDD} = \max_{t} \left[ \frac{\max_{\tau \leq t} V_\tau - V_t}{\max_{\tau \leq t} V_\tau} \right]$$where $V_t$ is portfolio value at time $t$.
Interpretation: An MDD of 30% means the portfolio declined 30% from its peak at some point. This is the worst historical drawdown.
Key metric: MDD shows the maximum pain an investor would have endured, which affects psychology and ability to stay invested.
Recovery factor: $\frac{\text{Total Return}}{\text{MDD}}$ measures return per unit of maximum risk taken.
Acceptable levels:
- < 10%: Low risk, conservative
- 10-20%: Moderate risk
- 20-30%: High risk
- > 30%: Very high risk, may cause panic selling
Use case: Essential for risk assessment and strategy evaluation. Regime-based strategies aim to reduce MDD by exiting during Bear/Crisis regimes.
See also: Drawdown, Sharpe Ratio, Risk Management, Backtesting
Reference: Investopedia - Maximum Drawdown
Definition: A trading strategy based on the theory that prices tend to return to their average over time after deviating from it.
Core hypothesis: Extreme price movements are temporary and will reverse toward the mean.
Mathematical basis: If a time series is stationary with mean $\mu$, deviations from $\mu$ should be temporary.
Trading approach:
- Buy when price significantly below mean
- Sell when price significantly above mean
- Uses indicators like Bollinger Bands, RSI
When it works:
- Sideways/ranging markets
- Stationary price series
- Over-reactions to news
- High-frequency trading
When it fails:
- Strong trends (Bull or Bear regimes)
- Structural changes in asset fundamentals
- Regime shifts
Contrast with trend-following: Mean-reversion bets on reversal, trend-following bets on continuation.
Use case: Effective in Sideways regimes detected by HMMs. Switch to trend-following in Bull/Bear regimes.
See also: Trend-Following, Sideways, Reversal, Overbought, Oversold
Reference: Investopedia - Mean Reversion
Definition: The rate of acceleration of an asset’s price or volume, representing the strength of a price trend.
Concept: Assets that have performed well recently tend to continue performing well in the near term (momentum effect).
Measurement:
- Rate of change: $(P_t - P_{t-n}) / P_{t-n}$
- RSI: Relative strength over period
- MACD: Difference between moving averages
Momentum trading:
- Positive momentum: Buy (expect continuation)
- Negative momentum: Sell or short
- Momentum reversal: Extreme momentum may signal reversal
Characteristics:
- Works in trends: Bull and Bear regimes
- Fails in ranges: Sideways markets
- Time-dependent: Momentum measured over specific period (e.g., 12-month)
Psychological basis: Herding behavior and delayed reaction to information cause momentum.
Use case: Confirms regime strength. Strong momentum in Bull regime = high confidence. Weakening momentum may signal regime transition.
See also: Trend-Following, RSI, MACD, Reversal
Reference: Investopedia - Momentum
Definition: The average price of an asset over a specified number of periods, updated as new data becomes available.
Types:
- Simple Moving Average (SMA): Arithmetic mean of prices
- Exponential Moving Average (EMA): Weighted mean favoring recent prices
Common periods:
- MA(20): Short-term trend
- MA(50): Medium-term trend
- MA(200): Long-term trend
Trading signals:
- Price crosses above MA: Bullish signal
- Price crosses below MA: Bearish signal
- Golden cross: MA(50) crosses above MA(200) - bullish
- Death cross: MA(50) crosses below MA(200) - bearish
Characteristics:
- Lags price: Moving averages smooth data but delay signals
- Trend identification: Direction of MA shows trend
- Support/resistance: MA can act as dynamic support or resistance levels
Use case: Simple trend identification. Can confirm HMM regime transitions. Price crossing MA may signal regime change.
See also: SMA, EMA, MACD, Bollinger Bands, Trend-Following
Reference: Investopedia - Moving Average
Maximum Likelihood Estimation (MLE)
Definition: A method for estimating parameters of a statistical model by maximizing the likelihood function.
Goal: Find parameters $\theta^*$ that maximize $P(\text{data} | \theta)$
Formula:
$$\theta^* = \arg\max_\theta L(\theta | x) = \arg\max_\theta \log L(\theta | x)$$Use case: Baum-Welch is an MLE algorithm for HMMs. Finds transition and emission parameters that best explain observed data.
Advantages: Statistically efficient, asymptotically unbiased Disadvantages: Can overfit with small samples, may need regularization
See also: Log Likelihood, Expectation-Maximization, Baum-Welch Algorithm
Section N
Section O
Definition: Evaluating a model’s performance on data that was NOT used during training, providing an unbiased estimate of predictive performance.
Purpose: Test if model generalizes to new data or has merely overfit to training data.
Process:
- Split data into training (in-sample) and test (out-of-sample) sets
- Train model on in-sample data only
- Evaluate model on out-of-sample data
- Compare in-sample vs out-of-sample performance
Typical split: 70-30 or 80-20 (training-test)
Red flags:
- Large performance gap: In-sample much better than out-of-sample → overfitting
- Negative out-of-sample Sharpe: Strategy doesn’t generalize
- Increasing gap over time: Model becoming stale
Gold standard: Out-of-sample performance determines real-world viability, not in-sample performance.
Use case: Critical validation for HMM trading strategies. Model must perform reasonably on unseen data before live trading.
See also: In-Sample, Walk-Forward Analysis, Backtesting, Overfitting
Definition: A condition where an asset’s price has risen significantly and may be due for a correction or reversal downward.
Technical indicators:
- RSI > 70: Traditionally considered overbought
- Price > upper Bollinger Band: Extended beyond normal range
- Stochastic > 80: Momentum indicator shows overbought
Interpretation: Overbought doesn’t mean “immediately sell”. In strong uptrends (Bull regimes), assets can remain overbought for extended periods.
Two perspectives:
- Mean-reversion: Overbought = sell signal (expect reversal)
- Trend-following: Overbought in uptrend = strength (continue holding)
Use case: Context matters. In Sideways regimes, overbought is sell signal. In Bull regimes, may be continuation signal.
See also: Oversold, RSI, Bollinger Bands, Mean-Reversion
Reference: Investopedia - Overbought
Definition: A condition where an asset’s price has fallen significantly and may be due for a bounce or reversal upward.
Technical indicators:
- RSI < 30: Traditionally considered oversold
- Price < lower Bollinger Band: Extended below normal range
- Stochastic < 20: Momentum indicator shows oversold
Interpretation: Oversold doesn’t mean “immediately buy”. In strong downtrends (Bear regimes), assets can remain oversold for extended periods.
Two perspectives:
- Mean-reversion: Oversold = buy signal (expect bounce)
- Trend-following: Oversold in downtrend = weakness (stay out or short)
“Catching a falling knife”: Buying oversold assets in Bear regimes can lead to further losses.
Use case: Context matters. In Sideways regimes, oversold is buy signal. In Bear/Crisis regimes, may indicate continued weakness.
See also: Overbought, RSI, Bollinger Bands, Mean-Reversion
Reference: Investopedia - Oversold
Definition: Creating a model that is too complex and fits the training data too closely, capturing noise rather than signal, resulting in poor performance on new data.
Symptoms:
- High in-sample performance
- Poor out-of-sample performance
- Too many parameters relative to data points
- Model is overly sensitive to small data changes
Causes:
- Too many states in HMM (5+ states often overfit)
- Too many features relative to sample size
- Over-optimization of parameters
- Data snooping: repeatedly testing on same data
Prevention:
- Use simpler models (2-4 states for HMMs)
- Out-of-sample validation
- Walk-forward analysis
- Regularization techniques
- Larger datasets
Detection: Compare in-sample vs out-of-sample Sharpe ratios. Large gap indicates overfitting.
Trade-off: Balance between underfitting (too simple) and overfitting (too complex).
Use case: Major risk in HMM regime detection. Prefer 3-state models over 5+ state models unless clear evidence supports complexity.
See also: In-Sample, Out-of-Sample Testing, Backtesting, Walk-Forward Analysis
Reference: Wikipedia - Overfitting
Section P
Definition: Simulated trading using real market prices but virtual (fake) money, allowing traders to test strategies without financial risk.
Purpose: Validate strategy in real-time market conditions before risking capital.
Benefits:
- Risk-free learning: Practice without losing money
- Real-time validation: Test strategy on live market data
- Psychology practice: Experience emotional aspects of trading
- System debugging: Identify technical issues
Limitations:
- No real money psychology: Fear and greed feel different with real capital
- Perfect execution: May not capture slippage and market impact
- Selection bias: Tendency to be more conservative or aggressive than with real money
Best practices:
- Paper trade for 1-3 months minimum
- Use realistic position sizes
- Include transaction costs and slippage
- Track emotional responses
- Don’t cherry-pick results
Use case: Essential step between backtesting and live trading. All HMM strategies should be paper traded before real deployment.
See also: Backtesting, Transaction Costs, Slippage
Reference: Investopedia - Paper Trading
Definition: A unit root test for stationarity, similar to ADF but more robust to heteroskedasticity and autocorrelation.
Advantage over ADF: Uses non-parametric correction for serial correlation, making it more robust to volatility clustering.
Interpretation: Same as ADF - low p-value indicates stationarity.
Use case: Alternative to ADF when you suspect heteroskedasticity (which is common in financial data).
See also: ADF Test, KPSS Test, Heteroskedasticity
Reference: Wikipedia - Phillips-Perron Test
Definition: The process of determining how much capital to allocate to a particular trade or investment, balancing potential returns against risk.
Common methods:
- Fixed dollar: Risk same dollar amount per trade
- Fixed percentage: Risk same % of capital per trade (e.g., 2%)
- Volatility-based: Scale position by $1/\sigma$ (inverse volatility)
- Kelly Criterion: $f^* = \frac{p \cdot b - q}{b}$ where $p$ = win rate, $b$ = win/loss ratio, $q = 1-p$
Regime-based sizing:
$$\text{Position} = \text{Base} \times \frac{\sigma_{\text{target}}}{\sigma_{\text{regime}}} \times \text{Confidence} \times f(\text{Sharpe})$$Key principles:
- Never risk more than 1-2% of capital on single trade
- Scale positions inversely with volatility
- Reduce size in low-confidence situations
- Consider correlation across positions
Common mistakes:
- Oversizing: Risking too much per trade
- Ignoring volatility: Same position size regardless of regime risk
- Ignoring correlation: Multiple correlated positions = concentrated risk
Use case: Critical for HMM trading strategies. Adjust position size based on regime volatility and confidence levels.
See also: Leverage, Risk Management, Volatility, Expected Shortfall
Reference: Investopedia - Position Sizing
Definition: The ratio of gross profits to gross losses over a trading period, measuring the profitability of a trading strategy.
Formula:
$$\text{Profit Factor} = \frac{\sum \text{Winning Trades}}{\sum |\text{Losing Trades}|}$$Interpretation:
- PF > 1: Strategy is profitable
- PF = 1: Break-even
- PF < 1: Losing strategy
- PF > 2: Strong strategy
- PF > 3: Excellent strategy
Example:
- Total wins: $10,000
- Total losses: $4,000
- Profit factor: 10,000 / 4,000 = 2.5
Relationship to win rate:
- High win rate + low profit factor → Many small wins, few large losses
- Low win rate + high profit factor → Few large wins, many small losses
Use case: Evaluate trading strategy quality. Compare profit factors across different regimes. High profit factor in Bull regime validates regime-based approach.
See also: Win Rate, Sharpe Ratio, Backtesting
Reference: Investopedia - Profit Factor
Section Q
Q-Q Plot (Quantile-Quantile Plot)
Definition: A graphical tool for assessing if a dataset follows a theoretical distribution (usually normal) by plotting quantiles against each other.
Interpretation:
- Points on diagonal line → Data matches theoretical distribution
- Points above line → Data has heavier right tail
- Points below line → Data has heavier left tail
- S-shaped curve → Data has both heavy tails (common for returns)
Use case: Visual test for normality. Complements statistical tests (Jarque-Bera, Shapiro-Wilk). Financial returns typically show S-curve indicating fat tails.
See also: Histogram, Jarque-Bera Test, Kurtosis
Reference: UVA - Understanding Q-Q Plots
Definition: A value below which a given percentage of observations in a distribution fall. The pth quantile is the value $x$ such that $P(X \leq x) = p$.
Common quantiles:
- 0.25 quantile: 25th percentile (Q1, first quartile)
- 0.50 quantile: 50th percentile (median, Q2)
- 0.75 quantile: 75th percentile (Q3, third quartile)
- 0.95 quantile: 95th percentile (used in VaR)
Formula:
$$Q(p) = F^{-1}(p)$$where $F^{-1}$ is the inverse CDF.
Interpretation: The 95th quantile is the value that separates the bottom 95% from the top 5% of the distribution.
Use in risk management:
- VaR at 5% level: Negative of 5th percentile (0.05 quantile)
- Tail risk: Focus on extreme quantiles (0.01, 0.05, 0.95, 0.99)
Interquartile range (IQR): $Q(0.75) - Q(0.25)$ measures dispersion.
Use case: Essential for tail risk measurement and VaR calculation. HMM regimes have different quantile structures - Crisis regimes have extreme lower quantiles.
See also: Inverse CDF, Value at Risk, Q-Q Plot
Reference: Wikipedia - Quantile
Section R
Definition: A stochastic process where each step is a random deviation from the previous value.
Formula:
$$X_t = X_{t-1} + \epsilon_t$$where $\epsilon_t$ is random noise (often $\mathcal{N}(0, \sigma^2)$).
Key property: Non-stationary - variance increases over time. Cumulative sum of stationary process.
Efficient Market Hypothesis: Prices follow random walk because all information is immediately incorporated.
Use case: Null hypothesis for many market models. If HMM detects regime structure, it provides evidence against pure random walk.
See also: Stationary, ADF Test, Markov Property
Reference: Wikipedia - Random Walk
Definition: A change in the direction of a price trend, from upward to downward or vice versa.
Types:
- Bullish reversal: Change from downtrend to uptrend
- Bearish reversal: Change from uptrend to downtrend
- V-shaped reversal: Sharp, sudden reversal
- Rounded reversal: Gradual trend change
Technical signals:
- Overbought/oversold conditions reversing
- MACD crossovers in opposite direction
- Price breaking through support/resistance
- Divergence between price and momentum indicators
vs Retracement: Reversals are trend changes; retracements are temporary pullbacks within ongoing trend.
HMM perspective: Regime transitions (Bull → Bear) are reversals. HMMs can detect regime transitions before traditional indicators.
Use case: Identifying reversals early provides trading opportunities. Exit longs before bearish reversals, enter longs after bullish reversals.
See also: Momentum, Overbought, Oversold, Regime, Trend-Following
Definition: A hidden state in an HMM representing a distinct market condition with characteristic statistical properties.
Common financial regimes:
- Bull: Positive returns, low-moderate volatility
- Bear: Negative returns, moderate-high volatility
- Sideways: Near-zero returns, low-moderate volatility
- Crisis: Large negative returns, very high volatility
- Euphoric: Large positive returns, high volatility
Statistical characterization: Each regime has distinct mean return and volatility (emission parameters).
Use case: The hidden states we’re trying to detect with HMMs. Goal is to identify current regime and predict transitions.
See also: HMM, Bull Market, Bear Market, Crisis, Euphoric, Sideways
Definition: Investment returns scaled by the amount of risk taken to achieve them, allowing fair comparison across strategies with different risk profiles.
Common measures:
- Sharpe Ratio: Excess return per unit of total risk (volatility)
- Sortino Ratio: Excess return per unit of downside risk
- Calmar Ratio: Annual return / maximum drawdown
- Information Ratio: Excess return / tracking error
Why adjust for risk: A 20% return with 50% volatility is inferior to 15% return with 10% volatility.
Formula (general):
$$\text{Risk-Adjusted Return} = \frac{\text{Return} - \text{Risk-Free Rate}}{\text{Risk Measure}}$$Use case: Essential for comparing HMM regime-based strategies against buy-and-hold. Focus on Sharpe ratio improvement, not just absolute returns.
See also: Sharpe Ratio, Volatility, Maximum Drawdown
Reference: Investopedia - Risk-Adjusted Return
Definition: The maximum acceptable amount of risk (measured in dollars, volatility, VaR, or drawdown) that a portfolio or trading strategy is allowed to take.
Common approaches:
- Dollar risk: Max $X loss per trade
- Volatility budget: Target Y% annual volatility
- VaR budget: No more than Z% VaR
- Drawdown limit: Stop trading if drawdown exceeds W%
Example: Risk budget of 2% per trade means: with $100,000 capital, max loss per trade = $2,000.
Allocation: Distribute risk budget across:
- Multiple strategies
- Multiple assets
- Multiple regimes
Use case: Core risk management tool. In HMM context, allocate more risk budget to high-Sharpe regimes (Bull), less to low-Sharpe regimes (Bear).
See also: Position Sizing, Value at Risk, Expected Shortfall
Definition: The theoretical return on an investment with zero risk, typically approximated by government bonds (e.g., US Treasury bills).
Common proxies:
- 3-month T-Bill: Short-term risk-free rate
- 10-year Treasury: Long-term risk-free rate
- 0%: Often used for simplicity in calculations
Use in Sharpe ratio:
$$\text{Sharpe} = \frac{R_p - R_f}{\sigma_p}$$where $R_f$ is the risk-free rate.
Reality check: No investment is truly “risk-free” (even governments can default), but T-bills are closest approximation.
Current environment: Risk-free rate changes over time. Higher risk-free rates raise the bar for strategy performance.
Use case: Baseline for evaluating investment returns. Strategies must beat risk-free rate to be worthwhile (after adjusting for risk).
See also: Sharpe Ratio, Risk-Adjusted Return
Definition: A momentum oscillator measuring the speed and magnitude of price changes, used to identify overbought and oversold conditions.
Formula:
$$RSI = 100 - \frac{100}{1 + RS}$$where $RS = \frac{\text{Average Gain over } n \text{ periods}}{\text{Average Loss over } n \text{ periods}}$
Default period: 14 days
Interpretation:
- RSI > 70: Overbought (potential reversal down)
- RSI < 30: Oversold (potential reversal up)
- RSI = 50: Neutral momentum
- RSI trending up: Building bullish momentum
- RSI trending down: Building bearish momentum
Trading signals:
- Divergence: Price makes new high but RSI doesn’t → bearish reversal
- Centerline crossover: RSI crosses 50 → trend change
- Failure swings: RSI fails to confirm price movement
Limitations: In strong trends (Bull/Bear regimes), RSI can remain overbought/oversold for extended periods.
Use case: Momentum confirmation for HMM regimes. RSI and regime detection can provide complementary signals.
See also: Momentum, Overbought, Oversold, MACD
Reference: Investopedia - RSI
Section S
Definition: A measure of risk-adjusted return calculated as excess return per unit of volatility, the most widely used metric for comparing investment strategies.
Formula:
$$\text{Sharpe} = \frac{R_p - R_f}{\sigma_p}$$where:
- $R_p$ = Portfolio return (annualized)
- $R_f$ = Risk-free rate
- $\sigma_p$ = Portfolio volatility (annualized std dev)
Interpretation:
- Sharpe < 0: Losing money (worse than risk-free)
- Sharpe = 0-1: Poor to acceptable
- Sharpe = 1-2: Good
- Sharpe > 2: Excellent
- Sharpe > 3: Outstanding (rare, scrutinize for errors)
Why it matters: Compares strategies on equal footing. A 30% return with 40% volatility (Sharpe=0.75) is worse than 20% return with 15% volatility (Sharpe=1.33).
Limitations:
- Assumes returns are normally distributed (they’re not)
- Treats upside and downside volatility equally
- Can be gamed with smoothed returns
Use case: PRIMARY metric for evaluating HMM trading strategies. Regime-based strategies aim to improve Sharpe ratio by reducing exposure in Bear regimes.
See also: Risk-Adjusted Return, Volatility, Maximum Drawdown
Reference: Investopedia - Sharpe Ratio
Definition: Selling an asset you don’t own with the expectation that its price will decrease, profiting from downward price movement.
Mechanics:
- Borrow asset from broker
- Sell at current price $P_0$
- Buy back at lower price $P_1$
- Return asset to broker
- Profit = $P_0 - P_1$ (minus costs)
Characteristics:
- Maximum gain: Limited to 100% (if price goes to zero)
- Maximum loss: Unlimited (price can rise indefinitely)
- Costs: Borrowing fees, interest charges
- Requirements: Margin account, short interest availability
Risks:
- Short squeeze: Rapid price increase forces covering at loss
- Margin call: Must post additional collateral if position moves against you
- Dividend payments: Responsible for dividends while short
Use case: Profit from Bear regimes. Advanced HMM strategies can short in Bear/Crisis regimes, though many retail traders only go long or cash.
See also: Long Position, Bear Market, Leverage
Reference: Investopedia - Short Selling
Definition: A numerical value (typically 0.0 to 1.0) representing the confidence or conviction in a trading signal, used for position sizing.
Formula (HMM context):
$$\text{Strength} = f(\text{Confidence}, \text{Agreement}, \text{Regime Quality})$$Components:
- Regime confidence: HMM probability of current state
- Indicator agreement: Alignment between HMM and technical indicators
- Regime quality: Historical Sharpe ratio of regime
Interpretation:
- 0.0-0.3: Weak signal, small position or skip
- 0.3-0.7: Moderate signal, normal position
- 0.7-1.0: Strong signal, larger position
Use case: Dynamic position sizing based on signal quality. Strong Bull regime with high confidence → high signal strength → larger position.
See also: Position Sizing, Confidence Level
Definition: The property where a law, process, or measurement behaves similarly regardless of the scale of the variables.
In finance: Returns are scale-invariant because a $5 move means different things for a $50 stock vs. a $500 stock. Log returns automatically account for this.
Formula relationship:
$$\frac{P_t - P_{t-1}}{P_{t-1}} = \frac{\Delta P}{P}$$is scale-invariant (percentages), while $\Delta P$ is not.
Use case: Essential property for comparing assets with different price levels. Allows HMM trained on SPY to potentially work on other assets.
See also: Log Returns
Reference: Wikipedia - Scale Invariance
Definition: A market condition where prices remain relatively stable, oscillating around a mean without clear trend.
Characteristics: Near-zero mean returns, low-moderate volatility, range-bound behavior
Statistical signature: Mean close to zero, volatility typically lower than crisis but can vary, may show mean-reversion
Use case: Important regime for option sellers and range-trading strategies. HMMs can detect sideways markets and adjust strategy accordingly.
See also: Bull Market, Bear Market, Regime
Simple Returns (Percentage Returns)
Definition: The percentage change in price from one period to the next.
Formula:
$$r_t = \frac{P_t - P_{t-1}}{P_{t-1}} = \frac{P_t}{P_{t-1}} - 1$$Key properties:
- Intuitive interpretation (directly in percentages)
- NOT time-additive (must compound)
- NOT perfectly symmetric (+10% then -10% ≠ break even)
Converting from log returns: $r = e^R - 1$
Why NOT to use for modeling:
- Violates stationarity for long horizons
- Non-additive (complex multi-period calculations)
- Asymmetric treatment of gains/losses
Use case: Reporting and communication (easier to understand). But ALWAYS use log returns for statistical analysis.
See also: Log Returns, Stationary
Reference: Read our Why Use Log Returns? article
Definition: A measure of the asymmetry of a probability distribution.
Formula:
$$\text{Skew}[X] = \frac{E[(X-\mu)^3]}{(\sigma^2)^{3/2}}$$Interpretation:
- Skewness = 0: Symmetric distribution (normal)
- Skewness > 0: Right tail is longer (positive outliers more common)
- Skewness < 0: Left tail is longer (negative outliers more common)
Financial markets: Often show negative skewness (crash more severe than rallies), though this varies by regime.
Use case: Assessing distribution asymmetry. Combined with kurtosis in Jarque-Bera test. HMMs can capture skewness through asymmetric regime transitions.
See also: Kurtosis, Jarque-Bera Test
Reference: Wikipedia - Skewness
Definition: For small values of $r$, the approximation $\log(1+r) \approx r$.
Mathematical basis: Taylor series expansion of $\log(1+r)$ around $r=0$
Accuracy: Very good for $|r| < 0.01$ (daily returns), breaks down for $|r| > 0.1$
Use case: Explains why log returns ≈ simple returns for daily data. Justifies multiplying log return std dev by 100 for volatility approximation.
Important: Only use for volatility, NOT for mean returns. Mean must always use proper conversion: $(e^{\bar{R}} - 1)$.
See also: Log Returns, Simple Returns
Reference: Read our Why Use Log Returns? article
Definition: A measure of the dispersion of data points around their mean.
Formula:
$$\sigma = \sqrt{\frac{1}{N-1}\sum_{i=1}^N (x_i - \bar{x})^2}$$where $\bar{x}$ is the sample mean.
Population vs sample: Use $N-1$ for sample std (Bessel’s correction for unbiased estimate).
Interpretation:
- About 68% of data falls within 1 std dev of mean (for normal distribution)
- About 95% within 2 std devs
- About 99.7% within 3 std devs
In finance: Standard deviation of returns is called volatility. Higher std dev = higher risk.
Use case: Core measure of risk in portfolio theory. In HMMs, each regime has characteristic std dev (emission parameter).
See also: Volatility, Variance, Correlation
Reference: Wikipedia - Standard Deviation
Definition: A stochastic process whose statistical properties (mean, variance, autocorrelation) do not change over time.
Weak (covariance) stationarity requires:
- Constant mean: $E[X_t] = \mu$ for all $t$
- Constant variance: $\text{Var}(X_t) = \sigma^2$ for all $t$
- Autocorrelation depends only on lag: $\text{Cov}(X_t, X_{t-k})$ depends on $k$, not $t$
Why it matters: Most statistical models (including HMMs) assume stationarity. Non-stationary data violates assumptions and produces invalid inferences.
Tests: ADF (reject null → stationary), KPSS (fail to reject null → stationary)
Financial data:
- Prices: NON-stationary (random walk with drift)
- Returns: Approximately stationary (what we use for modeling)
- Log returns: More stationary than simple returns
Use case: First step in time series analysis - verify stationarity before modeling. If non-stationary, take differences (returns) or apply transformations.
See also: ADF Test, KPSS Test, Random Walk, Log Returns
Reference: Wikipedia - Stationary Process
Definition: The arithmetic mean of an asset’s price over a specified number of periods, giving equal weight to all prices.
Formula:
$$SMA_n = \frac{1}{n}\sum_{i=0}^{n-1} P_{t-i}$$Common periods:
- SMA(20): Short-term trend (~ 1 month)
- SMA(50): Medium-term trend (~ 2-3 months)
- SMA(200): Long-term trend (~ 1 year)
Characteristics:
- Equal weighting: All prices in window treated equally
- Lagging indicator: Responds slowly to price changes
- Smooth: Reduces noise but delays signals
vs EMA: SMA gives equal weight to all periods; EMA weights recent prices more heavily, making it more responsive.
Golden/Death Cross:
- Golden cross: SMA(50) crosses above SMA(200) - bullish
- Death cross: SMA(50) crosses below SMA(200) - bearish
Use case: Simple trend identification. Price above SMA → uptrend. Can confirm HMM regime transitions.
See also: EMA, Moving Average, Bollinger Bands
Reference: Investopedia - Simple Moving Average
Definition: The difference between the expected execution price of a trade and the actual price received, caused by market conditions and order size.
Types:
- Positive slippage: Better price than expected (rare)
- Negative slippage: Worse price than expected (common)
Causes:
- Market orders: Execute at best available price (may differ from quote)
- Low liquidity: Large orders move the market
- High volatility: Prices change rapidly between order and execution
- Market impact: Your order affects the price
Typical magnitude:
- Highly liquid: 0.01-0.05% slippage
- Moderate liquidity: 0.1-0.3% slippage
- Low liquidity: 0.5%+ slippage
Mitigation:
- Use limit orders (but risk non-execution)
- Trade liquid assets
- Avoid trading during news events
- Split large orders
In backtesting: ALWAYS include realistic slippage assumptions (e.g., 0.1-0.2% per trade). Ignoring slippage leads to overly optimistic results.
Use case: Critical cost in frequent-trading strategies. HMM strategies with regime transitions must account for slippage when entering/exiting positions.
See also: Transaction Costs, Backtesting, Paper Trading
Reference: Investopedia - Slippage
Definition: In HMM context, using the full observation sequence (past, present, and future) to estimate state probabilities at each time point, providing the most accurate state estimates.
Forward-Backward Smoothing:
$$P(q_t = s_i | O, \lambda) = \frac{\alpha_t(i) \beta_t(i)}{\sum_{j=1}^N \alpha_t(j) \beta_t(j)}$$where $\alpha_t$ is forward probability, $\beta_t$ is backward probability.
Smoothing vs Filtering vs Prediction:
- Filtering: Estimate state at $t$ using observations up to $t$
- Smoothing: Estimate state at $t$ using ALL observations (past, present, future)
- Prediction: Estimate state at $t+1$ using observations up to $t$
Accuracy: Smoothing > Filtering > Prediction (because smoothing uses more information).
Use case: When analyzing historical data, smoothing provides best state estimates. For real-time trading, use filtering (can’t see future).
See also: Forward Variable, Backward Variable, Forward-Backward Algorithm, Viterbi Algorithm
Definition: A predetermined price level at which a position is automatically closed to limit losses, a fundamental risk management tool.
Types:
- Fixed stop: Set % or $ below entry (e.g., 5% stop-loss)
- Trailing stop: Moves with price, locks in gains
- Volatility-based: Based on ATR or standard deviation
- Time stop: Exit after X days regardless of price
Placement strategies:
- Below support: For longs, place below key support level
- ATR-based: 2× ATR below entry (adapts to volatility)
- Regime-based: Wider stops in high-volatility regimes
Psychology: Prevents hope-based holding of losing positions. Pre-commitment device.
Trade-off: Tighter stops = less risk but more false exits (whipsawed).
Use case: ESSENTIAL for all trading strategies. In HMM context, set wider stops in Crisis regimes (higher volatility), tighter stops in Sideways regimes.
See also: Risk Management, Volatility, Maximum Drawdown
Reference: Investopedia - Stop-Loss
Section T
Definition: The risk of extreme events that occur in the tails of a probability distribution, representing rare but severe losses.
Characteristics:
- Low probability: Occur less than 5% of the time
- High impact: Much larger than typical losses
- Fat tails: Occur more frequently than normal distribution predicts
Measurement:
- VaR: Loss threshold at confidence level
- Expected Shortfall: Average loss beyond VaR
- Kurtosis: Statistical measure of tail thickness
Financial examples:
- 2008 financial crisis: -50% market decline
- Flash crashes: Sudden extreme moves
- Black swan events: Unpredicted large moves
HMM perspective: Crisis regimes are characterized by extreme tail risk. Detecting and avoiding these regimes reduces tail risk exposure.
Use case: Understanding tail risk essential for survival in trading. One extreme event can wipe out years of gains.
See also: Value at Risk, Expected Shortfall, Kurtosis, Crisis
Reference: Investopedia - Tail Risk
Threshold-Based Classification
Definition: A method for labeling HMM states as regime types (Bear/Bull/Sideways) by comparing emission means against data-driven thresholds rather than sorting by state index.
Approach:
$$\text{Regime}_k = \begin{cases} \text{Bear} & \text{if } \mu_k < \theta_{\text{bear}} \\ \text{Bull} & \text{if } \mu_k > \theta_{\text{bull}} \\ \text{Sideways} & \text{otherwise} \end{cases}$$Why necessary: HMM state indices (0, 1, 2) are arbitrary. State 0 may have highest or lowest mean depending on initialization.
Threshold determination:
- Statistical: Based on percentiles of historical returns
- Economic: Based on annualized return targets
- Volatility-adjusted: Account for risk in classification
Best practice: Let the DATA determine regime labels, not state order.
Use case: Essential for interpreting HMM results correctly. Never assume State 0 = Bear without checking emission means.
See also: Regime, Emission Matrix, HMM
Reference: Read our Full Pipeline Advanced Analysis article
Definition: A sequence of data points indexed in time order, typically at equally-spaced intervals.
Examples: Daily stock prices, hourly temperatures, quarterly GDP
Key characteristics:
- Temporal ordering matters (unlike cross-sectional data)
- Often autocorrelated
- May have trends, seasonality, cycles
- May be stationary or non-stationary
Analysis goals:
- Description: Summarize patterns (trend, seasonality)
- Explanation: Model relationships between variables
- Prediction: Forecast future values
- Control: Use forecasts for decision-making
Use case: Financial data is inherently time series. HMMs are specialized time series models that account for regime changes.
See also: Stationary, Autocorrelation, HMM
Definition: Matrix of probabilities defining how likely the system is to transition from one hidden state to another in an HMM.
Formula:
$$a_{ij} = P(q_{t+1} = s_j | q_t = s_i)$$where $q_t$ is the state at time $t$.
Constraint: Each row sums to 1: $\sum_j a_{ij} = 1$
Interpretation:
- Diagonal elements $a_{ii}$: Persistence (staying in same state)
- Off-diagonal $a_{ij}$ $(i \neq j)$: Transition probability
- High diagonal → States persist (regime stability)
- Low diagonal → Rapid switching (regime volatility)
Example (2-state):
Bull Bear
Bull [ 0.95 0.05 ]
Bear [ 0.10 0.90 ]
This says bull markets persist (95% stay), and bear markets are stickier (90% stay, only 10% exit).
Use case: Captures regime dynamics. Learned by Baum-Welch algorithm. Critical for understanding regime transitions.
See also: Emission Matrix, Initial State Distribution, HMM, Markov Property
Definition: The total costs incurred when buying or selling securities, including commissions, fees, spreads, and market impact.
Components:
- Commissions: Broker fees per trade
- Bid-ask spread: Difference between buy and sell prices
- Slippage: Deviation from expected execution price
- Exchange fees: Transaction fees charged by exchanges
- Market impact: Price movement caused by your order
Typical costs (retail trading):
- Commissions: $0-$5 per trade (many brokers now zero-commission)
- Spread: 0.01-0.10% for liquid stocks
- Slippage: 0.05-0.20% per trade
- Total: ~0.1-0.3% per round trip (buy + sell)
Impact on strategies:
- High-frequency: Transaction costs can eliminate profits
- Low-frequency: Less impactful but still important
- Round trips matter: Buy + Sell = 2× costs
In backtesting: MUST include realistic transaction cost assumptions. Common mistake: ignoring costs leads to unrealistic performance.
Use case: HMM strategies with frequent regime transitions must carefully account for transaction costs. May need to add filters to avoid excessive trading.
See also: Slippage, Backtesting, Paper Trading
Reference: Investopedia - Transaction Costs
Definition: A trading strategy that attempts to capture gains by riding established price trends, buying in uptrends and selling/shorting in downtrends.
Core principle: “The trend is your friend” - established trends tend to continue.
Characteristics:
- Wins: Few large winning trades (capture big moves)
- Losses: Many small losing trades (whipsawed in ranges)
- Win rate: Typically 30-50% (but large wins compensate)
Entry signals:
- Price crosses above moving average
- MACD turns positive
- Price breaks out of range
Exit signals:
- Price crosses below moving average
- MACD turns negative
- Trailing stop hit
When it works:
- Bull/Bear regimes: Strong directional markets
- High momentum: Sustained price movement
- Low mean reversion: Trends persist
When it fails:
- Sideways regimes: Range-bound markets
- High volatility: Frequent whips awsOut
- Trend reversals: Late entry/exit
HMM application: Use trend-following strategies in Bull and Bear regimes, switch to mean-reversion in Sideways regimes.
See also: Mean-Reversion, Momentum, Moving Average, MACD
Reference: Investopedia - Trend Trading
Section U
Definition: A market condition where prices make a series of higher highs and higher lows over time, indicating sustained buying pressure.
Characteristics:
- Each price peak is higher than the previous peak
- Each price trough is higher than the previous trough
- Moving averages slope upward
- Positive momentum
Technical identification:
- Price consistently above moving averages
- Sequence of higher highs and higher lows
- Positive MACD and RSI in upper ranges
Trading implications:
- Trend-following: Take long positions
- Dips as opportunities: Buy pullbacks within trend
- HMM may identify as Bull regime
“Higher highs, higher lows”: Classic definition of uptrend structure.
Use case: Identifying directional bias. Uptrends often persist longer than expected, making trend-following profitable.
See also: Downtrend, Bull Market, Momentum, Trend-Following
Section V
Definition: A statistical measure estimating the maximum potential loss over a specified time period at a given confidence level.
Formula:
$$\text{VaR}_\alpha = -F^{-1}(\alpha)$$where $F^{-1}$ is the inverse CDF and $\alpha$ is the confidence level (e.g., 0.05 for 95% confidence).
Interpretation: A 95% VaR of $10,000 means: "We are 95% confident that losses will not exceed $10,000 over the specified period."
Methods:
- Historical: Use empirical distribution of past returns
- Parametric: Assume normal distribution (often wrong for finance)
- Monte Carlo: Simulate many scenarios
Confidence levels:
- 95%: Standard for risk management
- 99%: More conservative, regulatory requirements
- 99.9%: Extreme tail events
Limitations:
- Doesn’t capture severity beyond threshold
- Assumes past predicts future
- May underestimate tail risk
Use case: Risk budgeting and position sizing. In HMM context, calculate regime-specific VaR to adjust exposure dynamically.
See also: Expected Shortfall, Confidence Level, Tail Risk, Inverse CDF
Reference: Investopedia - Value at Risk
Definition: The expected squared deviation from the mean.
Formula:
$$\sigma^2 = \frac{1}{N-1}\sum_{i=1}^N (x_i - \bar{x})^2$$Relationship to std dev: Variance is the square of standard deviation: $\text{Var}(X) = \sigma^2$
Use case: Theoretical foundation for volatility. Variance has nice mathematical properties (additive for independent variables), but std dev is easier to interpret (same units as data).
See also: Standard Deviation, Volatility
Definition: A dynamic programming algorithm that finds the most likely sequence of hidden states that produced a given observation sequence.
Purpose: Solves the “decoding problem” - given observations and trained HMM, what states did we visit?
Key idea: At each time step, track the most probable path to each state. Then backtrack to find globally optimal path.
Complexity: $O(N^2 T)$ where $N$ is number of states and $T$ is sequence length. Much more efficient than brute force $O(N^T)$.
Viterbi vs Forward-Backward:
- Viterbi: Finds single best path (hard decoding)
- Forward-Backward: Computes probability of each state at each time (soft decoding)
Use case: After training HMM with Baum-Welch, use Viterbi to predict current regime and historical regime sequence.
Important limitation: Viterbi path may be less probable than any individual state from Forward-Backward at a given time. It’s the most likely PATH, not most likely state at each time.
See also: Baum-Welch Algorithm, Forward-Backward Algorithm, HMM
Reference: Read our HMM Algorithms Explained and HMM 101 articles
Definition: A measure of the dispersion of returns for a given security or market index. Represents the degree of variation (risk) in the price.
Formula: Typically measured as annualized standard deviation of returns:
$$\sigma_{\text{annual}} = \sigma_{\text{daily}} \times \sqrt{252}$$where 252 is the number of trading days per year.
Types:
- Historical volatility: Calculated from past returns
- Implied volatility: Derived from option prices (market’s expectation)
- Realized volatility: Actual volatility measured over a period
Volatility clustering: High volatility tends to follow high volatility (heteroskedasticity). This is why HMMs with state-dependent volatility are useful.
Interpretation:
- Low volatility (~10% annual): Stable stocks, defensive sectors
- Medium volatility (~15-20% annual): Market average
- High volatility (~30%+ annual): Growth stocks, crypto
Use case: Primary risk measure in finance. In HMMs, each regime has characteristic volatility (std dev of emission distribution). Crisis regimes have very high volatility.
See also: Standard Deviation, Heteroskedasticity, Regime
Reference: Wikipedia - Volatility (Finance)
Definition: The tendency for periods of high volatility to cluster together and periods of low volatility to cluster together in financial time series.
Observation: “Large changes tend to be followed by large changes, of either sign, and small changes tend to be followed by small changes.”
Statistical evidence:
- High autocorrelation in |returns| and returns²
- ARCH/GARCH effects
- Time-varying volatility
Causes:
- Information arrival: News comes in clusters
- Market structure: Liquidity varies over time
- Behavioral: Herding and momentum effects
HMM perspective: Volatility clustering suggests regime-switching behavior. High-volatility and low-volatility regimes persist before transitioning.
Modeling:
- ARCH/GARCH: Explicitly model time-varying volatility
- HMMs: Capture clustering via regime persistence (transition matrix)
- Stochastic volatility: Volatility follows its own stochastic process
Use case: Critical phenomenon in finance. HMM regime detection naturally captures volatility clustering through state persistence.
See also: Volatility, Heteroskedasticity, Autocorrelation, Regime
Reference: Wikipedia - Volatility Clustering
Section W
Definition: A validation technique that repeatedly trains a model on a rolling window of historical data and tests it on the subsequent out-of-sample period, simulating real-time performance.
Process:
- Train on window 1 (e.g., days 1-500)
- Test on window 2 (e.g., days 501-600)
- Roll forward: Train on days 101-600
- Test on days 601-700
- Repeat through entire dataset
Advantages:
- Realistic: Mimics how model would be used in production
- Adaptive: Model parameters update with new data
- Robust: Tests if model degrades over time
- Out-of-sample: All test periods are truly unseen
Parameters:
- Training window: Length of history for fitting (e.g., 500 days)
- Test window: Length of forward period (e.g., 100 days)
- Step size: How far to roll forward (e.g., 100 days)
Interpretation: If walk-forward results are similar to in-sample results, model is robust. Large degradation indicates overfitting or non-stationarity.
Use case: GOLD STANDARD for validating HMM trading strategies. More rigorous than single train/test split.
See also: Out-of-Sample Testing, In-Sample, Backtesting, Overfitting
Reference: Investopedia - Walk-Forward Analysis
Definition: The percentage of trading periods or trades that result in positive returns.
Formula:
$$\text{Win Rate} = \frac{\text{Number of Winning Trades}}{\text{Total Number of Trades}} \times 100\%$$Interpretation:
- 50%: Break-even (random)
- > 50%: More wins than losses
- > 60%: Good for mean-reversion strategies
- < 50%: Can still be profitable if wins are large (trend-following)
Win rate vs Profit factor:
- High win rate + low profit factor = Many small wins, few large losses (mean-reversion)
- Low win rate + high profit factor = Few large wins, many small losses (trend-following)
Misleading metric: A 90% win rate means nothing if the 10% of losses wipe out all gains. Must consider average win vs average loss size.
Use case: Diagnostic metric for trading strategies. Compare win rates across regimes - expect high win rate in favorable regimes, lower in unfavorable.
See also: Profit Factor, Sharpe Ratio, Backtesting
Reference: Investopedia - Win Rate
Section X
Definition: A random signal with constant power spectral density - completely uncorrelated across time.
Properties:
- Zero mean: $E[\epsilon_t] = 0$
- Constant variance: $\text{Var}(\epsilon_t) = \sigma^2$
- No autocorrelation: $\text{Cov}(\epsilon_t, \epsilon_s) = 0$ for $t \neq s$
Gaussian white noise: $\epsilon_t \sim \mathcal{N}(0, \sigma^2)$ and uncorrelated
Use case:
- Null model for time series (residuals should be white noise if model is good)
- Component of many time series models (noise term)
- If returns are white noise → unpredictable (efficient markets)
Testing: Use Ljung-Box test on residuals to check for white noise
See also: Autocorrelation, Stationary
Contributing
Found an error or want to suggest an addition? Please open an issue or contact us.
Last updated: October 12, 2025