42 Appendix A: Mathematical Foundations

This appendix collects the mathematical prerequisites and intermediate results needed for the course. It is organized by mathematical area, not by session. Use it as a reference; consult specific sections as needed.

Notation conventions: - $\mathbb{R}$ denotes the real numbers; $\mathbb{R}^n$ the $n$-dimensional Euclidean space - $\mathbb{N} = \{0, 1, 2, ...\}$ denotes the non-negative integers - $\mathbb{P}$ denotes a probability measure; $E[\cdot]$ denotes expectation under $\mathbb{P}$ - $\mathcal{N}(\mu, \sigma^2)$ denotes the normal distribution with mean $\mu$ and variance $\sigma^2$ - $W_t$ denotes a standard Brownian motion

42.1 A.1 Probability Theory Essentials

42.1.1 A.1.1 Random Variables and Distributions

A random variable $X$ is a measurable function from a probability space $(\Omega, \mathcal{F}, \mathbb{P})$ to $\mathbb{R}$.

Expectation: $E[X] = \int_\Omega X(\omega) d\mathbb{P}(\omega)$

Variance: $\text{Var}(X) = E[(X - E[X])^2] = E[X^2] - (E[X])^2$

Conditional expectation: $E[X | \mathcal{G}]$ for $\sigma$-algebra $\mathcal{G} \subseteq \mathcal{F}$. The conditional expectation is itself a random variable, measurable with respect to $\mathcal{G}$.

Key property (tower): $E[E[X | \mathcal{G}]] = E[X]$

42.1.2 A.1.2 The Normal Distribution

If $X \sim \mathcal{N}(\mu, \sigma^2)$:

Density: $f_X(x) = \dfrac{1}{\sigma\sqrt{2\pi}} \exp\left(-\dfrac{(x-\mu)^2}{2\sigma^2}\right)$
Mean: $\mu$; Variance: $\sigma^2$
Moment generating function: $M_X(t) = e^{\mu t + \sigma^2 t^2/2}$
Standardization: $Z = (X - \mu)/\sigma \sim \mathcal{N}(0, 1)$

Important moments:

Moment	Value
$E[X]$	$\mu$
$\text{Var}(X)$	$\sigma^2$
$E[(X - \mu)^3]$	0 (symmetry)
$E[(X - \mu)^4]$	$3\sigma^4$

42.1.3 A.1.3 Jensen’s Inequality (Foundational for GE-LAV)

Statement: For a convex function $f: \mathbb{R} \to \mathbb{R}$ and any random variable $X$ with finite expectation: \[f(E[X]) \leq E[f(X)]\] with strict inequality when $f$ is strictly convex and $X$ is non-degenerate.

For concave $f$: the inequality reverses: $f(E[X]) \geq E[f(X)]$.

Proof sketch (convex case): By the supporting hyperplane property, for any $x_0$: $f(x) \geq f(x_0) + f'(x_0)(x - x_0)$. Take $x_0 = E[X]$ and expectations of both sides: $E[f(X)] \geq f(E[X]) + f'(E[X]) \cdot E[X - E[X]] = f(E[X])$. ∎

Application to GE-LAV: The discount factor $f(L) = e^{-r(L)T}$ is convex in $L$ when $r(L)$ is appropriately convex. This produces the systematic upward Jensen bias in LAV vs. DCF.

42.1.4 A.1.4 Conditional Expectation Properties

For random variables $X, Y$:

Linearity: $E[aX + bY | \mathcal{G}] = aE[X | \mathcal{G}] + bE[Y | \mathcal{G}]$
Pull-out: If $Z$ is $\mathcal{G}$-measurable: $E[ZX | \mathcal{G}] = Z E[X | \mathcal{G}]$
Tower property: $E[E[X | \mathcal{G}]] = E[X]$
Independence: If $X$ independent of $\mathcal{G}$: $E[X | \mathcal{G}] = E[X]$

42.2 A.2 Stochastic Processes

42.2.1 A.2.1 Definition and Classification

A stochastic process $(X_t)_{t \in T}$ is a collection of random variables indexed by time. Common index sets: $T = \mathbb{N}$ (discrete time) or $T = [0, \infty)$ (continuous time).

Classification by state space: - Discrete state: e.g., Markov chains on $\{0, 1, 2, ...\}$ - Continuous state: e.g., Brownian motion, OU process

Classification by index set: - Discrete time: $(X_n)_{n \in \mathbb{N}}$ - Continuous time: $(X_t)_{t \geq 0}$

42.2.2 A.2.2 Markov Property

A process $(X_t)$ has the Markov property if, for all $s < t$: \[\mathbb{P}(X_t \in A | \mathcal{F}_s) = \mathbb{P}(X_t \in A | X_s)\] where $\mathcal{F}_s$ is the natural filtration.

In words: “future depends on past only through the present.” All the SDE-based processes in this course (Brownian motion, OU, GE-LAV state) are Markov.

42.2.3 A.2.3 Random Walks (Discrete-Time Building Block)

A simple random walk $(S_n)_{n=0,1,2,...}$: - $S_0 = 0$ - $S_n = X_1 + X_2 + ... + X_n$ - $X_i$ i.i.d. with $\mathbb{P}(X_i = +1) = \mathbb{P}(X_i = -1) = 1/2$

Properties: - $E[S_n] = 0$ - $\text{Var}(S_n) = n$ (variance grows linearly with time) - Recurrent in 1D, 2D; transient in 3D+

Scaling limit: As $n \to \infty$ with appropriate scaling, the random walk converges to Brownian motion (Donsker’s theorem).

42.3 A.3 Brownian Motion

42.3.1 A.3.1 Definition

A stochastic process $W_t$ is a standard Brownian motion if:

$W_0 = 0$
Paths $t \mapsto W_t$ are continuous almost surely
Increments are independent: for $s < t$, $W_t - W_s$ is independent of $\mathcal{F}_s$
Increments are normally distributed: $W_t - W_s \sim \mathcal{N}(0, t - s)$

42.3.2 A.3.2 Key Properties

Martingale property: $E[W_t | \mathcal{F}_s] = W_s$ for $s \leq t$.

Quadratic variation: $[W, W]_t = t$ — Brownian motion accumulates quadratic variation at unit rate.

Self-similarity (scaling): For any $c > 0$, the process $(c^{-1/2} W_{ct})$ has the same law as $(W_t)$.

Non-differentiability: Almost surely, $W_t$ is nowhere differentiable.

Time-reversal: The process $(W_T - W_{T-t})_{0 \leq t \leq T}$ has the same law as $(W_t)_{0 \leq t \leq T}$ for any $T > 0$.

42.3.3 A.3.3 Multidimensional Brownian Motion

$d$-dimensional standard Brownian motion: $W_t = (W_t^1, W_t^2, ..., W_t^d)$ where each $W_t^i$ is an independent 1D standard Brownian motion.

Cross-covariance: $\text{Cov}(W_t^i, W_t^j) = t \cdot \delta_{ij}$ (Kronecker delta).

42.4 A.4 Itô Calculus

42.4.1 A.4.1 Stochastic Integral

For an $\mathcal{F}_t$-adapted process $f(t, \omega)$ satisfying $\int_0^T E[f(t, \omega)^2] dt < \infty$, the Itô integral: \[\int_0^T f(t, \omega) dW_t\] is well-defined as the $L^2$ limit of step-function approximations.

Key property (Itô isometry): \[E\left[\left(\int_0^T f dW_t\right)^2\right] = \int_0^T E[f^2] dt\]

This is the square of the standard $L^2$ norm of $f$ — the foundation for many calculations.

42.4.2 A.4.2 Itô SDEs

A stochastic differential equation in Itô form: \[dX_t = \mu(X_t, t) dt + \sigma(X_t, t) dW_t\] with initial condition $X_0$, has a unique strong solution under standard conditions: - $\mu, \sigma$ Lipschitz continuous - $\mu, \sigma$ have linear growth bound

42.4.3 A.4.3 Itô’s Lemma (Chain Rule)

For a smooth function $f: \mathbb{R} \to \mathbb{R}$ and an Itô process $X_t$: \[df(X_t) = f'(X_t) dX_t + \frac{1}{2} f''(X_t) \sigma^2(X_t, t) dt\]

The Itô correction $\frac{1}{2} f''(X_t) \sigma^2 dt$ is what distinguishes stochastic calculus from ordinary calculus. It is the second-order Taylor term that becomes first-order because $(dW_t)^2 \to dt$ at order 1.

Multivariate version: For $f: \mathbb{R}^d \to \mathbb{R}$: \[df(X_t) = \sum_i \partial_i f(X_t) dX_t^i + \frac{1}{2} \sum_{i,j} \partial_{ij} f(X_t) [dX^i, dX^j]_t\] where $[dX^i, dX^j]_t = (\sigma\sigma^T)_{ij}(X_t, t) dt$.

42.4.4 A.4.4 Connection to Jensen’s Inequality

Apply Itô’s lemma to $f(X) = e^{-rX}$ with $r > 0$ (convex function): - $f'(X) = -r e^{-rX}$ - $f''(X) = r^2 e^{-rX}$

If $X$ is constant (zero drift, zero diffusion): no correction. If $X$ is stochastic: $E[f(X_T)] > f(E[X_T])$ by Jensen, with magnitude $\frac{r^2}{2} \text{Var}(X_T)$ to leading order.

This is exactly the Jensen bias mechanism that makes $V^{LAV} > V^{DCF}$ when $r(L)$ is convex in $L$ and $L$ is stochastic.

42.5 A.5 Ornstein-Uhlenbeck Process

42.5.1 A.5.1 Definition

The OU process is the Itô SDE: \[dL_t = \kappa(\bar{L} - L_t) dt + \sigma dW_t\] with parameters $\kappa > 0$ (mean reversion speed), $\bar{L}$ (long-run mean), $\sigma > 0$ (volatility).

42.5.2 A.5.2 Closed-Form Solution

Using the integrating factor $e^{\kappa t}$: \[L_t = e^{-\kappa t} L_0 + \bar{L}(1 - e^{-\kappa t}) + \sigma \int_0^t e^{-\kappa(t-s)} dW_s\]

The integral is Gaussian (linear in $dW$), so $L_t$ given $L_0$ is Gaussian.

Conditional distribution: \[L_t | L_0 \sim \mathcal{N}\left(\bar{L} + (L_0 - \bar{L})e^{-\kappa t}, \frac{\sigma^2}{2\kappa}(1 - e^{-2\kappa t})\right)\]

42.5.3 A.5.3 Stationary Distribution

As $t \to \infty$, the OU distribution converges: \[L_\infty \sim \mathcal{N}\left(\bar{L}, \frac{\sigma^2}{2\kappa}\right)\]

This is the invariant distribution of the OU process — independent of starting point $L_0$.

42.5.4 A.5.4 Calibrated Parameters (GE-LAV)

Parameter	Value	Interpretation
$\kappa$	0.45/yr	Mean reversion speed
$\sigma$	0.32	Volatility coefficient
$\bar{L}$	1.0 (or 0.0 depending on normalization)	Long-run mean
$\text{Var}(L_\infty)$	0.1138	Stationary variance
Std$(L_\infty)$	0.337	Stationary standard deviation
Half-life	1.54 years	$\ln(2)/\kappa$

42.6 A.6 Variational Inequalities and Free Boundaries

42.6.1 A.6.1 The American Option Analogue

The GE-LAV exit problem is an optimal stopping problem. The value function satisfies: \[\min\left\{ -\mathcal{L}V + rV - C, \quad V - G \right\} = 0\]

where: - $\mathcal{L}$ is the HJB operator (drift + diffusion + time) - $G(L)$ is the exit payoff - The first argument is the continuation condition - The second argument is the exercise condition

This is mathematically identical to American option pricing.

42.6.2 A.6.2 Smooth Pasting

At the free boundary $L^*(t)$: 1. Value match: $V(L^*, t) = G(L^*)$ 2. Smooth pasting: $\partial_L V(L^*, t) = G'(L^*)$

These two conditions uniquely determine $L^*(t)$.

42.6.3 A.6.3 Numerical Methods

For the GE-LAV exit problem: - Discretize the state space: $L \in [-3, 3]$ on a 200-point grid - Discretize time: Backward induction from $t = T$ to $t = 0$ - At each step: Solve the HJB PDE assuming continuation, then take maximum with exit payoff - Output: The exit boundary $L^*(t)$ emerges from the switching surface

42.7 A.7 Convex Analysis Essentials

42.7.1 A.7.1 Convex Functions

A function $f: \mathbb{R} \to \mathbb{R}$ is convex if: \[f(\lambda x + (1-\lambda) y) \leq \lambda f(x) + (1-\lambda) f(y)\] for all $x, y \in \mathbb{R}$ and $\lambda \in [0, 1]$.

Strict convexity: Inequality is strict whenever $x \neq y$.

Equivalent for twice-differentiable $f$: $f''(x) \geq 0$ for all $x$ (strictly convex if $f''(x) > 0$).

42.7.2 A.7.2 Convexity in GE-LAV

The premium function $\pi(L) = \pi_0 - \pi_1 L + \pi_2 L^2$ with $\pi_2 > 0$ is strictly convex.

The discount factor $e^{-r(L)T}$ as a function of $L$ is convex (since composition of convex with affine of $L$ is convex; with $r(L)$ convex, even more curvature).

This convexity is the source of the Jensen bias in private market valuation.

42.8 A.8 Game Theory: Nash Equilibrium and Mean-Field

42.8.1 A.8.1 Nash Equilibrium

A Nash equilibrium for a game with $N$ players is a profile $(s_1^*, ..., s_N^*)$ such that: \[U_i(s_i^*, s_{-i}^*) \geq U_i(s_i, s_{-i}^*) \quad \forall s_i, \forall i\]

Each player’s strategy is a best response given others’ strategies; mutually consistent.

Existence (Nash 1950): Every finite game has at least one Nash equilibrium in mixed strategies. For games with continuous strategies, additional regularity needed.

42.8.2 A.8.2 Mean-Field Approximation

When $N$ is large, tracking individual strategies is intractable. Replace with the distribution of strategies across the population: \[\mu_t = \text{distribution of } (s_t^i) \text{ across } i\]

Each player reacts to $\mu_t$ rather than to individual others. As $N \to \infty$, this becomes exact (propagation of chaos).

42.8.3 A.8.3 Fixed-Point Structure

A mean-field equilibrium is a distribution $\mu^*$ such that: 1. Each player’s optimal strategy, given $\mu^*$, is $s^*(\mu^*)$ 2. The aggregate distribution of $s^*(\mu^*)$ across the population equals $\mu^*$

This is a fixed point: $\mu^* = \Phi(\mu^*)$ where $\Phi$ is the mapping from population state to individual best response back to aggregate.

Existence: Schauder fixed-point theorem (for compact convex sets in topological vector spaces).

Uniqueness: Banach fixed-point theorem under contraction (stability condition).

42.9 A.9 Optimization and Lagrange Methods

42.9.1 A.9.1 Unconstrained Optimization

For smooth $f: \mathbb{R}^n \to \mathbb{R}$, an interior critical point satisfies $\nabla f = 0$.

Second-order condition: Hessian $\nabla^2 f$ positive (negative) definite for local minimum (maximum).

42.9.2 A.9.2 Constrained Optimization

For $\max f(x)$ subject to $g(x) = 0$: Lagrangian: $L(x, \lambda) = f(x) - \lambda g(x)$

First-order conditions: $\nabla_x L = 0$, $\nabla_\lambda L = 0$ (i.e., $g(x) = 0$).

42.9.3 A.9.3 Variational Methods

For functionals $J[u] = \int F(u, u', t) dt$, the Euler-Lagrange equation: \[\frac{\partial F}{\partial u} - \frac{d}{dt}\frac{\partial F}{\partial u'} = 0\]

The HJB equation arises from applying variational methods to the value function in optimal control.

42.10 A.10 Quick Reference Formulas

42.10.1 A.10.1 OU Calibration

At canonical values $\kappa = 0.45$, $\sigma = 0.32$: - Stationary variance: $0.1138$ - Stationary std: $0.337$ - Half-life: $1.54$ yr - $1 - e^{-\kappa t}$ at $t = 5$: $0.895$ - $1 - e^{-2\kappa t}$ at $t = 5$: $0.989$

42.10.2 A.10.2 Jensen Bias

Affine approximation: $B(T) = A \cdot T + C$ with $A \approx 0.16-0.18\%$/yr by asset class
Closed-form: $B(T) = (\pi_2/2) \Pi_{liq}(T)$ where $\Pi_{liq}(T) = \sigma^2[T - (1-e^{-2\kappa T})/(2\kappa)]$
At calibrated $\pi_2 = 0.021$: $B(5) \approx 0.4\%$, $B(15) \approx 1.5\%$, $B(20) \approx 2.0\%$

42.10.3 A.10.3 Effective Rate

DCF rate: typically constant at $r_f + \beta \cdot ERP + \pi_0 \approx 7.5\%$
LAV rate: path-dependent at $r(L_t) = r_f + \pi(L_t)$
GE-LAV rate: $r_{GE}(L, \mu) = r(L) + \text{equilibrium uplift}$
At GFC depth ($L = -1.5$): $r_{GE} \approx 32\%$ vs. DCF $7.5\%$ → 4.31× amplification

42.10.4 A.10.4 Welfare

Welfare gap: $\Delta W \approx 2.3\%$/yr
Aggregate: $\$13T \times 2.3\% = \$300B/$yr
Pigouvian tax at GFC depth: $\tau^* \approx 7\%$ on secondary transactions

42.11 A.11 Mathematical Maturity Required for Each Track

42.11.1 Track 1 (Practitioner)

Required: - Probability theory through expectations and variances (A.1) - Comfortable with the normal distribution (A.1.2) - Aware of Jensen’s inequality and its direction (A.1.3, A.7) - Basic familiarity with stochastic processes as concepts (A.2)

Recommended but not required: - Recognition of SDE notation - Awareness of HJB equation as a concept (not derivation) - Comfort reading mathematical statements in lectures

42.11.2 Track 2 (Researcher)

Required (in addition to Track 1): - Itô calculus, including Itô’s lemma derivation (A.4) - OU process derivation and properties (A.5) - Variational inequalities and free boundary problems (A.6) - Convex analysis (A.7) - Game theory and mean-field methods (A.8)

Helpful: - Functional analysis (Hilbert spaces, weak topologies) - Numerical analysis (finite differences, iterative methods) - Measure theory (Lebesgue integration, Radon-Nikodym)

42.12 A.12 Suggested Self-Diagnostic

Before the course (or by Session 4 at the latest), confirm understanding of:

For Track 1: - [ ] Can you state Jensen’s inequality and identify whether $E[X^2] \geq (E[X])^2$ is consistent with it? - [ ] If $X \sim \mathcal{N}(0, 1)$, what is $\mathbb{P}(X > 1.96)$? Approximately? - [ ] If a random variable has mean 0 and variance 9, what is its standard deviation? - [ ] What is a Markov chain, informally?

For Track 2: - [ ] Can you derive Itô’s lemma from a Taylor expansion? - [ ] Can you compute $E\left[\int_0^t e^{-\kappa(t-s)} dW_s\right]$ using the Itô isometry? - [ ] Can you state the HJB equation for a generic control problem? - [ ] What is a fixed-point theorem? Name one.

If you cannot answer the Track 1 questions, the practitioner track will be challenging — additional preparation recommended. If you cannot answer the Track 2 questions, the researcher track will be very challenging — strongly recommend remediation first.

← Reading List | Course Home

Moment	Value
\(E[X]\)	\(\mu\)
\(\text{Var}(X)\)	\(\sigma^2\)
\(E[(X - \mu)^3]\)	0 (symmetry)
\(E[(X - \mu)^4]\)	\(3\sigma^4\)