Bayesian Statistics HW1¶
1. (Composition Method) Suppose a random variable $X$ follows a Laplace distribution with mean $\mu$ and scale $b$. That is, the pdf of $X$ is as follows: $$f(x) = \frac{1}{2b}\exp\biggl\{-\frac{|x-\mu|}{b} \biggr\}.$$ When $W \sim \text{Exp}\bigl(\frac{1}{2b^2} \bigr)$ and $X|(W=w) \sim \mathcal N(\mu, w)$, prove the following expression. $$\int_0^{\infty}f(x|w)f(w)dw = \frac{1}{2b}\exp\biggl\{ -\frac{|x-\mu|}{b} \biggr\}.$$ Also, using this composition method, extract 10000 samples from Laplace distribution with mean 1 and scale 2, draw histogram, and compare it with true Laplace pdf.¶
(1) MGF of $X$ : $M_X(t)$¶
$$\begin{aligned} E_X[e^{tX}] &= \int_{-\infty}^{\infty} e^{tx} \frac{1}{2b} \exp\left\{ -\frac{1}{b} |x - \mu| \right\} dx \\ &= \frac{1}{2b} \int_{-\infty}^{\infty} \exp\left\{ -\frac{1}{b} |x - \mu| + t x \right\} dx \\ &= \frac{1}{2b} \int_{-\infty}^{\mu} \exp\left\{ \frac{1}{b} (x - \mu) + t x \right\} dx + \frac{1}{2b} \int_{\mu}^{\infty} \exp\left\{ -\frac{1}{b} (x - \mu) + t x \right\} dx \end{aligned}$$1)
$$\begin{aligned} &\int_{-\infty}^{\mu} \exp\left\{ \left( \frac{1}{b} + t \right) x - \frac{\mu}{b} \right\} dx \\ &= \left[ \frac{1}{\frac{1}{b} + t} \exp\left\{ \left( \frac{1}{b} + t \right) x - \frac{\mu}{b} \right\} \right]_{x = -\infty}^{x = \mu} \\ &= \frac{1}{\frac{1}{b} + t} \left( \exp\{ t \mu \} - 0 \right) \quad \text{(assuming } \frac{1}{b} + t > 0 \text{)} \end{aligned}$$2)
$$\begin{aligned} &\int_{\mu}^{\infty} \exp\left\{ (t - \frac{1}{b}) x + \frac{\mu}{b} \right\} dx \\ &= \left[ \frac{1}{t - \frac{1}{b}} \exp\left\{ (t - \frac{1}{b}) x + \frac{\mu}{b} \right\} \right]_{x = \mu}^{x = \infty} \\ &= \frac{1}{t - \frac{1}{b}} \left( 0 - \exp\{ t \mu \} \right) \quad \text{(assuming } t - \frac{1}{b} < 0 \text{)} \end{aligned}$$Then,
$$\begin{aligned} E_X[e^{tX}] &= \frac{1}{2b} \exp\{ t \mu \} \left( \frac{1}{\frac{1}{b} + t} - \frac{1}{\frac{1}{b} - t} \right) \\ &= \frac{1}{2b} \left( \frac{1}{\frac{1}{b} + t} - \frac{1}{\frac{1}{b} - t} \right) \exp\{ t \mu \} \\ &= \frac{1}{b^2} \frac{1}{(t^2 - \frac{1}{b^2})} \exp\{ t \mu \} \\ &= \frac{\exp\{ t \mu \}}{1 - b^2 t^2} \quad (1 + t < \frac{1}{b}) \end{aligned}$$(2) $X \sim \text{Laplace}(\mu, b)$¶
$$ f_W(w) = \frac{1}{2b^2} \exp\left\{ -\frac{1}{2b^2} w \right\}, \quad f_{X|W}(x|w) = \frac{1}{\sqrt{2 \pi w}} \exp\left\{ -\frac{1}{2w} (x - \mu)^2 \right\} $$So,
$$ f_{X|W}(x|w) f_W(w) = \frac{1}{2b^2 \sqrt{2 \pi w}} \exp\left\{ -\frac{w}{2b^2} - \frac{1}{2w} (x - \mu)^2 \right\} $$and
$$ f_X(x) = \int_{0}^{\infty} \frac{1}{2b^2 \sqrt{2 \pi w}} \exp\left\{ -\frac{w}{2b^2} - \frac{1}{2w} (x - \mu)^2 \right\} dw $$Then,
$$\begin{aligned} E_X[e^{tX}] &= \int_{-\infty}^{\infty} \int_{0}^{\infty} e^{t x} \frac{1}{2b^2 \sqrt{2 \pi w}} \exp\left\{ -\frac{w}{2b^2} - \frac{1}{2w} (x - \mu)^2 \right\} dw \, dx\\ &= \int_{0}^{\infty} \frac{1}{2b^2 \sqrt{2 \pi w}} \exp\left\{ -\frac{w}{2b^2} \right\} \left( \int_{-\infty}^{\infty} e^{t x} \exp\left\{ -\frac{1}{2w} (x - \mu)^2 \right\} dx \right) dw \\ &= \int_{0}^{\infty} \frac{1}{2b^2} \exp\left\{ -\frac{w}{2b^2} \right\} \exp\left\{ t \mu + \frac{1}{2} t^2 w \right\} dw \\ &= \frac{1}{2b^2} e^{t \mu} \int_{0}^{\infty} \exp\left\{ \left( \frac{t^2}{2} - \frac{1}{2b^2} \right) w \right\} dw \\ &= \frac{1}{2b^2} \frac{e^{t \mu}}{\left( \frac{t^2}{2} - \frac{1}{2b^2} \right)} \left[ \exp\left\{ \left( \frac{t^2}{2} - \frac{1}{2b^2} \right) w \right\} \right]_{w = 0}^{w = \infty} \\ &= \frac{e^{t \mu}}{b^2 \left( b^2 t^2 - 1 \right)} \left\{ 0 - 1 \right\} \quad \text{(assuming } t^2 - \frac{1}{b^2} < 0 \text{)} \\ &= \frac{e^{t \mu}}{1 - b^2 t^2} \quad \left( |t| < \frac{1}{b} \right), \end{aligned} $$which is MGF of $X \sim \text{Laplace}(\mu, b)$.
$$\therefore X \sim \text{Laplace}(\mu, b)$$The following code shows how to sample 10000 number of Laplace distribution and align the corresponding histogram with true pdf.
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import laplace
np.random.seed(2023311161)
samples_array = np.empty(0) # initial empty numpy array
# generate 10000 samples using composition method
for _ in range(10000):
V = np.random.exponential(scale=8)
X = np.random.normal(loc=1, scale=np.sqrt(V))
samples_array = np.append(samples_array, X)
# histogram
plt.hist(samples_array, bins=80, density=True, alpha=0.4, color='g', label='Histogram')
# PDF of the Laplace distribution
x = np.linspace(-20, 22, 10000)
pdf = laplace.pdf(x, loc=1, scale=2)
plt.plot(x, pdf, 'r-', lw=1, label='Laplace PDF')
plt.axvline(x=1, color='black', linestyle='--',lw = 1)
plt.annotate(r'$x=1$', xy=(1, 0.01), xytext=(5, 0.1),
arrowprops=dict(arrowstyle='->', color='black'),
fontsize=12)
plt.text(12, 0.2, r'$\mu=1, b=2$', color = 'r')
# show the plot
plt.xlabel('X')
plt.ylabel('Prob')
plt.legend()
plt.title('Histogram Using Composition Method vs Laplace PDF')
plt.show()
2. Suppose the log-return series $\log\left(\frac{y_t}{y_{t-1}}\right)$ follows a Laplace distribution with mean $\mu$ and scale parameter $b$: $$f(x) = \frac{1}{2b} \exp\left( -\frac{|x - \mu|}{b} \right).$$ For each data $y_t$, the latent variable $W_t$ can be 'augmented': $W_t \sim \text{Exp}\bigl( \frac{1}{2b^2}\bigr)$ for each $t=1,...,n$ where $n$ is the total number of data. You are tasked with the following:
(1) Derive the full conditional posterior distribution of $W$, $\mu$, and $b^2$.
(2) Construct a Gibbs sampler code to sample from the joint posterior distribution of $\mu$, $b$, and $W$. You may use the following prior: $$ f(\mu, b^2) \propto \frac{1}{b^2}.$$
(3) Apply your Gibbs sampler to actual log-return data. Plot and summarize the posterior distributions of $\mu$ and $b^2$, reporting posterior means, credible intervals, and any other relevant posterior summaries. [Hint] You may utilize the result: $$\frac{b |y - \mu|}{w} \, \Big| \, (\mu, b^2, y) \sim \text{IG}.$$
(1) Posteriors¶
1) Joint Posterior:¶
$$ \begin{aligned} p(\mu, b^2, \mathbf{w} \mid \mathbf{y}) &\propto p(\mathbf{y} \mid \mu, b^2, \mathbf{w}) \cdot p(\mu, b^2, \mathbf{w}) \\ &= p(\mathbf{y} \mid \mu, \mathbf{w}) \cdot p(\mu, b^2) \cdot p(\mathbf{w} \mid \mu, b^2) \quad (\because \mathbf{y} \perp b^2) \\ &\propto \left[ \prod_{i=1}^{n} w_i^{-\frac{1}{2}} \exp\left\{ -\frac{1}{2w_i} (y_i - \mu)^2 \right\} \right] \cdot \frac{1}{b^2} \cdot \left[ \prod_{i=1}^{n} \frac{1}{2b^2} \exp\left( -\frac{w_i}{2b^2} \right) \right]. \end{aligned} $$2) Full Conditional of $\mu$:¶
$$ \begin{aligned} p(\mu \mid b^2, \mathbf{w}, \mathbf{y}) &\propto \prod_{i=1}^{n} w_i^{-\frac{1}{2}} \exp\left\{ -\frac{1}{2w_i} (y_i - \mu)^2 \right\} \\ &\propto \exp\left\{ -\sum_{i=1}^{n} \frac{1}{2w_i} (y_i^2 - 2\mu y_i + \mu^2) \right\} \\ &\propto \exp\left\{ -\frac{1}{2} \sum_{i=1}^{n} \frac{y_i^2}{w_i} + \mu \sum_{i=1}^{n} \frac{y_i}{w_i} - \frac{1}{2} \sum_{i=1}^{n} \frac{1}{w_i} \mu^2 \right\} \\ &\propto \exp\left\{ -\frac{1}{2} \left( \sum_{i=1}^{n} \frac{1}{w_i} \right) \left( \mu - \frac{\sum_{i=1}^{n} \frac{y_i}{w_i}}{\sum_{i=1}^{n} \frac{1}{w_i}} \right)^2 \right\}. \end{aligned} $$Thus,
$$ \mu \mid (b^2, \mathbf{w}, \mathbf{y}) \sim \mathcal{N} \left( \frac{\sum_{i=1}^{n} \frac{y_i}{w_i}}{\sum_{i=1}^{n} \frac{1}{w_i}}, \frac{1}{\sum_{i=1}^{n} \frac{1}{w_i}} \right). $$3) Full Conditional of $b^2$:¶
$$ \begin{aligned} p(b^2 \mid \mu, \mathbf{w}, \mathbf{y}) &\propto \frac{1}{b^2} \cdot \prod_{i=1}^{n} \frac{1}{2b^2} \exp\left( -\frac{w_i}{2b^2} \right) \\ &\propto (b^2)^{-(n+1)} \exp\left( -\frac{\sum_{i=1}^{n} w_i}{2b^2} \right). \end{aligned} $$This shows
$$ b^2 \mid (\mu, \mathbf{w}, \mathbf{y}) \sim \text{Inv-}\chi^2\left( 2n, \frac{\sum_{i=1}^{n} w_i}{2n} \right). $$Note: Since $\text{Inv-}\chi^2(\nu, \sigma^2) = \text{Inv-}\Gamma\left( \frac{\nu}{2}, \frac{\nu \sigma^2}{2} \right)$, we can also write:
$$ b^2 \mid (\mu, \mathbf{w}, \mathbf{y}) \sim \text{Inv-}\Gamma\left( n, \frac{1}{2} \sum_{i=1}^{n} w_i \right). $$4) Full Conditional of $\mathbf{w}$:¶
$$ \begin{aligned} p(\mathbf{w} \mid \mu, b^2, \mathbf{y}) &= \prod_{i=1}^{n} p(w_i \mid \mu, b^2, y_i) \quad (\text{iid}) \\ p(w_i \mid \mu, b^2, y_i) &\propto w_i^{-\frac{1}{2}} \exp\left\{ -\frac{1}{2w_i} (y_i - \mu)^2 - \frac{w_i}{2b^2} \right\}. \end{aligned} $$Let $\zeta_i := \frac{b |y_i - \mu|}{w_i}$ $\quad \Rightarrow \quad w_i = \frac{b |y_i - \mu|}{\zeta_i}$.
Jacobian:
$$ \mathcal{J} = \left| \frac{\partial w_i}{\partial \zeta_i} \right| = \frac{-b |y_i - \mu|}{\zeta_i^2}. $$Then,
$$ \begin{aligned} p(\zeta_i \mid \mu, b^2, y_i) &= p\left( \frac{b |y_i - \mu|}{w_i} \mid \mu, b^2, y_i \right) \times |\mathcal{J}| \\ &\propto \frac{1}{b |y_i - \mu|} \zeta_i \exp\left( -\frac{\zeta_i}{b |y_i - \mu|} \frac{(y_i - \mu)^2}{2} - \frac{b |y_i - \mu|}{2b^2} \zeta_i^{-1} \right) \times \frac{b |y_i - \mu|}{\zeta_i^2} \\ &\propto \zeta_i^{-3/2} \exp\left( -\frac{1}{2} \left( \frac{\zeta_i}{b^2} + \frac{(y_i - \mu)^2}{b^2 \zeta_i} \right) \right). \end{aligned} $$Since:
$$ f_X(x) \propto \frac{1}{\sqrt{x^3}} \exp\left( -\frac{\lambda}{2x} - \frac{\lambda x}{2} \right) \quad \text{for } X \sim \text{Inv-Gaussian}(\mu, \lambda), $$we recognize:
$$ \zeta_i \mid (\mu, b^2, y_i) \sim \text{Inv-Gaussian} \left( 1, \frac{b^2}{(y_i - \mu)^2} \right). $$We may use this to obtain:
$$ w_i = \frac{b |y_i - \mu|}{\zeta_i}. $$(2) Gibbs Sampler Code¶
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import invgamma
from scipy.stats import laplace
from scipy.stats import mode
def gibbs_parameter(y, num, mu_in, b_in, txt):
n = len(y) # y : the log-earning rate data
samples = np.zeros((num, 2)) # for mu and b^2, num : number of iteration
# initialial parameters
mu = mu_in
b2 = b_in**2
w = np.random.exponential(2*b2, n)
for i in range(num):
# mu
mu = np.random.normal(loc = np.sum(y/w)/np.sum(1/w), scale = np.sqrt(1/np.sum(1/w)))
samples[i,0] = mu
# b2
b2 = invgamma.rvs(a = n, scale = np.sum(w)/2, size = 1)
samples[i,1] = b2
# w using Inverse Gaussian
for j in range(n):
L = np.abs(y[j] - mu) / np.sqrt(b2)
z = np.random.wald(mean = 1.0, scale = L) # equivalent to inverse Gaussian
w[j] = np.sqrt(b2) * np.abs(y[j] - mu) / z
mus = samples[:,0]
b2s = samples[:,1]
# txt : the name of data, HSI, NASDAQ, SSEC
fig, axs = plt.subplots(2,2, figsize = (12,10))
axs[0,0].plot(range(len(mus)),mus, color = 'red', alpha = 0.7)
axs[0,0].set_title(f'plot of $\mu$ for {txt} data')
axs[0,0].set_xlabel('iteration')
axs[0,0].set_ylabel('$\mu$')
axs[0,1].hist(mus, bins=30, range = (np.min(mus), np.max(mus)), density=True, color='red', alpha = 0.7)
axs[0,1].set_title(f'Histogram of $\mu$ for {txt} data')
axs[0,1].set_xlabel('$\mu$')
axs[0,1].set_xticks(np.linspace(np.min(mus), np.max(mus), 4))
axs[0,1].set_ylabel('density')
axs[1,0].plot(range(len(b2s)),b2s, color = 'blue', alpha = 0.7)
axs[1,0].set_title(f'plot of $b^2$ for {txt} data')
axs[1,0].set_xlabel('iteration')
axs[1,0].set_ylabel('$b^2$')
axs[1,1].hist(b2s, bins=30, range = (np.min(b2s),np.max(b2s)), density=True, color='blue', alpha = 0.7)
axs[1,1].set_title(f'Histogram of $b^2$ for {txt} data')
axs[1,1].set_xlabel('$b^2$')
axs[1,1].set_xticks(np.linspace(np.min(b2s), np.max(b2s), 4))
axs[1,1].set_ylabel('density')
plt.show()
mean_mu = f"{np.mean(mus):.4g}"
median_mu = f"{np.median(mus):.4g}"
mode_mu = f"{mode(mus)[0]:.4g}"
print(f'The posterior mean, median, mode of mu for {txt} are {mean_mu}, {median_mu}, and {mode_mu}')
mean_b2 = f"{np.mean(b2s):.4g}"
median_b2 = f"{np.median(b2s):.4g}"
mode_b2 = f"{mode(b2s)[0]:.4g}"
print(f'The posterior mean, median, mode of b^2 for {txt} are {mean_b2}, {median_b2}, and {mode_b2}')
for google colab
from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
(3). Real Data¶
1) HSI data¶
i) Distribution of Log-Earning Rate for HSI¶
data = pd.read_excel('/content/drive/MyDrive/Bayesian/HSI.xlsx')
df = data['Price']
df = df/df.shift(1)
df = df.drop(0)
log_ratio = np.log(df)
log_samples = log_ratio.to_numpy()
plt.hist(log_samples, bins=30, density=True, alpha=0.8, color='green', label='Histogram')
x = np.linspace(-0.05, 0.05, 1000)
m = 0; b = 0.007
pdf = laplace.pdf(x, loc = m, scale = b)
plt.plot(x, pdf, label=f'Laplace({m}, {b})', color = 'purple')
plt.title(f'Histogram of HSI vs PDF of Laplace({m}, {b})')
plt.xlabel('x')
plt.ylabel('density')
plt.legend()
plt.show()
It can be shown that the distribution of log earning rate (green histogram) is similar to that of Laplace distribution with mean 0 and scale 0.007 (purple graph).
ii) Posterior Distribution of $\mu$, $b^2$ for HSI data¶
np.random.seed(2023311161)
gibbs_parameter(log_samples, 2000, m, b, 'HSI')
The posterior mean, median, mode of mu for HSI are 0.0009909, 0.0009963, and 2.128e-05 The posterior mean, median, mode of b^2 for HSI are 5.094e-05, 5.078e-05, and 4.181e-05
The value of $\mu$ converges to around 0.001,
while that of $b^2$ converges to about $5*10^{-5}$.
2) NASDAQ¶
i) Distribution of Log-Earning Rate for NASDAQ¶
data = pd.read_excel('/content/drive/MyDrive/Bayesian/NASDAQ.xlsx')
df = data['Price']
df = df/df.shift(1)
df = df.drop(0)
log_ratio = np.log(df)
log_samples = log_ratio.to_numpy()
plt.hist(log_samples, bins=30, density=True, alpha=0.8, color='green', label='Histogram')
x = np.linspace(-0.05, 0.05, 1000)
m = 0.01; b = 0.015
pdf = laplace.pdf(x, loc = m, scale = b)
plt.plot(x, pdf, label=f'Laplace({m}, {b})', color = 'purple')
plt.title(f'Histogram of NASDAQ vs PDF of Laplace({m}, {b})')
plt.xlabel('x')
plt.ylabel('density')
plt.legend()
plt.show()
The distribution of log earning rate (green histogram) is similar to that of Laplace distribution with mean 0.01 and scale 0.015 (purple graph).
ii) Posterior Distribution of $\mu$, $b^2$ for NASDAQ data¶
np.random.seed(2023311161)
gibbs_parameter(log_samples, 2000, m, b, 'NASDAQ')
The posterior mean, median, mode of mu for NASDAQ are 0.005885, 0.005911, and -0.000564 The posterior mean, median, mode of b^2 for NASDAQ are 0.0002005, 0.0001993, and 0.0001393
The value of $\mu$ converges to around 0.006,
while that of $b^2$ converges to about 0.0002.
3) SSEC¶
i) Distribution of Log-Earning Rate for SSEC¶
data = pd.read_excel('/content/drive/MyDrive/Bayesian/SSEC.xlsx')
df = data['Price']
df = df/df.shift(1)
df = df.drop(0)
log_ratio = np.log(df)
log_samples = log_ratio.to_numpy()
plt.hist(log_samples, bins=30, density=True, alpha=0.8, color='green', label='Histogram')
x = np.linspace(-0.05, 0.05, 1000)
m = 0; b = 0.01
pdf = laplace.pdf(x, loc = m, scale = b)
plt.plot(x, pdf, label=f'Laplace({m}, {b})', color = 'purple')
plt.title(f'Histogram of SSEC vs PDF of Laplace({m}, {b})')
plt.xlabel('x')
plt.ylabel('density')
plt.legend()
plt.show()
The distribution of log earning rate (green histogram) is similar to that of Laplace distribution with mean 0 and scale 0.01 (purple graph).
ii) Posterior Distribution of $\mu$, $b^2$ for SSEC data¶
np.random.seed(2023311161)
gibbs_parameter(log_samples, 2000, m, b, 'SSEC')
The posterior mean, median, mode of mu for SSEC are 0.001129, 0.001141, and -0.0008247 The posterior mean, median, mode of b^2 for SSEC are 8.19e-05, 8.176e-05, and 6.065e-05
The value of $\mu$ converges to around 0.001,
while that of $b^2$ converges to about 0.00008.