Bayesian Statistics HW1¶

1. (Composition Method) Suppose a random variable $X$ follows a Laplace distribution with mean $\mu$ and scale $b$. That is, the pdf of $X$ is as follows: $$f(x) = \frac{1}{2b}\exp\biggl\{-\frac{|x-\mu|}{b} \biggr\}.$$ When $W \sim \text{Exp}\bigl(\frac{1}{2b^2} \bigr)$ and $X|(W=w) \sim \mathcal N(\mu, w)$, prove the following expression. $$\int_0^{\infty}f(x|w)f(w)dw = \frac{1}{2b}\exp\biggl\{ -\frac{|x-\mu|}{b} \biggr\}.$$ Also, using this composition method, extract 10000 samples from Laplace distribution with mean 1 and scale 2, draw histogram, and compare it with true Laplace pdf.¶

(1) MGF of $X$ : $M_X(t)$¶

$$\begin{aligned} E_X[e^{tX}] &= \int_{-\infty}^{\infty} e^{tx} \frac{1}{2b} \exp\left\{ -\frac{1}{b} |x - \mu| \right\} dx \\ &= \frac{1}{2b} \int_{-\infty}^{\infty} \exp\left\{ -\frac{1}{b} |x - \mu| + t x \right\} dx \\ &= \frac{1}{2b} \int_{-\infty}^{\mu} \exp\left\{ \frac{1}{b} (x - \mu) + t x \right\} dx + \frac{1}{2b} \int_{\mu}^{\infty} \exp\left\{ -\frac{1}{b} (x - \mu) + t x \right\} dx \end{aligned}$$

$$\begin{aligned} &\int_{-\infty}^{\mu} \exp\left\{ \left( \frac{1}{b} + t \right) x - \frac{\mu}{b} \right\} dx \\ &= \left[ \frac{1}{\frac{1}{b} + t} \exp\left\{ \left( \frac{1}{b} + t \right) x - \frac{\mu}{b} \right\} \right]_{x = -\infty}^{x = \mu} \\ &= \frac{1}{\frac{1}{b} + t} \left( \exp\{ t \mu \} - 0 \right) \quad \text{(assuming } \frac{1}{b} + t > 0 \text{)} \end{aligned}$$

$$\begin{aligned} &\int_{\mu}^{\infty} \exp\left\{ (t - \frac{1}{b}) x + \frac{\mu}{b} \right\} dx \\ &= \left[ \frac{1}{t - \frac{1}{b}} \exp\left\{ (t - \frac{1}{b}) x + \frac{\mu}{b} \right\} \right]_{x = \mu}^{x = \infty} \\ &= \frac{1}{t - \frac{1}{b}} \left( 0 - \exp\{ t \mu \} \right) \quad \text{(assuming } t - \frac{1}{b} < 0 \text{)} \end{aligned}$$

Then,

$$\begin{aligned} E_X[e^{tX}] &= \frac{1}{2b} \exp\{ t \mu \} \left( \frac{1}{\frac{1}{b} + t} - \frac{1}{\frac{1}{b} - t} \right) \\ &= \frac{1}{2b} \left( \frac{1}{\frac{1}{b} + t} - \frac{1}{\frac{1}{b} - t} \right) \exp\{ t \mu \} \\ &= \frac{1}{b^2} \frac{1}{(t^2 - \frac{1}{b^2})} \exp\{ t \mu \} \\ &= \frac{\exp\{ t \mu \}}{1 - b^2 t^2} \quad (1 + t < \frac{1}{b}) \end{aligned}$$

(2) $X \sim \text{Laplace}(\mu, b)$¶

$$ f_W(w) = \frac{1}{2b^2} \exp\left\{ -\frac{1}{2b^2} w \right\}, \quad f_{X|W}(x|w) = \frac{1}{\sqrt{2 \pi w}} \exp\left\{ -\frac{1}{2w} (x - \mu)^2 \right\} $$

So,

$$ f_{X|W}(x|w) f_W(w) = \frac{1}{2b^2 \sqrt{2 \pi w}} \exp\left\{ -\frac{w}{2b^2} - \frac{1}{2w} (x - \mu)^2 \right\} $$

and

$$ f_X(x) = \int_{0}^{\infty} \frac{1}{2b^2 \sqrt{2 \pi w}} \exp\left\{ -\frac{w}{2b^2} - \frac{1}{2w} (x - \mu)^2 \right\} dw $$

Then,

$$\begin{aligned} E_X[e^{tX}] &= \int_{-\infty}^{\infty} \int_{0}^{\infty} e^{t x} \frac{1}{2b^2 \sqrt{2 \pi w}} \exp\left\{ -\frac{w}{2b^2} - \frac{1}{2w} (x - \mu)^2 \right\} dw \, dx\\ &= \int_{0}^{\infty} \frac{1}{2b^2 \sqrt{2 \pi w}} \exp\left\{ -\frac{w}{2b^2} \right\} \left( \int_{-\infty}^{\infty} e^{t x} \exp\left\{ -\frac{1}{2w} (x - \mu)^2 \right\} dx \right) dw \\ &= \int_{0}^{\infty} \frac{1}{2b^2} \exp\left\{ -\frac{w}{2b^2} \right\} \exp\left\{ t \mu + \frac{1}{2} t^2 w \right\} dw \\ &= \frac{1}{2b^2} e^{t \mu} \int_{0}^{\infty} \exp\left\{ \left( \frac{t^2}{2} - \frac{1}{2b^2} \right) w \right\} dw \\ &= \frac{1}{2b^2} \frac{e^{t \mu}}{\left( \frac{t^2}{2} - \frac{1}{2b^2} \right)} \left[ \exp\left\{ \left( \frac{t^2}{2} - \frac{1}{2b^2} \right) w \right\} \right]_{w = 0}^{w = \infty} \\ &= \frac{e^{t \mu}}{b^2 \left( b^2 t^2 - 1 \right)} \left\{ 0 - 1 \right\} \quad \text{(assuming } t^2 - \frac{1}{b^2} < 0 \text{)} \\ &= \frac{e^{t \mu}}{1 - b^2 t^2} \quad \left( |t| < \frac{1}{b} \right), \end{aligned} $$

which is MGF of $X \sim \text{Laplace}(\mu, b)$.

$$\therefore X \sim \text{Laplace}(\mu, b)$$

The following code shows how to sample 10000 number of Laplace distribution and align the corresponding histogram with true pdf.

In [1]:

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import laplace

np.random.seed(2023311161)
samples_array = np.empty(0) # initial empty numpy array

# generate 10000 samples using composition method
for _ in range(10000):
    V = np.random.exponential(scale=8)
    X = np.random.normal(loc=1, scale=np.sqrt(V))
    samples_array = np.append(samples_array, X)

# histogram
plt.hist(samples_array, bins=80, density=True, alpha=0.4, color='g', label='Histogram')

# PDF of the Laplace distribution
x = np.linspace(-20, 22, 10000)
pdf = laplace.pdf(x, loc=1, scale=2)
plt.plot(x, pdf, 'r-', lw=1, label='Laplace PDF')


plt.axvline(x=1, color='black', linestyle='--',lw = 1)
plt.annotate(r'$x=1$', xy=(1, 0.01), xytext=(5, 0.1),
             arrowprops=dict(arrowstyle='->', color='black'),
             fontsize=12)
plt.text(12, 0.2, r'$\mu=1, b=2$', color = 'r')

# show the plot
plt.xlabel('X')
plt.ylabel('Prob')
plt.legend()
plt.title('Histogram Using Composition Method vs Laplace PDF')

plt.show()

2. Suppose the log-return series $\log\left(\frac{y_t}{y_{t-1}}\right)$ follows a Laplace distribution with mean $\mu$ and scale parameter $b$: $$f(x) = \frac{1}{2b} \exp\left( -\frac{|x - \mu|}{b} \right).$$ For each data $y_t$, the latent variable $W_t$ can be 'augmented': $W_t \sim \text{Exp}\bigl( \frac{1}{2b^2}\bigr)$ for each $t=1,...,n$ where $n$ is the total number of data. You are tasked with the following:
(1) Derive the full conditional posterior distribution of $W$, $\mu$, and $b^2$.
(2) Construct a Gibbs sampler code to sample from the joint posterior distribution of $\mu$, $b$, and $W$. You may use the following prior: $$ f(\mu, b^2) \propto \frac{1}{b^2}.$$
(3) Apply your Gibbs sampler to actual log-return data. Plot and summarize the posterior distributions of $\mu$ and $b^2$, reporting posterior means, credible intervals, and any other relevant posterior summaries. [Hint] You may utilize the result: $$\frac{b |y - \mu|}{w} \, \Big| \, (\mu, b^2, y) \sim \text{IG}.$$

(1) Posteriors¶

1) Joint Posterior:¶

$$ \begin{aligned} p(\mu, b^2, \mathbf{w} \mid \mathbf{y}) &\propto p(\mathbf{y} \mid \mu, b^2, \mathbf{w}) \cdot p(\mu, b^2, \mathbf{w}) \\ &= p(\mathbf{y} \mid \mu, \mathbf{w}) \cdot p(\mu, b^2) \cdot p(\mathbf{w} \mid \mu, b^2) \quad (\because \mathbf{y} \perp b^2) \\ &\propto \left[ \prod_{i=1}^{n} w_i^{-\frac{1}{2}} \exp\left\{ -\frac{1}{2w_i} (y_i - \mu)^2 \right\} \right] \cdot \frac{1}{b^2} \cdot \left[ \prod_{i=1}^{n} \frac{1}{2b^2} \exp\left( -\frac{w_i}{2b^2} \right) \right]. \end{aligned} $$

2) Full Conditional of $\mu$:¶

$$ \begin{aligned} p(\mu \mid b^2, \mathbf{w}, \mathbf{y}) &\propto \prod_{i=1}^{n} w_i^{-\frac{1}{2}} \exp\left\{ -\frac{1}{2w_i} (y_i - \mu)^2 \right\} \\ &\propto \exp\left\{ -\sum_{i=1}^{n} \frac{1}{2w_i} (y_i^2 - 2\mu y_i + \mu^2) \right\} \\ &\propto \exp\left\{ -\frac{1}{2} \sum_{i=1}^{n} \frac{y_i^2}{w_i} + \mu \sum_{i=1}^{n} \frac{y_i}{w_i} - \frac{1}{2} \sum_{i=1}^{n} \frac{1}{w_i} \mu^2 \right\} \\ &\propto \exp\left\{ -\frac{1}{2} \left( \sum_{i=1}^{n} \frac{1}{w_i} \right) \left( \mu - \frac{\sum_{i=1}^{n} \frac{y_i}{w_i}}{\sum_{i=1}^{n} \frac{1}{w_i}} \right)^2 \right\}. \end{aligned} $$

Thus,

$$ \mu \mid (b^2, \mathbf{w}, \mathbf{y}) \sim \mathcal{N} \left( \frac{\sum_{i=1}^{n} \frac{y_i}{w_i}}{\sum_{i=1}^{n} \frac{1}{w_i}}, \frac{1}{\sum_{i=1}^{n} \frac{1}{w_i}} \right). $$

3) Full Conditional of $b^2$:¶

$$ \begin{aligned} p(b^2 \mid \mu, \mathbf{w}, \mathbf{y}) &\propto \frac{1}{b^2} \cdot \prod_{i=1}^{n} \frac{1}{2b^2} \exp\left( -\frac{w_i}{2b^2} \right) \\ &\propto (b^2)^{-(n+1)} \exp\left( -\frac{\sum_{i=1}^{n} w_i}{2b^2} \right). \end{aligned} $$

This shows

$$ b^2 \mid (\mu, \mathbf{w}, \mathbf{y}) \sim \text{Inv-}\chi^2\left( 2n, \frac{\sum_{i=1}^{n} w_i}{2n} \right). $$

Note: Since $\text{Inv-}\chi^2(\nu, \sigma^2) = \text{Inv-}\Gamma\left( \frac{\nu}{2}, \frac{\nu \sigma^2}{2} \right)$, we can also write:

$$ b^2 \mid (\mu, \mathbf{w}, \mathbf{y}) \sim \text{Inv-}\Gamma\left( n, \frac{1}{2} \sum_{i=1}^{n} w_i \right). $$

4) Full Conditional of $\mathbf{w}$:¶

$$ \begin{aligned} p(\mathbf{w} \mid \mu, b^2, \mathbf{y}) &= \prod_{i=1}^{n} p(w_i \mid \mu, b^2, y_i) \quad (\text{iid}) \\ p(w_i \mid \mu, b^2, y_i) &\propto w_i^{-\frac{1}{2}} \exp\left\{ -\frac{1}{2w_i} (y_i - \mu)^2 - \frac{w_i}{2b^2} \right\}. \end{aligned} $$

Let $\zeta_i := \frac{b |y_i - \mu|}{w_i}$ $\quad \Rightarrow \quad w_i = \frac{b |y_i - \mu|}{\zeta_i}$.

Jacobian:

$$ \mathcal{J} = \left| \frac{\partial w_i}{\partial \zeta_i} \right| = \frac{-b |y_i - \mu|}{\zeta_i^2}. $$

Then,

$$ \begin{aligned} p(\zeta_i \mid \mu, b^2, y_i) &= p\left( \frac{b |y_i - \mu|}{w_i} \mid \mu, b^2, y_i \right) \times |\mathcal{J}| \\ &\propto \frac{1}{b |y_i - \mu|} \zeta_i \exp\left( -\frac{\zeta_i}{b |y_i - \mu|} \frac{(y_i - \mu)^2}{2} - \frac{b |y_i - \mu|}{2b^2} \zeta_i^{-1} \right) \times \frac{b |y_i - \mu|}{\zeta_i^2} \\ &\propto \zeta_i^{-3/2} \exp\left( -\frac{1}{2} \left( \frac{\zeta_i}{b^2} + \frac{(y_i - \mu)^2}{b^2 \zeta_i} \right) \right). \end{aligned} $$

Since:

$$ f_X(x) \propto \frac{1}{\sqrt{x^3}} \exp\left( -\frac{\lambda}{2x} - \frac{\lambda x}{2} \right) \quad \text{for } X \sim \text{Inv-Gaussian}(\mu, \lambda), $$

we recognize:

$$ \zeta_i \mid (\mu, b^2, y_i) \sim \text{Inv-Gaussian} \left( 1, \frac{b^2}{(y_i - \mu)^2} \right). $$

We may use this to obtain:

$$ w_i = \frac{b |y_i - \mu|}{\zeta_i}. $$

(2) Gibbs Sampler Code¶

In [ ]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats import invgamma
from scipy.stats import laplace
from scipy.stats import mode

def gibbs_parameter(y, num, mu_in, b_in, txt):
    n = len(y) # y : the log-earning rate data
    samples = np.zeros((num, 2))  # for mu and b^2, num : number of iteration

    # initialial parameters
    mu = mu_in
    b2 = b_in**2
    w = np.random.exponential(2*b2, n)

    for i in range(num):
        # mu
        mu = np.random.normal(loc = np.sum(y/w)/np.sum(1/w), scale = np.sqrt(1/np.sum(1/w)))
        samples[i,0] = mu

        # b2
        b2 = invgamma.rvs(a = n, scale = np.sum(w)/2, size = 1)
        samples[i,1] = b2

        # w using Inverse Gaussian
        for j in range(n):
            L = np.abs(y[j] - mu) / np.sqrt(b2)
            z = np.random.wald(mean = 1.0, scale = L) # equivalent to inverse Gaussian
            w[j] = np.sqrt(b2) * np.abs(y[j] - mu) / z

    mus = samples[:,0]
    b2s = samples[:,1]

    # txt : the name of data, HSI, NASDAQ, SSEC

    fig, axs = plt.subplots(2,2, figsize = (12,10))

    axs[0,0].plot(range(len(mus)),mus, color = 'red', alpha = 0.7)
    axs[0,0].set_title(f'plot of $\mu$ for {txt} data')
    axs[0,0].set_xlabel('iteration')
    axs[0,0].set_ylabel('$\mu$')

    axs[0,1].hist(mus, bins=30, range = (np.min(mus), np.max(mus)), density=True,  color='red', alpha = 0.7)
    axs[0,1].set_title(f'Histogram of $\mu$ for {txt} data')
    axs[0,1].set_xlabel('$\mu$')
    axs[0,1].set_xticks(np.linspace(np.min(mus), np.max(mus), 4))
    axs[0,1].set_ylabel('density')

    axs[1,0].plot(range(len(b2s)),b2s, color = 'blue', alpha = 0.7)
    axs[1,0].set_title(f'plot of $b^2$ for {txt} data')
    axs[1,0].set_xlabel('iteration')
    axs[1,0].set_ylabel('$b^2$')

    axs[1,1].hist(b2s, bins=30, range = (np.min(b2s),np.max(b2s)), density=True,  color='blue', alpha = 0.7)
    axs[1,1].set_title(f'Histogram of $b^2$ for {txt} data')
    axs[1,1].set_xlabel('$b^2$')
    axs[1,1].set_xticks(np.linspace(np.min(b2s), np.max(b2s), 4))
    axs[1,1].set_ylabel('density')

    plt.show()

    mean_mu = f"{np.mean(mus):.4g}"
    median_mu = f"{np.median(mus):.4g}"
    mode_mu = f"{mode(mus)[0]:.4g}"

    print(f'The posterior mean, median, mode of mu for {txt} are {mean_mu}, {median_mu}, and {mode_mu}')

    mean_b2 = f"{np.mean(b2s):.4g}"
    median_b2 = f"{np.median(b2s):.4g}"
    mode_b2 = f"{mode(b2s)[0]:.4g}"

    print(f'The posterior mean, median, mode of b^2 for {txt} are {mean_b2}, {median_b2}, and {mode_b2}')

for google colab

In [ ]:

from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

(3). Real Data¶

1) HSI data¶

i) Distribution of Log-Earning Rate for HSI¶

In [ ]:

data = pd.read_excel('/content/drive/MyDrive/Bayesian/HSI.xlsx')
df = data['Price']

df = df/df.shift(1)
df = df.drop(0)
log_ratio = np.log(df)
log_samples = log_ratio.to_numpy()

plt.hist(log_samples, bins=30, density=True, alpha=0.8, color='green', label='Histogram')

x = np.linspace(-0.05, 0.05, 1000)
m = 0; b = 0.007
pdf = laplace.pdf(x, loc = m, scale = b)
plt.plot(x, pdf, label=f'Laplace({m}, {b})', color = 'purple')
plt.title(f'Histogram of HSI vs PDF of Laplace({m}, {b})')
plt.xlabel('x')
plt.ylabel('density')
plt.legend()
plt.show()

It can be shown that the distribution of log earning rate (green histogram) is similar to that of Laplace distribution with mean 0 and scale 0.007 (purple graph).

ii) Posterior Distribution of $\mu$, $b^2$ for HSI data¶

In [ ]:

np.random.seed(2023311161)
gibbs_parameter(log_samples, 2000, m, b, 'HSI')

The posterior mean, median, mode of mu for HSI are 0.0009909, 0.0009963, and 2.128e-05
The posterior mean, median, mode of b^2 for HSI are 5.094e-05, 5.078e-05, and 4.181e-05

The value of $\mu$ converges to around 0.001,
while that of $b^2$ converges to about $5*10^{-5}$.

2) NASDAQ¶

i) Distribution of Log-Earning Rate for NASDAQ¶

In [ ]:

data = pd.read_excel('/content/drive/MyDrive/Bayesian/NASDAQ.xlsx')
df = data['Price']

df = df/df.shift(1)
df = df.drop(0)
log_ratio = np.log(df)
log_samples = log_ratio.to_numpy()

plt.hist(log_samples, bins=30, density=True, alpha=0.8, color='green', label='Histogram')

x = np.linspace(-0.05, 0.05, 1000)
m = 0.01; b = 0.015
pdf = laplace.pdf(x, loc = m, scale = b)
plt.plot(x, pdf, label=f'Laplace({m}, {b})', color = 'purple')
plt.title(f'Histogram of NASDAQ vs PDF of Laplace({m}, {b})')
plt.xlabel('x')
plt.ylabel('density')
plt.legend()
plt.show()

The distribution of log earning rate (green histogram) is similar to that of Laplace distribution with mean 0.01 and scale 0.015 (purple graph).

ii) Posterior Distribution of $\mu$, $b^2$ for NASDAQ data¶

In [ ]:

np.random.seed(2023311161)
gibbs_parameter(log_samples, 2000, m, b, 'NASDAQ')

The posterior mean, median, mode of mu for NASDAQ are 0.005885, 0.005911, and -0.000564
The posterior mean, median, mode of b^2 for NASDAQ are 0.0002005, 0.0001993, and 0.0001393

The value of $\mu$ converges to around 0.006,
while that of $b^2$ converges to about 0.0002.

3) SSEC¶

i) Distribution of Log-Earning Rate for SSEC¶

In [ ]:

data = pd.read_excel('/content/drive/MyDrive/Bayesian/SSEC.xlsx')
df = data['Price']

df = df/df.shift(1)
df = df.drop(0)
log_ratio = np.log(df)
log_samples = log_ratio.to_numpy()

plt.hist(log_samples, bins=30, density=True, alpha=0.8, color='green', label='Histogram')

x = np.linspace(-0.05, 0.05, 1000)
m = 0; b = 0.01
pdf = laplace.pdf(x, loc = m, scale = b)
plt.plot(x, pdf, label=f'Laplace({m}, {b})', color = 'purple')
plt.title(f'Histogram of SSEC vs PDF of Laplace({m}, {b})')
plt.xlabel('x')
plt.ylabel('density')
plt.legend()
plt.show()

The distribution of log earning rate (green histogram) is similar to that of Laplace distribution with mean 0 and scale 0.01 (purple graph).

ii) Posterior Distribution of $\mu$, $b^2$ for SSEC data¶

In [ ]:

np.random.seed(2023311161)
gibbs_parameter(log_samples, 2000, m, b, 'SSEC')

The posterior mean, median, mode of mu for SSEC are 0.001129, 0.001141, and -0.0008247
The posterior mean, median, mode of b^2 for SSEC are 8.19e-05, 8.176e-05, and 6.065e-05

The value of $\mu$ converges to around 0.001,
while that of $b^2$ converges to about 0.00008.