MCMC HW1

1. Normal from Laplace distribution

\[\begin{aligned} f(x) &= \frac{1}{\sqrt{2\pi}}\exp\Bigl\{-\frac{x^2}{2}\Bigr\} \\ q(x) &= \frac{\beta}{2}\exp\Bigl\{-\beta|x|\Bigr\} \end{aligned}\]

(a) Calculate the value of \(M\) when \(\beta = 1\)

Recall that \(M = \sup_{x}\frac{f(x)}{q(x)}\). Both the target function \(f\) and proposal density \(q\) are symmetric at \(x=0\), so we can only consider the case \(x \ge 0\) for achieving the supremum of \(f(x)/q(x)\) for all \(x\in\mathbb R\).

\[\begin{aligned} f(x)/q(x) &= (1/\sqrt{2\pi})/(\beta/2)\cdot\exp\Bigl\{-\frac{x^2}{2}+\beta x \Bigr\} \\ &= \frac{2}{\sqrt{2\pi}\beta}\exp\Bigl\{-\frac{1}{2}(x-\beta)^2 + \frac{\beta^2}{2} \Bigr\} \\ &\le \frac{2}{\sqrt{2\pi}\beta}\exp\Bigl\{ \frac{\beta^2}{2} \Bigr\}, \quad\text{when } x = \beta \end{aligned}\]

So, \(\sup_{x\ge0}\frac{f(x)}{q(x)} = \sqrt{\frac{2}{\pi}}\exp\{1/2\}\) when \(\beta = 1\), and thus \(M = \sqrt{\frac{2}{\pi}}\exp\{1/2\}\). This constant works as a constant that is multiplied by the proposal density \(q\) so that the envelope function \(e(x) := M\cdot q(x)\) is always greater than target density \(f\) and there exist some \(x\in\mathbb R\) such that \(e(x) = f(x)\).

(b) Sampling from Standard normal distribution using rejection sampling.

We import the “VGAM” package to sample from Laplace distribution.

rm(list = ls())
set.seed(2023311161)

#install.packages("VGAM") 
library(VGAM) # rlaplace

## Warning: 패키지 'VGAM'는 R 버전 4.3.3에서 작성되었습니다

## 필요한 패키지를 로딩중입니다: stats4

## 필요한 패키지를 로딩중입니다: splines

library(ggplot2)

## Warning: 패키지 'ggplot2'는 R 버전 4.3.3에서 작성되었습니다

As obtained from (a), we set \(M = \sqrt{\frac{2}{\pi}}\exp\{1/2\}\). Also, we will generate 100,000 samplings from rejection sampling, which will be saved in vector X.

M <- sqrt(2/pi)*exp(1/2)

num <- 100000;i<-1
X <- c()

The rejection algorithm for this problem is as follows:

Sample from proposal density; \(Y \sim \text{Laplace}(0,1)\).
Sample from Uniform distribution; \(U \sim \text{Unif}(0,1)\).
If \(U \le f(Y)/q(Y)\), keep the value \(Y\) and set \(X=Y\).
otherwise, reject \(Y\) and return to step 1 until 100,000 number of X is generated.

while(i <= num){
  y <- rlaplace(1)
  u <- runif(1)
  
  if(u < dnorm(y)/(M*dlaplace(y))){
    X[i] <- y
    i <- i+1
  }

}

The following histogram shows how the result of rejection sampling (blue) are similar to target density (red), the standard normal distribution.

data_X <- as.data.frame(X)

ggplot(data_X, aes(x=X)) + 
  geom_histogram(aes(y = after_stat(density)), binwidth = 0.1, fill = 'cyan', color = 'black') +
  stat_function(fun = dnorm, args = list(mean = 0, sd = 1), color = 'red', linewidth = 1) + 
  labs(title = "Histogram of X using rejection sampling")

2. Importance Sampling

(a) For a standard normal random variable \(Z\), calculate \(\mathbb P(Z > 2.5)\) using Monte Carlo sums based on indicator functions. How many simulated random variables are needed to obtain three digits of accuracy?

The given probability can be expressed as an expectation of an indicator function: \[\mathbb P(Z>2.5) = \mathbb E[\mathbb{1}(Z > 2.5)].\] The value of the probability is around 0.00621, So the purpose is to check whether the difference between the Monte Carlo result and the real value is less than 0.000005.

rm(list = ls())
set.seed(2023311161)

1-pnorm(2.5) # 0.006209665, 0.00621

## [1] 0.006209665

ind <- c()
i <- 1

The ‘ind’ vector contains the value 0 and 1, indicating whether the ‘z’ is greater than 2.5. We continue these steps until the difference between the average of all the ‘ind’ vector elements and 0.00621 are less than 0.000005.

while(1){
  z <- rnorm(1)
  ind[i] <- as.numeric(z>2.5)
  
  if(abs(mean(ind)-(0.00621)) < 0.000005){break}
  i <- i+1
}

The Monte Carlo estimates are 0.00621118, and the number of interation is 322.

mean(ind) # 0.00621118

## [1] 0.00621118

length(ind) # 322

## [1] 322

(b) Compare the number of variables needed to obtain three digits of accuracy with importance sampling to the answers obtained from above.

Given the target and proposal density \(f\) and \(g\), the Importance Sampling algorithm for estimating \(\mathbb E_{X\sim f}[h(X)]\) is as follows:

Sample \(X_i\) from \(g\)
Calculate the weight \(w(X_i) = f(X_i)/g(X_i)\).
Repeat 1 and 2 for desired number of iteration (\(n\)), and take an average of all \(h(X)w(X)\). That is, \[\hat{\mathbb E_f[h(X)]} = \frac{1}{n}\sum_{i=1}^nh(X_i)w(X_i).\]

The pdf value of target density \(f\) at \(x=2.5\) is extremely low; about 0.0175. So. we need a proposal density \(g\) so that the tail probability of \(g\) is greater than \(f\). If the proposal density is \(\text{Exp}(1)\) truncated at \(x=2.5\), \(g(x) = \exp\{-(x-2.5)\}, x>2.5\) and \[\begin{aligned} \frac{f(x)}{g(x)} &= \frac{1}{\sqrt{2\pi}}\exp\Bigl\{-\frac{1}{2}x^2 + x - 2.5 \Bigr\} \\ &= \frac{1}{\sqrt{2\pi}}\exp\Bigl\{-\frac{1}{2}(x-1)^2-2\Bigr\} \\ &\le \frac{1}{\sqrt{2\pi}}\exp\{-2\} \\ &\approx 0.05399,\qquad \forall x>2.5 \end{aligned}\] This shows \(g\) satisfies the condition of the proposal density of importance sampling.

set.seed(2023311161)

dnorm(2.5) # 0.0175283

## [1] 0.0175283

1/sqrt(2*pi)*exp(-2) # 0.05399097

## [1] 0.05399097

The \(h(X)\) for this problem is the indicator function \(\mathbb{1}(X > 2.5)\). So, the importance sampling algorithm is written as follows:

ind_is <- c()
i <- 1

while(1){
  y <- rexp(1)+2.5 # truncated at 2.5
  ind_is[i] <- dnorm(y)/dexp(y-2.5)*as.numeric(y > 2.5)
  
  if(abs(mean(ind_is) - 0.00621) < 0.000005){break}
  i <- i+1
}

The estimated value of the probability using importance sampling is 0.006210526 and the number of iteration is 206.

mean(ind_is) # 0.006210526

## [1] 0.006210526

length(ind_is) # 206

## [1] 206

3. Compute \(\int_0^1 \frac{e^{-x}}{1+x^2}dx\) using control variate method. (hint. \(c(y) = \frac{e^{-0.5}}{1+y^2}\))

The integral \(\int_0^1 \frac{e^{-x}}{1+x^2}dx\) can be viewed as an expectation as follows: \[\int_0^1 \frac{e^{-x}}{1+x^2}\cdot1dx = \mathbb E_{X}\biggl[\frac{e^{-X}}{1+X^2}\biggr],\] where \(X \sim \text{Unif}(0,1)\). The theoretical value of the integral is 0.5247971.

rm(list = ls())
set.seed(2023311161)

n <- 100000

h <- function(x) {
  exp(-x)/(1+x^2)
}

integrate(h, 0, 1)$value # 0.5247971

## [1] 0.5247971

Instead of simple Monte Carlo method, we may use another random variable \(c(X) = \frac{e^{-0.5}}{1+X^2}\) so that \(\theta = \int_0^1 \frac{e^{-0.5}}{1+x^2}dx \approx 0.476\).

c_y <- function(y) {
  exp(-0.5)/(1+y^2)
}

theta <- integrate(c_y, 0, 1)$value # 0.4763681

Instead of using distinct random variable \(Y\) as written in the lecture note (page 21), we will use the same \(X\) to generate \(c(X)\) and corresponding \(\hat\theta_{MC}\). Then, the control variate estimator is \[\hat\mu_{CV} = \hat\mu_{MC} + \lambda(\hat\theta_{MC} - \theta),\] where \(\hat\mu_{MC}\) and \(\hat\theta_{MC}\) are Monte Carlo estimators of \(\int_0^1 \frac{e^{-x}}{1+x^2}dx\) and \(\int_0^1 \frac{e^{-0.5}}{1+x^2}dx\) respectively, and \[\lambda = -\frac{\text{Cov}[\hat\mu_{MC}, \hat\theta_{MC}]}{\text{Var}[\hat\theta_{MC} ] }.\] We may estimate the covariance and variance using vectors of \(h(x) := \frac{e^{-x}}{1+x^2}\) and \(c(x)\) respectively.

xs <- runif(n)
hs <- h(xs)
c_ys <- c_y(xs)

cov_h_c <- cov(hs, c_ys) # 0.02337966
lambda <- -cov_h_c/var(c_ys)

mu_mc <- mean(hs)
theta_mc <- mean(c_ys)

mu_cv <- mu_mc + lambda*(theta_mc - theta)

Thus, the estimated integral using control variate is 0.5251. For reference, the estimates using Monte Carlo is 0.5260.

mu_cv # 0.5251436

## [1] 0.5251436

mu_mc # 0.5260371

## [1] 0.5260371

Monte Carlo Method HW1