MCMC HW5

1. Slice sampler

(a) Generate a random number from the Gaussian mixture \[p(x) = \frac{3}{10}\mathcal N(x|-3,0.5) + \frac{4}{10}\mathcal N(x|0, 0.5) + \frac{3}{10}\mathcal N(x|3,0.5)\]

We define a function \(p(x)\) as follows:

rm(list = ls())
set.seed(2023311161)

# Define the Gaussian mixture density function
p_x_a <- function(x) {
  (3/10) * dnorm(x, mean = -3, sd = sqrt(0.5)) +
    (4/10) * dnorm(x, mean = 0, sd = sqrt(0.5)) +
    (3/10) * dnorm(x, mean = 3, sd = sqrt(0.5))
}

When \(p\) is a target function and \(x_t\) is a sample from \(p\) at time \(t\), the algorithm of slice sampler is as follows:

Draw \(u_{t+1}\) from \(\text{Unif}(0, p(x_t))\).
Draw \(x_{t+1}\) uniformly from the region \(S_{t+1} = \{x : p(x) \ge u_{t+1} \}\).

Since the set \(S_{t+1}\) is needed for every iteration, we define a function ‘slice’ that returns the x values (vector) whose \(p(x)\) values are greater than given \(u\).

slice <- function(u, p_x, x_grid){
  density_values <- p_x(x_grid)
  is_above_u <- density_values >= u
  
  return(x_grid[is_above_u])
}

The ‘slice_sampler’ function generates samples from the distribution \(p\) using slice sampler algorithm. The ‘n_trial’ counts the number of total trials to generate ‘n_iter’ number of samples. ‘x_0’ is the initial value of \(x\), and ‘p_x’ is the target distribution.

For every iteration, generate \(u\) from \(\text{Unif}(0, p(x))\) (step 1), and find the range of \(x\) whose \(p(x)\) values are greater than the \(u\) using the ‘slice’ function (step 2). Let the minimum and maximum of the \(x\) be ‘left’ and ‘right’ respectively. Then, generate ‘x_new’ uniformly from (left, right) and accept it as \(i\)th sample if its target distribution value is greather than or equal to \(u\) (step 3).

slice_sampler <- function(p_x, n_iter, x_0, x_grid) {
  n_trial <- 0
  samples <- numeric(n_iter)  
  x <- x_0                   
  
  for (i in 1:n_iter) {
    # Step 1: Sample u ~ Uniform(0, p_x(x))
    u <- runif(1, min = 0, max = p_x(x))
    
    # step 2
    x_above_u <- slice(u, p_x, x_grid)
    left <- min(x_above_u)
    right <- max(x_above_u)
    
    # Step 3: Sample x ~ Uniform(left, right) within the slice
    repeat {
      n_trial <- n_trial+1
      x_new <- runif(1, min = left, max = right)
      if (p_x(x_new) >= u) {
        x <- x_new
        break
      }
    }
    
    samples[i] <- x
  }
  
  result <- list(samples = samples, n_trial = n_trial)
  return(result)
}

The followng shows how to generate 10000 number of samples from \(p(x)\), and the initial value of \(x\) is 0. The grid of \(x\) is between -10 and 15. The histogram of the generated samples (‘samples_a’) shows that the samples align with the pdf of \(p(x)\).

n_iter <- 10000
x_0 <- 0  # Initial value
x_grid <- seq(-10,15, by = 0.01)

result_a <- slice_sampler(p_x_a, n_iter, x_0, x_grid)
samples_a <- result_a$samples

# Plot the histogram of the samples
hist(samples_a, breaks = 100, probability = TRUE, col = "skyblue", 
     main = "Slice Sampler for Gaussian Mixture, Q1(a)", xlab = "x")
curve(p_x_a, col = "red", lwd = 2, add = TRUE)

The total number of trial to obtain 10000 samples is about 13000.

result_a$n_trial

## [1] 13347

(b) Generate a random number from the modified Gaussian mixture and compare to the result in (a) \[p(x) = \frac{3}{10}\mathcal N(x|-3,0.5) + \frac{4}{10}\mathcal N(x|0, 0.5) + \frac{3}{10}\mathcal N(x|10,0.5)\]

The target distribution is same as (a), except for the mean of the third mode of the target distribution. So, we define the target distribution differently.

set.seed(2023311161)

p_x_b <- function(x) {
  (3/10) * dnorm(x, mean = -3, sd = sqrt(0.5)) +
    (4/10) * dnorm(x, mean = 0, sd = sqrt(0.5)) +
    (3/10) * dnorm(x, mean = 10, sd = sqrt(0.5))
}

Other algorithms, such as obtaining \(S_{t+1}\) and ‘slice_sampler’ function is same in (b), so we use the ‘slice_sampler’ using ‘p_x_b’. Then, the histogram and red curve shows the generated samples are from target distribution \(p\).

result_b <- slice_sampler(p_x_b, n_iter, x_0, x_grid)
samples_b <- result_b$samples

hist(samples_b, breaks = 100, probability = TRUE, col = "lightgreen", 
     main = "Slice Sampler for Gaussian Mixture, Q1(b)", xlab = "x")
curve(p_x_b, col = "red", lwd = 2, add = TRUE)

The total number of trials to obtain 10000 samples is about 25000.

result_b$n_trial

## [1] 25155

Despite identical algorithm, he total number of trials to get the same number of samples differ significantly. This is because the distance between the 2nd and the 3rd mode in (b) is much larger than that of (a), which leads to more rejection in (b). That is, the ‘x_new’ in (b) is less likely to be in \(S_{t+1}\); \(p(x_{new}) < u_{t+1}\).

2. 2D Ising Model Simulation with Parallel Tempering

Introduction: In the Ising model, we study the behavior of spins on a lattice that can be in one of two states: up (+1) or down (−1). The model simulates the interaction between neighboring spins, where the interaction energy tends to align the spins in the same direction. At high temperatures, the spins are more randomly oriented, leading to a disordered state, while at low temperatures, spins tend to align, resulting in a more ordered magnetic state.

Model Setup: We simulate the 2D Ising model on a square lattice with size \(L \times L\), where each spin is denoted as \(s_{i,j} \in \{-1,1 \}\). The Hamiltonian for the system is given by: \[H = -J\sum_{k,l}s_k s_l\] where the sum is taken over all pairs of neighboring spins, and \(s_k\) and \(s_l\) represent the spins at positions \(k\) and \(l\) on the lattice. The interaction strength \(J\) is assumed to be \(J = 1\), meaning that neighboring spins that are aligned (i.e., \(s_k = s_l\)) lower the system’s energy.

Energy Function and Flip: To perform the simulation, we use the Metropolis-Hastings algorithm to flip spins based on the energy change. The energy change for flipping a spin is given by: \[\Delta E = 2s_{i,j} (s_{i+1,j} + s_{i−1,j} + s_{i,j+1} + s_{i,j−1}).\] If the change in energy is negative (\(\Delta E < 0\)), the flip is always accepted. If the change in energy is positive (\(\Delta E > 0\)), the flip is accepted with probability \(\exp(−\beta\Delta E)\), where \(\beta = 1/T\) is the inverse temperature. At each step, one spin is randomly selected.

The MH algorithm flips only one spin at a time to gradually explore the system’s energy landscape. This prevents large changes in energy that could disrupt the simulation. The parameter “steps per move” determines how many flips are performed in each move. A larger value allows for more flips per move, but too many can cause the system to settle into the local minimum.

Simulation with Parallel Tempering: You will simulate the Ising model using parallel tempering to explore the system at multiple temperatures. The parallel tempering method runs multiple copies (systems) of the model at different temperatures, and periodically swaps configurations between systems based on the energy difference and temperature.

Lattice size(\(L\)): 20
Number of systems: 10
Minimum temperature: \(T_{min}\) = 0.2
Maximum temperature: \(T_{max}\) = 10
Number of moves per system: 1000
Steps per move: 10

(a) Implement the simulation of the Ising model at multiple temperatures using parallel tempering. Compare the lattice configurations at different temperatures, from low temperature (ordered) to high temperature (disordered).

1) lattice initialization

We need a square lattice with size \(L \times L\), and the following ‘initialize_lattice’ function creates a matrix with size \(L \times L\) whose values are randomly chosen between -1 and 1. In the main ‘ising’ function, we make a list of 10 lattice matrices, each matrix of which represent the lattice at different temperature.

rm(list = ls())
set.seed(2023311161)


initialize_lattice <- function(L) {
  matrix(sample(c(-1, 1), L * L, replace = TRUE), nrow = L, ncol = L)
}

2) Hamiltonian function

Recall the definition of Hamiltonian in this 2D Ising model: \[H = -\sum_{k,l}s_k s_l\] where the sum is taken over all pairs of neighboring spins, and \(s_k\) and \(s_l\) represent the spins at positions \(k\) and \(l\) on the lattice. For example, when \(s_k = s_{2,2}\), we consider \(s_{3,2}, s_{2,3}, s_{1,2}, s_{2,1}\) as its neighbors. So, for every spin, add all the 4 neighbors (‘neighbors’) and multiply the sum with the spin (‘lattice[i, j] * neighbors’). Then, add all the values for all the spins, and multiply \(-1\) to that value (‘energy’).

This calculation counts one neighbor twice, so we divide the final value by 2 (‘energy / 2’). This ‘Hamiltonian’ function is used in the late State Swapping Step.

The reason for using ‘(i - 2) %% L + 1’ instead of ’ (i-1) %% L, ’ is that the former returns \(1,2,...,L\), while the latter returns \(0,1,2,...,L-1\).

Hamiltonian <- function(lattice) { 
  L <- nrow(lattice)
  energy <- 0
  for (i in 1:L) {
    for (j in 1:L) {
      neighbors <- lattice[(i %% L) + 1, j] + lattice[i, (j %% L) + 1] +
        lattice[(i - 2) %% L + 1, j] + lattice[i, (j - 2) %% L + 1]
      # the reason for using i-2 is to have 1,2,..., and "L"
      energy <- energy - lattice[i, j] * neighbors
    }
  }
  energy / 2  # Each interaction is counted twice, so divide by 2
}

3) Parallel MH step

The parallel tempering algorithm consists of two steps; parallel MH step and state swapping step. In this 2D ising model, flipping spins is the parallel MH step. As written in the question, one spin is randomly chosen at each step. After the index (i,j) is chosen, add all the values of its neighbors and multiply by 2 times the current spin. This becomes the \(\Delta E = 2s_{i,j} (s_{i+1,j} + s_{i−1,j} + s_{i,j+1} + s_{i,j−1})\) (‘deltaE’). Then, the spin is always flipped if \(\Delta E < 0\). When \(\Delta E > 0\), it is flipped with probability \(\exp(-\beta\Delta E)\). After repeating this for all the steps, return the updated lattice.

parallel_MH_step <- function(lattice, beta, steps) {
  L <- nrow(lattice)
  for (step in 1:steps) {
    i <- sample(1:L, 1)
    j <- sample(1:L, 1)
    neighbors <- lattice[(i %% L) + 1, j] + lattice[i, (j %% L) + 1] +
      lattice[(i - 2) %% L + 1, j] + lattice[i, (j - 2) %% L + 1]
    
    deltaE <- 2 * lattice[i, j] * neighbors
    
    if (deltaE < 0 || runif(1) < exp(-beta * deltaE)) {
      lattice[i, j] <- -lattice[i, j]
    }
  }
  
  return(lattice)
}

4) State Swapping Step

When it is \(i\)th system currently, we choose whether \(j = i-1\) or \(j = i+1\) with probability \(0.5\) respectively. When current system is at the start (\(i=1\)) or the end (\(i = N\)), the next (\(j\)) system is \(2\) and \(N-1\) respectively with probability 1. Recall that the acceptance ratio of the state swap is \[\min\biggl[ 1, \exp\biggl\{\Bigl(H(x_i^{(t)}) - H(x_j^{(t)}) \Bigr)\Bigl(\frac{1}{T_i} - \frac{1}{T_j} \Bigr) \biggr\} \biggr]\] where \(x_i^{(t)}\) corresponds to the lattice of \(i\)th temperature at \(t\)th move. If flip is accepted, switch the \(i\)th and \(j\)th lattices each other.

swap_step <- function(lattices, betas) {
  N <- length(lattices) # 10
  
  for (i in 1:N) {
    if(i == 1){j <- 2
    } else if(i == N){j <- N-1
    } else {j <- sample(c(i-1,i+1),1) 
    }
    
    
    energy_i <- Hamiltonian(lattices[[i]])
    energy_j <- Hamiltonian(lattices[[j]])
    delta <- (energy_i - energy_j) * (betas[i] - betas[j])
    
    if (runif(1) < exp(delta)) { # switch ith and jth lattice
      temp <- lattices[[i]]
      lattices[[i]] <- lattices[[j]]
      lattices[[j]] <- temp
    }
  }
  
  return(lattices)
}

5) ising model algorithm

The ‘lattices’ is a list of \(N=10\) number of matrices, each of which serves as initial lattice at differenct temperature. The temperatures are sorted in decreasing order in such a way that the logarithm of them are equally spaced. Then, update each lattice using ‘parallel_MH_stpe’ (flip), and check whether state swap is needed for every system using ‘swap_step’.

# Main function for the simulation
ising <- function(L, N, Tmin, Tmax, moves, steps) {

  lattices <- lapply(1:N, function(x) initialize_lattice(L)) # returns N=10 number of matrices
  temperatures <- exp( sort(seq(log(Tmin), log(Tmax), length.out = N) ) )
  betas <- 1 / temperatures
  
  
  for (move in 1:moves) {
    # Parallel MH Step
    lattices <- lapply(1:N, function(i) { #seq_along(lattices)
      parallel_MH_step(lattices[[i]], betas[i], steps)
    })
    
    # State Swapping Step
    lattices <- swap_step(lattices, betas)
  }
  
  
  return(list(lattices = lattices, temperatures = temperatures))
}

Set the parameters as given in the question.

L <- 20        # Lattice size
N <- 10        # Number of systems
Tmin <- 0.2    # Minimum temperature
Tmax <- 10     # Maximum temperature
moves <- 1000  # Number of moves per system
steps <- 10    # Steps per move

result <- ising(L, N, Tmin, Tmax, moves, steps)

The color black and gray correspond to \(-1\) and \(1\) respectively. The generated images show that when the temperature is low(\(T = 0.2, 0.31\)), most of the spins in the lattice have same value (ordered). When temperature becomes higher (\(T = 0.74, 1.14, 1.76, ,...\)), the spin in the lattice is still likely to have similar values to its neighbors so that the total image has some specific shape (less ordered). However, when temperatures become much higher (such as \(T = 6.47, 10\), the values of all the spins seem to be random (disordered).

par(mfrow = c(5,2))
for (i in 1:N) {
  image(1:L, 1:L, result$lattices[[i]], 
        main = paste("Ising Model Result When T =", round(result$temperatures[i], 2) ), 
        col = c("black", "gray"), xlab = "i", ylab = "j", axes = FALSE)
  axis(1, at = 1:L, labels = 1:L)  # x-axis ticks
  axis(2, at = 1:L, labels = 1:L)  # y-axis ticks
}

par(mfrow = c(1,1))

(b) Explain the significance of the temperature in the Ising model. How does the alignment of spins change as the temperature is varied?

In the 2D Ising model, temperature plays a critical role in the alignment of spins. At low temperatures, the system minimizes its energy by aligning neighboring spins, leading to ordered states. At high temperatures, thermal fluctuations dominate, so the spins become more random, leading to disordered states.