Monte Carlo simulation for empirical rejection rates in R - calculations and plots

Monte Carlo simulation for empirical rejection rates in R - calculations and plots

Assume you have a random variable that follows a normal distribution, you want to calculate empirical rejection rates and plot them. In the following we will look at the case where the mean is unknown and the variance is known.

Specifications
H_0: mu_0 = 0 vs H1: mu_1 > 0
5% significance 

Therefore, we will conduct a Monte Carlo simulation to come up with a table of empirical rejection rates for different sample sizes and different values for the unknown mean mu. Please note that the code shown in this article is for educational purposes and is not always written in the most efficient way for reasons of comprehensibility - this way it is easier to follow.

# set a seed for the reproducability of the results
set.seed(73)
# define the number of runs of the MC simulation
runs = 10000
# define list of sample sizes 
vector_mu = seq(0, 4, by = 0.25)

Now we will start with a sample size of 10 draws per Monte Carlo simulation:

# set sample size
n = 10
# create empty vector to store results
rates_10 = c()
# set siginficance level
alpha = 0.05
# start interation
for (i in vector_mu){
  accepted = 0
  rejected = 0
  for(k in 1:runs){
    simulation = rnorm(n, mean = i, sd = sqrt(6))
    t_stat = (mean(simulation) - 0) / (sqrt(6) / sqrt(n))
    # calculate critical value of right-tailed test 
    critical_value = qnorm(1-alpha)
    # test decision
    if(t_stat > critical_value){
      rejected = rejected + 1
    } else {
      accepted = accepted + 1
    }
  }
  result = rejected / (accepted + rejected)
  rates_10 = c(rates_10, result)
}
print(rates_10)

Repeat this for other sample sizes (for example n=25 and n=50). You can use the following code to plot your results.

par(mfrow=c(1,1))
y = seq(0.1, 1, by = 0.1)
plot(vector_mu, rates_10, type="l", col = "black", main = "Empirical power function (sigma known)", ylab = "Power", xlab = "mu")
lines(vector_mu, rates_25, type="l", col = "red", lty = 2)
lines(vector_mu, rates_50, type="l", col = "green", lty = 3)
legend("bottomright", title = "Sample size", legend = c(10,25,50),
       col=c("black", "red", "green"),lty=c(1,2,3), cex = 0.5)

The plot of your results can look like the following. Note that your graphs can look different if you choose other sample sizes or a different seed.

If you are interested you can view the entire code and a lot of other interesting statistics projects on github.