I have to run Markov chain Monte Carlo (MCMC) simulations that each takes hours and requires parameter configuration. On this particular day, it was a Metropolis Hastings algorithm for which I have to specify the step size of the proposal distribution.
After getting sick of manually changing the step size each time, I gave in and parallelize the MCMCs, running all of them in one go. Given the big time cost of each MCMC, I really don’t want one failure to jeopardize the rest. So I wrote a parallelized script that accomplishes the following goals:
The result of each chain is saved as soon as it is done
The progress is tracked in a log file
I used the foreach package to parallelize plus some tricks to create an informative log file and file names. Below is the code for my_parallel_mcmc.R:
Inside your f_mcmc() function, it should 1) periodically print out a progress report, and 2) save the MCMC result at the end.
I used git and github to clone my script onto a remote cluster. From its terminal, I run this script with Rscript my_parallel_mcmc_script.R. To check the progress, I use tail -f my_mcmc.log. To keep the job running even after you disconnect from the remote cluster, use tmux.
If you don’t know what you did wrong or feel lost by Dota 2, here are timeless and fundamental concepts that will let you figure out what to do in any situat...
Leave a Comment