Thursday, January 24, 2013

Running on a server

I use MrBayes quite a lot, and they often take quite a long time to run. That's why I use the lab's server to run most of my MrBayes scripts. Its name is Aquarium:

It was recently upgraded to 8 cores, so I frequently have seven scripts running at once (I like to leave at least one for someone else who might need it). I also submit them as low priority so that someone with more pressing needs can use the high priority slots.

Learning how to use Aquarium was the first time I learned anything about SSH (secure shell) or SCP (secure copy), and back when I only had a PC, I had to learn to use PuTTY to use it. I've now started to get the hang of it, but here are a couple tricks I've used recently to overcome some issues.

One thing I noticed was that some scripts were taking an abnormally long time! Months of waiting for 50 million generations. And most of the ones taking a long time were priors, where I replaced the sequence data with the appropriate number of question marks. For some reason, MrBayes, when estimating under the prior alone, will just stall. The reason why I know this now is because I used a little trick to see whether my file sizes were changing:

ls -l

This lists the files and folders with details, including file size. If the script is running correctly, most of my files (like my .p and .t files) will grow bigger and bigger. But even overnight, they were staying the same. Now came a trickier part:


When I use top, I can see the processes that are running. Since some of my scripts are stalled, I just want to stop them and download the files using SCP. To stop the correct process, I need to use the correct PID:

kill -9 [PID]

If I happen to remember the order in which I submitted my scripts, I can just use the time that it's been running to find the correct PID. But if I don't remember, I can use this:

ps aux | grep [command name]

This way, I know exactly what script corresponds with what PID!

No comments:

Post a Comment