Talapas and R
How to use
R in talapas
You can find the Talapas knowledge base here.
Logging in to Talapas
ssh to login through the terminal:
scp to copy files to/from your local machine. Adding the
-r flag lets you copy whole directories and all of the files within that directory. Some examples of its usage:
scp ~/myfile.tsv email@example.com:./projects/my_project scp firstname.lastname@example.org:./projects/my_project/myfile.tsv ~/ scp -r ~/my_directory email@example.com:./projects/my_project
To login to a virtual desktop environment go to talapas-login.uoregon.edu with your browser.
Submitting a job to slurm
When you login to talapas, you will be on one of the login nodes. To run analyses, you will have to access a compute node by submitting a job. From the terminal, there are two ways of submitting a job. You can either start an interactive session in your terminal using
srun or submit a batch script using
sbatch, which will run automatically in the background. In either case, you will have to wait in the queue until a node opens up for the amount of CPU, memory, and time that you need.
srun --account=crobe --pty --partition=short --mem=1024M --time=240 bash
you must have
bash in there. The other commands are optional, but let you spectify memory, time, CPUs, etc. Also, these flags (e.g.,
--time=240) are the same as what you would put into a batch script.
--partition=short is the default type of partition you will use.
--mem=1024M gives you a gig of RAM/memory and
--time=240 gives you 4 hours of time on the node.
An example batch script:
#!/bin/bash #SBATCH --partition=short ### queue to submit to #SBATCH --job-name=varcomp ### job name #SBATCH --output=varcomp.out ### file in which to store job stdout #SBATCH --error=varcomp.err ### file in which to store job stderr #SBATCH --time=30 ### wall-clock time limit, in minutes #SBATCH --nodes=1 ### number of nodes to request #SBATCH --ntasks-per-node=1 ### number of tasks per node #SBATCH --cpus-per-task=28 ### number of cores/CPUs #SBATCH -A crobe ### account module load R/3.6.1 # optional if you want to load a specific version cd projects/Royal_Society/R Rscript var_comp.R
A batch script is essentially just a normal
bash script with some extra parameters at the top, which slurm reads. You should specify
--job-name and where to put
--time sets the time in minutes.
--ntasks-per-node will pretty much always by equal to
R jobs. You can set how many cores/CPUs you want with
--cpus-per-task and the amount of memory you want with
--mem where it takes a number and a unit. For example,
--mem=64G will gives you 64 gigabytes of RAM on that node. I think the default is something like 4 GB per CPU. At the bottom of the script, you should enter normal terminal commands as you would in a
bash script or as you would enter them directly at the command line. This script can be in your home directory or in the project directory.
Run a batch script by typing
sbatch and then the name of the batch script (the filename extension doesn’t matter so I usually just use
To check on your jobs you can type in:
squeue -u username
username with your username.) This will tell you whether they’re still in the queue, how long they’ve been running, etc.
You can cancel a job by getting the job ID from
squeue and then entering:
job-id with the actual number.) Any output that your commands print to the terminal will be saved in the
output file that you specify. Any errors specifically from slurm, for example if your job crashes for some reason, will be printed to the
error file that you specify.
R is installed
First, login to talapas using
ssh. By default, when you login to talapas,
R is already loaded with a recent installed version (at least on my account). Check by entering:
at the command line. If this returns an error or if you want a specific version of R you can see the installed versions with:
module spider R
and load the latest version with:
module load R
or a specific version with:
module load R/3.6.1
This is important to check first, because if it’s not automatically loaded, you will have to include
module load R at the top of your batch file when you go to submit jobs.
(It’s probably safest to always load a specific version of
R at the top of your batch file, but I don’t do this. 🤷)
R at the command line
R directly by typing
at the command line, which will open an
R console. This is a good way of testing things and installing packages. You should manually install all of the packages you need this way (with the appropriate
R module loaded) before submitting a batch script. When you run
install.package() from talapas it will ask you to select a mirror from which to download the files. I usually choose the OSU one.
R script from the command line
Use the function
Rscript to run a script from the terminal:
By default, the output is not saved to a file, but is just printed to the terminal. So any output you need should be manually saved within your script.
(if you’re running scripts manually at the command line without a batch job (i.e., without using
sbatch) then remember to login to a compute node using
srun instead of running your script on the login node.)
R script from within a batch file using slurm
If you want to submit your
R script as a job to talapas, just create a batch script in which you
cd into the folder with your
R script and use the
Rscript command like you would at the command line.
Your batch script will look something like this:
#!/bin/bash #SBATCH --partition=short ### queue to submit to #SBATCH --job-name=varcomp ### job name #SBATCH --output=varcomp.out ### file in which to store job stdout #SBATCH --error=varcomp.err ### file in which to store job stderr #SBATCH --time=30 ### wall-clock time limit, in minutes #SBATCH --nodes=1 ### number of nodes to request #SBATCH --ntasks-per-node=1 ### number of tasks per node #SBATCH --cpus-per-task=28 ### number of cores you want to use #SBATCH -A crobe ### account module load R/3.6.1 # optional if you want to load a specific version cd projects/Royal_Society/R Rscript var_comp.R
And you will run it at the command line like this (except replace
varcomp.batch with whatever the name of your batch file is):
(It can be helpful to use
require('package_name') instead of
library('package_name') so that you receive output in the event that a package fails to load.)
(If you’ve ever used
R CMD BATCH to run
R scripts at the command line,
Rscript works a little differently. It prints output to the terminal (a.k.a.,
stdout) and does not save an
.Rout file. Therefore, any output or errors will print to your
sbatch.out file defined in your batch script.)
rstudio on talapas
If you want to use
rstudio on talapas you can login to an Open OnDemand virtual desktop by opening your web browser and navigating to talapas-login.uoregon.edu. Enter your uoregon login info and then click on “Interactive Apps” at the top of the screen and then “Talapas Desktop” from the dropdown menu that appears. This will take you to a screen where you enter your normal
sbatch info for slurm including your account name (crobe), the length of your job, how many cpus you need, etc. Once your session begins, click on “Launch noVNC in New Tab” to open the desktop.
Once you have the talapas desktop open, go to the top of the screen and click on the terminal button. At the terminal type in
module load rstudio
and if you want a specific version of
R you can type in
module load R/3.6.1
or whatever version you want. Finally, type in
to start the
rstudio application. Your working directory will be your normal talapas home directory and you can use
rstudio like you would on your local machine but with the power 💪 of talapas.
If you want to use
qiime2 on talapas, you should put the following commands into your batch script:
module load miniconda conda activate qiime2-2019.10
where you should replace
qiime2-2019.10 with the name of the
conda environment that has the version of
qiime2 that you need. You can see which
qiime2 environments are installed with:
conda env list
How to install a specific version of
qiime2 on Talapas using
When you do this, you should go to the qiime2 docs under “Installing QIIME 2” and run the commands from there to get the latest version. This will create a miniconda environment specific to your talapas user account. Below, I have copied how to do this for the current version (version 2020.2) as an example.
First, if you haven’t already, activate the
module load miniconda
Then you can install
qiime2 in a new environment. Just update the url and environment name with whatever specific version you want to install:
wget https://data.qiime2.org/distro/core/qiime2-2020.2-py36-linux-conda.yml conda env create -n qiime2-2020.2 --file qiime2-2020.2-py36-linux-conda.yml # OPTIONAL CLEANUP rm qiime2-2020.2-py36-linux-conda.yml
Then any time you what to use that version simply run:
source activate qiime2-2020.2
If you forget what it’s called, it will be at the top of this left:
conda env list