1 Learning goals

In this first module we focus on the skills needed to get started working on the computing cluster. Even if you in many cases can do a lot of bioinformatics on a local (powerful) PC or Mac, it is part of our training to learn how to make use of High Performance Computing (HPC) facilities. This means you log onto this computing cluster from your own computer, and will do all the work in this course on this facility. In this first module we focus on getting started by establishing the connection to the HPC facility in two different ways, and very briefly start to use it.

R is a ‘workhorse’ in the courses given at KBM. We assume all are familiar with the use of R and RStudio. This is not a course where you learn new programming concepts, but we will make use of RStudio as our ‘workbench’, and we need to write some (smallish) pieces of R code frequently. Note that in this course we will run R and RStudio on the computing cluster, not on your local computer, but it will look and behave (almost) the same.

We will also devote some time to learn some basic coding in the UNIX shell. This is required to ‘survive’ in modern computational biology. Previous knowledge of the UNIX operating system is not required, and you will not need much UNIX skills to complete the tasks of this course. However, if you foresee a future for yourself in computational biology, you should invest some time learning more UNIX.

Summarized, these are the learning goals for this module:

Accessing the High Performance Computing (HPC) cluster orion at NMBU, using both the Terminal window and RStudio.
Learn how to set up up a local library for R packages.
Get an overview the very basic UNIX needed to get started in the computing cluster environment. Learning shell coding will take some more time, we will practice as we go along every week.
Start making some small scripts, both in R and shell.

2 Introduction to BIN310

I have recorded two short lectures with an introduction to this course:

I will talk about this on our opening day (September 7.), but here you have a recorded version as well.

3 Accessing the computing cluster

In this course we will spend the time working on a High Performance Computing (HPC) cluster. This is a standard working environment for most computational sciences, including computational biology. Our very first step is therefore to get access to this. In the next module we will dig more into this.

3.1 The orion HPC

A High Performance Computing cluster can be seen as a collection of (more or less) powerful computers linked together. Each of these computers in the cluster we refer to as a node. Each node may in itself not be particularly powerful, at least compared to a powerful gaming PC you may have at home, but together they make a computing facility with resources exceeding what we typically have locally. We immediately divide the nodes into two distinct categories:

The login node. This is the special node where you log into the HPC. You are not meant to do any computations or other heavy work on this node! It is purely for logging in and small operations.
Computing nodes. These are the nodes doing all the heavy work. You rarely log into these nodes directly, with one exception, and that is with RStudio as we will see below.

Compared to a local PC, the advantages of a computing cluster are typically:

Memory. This limits how much data you can process in one go, e.g. how many numbers of how large texts you can read into R or Python (not the disk space). A standard laptop may have 16 gigabytes. The largest nodes in an HPC may have 1000 gigabytes (1 Terabyte), allowing us to work with larger data sets.
Parallel computing. We can spread computations out on many threads and/or many jobs, and speed up computations compared to doing it sequentially one task at the time. We will come back to what we mean by threads and jobs.
Effective use of resources. The HPC has a queuing system that ensures our tasks are started once there are resources available. This means no resources are wasted by (expensive) computers not being used.
Backup. Data you store on an HPC are backed up every day (with some exceptions), i.e. nothing is lost in case your local PC breaks down or is stolen.
User community. Since there are many users, almost all problems you stumble across has already be seen, and solved, by someone else.
Expert help. There are some expert people at the IT department you can turn to in case of special problems.

In Norway we have a national HPC facility called sigma2. It is quite likely that several of you will be using this in the future, and you may read about sigma2 here.

In addition to this, we have at NMBU a local computing cluster called orion. This is the HPC we will be using in BIN310. It is small compared to the national facilities but is a nice platform for learning the necessary skills. It has the same operating system and queuing system as sigma2, and is similar in all ways. Thus, what you learn by using orion is very much transferable to sigma2 and other similar HPC facilities.

We will now take some necessary steps to access the orion HPC.

3.1.1 Visit the orion website

The documentation website for orion is https://orion.nmbu.no/. NB! You may need a VPN connection (see below) to actually reach this site.

You may start looking into this site now, but it will probably be more useful once you have started to make use of the facilities. Bookmark this site.

3.1.2 Establish a VPN connection to NMBU

In order to access orion, you need to be inside the firewall, for security reasons. This means you need a Virtual Private Network (VPN) connection to NMBU. This may be required even if you are at campus. How to do this?

The website for logging in is: https://na.nmbu.no/

Note that this requires a two-factor login. Here are some links to some websites with help for setting this up:

These are in norwegian, but I think it should be possible to follow the procedure anyway.

Setting up this is something you do once, but each time you re-start your computer, you need to log in at the VPN website (https://na.nmbu.no/) before you can access orion. Previous experiences is that some computers are more problematic than others with respect to establish a VPN connection to NMBU.

Please make certain you have a VPN connection, you will not be able to follow BIN310 without it! Contact the proper IT help services if you run into problems. We (BIN310 teachers) are probably of little help, but let me know if you have some problems. Then I can give feedback to the IT department to improve things for later.

3.1.3 Get an account

To be able to log into orion, you must first get an account, i.e. you must be registered with a username and a password. For this course we have created some student accounts on orion. To get your username and dummy password, either

Show up on the day announced in Canvas, to get your username an dummy password physically.
Or email me at lars.snipen@nmbu.no using the subject orion student and give me your phone number. Then you will receive the information in an SMS.

The dummy password is for first time login only, and you must immediately change this, see below how to do this.

3.1.4 Connecting to orion with a Terminal

Let us connect to orion by using a software that I will refer to as a Terminal. This open a single Terminal-window access to orion. This is the basic way of connecting to orion, or any other HPC. A Terminal is simply a window on your local computer that typically may look something like this:

A terminal window. This is the software Putty in Windows, where I (larssn) have logged into orion.

Mac users: The Mac operating system is very similar to the operating system of orion (both are variants of UNIX). Thus, a Mac already has a Terminal app that you make use of now! Start the Terminal app on your Mac, and you have a window for logging into orion. In this window always type

ssh <username>@login.orion.nmbu.no

where you replace <username> by your proper username. After this you press the Return-key, and assuming your VPN connection is fine, you will be prompted for the password. Type it in, and hit Return. NOTE: Nothing is displayed while typing the password (for security reasons), i.e. it may look like it is not being typed. But it is! Thus, you need to be able to type your password blindfolded! If this was successful you should get some text scrolling over the screen, see my short video for Windows users below.

Windows users: You need to first install a software that gives you a Terminal window, unlike the Mac this is not standard on Windows computers. The orion website above suggests using either MobaXterm or putty. Both are free software tools, and may be installed from the NMBU Software Center.

Here is a short video on how to log into orion using MobaXterm

3.1.5 Changing password

When you log onto orion for the first time, using your dummy password, you will immediately be asked to change it (see the MobaXterm video above).

Just do as the text in the Terminal window tells you. Again, when typing the old and new passwords, nothing will appear on the screen, but it is working!

After completion the new password is the one you use from now on. Please do not forget this! You should now end the session, and start a new one, and log in to see if the new password is now working. It may take some time for the new password to be registered.

3.1.6 Command lines in the BIN310 documents

In the small black text box above you can see the first example of my writing a UNIX command line in this course (ssh <username>@login.orion.nmbu.no). All commands/codes written with such black background means written directly in the Terminal window. This is what we refer to as the command line. This is the basic access to a UNIX system, there is no graphical interface (Desktop). We will type a lot of commands on command lines like this in BIN310.

On our local computers we are used to a graphical interface, i.e. some Desktop on which we can see files as small images, and where we can click and drag stuff. UNIX system also have this, but on a HPC facility this is not used. Here our only ‘window’ into the system is a Terminal window, and in order to communicate with the system we need to type in commands on the command line. You may recognize this from the Console window in RStudio, where you can type R-commands directly, and get output text. The reasons this command line facility has survived, and is still very much in use, are several. It requires little computing resources, and also give us a more direct access to the operating system. It also reflects this is actually a computer in the original meaning of the word, not like our Mac or Windows ‘computers’ who are only used for communication, administration and gaming, and hardly ever for actual computing.

3.2 Using RStudio on orion

Above we used a Terminal window to connect to orion. This is the standard way to access a computing cluster. However, on orion we may also connect by running RStudio directly on orion. This give us some benefits we will make use of later. In BIN310 we typically both have a Terminal window and an RStudio window open to orion at the same time. Some tasks are best done in the Terminal, some best in RStudio. Let us connect to RStudio on orion.

3.2.1 Connecting via RStudio in BIN310

In BIN310 we have been granted some special servers for running RStudio this year. The reason is to avoid using the standard RStudio access, competing for resources with the rest of the orion users.

To log in to RStudio you use one of these two server addresses:

These should behave the same way. In order to spread our load on both servers we now recommend:

Users 1-31 use https://rstudio-1.orion.nmbu.no
Users 32-62 use https://rstudio-2.orion.nmbu.no

This is not a restriction, just an attempt to spread the load between the two to some degree. You may log into any of the two.

Bookmark these websites! You will be asked for username and password, type in as you did for Terminal login.

Here is a short video on RStudio login

3.2.2 Connecting via RStudio the standard way

If you get an ordinary user account (not BIN310 student) you log into RStudio in a slightly different way. I prefer to also mention this here, since we may also use this option on rare occasions. Note that in BIN310 we will by default use the procedure with the two servers outlined above!

Ordinary users log in from the JupyterHub website https://jupyterhub.orion.nmbu.no/hub/login. Type in your orion username and password as before. You should be taken to a page named Server Options. Here we choose how to run RStudio on orion.

Notice we typically choose:

The Simple tab.
The RStudio partition.
Only 1 core, since we never to heavy computations in R.
Choose duration by how long you think your session will be.

You then press the big Start button at the lower end of the page, and you are taken to the JupyterHub Launcher site. There you choose the RStudio-4.3.1 button, to start RStudio using R version 4.3.1.

Note again that in BIN310 we should log into RStudio using the two servers mentioned above, this is just a backup solution we may (or may not) need on some rare occasions.

Here is a short video on this procedure

3.2.3 Configuring RStudio

You can configure RStudio more or less in the same way as you can when running RStudio locally on your own computer. Some settings are actually important to be aware of, see the following video:

Here is a short video with some practical hints on configuring RStudio

You may notice that in RStudio we also have Terminal window (locate the Terminal tab). Could we not just forget about the Terminal login that we did above, and only use RStudio? Well, I wish we could, but there are some problems. The Terminal window in RStudio will not behave 100% as a standard Terminal window, and we probably run into some problems relying only on this. Also, this solution with running RStudio on the cluster is not in general available (e.g. on sigma2) and we should not be dependent on it.

4 Starting with UNIX

Even if UNIX systems in general also have a graphical user interface (GUI), the command line is always available. This is what we should take some time learning. All bioinformaticians are expected to be familiar with UNIX command lines!

4.1 Basic UNIX commands

The internet is full of help for those who want to learn the basics of UNIX commands. Let us make use of some of this. Open a new tab in your web-browser, and visit

LinuxCommand.org

Bookmark this! Step through the first part, named Learning the Shell (see left margin). Have a Terminal window to orion open, and try out the commands from this web-course on orion. The shell we use on orion is the bash (Bourne Again Shell), which is the same as in the web-course.

Here is a short guide to the Learning the Shell part:

What is the shell? - note that some talk about superuser and window manager is irrelevant for us.
Navigation - essential, notice especially the grey box about filenames!
Looking around - essential.
A guided tour - understanding the file system (file-tree) is very fundamental.
Manipulating files - essential stuff!
Working with commands - nice to know, but not something we use all the time.
I/O redirection - very useful!
Expansions - lots of useful stuff here, which we will see later in the course.
Permissions - we need to know what permissions are, and how to change these for certain files. See last exercise in this module!
Job control - notice that on orion you are only allowed to run very small jobs directly in the Terminal window (on the login node). If everyone started to run heavy computations here, it would crash!

Open the Terminal (not RStudio) on orion, and try to run the codes listed in the course above.

Note that a folder and a directory is the same thing, I may use both in my texts.

4.1.1 Exercise - The shell

Once you are logged in to orion with a Terminal, you are in your home directory. Based on the reading above, do the following from the command line:

List the content of your home directory.
Find the full path to this directory. NOTE: Different paths may be listed, several paths leads to home!
Create a directory named module1 under the home directory. Notice we never use spaces inside names of files or folders (or any other item). Navigate into it.
Create a new directory named data inside it, and navigate into it.
Copy all files ending with .faa from /mnt/courses/BIN310/module1/ into this data directory.
Display the content of the file Eubacterium_rectale.faa. Hint: Use cat.
Inspect the first 10 lines of the file. Hint: Use head
Inspect the last 20 lines of the same file. Hint: Use tail
List the Header-lines of this file. A Header-line is simply a line that starts with the symbol >. Hint: Use first cat to list all lines, as above. Then add the pipe operator and grep '>' to pick up only the lines having the symbol >.
How many sequences are there in this file? Well, there is always one Header-line per sequence, so we need to count the number of Header-lines. Try to extend the code from above by adding another pipe operator and then wc. What does wc do?

As we go along in this course we will make use of UNIX shell commands. As always, the only way of learning this is to repeat it over and over again. Thus, you will most likely need to revisit the LinuxCommand.org web-site, or similar sources for help later.

4.1.2 Exercise solution

### Suggested solutions to above exercise
cd $HOME   # cd means change directory. Alternative: cd ~   or just   cd
ls         # ls means list
pwd        # pwd means print working directory

mkdir module1   # mkdir means make directory. You specify its name
cd module1
mkdir data
cd data

cp /mnt/courses/BIN310/module1/*.faa .           # cp needs 2 inputs:
                                                 # what to copy (from)
                                                 # where to copy (to) Here we use only . (a dot)
                                                 # A dot means 'this folder'
                                                 # You may also give the copied file a new name
cat *.faa                                        # lists all files ending with .faa in this folder,
                                                 # which is only one file
head Eubacterium_rectale.faa                     # head lists first 10 lines
tail -n 20 Eubacterium_rectale.faa               # tail lists last 10 lines, but option -n is used to
                                                 # list more/fewer lines

cat Eubacterium_rectale.faa                      # This list all lines in the file
cat Eubacterium_rectale.faa | grep '>'           # All lines are piped into grep, listing only the lines
                                                 # with a >
cat Eubacterium_rectale.faa | grep '>' | wc -l   # All lines with > are then piped into wc, which is
                                                 # counting the number of lines when we use the option -l

4.2 Shell variables

When we are logged in to orion, we are in a shell, which is an environment not unlike the GlobalEnvironment in RStudio. In this shell we may create shell variables just like the objects we create in R. Let us create a new shell variable and fill it with some text:

my_name="Lars Snipen"

We create the variable my_name and immediately fill it with content, the text "Lars Snipen". Notice:

A variable name must be one contiguous text, use underscore _ to splice together words (never space!)
The assignment is =, and there must be no spaces around it (unlike R or python)

Just like R you may use both single or double quotes around a text, e.g. we could have written

my_name='Lars Snipen'

To see the content of a shell variable, we need to refer to it using the $ operator:

echo $my_name
Lars Snipen

The echo command will simply display whatever it gets as input, here the content of the variable.

If you forget the $ it will simply display the name of the variable:

echo my_name
my_name

Learn this difference! When we create or assign something to a variable, we name it in the usual way, but if we want to retrieve its content we need to add the $ before its name.

We will create a lot of shell variables in the scripts we make in BIN310.

4.3 Environment variables

Assume you are in some directory in the UNIX file-tree. Which one of these commands will take you back to your home directory?

cd
cd $HOME
cd ~

Try it out in a Terminal window. You should find that all three ways will take you to your home directory.

Notice the shell variable named HOME. We never created this, where does it come from? When you log in to orion, you will find there are from the start a number of environment variables who already have a content. These are shell variables who are ‘always there’.

In the Terminal window, inspect the content of this variable by

echo $HOME

In this course I will from now on use $HOME to symbolize the home directory. Note that when I type $HOME it refers to my home, but when you type $HOME it refers to your home.

There are some other environment variables we will also meet:

COURSES. This is the path to a folder where courses like BIN310 may upload data to share among the course participants.
SCRATCH. This is the path to a folder you have, much like your HOME, and where you may store temporary stuff.
TMPDIR. This is a folder where we may tell software to store temporary files.

We will meet these as we go along. By convention, all environment variables are written in uppercase letters. Thus, when we create shell variables we should not use uppercase letters (even if it is legal), just to avoid a potential confusion.

You may also create your own environment variables, but we will probably not spend time on this in BIN310.

4.4 Folders and paths

Experience tells us that many people are not used to care about folders and the paths to where their files are. In Windows or a Mac everything is visible on a desktop, and users are in general little aware of how the files and folders are organized.

When working in a UNIX system, you have to know where your files are!

When we start making scripts (program-files) and produce result files, it is a good habit to organize everything in directories (folders). You typically create a new directory for each module or assignment, we already created the module1 directory above. Once we start working with many input and output files, it is important to be organized properly. It will become a hopeless mess if you store all files together in one directory.

When you are working inside a folder, your script (R, shell or whatever) will not know about files in other folders unless you specifically address them. An example is from the exercise above, where we copied the file Eubacterium_rectale.faa from the folder /mnt/users/larssn/BIN310/coursedata/. We may, in any folder, access this file directly as long as we use the full path to it. Let us use the wc command to see how many lines it has:

wc -l /mnt/courses/BIN310/module1/Eubacterium_rectale.faa

Note that this will always work regardless from which folder I execute this, since I specify exactly which file I refer to (the path to where it is located and its name).

But, this will not work if we just type:

wc -l Eubacterium_rectale.faa

unless there is in fact a file with that name in the folder you are now working inside. If so, this is not the same file, just another file having the same name!

Learn that the shell will not, magically, know which file you are referring to unless you specify completely which file and where exactly it is located.

5 Installing R-packages

Let us return to the RStudio connection to orion again. Before we start using R properly, we need to install some R packages. Every user install their own R packages on orion, only the base packages are always available to everyone. But, before we start installing R-packages, we like to create a specified R library, which is simply a directory in which we install all R-packages.

5.1 Make your own R-library

There is a reason for making a specified R library. If you start to install R-packages without having done this, RStudio will create a local library somewhere. This works fine as long as you only use R through RStudio. But, sooner or later we may run bigger jobs, and we want to start R from the SLURM queuing system (more on this later). Then the local library created by RStudio is no longer visible to R! In fact, specifying the directory (folder) where you want to store you R packages is always a good idea, even on your local computer. For instance, many problems can be avoided by never storing R packages in the cloud…

Here is how we do this:

First, create the library directory. In my $HOME directory there is a directory named R (if not, create it now!) and inside this I have created another directory named myLib, thus I want $HOME/R/myLib to be my R-package library. You should create a directory under your $HOME in a similar way (you may of course choose different names). We must now make R aware of this, and make R always look into this folder when installing or using packages.

5.2 Editing the `.Rprofile` file

Each time you start R on orion, either through RStudio or directly in the shell, it will always look for a file named .Rprofile in you $HOME directory, and execute the codes in this file. Thus, in this file we put commands we always want to be executed at startup. This file does not exist by default, but we will create it and fill it with some content now:

In RStudio, create a new text file (File - New File - Text File). In this file you write one line of R-code:

.libPaths("~/R/myLib")

Add one newline (return) such that the file ends with an empty line (this is important!). Save this file in your $HOME directory and under the exact name .Rprofile. Notice the file name starts with a dot.

This should now tell R to use the folder ~/R/myLib as library. Note that you may of course choose other directories as your library, just edit the .Rprofile file accordingly.

To test if this has been successful, first quit RStudio, and re-start it. This means you must use the menu File - Quit Session… (closing the web browser and re-open it does not re-start RStudio!). Then, from the menu choose Tools - Install Packages… and a small window should appear. In the bottom pop-up menu of this window (Install to Library:) you should now see your chosen library. If not, something was wrong. Make certain you did exactly as specified above!

Here is a short video illustrating the procedure

5.3 Install some R packages

We can install R-packages from various sources. Please make certain you run the newest version of R on orion when you install R packages. In most cases you can then go back and use older versions of R (if you for some reason like to), but the opposite is not necessarily possible.

Here are some exercises that you need to do, we will need these packages (and more) later:

5.3.1 Exercise - Install from CRAN

Install the microseq package from CRAN. The Comprehensive R Archive Network (CRAN) is the main repository for R packages. Use the Tools - Install Packages... in RStudio. Search for microseq. This is a small package we have made, and we use it mostly for reading/writing sequence files to/from tables. There are other packages for doing this, but they tend to not store data in tables. R loves tables!

We will need the tidyverse packages from CRAN in BIN310. However, this should already be available to us, along with a number of other R packages. Inspect the Packages tab in RStudio to verify this.

5.3.2 Exercise - Install from Bioconductor

Install the package ggtree from Bioconductor. Try to google and follow the instructions.

5.3.3 Exercise solution

A short video with solutions

6 Making scripts

A script is simply a text file containing some code, in our case either R code or UNIX shell code. Since we can run RStudio on orion, we will use RStudio as our editor for making all kinds of scripts. There are editors we may run inside the Terminal, and we may briefly see this too, but in this course we will predominately make use of RStudio.

You should have some familiarity with R coding, as this is a prerequisite for this course. In this course we will not do very much statistics, but use R for data wrangling. In most cases this means reading text files into R, do some filtering, selection, sorting, simple calculations etc, and then do some plotting of this.

We will also make scripts with shell code. A shell script is simply a text file listing UNIX commands. You need to make such scripts in order to start a computing job on a HPC, and in these scripts we will from time to time need some simple coding in addition to straightforward commands. I use the term ‘shell code’ for all commands we can put into such shell scripts. The shell is not formally a programming language, more like a set of commands, but it contains some elements we recognize from programming languages, like if-else statements and for-loops. We will also see some use of awk, which is a programming language, in some of this shell code, but this is not a course in awk coding.

6.1 Making R scripts

Let us make an R script and refresh some data wrangling in R.

First, make a new R script (File - New File - R Script). Add this code

library(tidyverse)   # we will need functions from these packages
library(microseq)    # we need the function readFasta() from this package

### Reading data from fasta-file
faa.tbl <- readFasta("/mnt/courses/BIN310/module1/Eubacterium_rectale.faa")

and save it in your module1 directory by a proper name. R scripts should always have the extension .R. RStudio will in fact add this if you forget. Remember: Never use spaces inside file names, and never ever use the scandinavian letters when coding!

Notice the looks of the code chunk above, with a rather light grey background. This is how I will display R code in the texts I produce for this course. Each time you see a box like this, it contains R code.

6.1.1 Exercise - Wrangling

Run (Source) the script from above to verify it works. This code reads the data into the table faa.tbl in R. This table has two columns, one named Header and one named Sequence. Extend the script to fulfill these tasks:

Create a new column Length containing the length of each sequence. Hint: Use mutate() and str_length().
Sort the table by Length, such that the longest sequence is at the top. Hint: arrange() and desc().
Make a new table with only the 10 longest sequences. Hint: slice().
Make another table with all sequence longer than 300 amino acids. Hint: filter().
Make another table containing only the sequences where butyrate is mentioned in the Header. Hint: str_detect().

These are small examples of typical data wrangling, i.e handling or processing of data (tables) in R. We will see later that the output we get from various software tools may be read into R and wrangled like this, in order to either summarise the results or give them as input to the next software tool.

6.1.2 Exercise solution

library(tidyverse)   # we will need functions from these packages
library(microseq)    # we need the function readFasta() from this package

### Reading data from fasta-file
faa.tbl <- readFasta("/mnt/courses/BIN310/module1/Eubacterium_rectale.faa") %>%
  mutate(Length = str_length(Sequence)) %>%
  arrange(desc(Length))
longest.tbl <- faa.tbl %>%
  slice(1:10)
above300.tbl <- faa.tbl %>%
  filter(Length > 300)
butyrate.tbl <- faa.tbl %>%
  filter(str_detect(Header, "butyrate"))

6.2 Making shell scripts

In addition to some R scripts, we will also make shell scripts in this course.

Create a new shell script file (File - New File - Shell Script). Add to it the following shell code:

#!/bin/bash

# My first script

echo "Hello World!"
echo "My home is $HOME"

and save it in a proper folder under the name hello.sh. Shell scripts should always have the extension .sh, and the newest versions of RStudio will add this in case you forget. You may notice that when you pasted in the text into RStudio it was colored. RStudio also understands shell code!

Notice also the darker grey background of the code chunk above. This is how I will display shell scripts in the texts I produce for this course. Thus, whenever you see code chunks in my texts then

Light background means R scripts
Darker background means shell scripts
Black background means shell code (commands) written directly in the Terminal.

6.2.1 Exercise - Running shell scripts

We typically run shell scripts from the Terminal. Open a Terminal window and navigate to the same folder where you saved the shell script above. Run the script by

./hello.sh

Is it executed? Probably not… Why not?

List the content of the folder in long format by

ls -l

This reveals the permissions to every file. In my case it looks like this:

Notice the permissions for the shell script file. It is the string -rw-r--r-- at the start of the listing line. The first - indicates it is a file, it would say d if it was a directory. The next 9 characters indicate the permissions for this file:

Character 1,2,3 indicates Reading (r), Writing (w) and Execution (x) permissions for you, the User that owns this file. A - indicates the corresponding permission has not been given, i.e. rw- indicates reading and writing permissions, but not execution. The next three are similar permission for the Group you as a user belong to. You may open files to people in the same group only. Finally, the last three characters are similar for for all Other users. Thus, if it reads -rwxrwxrwx the file will be completely open to all on orion!

In order to run a shell script, you as a user must have executable permissions to it. Is this the case here? In the web-course at LinuxCommand.org there was a session about permissions. Use this to

Change the permissions of this file such that you (user) have permissions to both read, write and execute the shell script.
Change the permissions such that no permissions are given to anyone else.

If you try to execute a shell script without permissions you always get the Permission denied error message. Now you know how you may fix this!

Epilog: In fact, this little shell script you could also execute directly inside RStudio, using the Run Script button (upper right corner of file). If you do, RStudio will automatically change the scripts permission to executable! However, we will in general not run shell scripts in RStudio. By far the most shell scripts we will make in BIN310 are for submitting jobs to the SLURM queuing system, and this is not something RStudio will do for you! We will see more of in the coming weeks.

6.2.2 Exercise solution

Short video with solutions

BIN310 module 1 - Getting started

Lars Snipen

2023

1 Learning goals

2 Introduction to BIN310

3 Accessing the computing cluster

3.1 The orion HPC

3.1.1 Visit the orion website

3.1.2 Establish a VPN connection to NMBU

3.1.3 Get an account

3.1.4 Connecting to orion with a Terminal

3.1.5 Changing password

3.1.6 Command lines in the BIN310 documents

3.2 Using RStudio on orion

3.2.1 Connecting via RStudio in BIN310

3.2.2 Connecting via RStudio the standard way

3.2.3 Configuring RStudio

4 Starting with UNIX

4.1 Basic UNIX commands

4.1.1 Exercise - The shell

4.1.2 Exercise solution

4.2 Shell variables

4.3 Environment variables

4.4 Folders and paths

5 Installing R-packages

5.1 Make your own R-library

5.2 Editing the `.Rprofile` file

5.3 Install some R packages

5.3.1 Exercise - Install from CRAN

5.3.2 Exercise - Install from Bioconductor

5.3.3 Exercise solution

6 Making scripts

6.1 Making R scripts

6.1.1 Exercise - Wrangling

6.1.2 Exercise solution

6.2 Making shell scripts

6.2.1 Exercise - Running shell scripts

6.2.2 Exercise solution

BIN310 module 1 - Getting started

Lars Snipen

2023

1 Learning goals

2 Introduction to BIN310

3 Accessing the computing cluster

3.1 The orion HPC

3.1.1 Visit the orion website

3.1.2 Establish a VPN connection to NMBU

3.1.3 Get an account

3.1.4 Connecting to orion with a Terminal

3.1.5 Changing password

3.1.6 Command lines in the BIN310 documents

3.2 Using RStudio on orion

3.2.1 Connecting via RStudio in BIN310

3.2.2 Connecting via RStudio the standard way

3.2.3 Configuring RStudio

4 Starting with UNIX

4.1 Basic UNIX commands

4.1.1 Exercise - The shell

4.1.2 Exercise solution

4.2 Shell variables

4.3 Environment variables

4.4 Folders and paths

5 Installing R-packages

5.1 Make your own R-library

5.2 Editing the .Rprofile file

5.3 Install some R packages

5.3.1 Exercise - Install from CRAN

5.3.2 Exercise - Install from Bioconductor

5.3.3 Exercise solution

6 Making scripts

6.1 Making R scripts

6.1.1 Exercise - Wrangling

6.1.2 Exercise solution

6.2 Making shell scripts

6.2.1 Exercise - Running shell scripts

6.2.2 Exercise solution

5.2 Editing the `.Rprofile` file