Intro to Singularity containers

# Intro to Singularity Containers

Pablo Escobar Lopez

https://scicore-unibas-ch.github.io/singularity-slides

https://www.sylabs.io/

---

# License and credits

This work was done by [sciCORE - Center for scientific computing](https://scicore.unibas.ch/) 
at the [University of Basel](https://www.unibas.ch) and [SIB Swiss Institute of bioinformatics](https://www.sib.swiss/)

Except where otherwise noted, this presentation is licensed under [Creative Commons Attribution 4.0 International (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/)
.center[
![:scale 15%](img/cc.png)
]

.small[These slides are based in previous work from Alexander Kashev at [http://www.scits.unibe.ch/](http://www.scits.unibe.ch/)]
---

# Agenda

1. Tutorial intro
2. The problem we're solving
3. Virtual machines vs containers
4. History of containers
5. Docker vs Singularity
6. Singularity workflow
7. Examples

---

## Objective

- Explain the basic technical concepts about containers

- Differences between Virtual Machines and containers. Pros and Cons

- Differences between Docker and Singularity. Pros and Cons

- Some practical examples with Singularity
  
  - Execute Singularity containers
  - Create Singularity containers from scratch
  - Share Singularity containers

---
  
### Requirements to follow the course

- Your laptop with:

- [VirtualBox](https://www.virtualbox.org/) and [Vagrant](https://www.vagrantup.com)

- Linux shell knowledge

- If you are using a Windows laptop you also need [Cygwin](https://www.cygwin.com/)
  to have a bash shell. During the Cygwin installation make sure to install all the 
  ssh related packages as in this screenshot:

# The Problem(s) we would like to solve

---

# Problem (for developers)

Suppose you're writing some software. It works great on your machine.

However, eventually it has to leave your machine to run on your colleague's machine,
  or be deployed in its production environment.

It could be a completely different flavour of OS, with a different set of libraries
  and supporting tools.

It can be difficult to test if you accounted for all these variations
  on your own development system. You may have things in your environment
  you're not even aware of which make a difference.

Your users could also be less technically inclined to deal with dependencies.
  You may wish to decrease this friction.

---

# Problem (for researchers)

Suppose you have a piece of scientific software you used
  to obtain some result.

Then someone half across the globe tries to reproduce it,
  and can't get it to run, or worse - is getting different
  results for the same inputs. What is to blame?

Or, even simpler: your group tries to use your software a couple
  of years after you left, and nobody can get it to work.

To enable reproducible science with the help of software packaging,
  just the source code might not be enough;
  the environment should also be predictable.

---

# Problem (for server administrators)

Suppose you have one hundred users, each requesting certain software.

Some of those applications have ancient dependencies very hard (or imposible) 
to compile from sources in your system

Some of those application only work with linux distribution "XXX" . the developer said 
"it works in my computer" ;). Of course your cluster runs a different distro

Some of the software works with mutually-incompatible library versions.
  Possibly these have known security issues.

And finally, _you most certainly don't trust_ any of this software
  not to mess up your OS. You've been there before.

---

# What would be a solution?

* A turnkey solution

A recipe that can build a working instance of your software, reliably and fast.

* BYOE: Bring Your Own Environment

A way to capture the prerequisites and environment together with the software.

---

# The Solution(s)

---

# Solution: Virtual Machines?

A virtual machine is an isolated instance of a .highlight[whole other "guest" OS] running
  under your "host" OS.

A .highlight[hypervisor] is responsible for handling the situations where this isolation
  causes issues for the guest.

From the point of view of the guest, it runs under its own, dedicated hardware.
  Hence, it's called .highlight[hardware-level virtualization].

Most<sup>*</sup> guest/host OS combinations can run: you can run Windows on Linux,
Linux on Windows, etc.

------

\* MacOS being a stinker here, of course, with their license.

---

## Virtual machines

image source: [http://www.admin-magazine.com/HPC/Articles/Singularity-A-Container-for-HPC](http://www.admin-magazine.com/HPC/Articles/Singularity-A-Container-for-HPC)
---

# Virtual Machines: the good parts
.small[
* The BYOE principle is fully realized

Whatever your environment is, you can package it fully, OS and everything.

* Easy to precisely measure out resources

The contained application, together with its OS, has restricted access to
  hardware: you measure out its disk, memory and alloted CPU.

* More Flexible. More OS are supported

* Security risks are truly minimized.

Very narrow and secured bridge between the guest and the host means little
  opportunity for a bad actor to break out of isolation.
]
---

# Virtual machines: the not so good parts

* Operational overhead. For every piece of software, the full underlying OS has to be run.

* Setup overhead. Starting and stopping a virtual machine is not very fast,
  and/or requires saving its state.

* Hardware availability. The isolation between the host and the guest can hinder access to
    specialized hardware on the host system (eg GPUs)

* There are a lot of different hypervisors (eg Xen, KVM, VirtualBox). Very little chance that you get a hypervisor installed
in your HPC cluster.

* No integration with HPC schedulers (Slurm, SGE)

---

# Solution: Containers (on Linux)?

If your host OS is Linux and your software expects Linux, there's
  a more direct and lightweight way to reach similar goals.

Recent kernel advances allow the isolation of processes from the rest
  of the system, presenting them with their own view of the system.

You can package other different Linux distributions, and with the exception
  of the host kernel, the entire userland can be different for the process.

From the point of view of the application, it's running on the same hardware
  as the host, hence containers are called
  .highlight[operating-system-level virtualization].

---
## Containers

image source: [http://www.admin-magazine.com/HPC/Articles/Singularity-A-Container-for-HPC](http://www.admin-magazine.com/HPC/Articles/Singularity-A-Container-for-HPC)

---

# Containers: the good parts

* Lower operational overhead. You don't need to run a whole second OS to run an application.

* Lower startup overhead. Setup and teardown of a container is much less costly.

* More hardware flexibility. You don't have to dedicate a set portion of memory to your VM
well in advance, or contain your files in a fixed-size filesystem.  Also, the level of isolation is up to you.
You may present devices on the system directly to containers if you so desire. e.g. GPUs

* Integration with HPC schedulers is trivial

---

# Containers: the not so good parts

Kernel is shared between the host and the container,
    so there may be some incompatibilties.

Plus, container support is (relatively) new, so it needs a recent kernel
    on the host.

* Security concerns

The isolation is thinner than in VM case. The host OS kernel
  and hardware is directly exposed so the attack surface is larger.

* Linux on Linux

Containers are inherently a Linux technology. You need a Linux host
    (or a Linux VM) to run containers, and only Linux software can run.
]

---

# History of containers

The idea of running an application in a different environment is not new to UNIX-like systems.

Perhaps the first effort in that direction is the `chroot` command and concept (1982):
  presenting applications with a different view of the filesystem (a different `/`, root directory).

This minimal isolation was improved in FreeBSD with `jail` (2000),
  separating other resources (processes, users)
  and restricting how applications can interact with each other and the kernel.

Linux developed facilities for isolating and controlling access to some processes
  with namespaces (2002) and cgroups (2007).

Those facilities led to creation of solutions for containerization, notably LXC (2008),
  Docker (2013) and Singularity (2016).

---

## Main technologies used to create containers

All the recent containers solutions (LXC, Docker, Singularity...etc) use namespaces and cgroups 
to isolate the containers from the host

These two technologies (namespaces and cgroups) are provided by the Linux kernel

The container solution "only" provides a easy way for the end user to automate the creation
of the required namespaces and cgroups

In the case of Docker it's the docker daemon who does the job. In Singularity it's a setuid
binary who does it.

---

# Docker

Docker came about in 2013 and since has been on a meteoric rise
  as a gold standard for containerization technology.

A huge number of tools are built around Docker to build, run,
  orchestrate and integrate Docker containers.

Many cloud service providers can directly integrate Docker containers.

Docker encourages splitting software into microservice chunks
  that can be portably used as needed.

---

## Docker concerns

Docker uses a pretty complicated model of images/volumes/metadata,
and it is not always very transparent how this is stored.

Also, isolation features require superuser privileges;
Docker has a persistent daemon running with root privileges
and many container operations require root as well. In practice,
.highlight[a user who can run docker containers can get root access]

Docker is mainly designed and used to run [microservices](https://en.wikipedia.org/wiki/Microservices) like web servers, databases,
mail servers, etc and not to run command line utilies like most users usually do in an HPC cluster.
Microservices also require orchestration tools like [docker-compose](https://docs.docker.com/compose/),
[docker swarm](https://docs.docker.com/engine/swarm/) or [kubernetes](https://kubernetes.io/)

---

## Docker concerns

The user IDs inside and outside the docker containers are not easily mapped.

This means that processes running inside the container can run with different UID than the user
who is executing the container. Because of this, the generated files won't be accessible by the user "outside"
the container

These concerns make Docker undesirable in multi-user, shared HPC environments.

Out of these concerns, and the scientific community, came Singularity.
---

# Singularity

Singularity is quite similar in principle to Docker. In fact,
it's pretty straightforward to convert a Docker container
to a Singularity image.

Singularity uses a monolithic, image-file based approach.
Instead of dynamically overlaid "layers" of Docker, you have
a single file you can build once and simply copy over to the target system.

The privilege problem was a concern from the ground-up, and solved by having a
`setuid`-enabled binary that can accomplish container startup - and drop
privileges completely as soon as practical.

---

# Singularity

Privilege elevation inside a Singularity container is impossible (in theory ;): to be root inside,
you have to be root outside.

Users don't need explicit root access to execute existing containers

Singularity automatically maps the user IDs . All the processes inside the container will
run with the same user ID as the user executing the container. This means
that also the files generated inside the container will have the right access permissions outside
the container.

---

# Singularity and HPC

root is only required for initial build of the container but you can do this in a virtual machine
(as we will do in this course) and then move the container to your HPC cluster.
.highlight[You don't need root to execute the containers]

If the container is already available in a registry (containers repository) you can directly
  download it to your HPC cluster without root permissions.

Thanks to the above improvements over Docker, HPC cluster operators
  are much more welcoming to the idea of Singularity support.

Once your software is packaged in Singularity, it should work across
  all Science IT platforms supporting the technology.

---

### Performance comparison: native VS singularity in the sciCORE cluster

### NGS pipeline without containers
```terminal
user@scicore:~$ sacct -j 86719  -o State,ExitCode,Elapsed,MaxRSS

State ExitCode  Elapsed     MaxRSS
---------- --------  ---------- ----------
 COMPLETED      0:0  00:55:47
 COMPLETED      0:0  00:55:47   31754776K (31.754776GB)
```

### NGS pipeline using Singularity containers
```terminal
user@scicore:~$ sacct -j 86720  -o State,ExitCode,Elapsed,MaxRSS

State ExitCode   Elapsed     MaxRSS
---------- --------  ----------  ----------
 COMPLETED      0:0  00:58:46
 COMPLETED      0:0  00:58:46    31755136K (31.755136GB)
```
---

### Performance benchmark: Singularity VS Docker VS native

![:scale 70%](img/singularity_docker_benchmark.jpg)

# Singularity workflow

![:scale 100%](img/singularity_workflow.jpg)

---

# Singularity workflow

.small[The general idea: prepare container on a local machine with full control,
  then you can move it to the cluster to run it without root. You can use scp or upload it to a containers registry.

If you upload the container image to a registry like [docker hub](https://hub.docker.com/)
, [singularity hub](https://singularity-hub.org/) or a private registry you can directly download the container
to the cluster as a regular user.
]

---

### Some notes about the shell exercises

USER @ MACHINE: WORKING_DIR

.small[Username "user" in machine "laptop" in folder "~" (~ means $HOME)]
```terminal
user@laptop:~$ mkdir ~/singularity-vm
```
.small[Username "vagrant" in machine "vm" (virtual machine) in folder "~"]
```terminal
vagrant@vm:~$ id
```

.small[Username "root" in machine "vm" in folder "~/singularity-vm"]
```terminal
root@vm:~/singularity-vm#

```

---

### Installing Singularity in your laptop

.small[In fact we are installing Singularity inside a VirtualBox VM running in your laptop.
  First boot a Ubuntu16.04 VM and test that you can ssh to it and become root]

```terminal
user@laptop:~$ mkdir ~/singularity-vm

user@laptop:~$ cd ~/singularity-vm

user@laptop:~/singularity-vm$ vagrant init ubuntu/xenial64

user@laptop:~/singularity-vm$ vagrant up

user@laptop:~/singularity-vm$ vagrant ssh

vagrant@vm:~$ id
uid=1000(vagrant) gid=1000(vagrant) groups=1000(vagrant)

vagrant@vm:~$ sudo -s

root@vm:~# id
uid=0(root) gid=0(root)

root@vm:~# exit

vagrant@vm:~$ exit

user@laptop:~/singularity-vm$
```

---

## Installing Singularity in your laptop

.small[Verify that the folder /vagrant inside the VM is mapped to the local folder ~/singularity-vm in your laptop

```terminal
user@laptop:~$ cd ~/singularity-vm

user@laptop:~/singularity-vm$ touch ~/singularity-vm/dummy.file

user@laptop:~/singularity-vm$ vagrant ssh

vagrant@vm:~$ ls /vagrant
Vagrantfile  dummy.file  ubuntu-xenial-16.04-cloudimg-console.log

vagrant@vm:~$ touch /vagrant/dummy2.file

vagrant@vm:~$ exit

user@laptop:~/singularity-vm$ ls ~/singularity-vm
dummy2.file  dummy.file  ubuntu-xenial-16.04-cloudimg-console.log  Vagrantfile
```

We will use this to copy files from/to the VM and laptop. We can also use it to edit the files
 inside the VM with our preferred editor running in our laptop
  
.highlight[Don't generate your containers directly in /vagrant .
Generate the containers in /home/vagrant and then copy to /vagrant if needed]
]

---

## Installing Singularity in your laptop

Copy&Paste this in your laptop's terminal to install singularity in the VM

```terminal
user@laptop:~$ cd ~/singularity-vm/

user@laptop:~/singularity-vm$ vagrant ssh -c /bin/bash <<EOF
    sudo apt-get update
    sudo apt-get -y install build-essential squashfs-tools curl git sudo man \
    vim autoconf libtool emacs24-nox python bash-completion libarchive-dev
    mkdir ~/src/ && cd ~/src/
    wget https://github.com/singularityware/singularity/archive/2.6.0.tar.gz
    tar xvf 2.6.0.tar.gz
    cd singularity-2.6.0/
    ./autogen.sh
    ./configure --prefix=/usr/local
    make
    sudo make install
    sudo ln -s /usr/local/etc/bash_completion.d/singularity /etc/profile.d/singularity.sh
EOF
```
Check the [official docs](https://www.sylabs.io/guides/2.6/user-guide/installation.html#install-on-linux) if you want to
create a RPM or DEB package for your distribution (not needed for this course)

---

# Using Singularity

If you followed build instructions, you should now have `singularity`
  available in your VM.

```
user@laptop:~$ cd ~/singularity-vm/

user@laptop:~/singularity-vm$ vagrant ssh

vagrant@vm:~$ singularity --version
2.6.0-dist
```

The general format of Singularity commands is:

```
singularity [global options...] <command> [command options...] ...
```

Singularity is pretty sensitive to the order of those.

Use `singularity help [<command>]` to check built-in help.

You can find the configuration of Singularity under `/usr/local/etc/singularity`

---

# Container images

A container needs to be somehow bootstrapped to contain a base
  operating system before further modifications can be made.

A container image is a filesystem tree that will be presented 
  to the applications running inside it.

A Docker container is built with a series of _layers_ that are
  stacked upon each other to form the filesystem. Singularity
  collapses those into a single, portable file.

<!--

## Running a pre-build image from docker hub

```terminal
user@laptop:~$ vagrant ssh

vagrant@vm:~$ singularity pull docker://biocontainers/samtools
Docker image path: index.docker.io/biocontainers/samtools:latest
Importing: base Singularity environment
Building Singularity image...
Singularity container built: ./samtools.simg

vagrant@vm:~$ singularity run samtools.img
Program: samtools (Tools for alignments in the SAM format)
Version: 1.3.1 (using htslib 1.3.1)

vagrant@vm:~$ singularity exec samtools.img  cat /etc/debian_version
stretch/sid

vagrant@vm:~$ singularity shell samtools.img
Singularity samtools.img:~> ls $HOME
Singularity samtools.img:~> id
Singularity samtools.img:~> exit

vagrant@vm:~$
```

.small[This will download the layers of the Docker container to your
machine and assemble them into an image.

Note that this .highlight[does not require `sudo` or Docker]!]
-->

<!--
## Running a prebuilt-image in the sciCORE cluster

```terminal
user@laptop:~$ ssh user@scicore-cluster

user@cluster:~$ ml Singularity/2.4

user@cluster:~$ singularity pull docker://biocontainers/samtools
Docker image path: index.docker.io/biocontainers/samtools:latest
Importing: base Singularity environment
Building Singularity image...
Singularity container built: ./samtools.img

user@scluster:~$ singularity run samtools.img
Program: samtools (Tools for alignments in the SAM format)
Version: 1.3.1 (using htslib 1.3.1)

user@cluster:~$ singularity exec samtools.img  cat /etc/debian_version
stretch/sid

user@cluster:~$ singularity shell samtools.img
Singularity samtools.img:~> ls $HOME
Singularity samtools.img:~> id
Singularity samtools.img:~> exit

user@cluster:~$
```
-->
---

### Pulling a pre-built Docker image to test the basic functionality of Singularity

We will do this exercise in the VM:

vagrant@vm:~$ singularity pull docker://centos:6
WARNING: pull for Docker Hub is not guaranteed to produce the
WARNING: same image on repeated pull. Use Singularity Registry
WARNING: (shub://) to pull exactly equivalent images.
Docker image path: index.docker.io/library/centos:6
Cache folder set to /home/vagrant/.singularity/docker
[1/1] |===================================| 100.0%
Importing: base Singularity environment
Importing: /home/vagrant/.singularity/docker/sha256:ca9499a209fd9abe7f919a5c99989fd26d3410164e499fe577be504341f0a352.tar.gz
Importing: /home/vagrant/.singularity/metadata/sha256:b4d98049dd466efa5cfbf4b07aa672ced350fdbec3cd2faa139b194623df29ef.tar.gz
WARNING: Building container as an unprivileged user. If you run this container as root
WARNING: it may be missing some functionality.
Building Singularity image...
Singularity container built: ./centos-6.simg
Cleaning up...
```
]

.small[This will download the layers of the Docker container to your
machine and assemble them into an image.

Note that this .highlight[does not require `sudo` or Docker]!]
---

# Entering shell in the container

To test our freshly-created container, we can invoke an interactive shell
  to explore it with .highlight[`shell`]:

```terminal
vagrant@vm:~$ singularity shell centos-6.simg
Singularity: Invoking an interactive shell within container...

Singularity centos-6.simg:~>
```

At this point, you're within the environment of the container.

We can verify we're "running" CentOS:

```terminal
Singularity centos-6.simg:~> cat /etc/centos-release
CentOS release 6.10 (Final)
```

---

# User/group within the container

Inside the container, we are the same user:

```terminal
Singularity centos-6.simg:~> whoami
vagrant

Singularity centos-6.simg:~> groups
vagrant
```

.small[We will also have the same groups. That way, if any host resources are mounted
in the container, we'll have the same access privileges. You will also have
access from outside the container to files generated inside the container. As you can
imagine this is quite convenient when using Singularity in an HPC cluster.

One of the biggest problems when working with Docker is dealing with different user uids
inside and outside the container but this is "automagically" solved by singularity
so you don't need to worry about it

If we launch `singularity` with `sudo`, we would be `root` inside the container.
]

---

# Default folders mounts

.small[Additionally, your home folder and the folder we've invoked Singularity from
are accessible inside the container (by default):
```
Singularity centos-6.simg:~> ls ~
centos-6.simg  src
```

We have access to the bound folders with the same rights as outside,
  so we can for example write to:

```
Singularity centos-6.simg:~> touch ~/test_container
Singularity centos-6.simg:~> exit
vagrant@vm:~$ ls ~/test_container
/home/vagrant/test_container
```

The current working directory inside the container is the same
  as outside at launch time.

note for sysadmins: Check `/usr/local/etc/singularity/singularity.conf` for defaults
]
---

# Custom folder mounts

You can mount custom folders inside the container using the `--bind` flag

```terminal
vagrant@vm:~$ touch /var/tmp/file.dummy

vagrant@vm:~$ singularity shell -B /var/tmp:rw centos-6.simg 
Singularity: Invoking an interactive shell within container...

Singularity centos-6.simg:~> ls /var/tmp/
file.dummy
```

---

# Running a command directly

Besides the interactive shell, we can execute any command
  inside the container directly with .highlight[`exec`]:

```
vagrant@vm:~$ singularity exec centos-6.simg cat /etc/centos-release
CentOS release 6.9 (Final)
```

---

# STDIO with container processess

Standard input/output are processed as normal by Singularity.
You can redirect them:

```
vagrant@vm:~$ singularity exec centos-6.simg echo Boo! > ~/test_container
vagrant@vm:~$ singularity exec centos-6.simg cat < ~/test_container
Boo!
```

You can use containers in pipelines:

```
$ singularity exec centos-6.simg echo Boo! | singularity exec centos-6.simg cat
Boo!
```

.exercise[
  Count the number of words in host's `ls /etc`'s output using container's copy of `wc`,
  then the other way around. Hint:

```
  ls /etc | wc -w
  ```
]

---

## STDIO with containers. A real world example

```bash
#!/bin/bash

module load Singularity/2.4

RefGenome='~/scicore-pipeline2-input-data/RefGenome/igenomes/Saccharomyces_cerevisiae/Ensembl/R64-1-1/Sequence/'

echo 'SAM to sorted BAM...'
singularity exec pipeline2.simg samtools view -bS output/bowtie2/yeast_reseq_ds-bt2aln.sam | singularity exec pipeline2.simg samtools sort - --threads ${SLURM_CPUS_PER_TASK} -o output/samtools/yeast_reseq_ds-bt2aln.s.bam

echo 'calling variants with FreeBayes...'
singularity exec pipeline2.simg freebayes -f $RefGenome/WholeGenomeFasta/genome.fa output/samtools/yeast_reseq_ds-bt2aln.s.bam > output/freebayes/yeast_reseq_ds-bt2aln.s.vcf

```

---

## Creating your container from scratch

### Option 1: Create a writable singularity image for development

.small[First create an [empty writable image](https://www.sylabs.io/guides/2.6/user-guide/quick_start.html#writable-image) 2048MB in size and initialize it with a centos7 docker image

```
vagrant@vm:~$ sudo singularity image.create --size 2048 centos7.simg
vagrant@vm:~$ sudo singularity build --writable centos7.simg docker://centos:7
vagrant@vm:~$ sudo singularity shell --writable centos7.simg
Singularity centos7.simg:~> whoami
root
```

Let's get a missing application, `fortune`.
It's not available in the default distribution,
we will need to enable a community repository and install it:

```
Singularity centos7.simg:~> yum -y --enablerepo=extras install epel-release
Singularity centos7.simg:~> yum -y install fortune-mod
```

]
---

## Creating your container from scratch

### Option 2: Create a sandbox environment for development

.small[Another approach is to create a [sandbox](https://www.sylabs.io/guides/2.6/user-guide/quick_start.html#sandbox-directory) for development.
This is the approach recommended by the singularity developers.

A [sandbox](https://www.sylabs.io/guides/2.6/user-guide/quick_start.html#sandbox-directory) is just a plain folder which contains all the files of the container.

One of the advantages of this approach is that you don't need to define a fixed size in advance.
]
```terminal
vagrant@vm:~$ sudo singularity build --sandbox centos7_sandbox docker://centos:7

vagrant@vm:~$ ls centos7_sandbox/

vagrant@vm:~$ sudo singularity shell --writable centos7_sandbox/

Singularity centos7_sandbox:~> whoami
root

Singularity centos7_sandbox:~> cat /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)
```

---

# Giving a purpose to the container

A container can have a "default" command which is run without specifying it.

Inside the container, it's `/singularity`. Let's try modifying it:

```
vagrant@vm:~$ sudo singularity exec -w centos7.simg vi /singularity
```

By default you'll see the following:

```bash
#!/bin/sh

exec /bin/bash
```

This is a script that will execute `/bin/bash`.

---

# Giving a purpose to the container

We installed `fortune`, so let's use that instead by modifying `/singularity` like this:

```bash
#!/bin/sh

exec /usr/bin/fortune "$@"
```

Now we can invoke it with .highlight[`run`] or even by
  .highlight[running the image]:

```terminal
vagrant@vm:~$ singularity run centos7.simg
[..some wisdom or humor..]

vagrant@vm:~$ ./centos7.simg
[..some more wisdom or humor..]
```

---

# Making the container reproducible

Instead of taking some base image and making changes to it by hand,
  we want to make this build process reproducible.

This is achieved with [singularity recipes](https://www.sylabs.io/guides/2.6/user-guide/quick_start.html#singularity-recipes)

Let's try to retrace our steps to obtain a fortune-telling CentOS container.

---

# Bootstrapping

The definition file starts with a [header section](https://www.sylabs.io/guides/2.6/user-guide/container_recipes.html#header).

The key part of it is the `Bootstrap:` configuration.

There are currently 2 types of bootstrap methods:

* using `yum`/`apt`/`pacman` on the host system to bootstrap a similar one
* pull a Docker image

We'll be using the Docker method.

```
Bootstrap: docker
From: centos:7
```

---

# Setting up the container

There are 2 sections for setup commands (essentially shell scripts):

1. [setup](https://www.sylabs.io/guides/2.6/user-guide/container_recipes.html#setup) for commands to be executed .highlight[outside the container].

You can use `$SINGULARITY_ROOTFS` to access the container's filesystem,
    as it is mounted on the host during the build.

2. [post](https://www.sylabs.io/guides/2.6/user-guide/container_recipes.html#post) for commands to be executed .highlight[inside] the container.

This is a good place to set up the OS, such as installing packages.

---

# Setting up the container

Let's save the name of the build host and install `fortune`.

```
Bootstrap: docker
From: centos:7

%setup
  hostname -f > $SINGULARITY_ROOTFS/etc/build_host

%post
  yum -y --enablerepo=extras install epel-release
  yum -y install fortune-mod
```

---

# Adding files to the container

An additional section, [files](https://www.sylabs.io/guides/2.6/user-guide/container_recipes.html#files), allows to copy files or folders to the container.

We won't be using it here, but the format is as follows (like `cp`, but destination is inside):

```
%files
  some/file /some/other/file some/path/
  some/directory some/path/
```

Note that this happens _after_ `%post`. If you need the files earlier, copy them manually in `%setup`.

---

# Setting up the environment

You can specify a script to be sourced when something is run in the container.

This goes to the [environment section](https://www.sylabs.io/guides/2.6/user-guide/container_recipes.html#environment). Treat it like `.bash_profile`.

```
%environment
  export HELLO=World
```

All the environment is defined in profile files in folder `/.singularity.d/env/` inside the container.

Note that the host environment variables are passed on by default as well, unless `-e | --cleanenv` is specified.

---

# Setting up the runscript

The runscript (`/singularity`) is specified in the [runscript](https://www.sylabs.io/guides/2.6/user-guide/container_recipes.html#runscript) section.

Let's use the file we copied at `%setup` and run `fortune`:

```
%runscript
  read host < /etc/build_host
  echo "Hello, $HELLO! Fortune Teller, built by $host"
  exec /usr/bin/fortune "$@"
```

---

# Testing the built image

You can specify commands to be run at the end of the build process inside the container
  to perform sanity checks.

Use [test](https://www.sylabs.io/guides/2.6/user-guide/container_recipes.html#test) section for this:

```
%test
  test -f /etc/build_host
  test -f /usr/bin/fortune
```

All commands must return successfully or the build will fail.

---

# The whole definition file

```
Bootstrap: docker
From: centos:7

%setup
  hostname -f > $SINGULARITY_ROOTFS/etc/build_host

%post
  yum -y --enablerepo=extras install epel-release
  yum -y install fortune-mod

%environment
  export HELLO="World"

%runscript
  read host < /etc/build_host
  echo "Hello, $HELLO! Fortune Teller, built by $host"
  exec /usr/bin/fortune "$@"

%test
  test -f /etc/build_host
  test -f /usr/bin/fortune
```

---

# Building a container from definition file

[build](https://www.sylabs.io/guides/2.6/user-guide/build_a_container.html) the container using the definition (.highlight[requires root]):

```
vagrant@vm:~$ sudo singularity build fortune.simg fortune.def
```

2. Test running it directly.
]

---

### Building a "real world" container

Download this Singularity [definition file](https://github.com/scicore-unibas-ch/singularity-slides/blob/master/pipeline1.def)

.exercise[
  Build this container and try to execute STAR and htseq-count utilities using "singularity exec"
]

---

## Host resources

A Singularity container can have more host resources exposed.

For providing access to more directories,
  one can specify bind options at runtime with `-B`:

```
-B source[:destination[:mode]]
```

where .highlight[source] is the path on the host,
  .highlight[destination] is the path in a container (if different)
  and .highlight[mode] is optionally `ro` if you don't want to give write access.

Sysadmins can define default mounts in the global singularity config file
(e.g. always mount /local_scratch to /tmp inside the container)

Additionally, devices on the host can be exposed, e.g. the [GPU](https://www.sylabs.io/guides/2.6/user-guide/faq.html#does-singularity-support-containers-that-require-gpus)

OpenMPI should also work. [Check official docs](https://www.sylabs.io/guides/2.6/user-guide/faq.html#can-i-containerize-my-mpi-application-with-singularity-and-run-it-properly-on-an-hpc-system)

---

# X-forwarding inside containers

Quoting the [official docs](https://www.sylabs.io/guides/2.6/user-guide/faq.html#can-i-run-x11-apps-through-singularity):

Can I run X11 apps through Singularity?
Yes. This works exactly as you would expect it to.

Assuming you have X-forwarding enabled in your current ssh session you only need to open
the graphical application inside the container.

---

# Fuller isolation

By default, a container is allowed a lot of "windows" into the host system (dictated by Singularity configuration).

For an untrusted container, you can further restrict this with options like `--contain`, `--containall`.

In this case, you have to manually define where standard binds like the home folder or `/tmp` point.

See `singularity help run` for more information.

---

### Integration with Slurm (or other schedulers)

```bash
#!/bin/bash

#SBATCH --job-name=using-singularity
#SBATCH --cpus-per-task=8
#SBATCH --mem=10G

module load Singularity/2.4

singularity exec pipeline2.simg samtools view -bS yeast_reseq_ds-bt2aln.sam | singularity exec pipeline2.simg samtools sort - --threads ${SLURM_CPUS_PER_TASK} -o yeast_reseq_ds-bt2aln.s.bam

```

```bash
#!/bin/bash

#SBATCH --job-name=without-containers
#SBATCH --cpus-per-task=8
#SBATCH --mem=10G

module load SAMtools/1.7-goolf-1.7.20

samtools view -bS yeast_reseq_ds-bt2aln.sam | samtools sort - --threads ${SLURM_CPUS_PER_TASK} -o yeast_reseq_ds-bt2aln.s.bam

```

---

# Scientific Filesystem (SCIF) Apps

.small[[The Scientific Filesystem (SCIF)](https://www.sylabs.io/guides/2.6/user-guide/reproducible_scif_apps.html) provides internal modularity of containers,
and it makes it easy for the creator to give the container implied metadata about software

e.g. metadata allows to define the list of applications available inside the
container so the user doesn't need to know the details about the container internals.

In this example I am listing all the utilities available in a container named
"pescobar-OpenStructure-Singularity-master.simg"

This container can be downloaded from the [singularity hub](https://www.singularity-hub.org/collections/607)
]
```terminal
$ singularity apps pescobar-OpenStructure-Singularity-master.simg
chemdict_tool
lddt
molck
ost
tmalign
tmscore
```

---

# Scientific Filesystem (SCIF) Apps

.exercise[
Download this Singularity [definition file](https://github.com/scicore-unibas-ch/singularity-slides/blob/master/pipeline1.def)
and modify it so "singularity apps" lists the available applications in the container (STAR and htseq-count).
]

hint: you have to add some `%apprun` lines. See the [docs](https://www.sylabs.io/guides/2.6/user-guide/reproducible_scif_apps.html#sections)

---

# Running services

Singularity 2.4 introduces the ability to run [container instances](https://www.sylabs.io/guides/2.6/user-guide/running_services.html),
allowing you to run services (e.g. Nginx, MySQL, etc...) using Singularity.

.exercise[
Read the docs about [Singularity instances](https://www.sylabs.io/guides/2.6/user-guide/running_services.html)
and do the [hello world](https://www.sylabs.io/guides/2.6/user-guide/running_services.html#nginx-hello-world-in-singularity) exercise to boot an
Nginx webserver with Singularity.
]

---

# Distributing the container

## The easiest: just copy it

Using the container after creation on another Linux machine is simple:
  you simply copy the image file there. You can use any file transfer tool. e.g. scp or ftp 
  or even a usb stick

Note that you can't just run the image file on a host without Singularity installed!

---

# Distributing the container

### Using Singularity Hub

[Singularity Hub](https://singularity-hub.org/) allows you to build your containers in the 
  cloud from Bootstrap files, which you can then simply `pull` on a target host.

This requires a GitHub repository with a `Singularity` definition file.

After creating an account in the Singularity Hub and connecting it to your GitHub account,
  you can select a repository and branches from github to be built.

[https://github.com/pescobar/OpenStructure-Singularity](https://github.com/pescobar/OpenStructure-Singularity)

[https://www.singularity-hub.org/collections/607](https://www.singularity-hub.org/collections/607)
]

---

# Distributing the container

### Using Singularity Hub

Once your container is in Singularity hub you can pull the result:

```terminal
vagrant@vm:~$ singularity pull shub://pescobar/OpenStructure-Singularity
vagrant@vm:~$ ./pescobar-OpenStructure-Singularity-master-latest.simg
```
You can check the [official docs](https://github.com/singularityhub/singularityhub.github.io/wiki/Build-A-Container)
to learn how to upload your images to the official Singularity hub

You can also host your private [singularity hub](https://github.com/singularityhub/sregistry)

Optional exercise: Upload your first container to singularity hub

---

# Docker and Singularity

Instead of writing a Singularity file, you may write a `Dockerfile`,
  build a Docker container and convert that.

Pros:

* More portable: for some, using Docker or some other container solution is preferable.

Cons:

* Blackbox: Singularity understands less about the build process, in terms of container metadata.

* Complexity: Another tool to learn if you don't know Docker already.

---

# Docker -> Singularity

If you have a Docker image you want to convert to Singularity,
  you have at least two options:

1. Upload the image to a Docker Registry (such as Docker Hub)
  and `pull` from there.

2. Convert locally with Docker and `docker2singularity`
  https://github.com/singularityware/docker2singularity
  https://hub.docker.com/r/vanessa/singularity/

Another work in progress:

[https://github.com/dctrud/docker2singularity-go](https://github.com/dctrud/docker2singularity-go)

---

# Docker -> Singularity

```terminal
vagrant@vm:~$ singularity pull docker://biocontainers/blast

vagrant@vm:~$ singularity exec blast.simg blastx -h

```

---

# More examples

## Using Conda inside Singularity

Check this Singularity [definition file](https://github.com/scicore-unibas-ch/singularity-slides/blob/master/singularity_conda.def)

.exercise[
  Extend the container to install some of your preferred bioinfo tools from [BioConda](https://bioconda.github.io/)
]
---

# More examples

## Using EasyBuild inside Singularity

[EasyBuild](https://easybuilders.github.io/easybuild/) is a tool to build scientific software.
Building your software inside a Singularity container you can keep full control about what
dependencies your tool is using.

See an example container [here](https://github.com/pescobar/singularity-easybuild)

optional exercise: Build this container and use EasyBuild inside the container to build an application

---

# Extra exercises

---

## Some other interesting tutorials online

### Another intro to Singularity

[https://singularity-tutorial.github.io/](https://singularity-tutorial.github.io/)

### SCI-F Apps interactive tutorial

[https://sci-f.github.io/tutorials](https://sci-f.github.io/tutorials)

### Tutorial using Singularity, SCIF and SnakeMake

- [https://github.com/sci-f/snakemake.scif](https://github.com/sci-f/snakemake.scif)
  - [https://sci-f.github.io/snakemake.scif/](https://sci-f.github.io/snakemake.scif/)

---

# vim: filetype=markdown syntax=markdown tabstop=2 shiftwidth=2 expandtab