Dedication

HPC doc.

The cluster

Access

See here Access Baobab

Support

If you need help, you have different means of contacting us or the community :

  • Forum : hpc-community.unige.ch, preferred way if the topic could interest other user
  • Email : hpc@unige.ch if it’s a personal issue
  • support-si
  • We can also organise physical meetings at Uni Dufour, please contact us to schedule an appointment.

When you reach out for help, we always do our best to get back to you as soon as possible, but please bear in mind that :

  • Hundreds of other people use Baobab and almost each of them use some special combination of softwares (compilers, libraries, using specific version, etc.). We are good, but we don’t have experts for each software.
  • The best way to get help quickly is to read this documentation or search/post on hpc-community forum.
  • When you ask for help, please give us enough details so we know exactly what you did, such as :
    • What did you try, what didn’t work, what is the expected result, what error message you have, etc.
    • If this information is missing, most of the time, we will just reply and tell you we need this information... !

Example of useful information to provide:

  • Explain what you want to achieve.
  • Explain what you tried.
  • Explain what is not working.
  • Give us the path to the relevant files in Baobab (sbatch, error logs, etc.).

Acknowledgments

If you publish any data computed using Baobab cluster, we kindly ask you to add the following acknowledgement to your paper:

The computations were performed at University of Geneva on the Baobab cluster.

Resources

The cluster is mainly composed of two head nodes (or login nodes) and a bunch of computation nodes. Each node provides 32, 20, 16 cores, 12 cores or 8 cores. When you submit a job to the cluster, you allocate resources for your work.

You can either request exclusives resources (allocation by node) or request shared resources (allocation by cores). See Partitions and limits for the choice of a partition.

Compute nodes

Since the cluster has evolved, not all nodes are of the same generation. You can see the details in the following table.

generation model freq nb cores architecture node extra flag
V1 X5550 2.67GHz 8 cores “Gainestown” (45 nm) node[072-075]  
V2 X5650 2.67GHz 12 cores “Westmere-EP” (32 nm) Efficient Performance node[076-153]  
V3 E5-2660V0 2.20GHz 16 cores “Sandy Bridge-EP” (32 nm) Efficient Performance node[001-056,058]  
V3 E5-2670V0 2.60GHz 16 cores “Sandy Bridge-EP” (32 nm) Efficient Performance node[059-062,067-070]  
V3 E5-4640V0 2.40GHz 32 cores “Sandy Bridge-EP” (32 nm) Efficient Performance node[057,186,214-215]  
V4 E5-2650V2 2.60GHz 16 cores “Ivy Bridge-EP” (22 nm) Efficient Performance node[063-066,071,154-172]  
V5 E5-2643V3 3.40GHz 12 cores “Haswell-EP” (22 nm) Efficient Performance gpu[002-003]  
V6 E5-2630V4 2.20GHz 20 cores “Broadwell-EP” (14 nm) node[173-185,187-201,205-213],gpu[004-006]  
V6 E5-2637V4 3.50GHz 8 cores “Broadwell-EP” (14 nm) node[218-219] HIGH_FREQUENCY
V6 E5-2643V4 3.40GHz 12 cores “Broadwell-EP” (14 nm) node[202,204,216-217] HIGH_FREQUENCY
V6 E5-2680V4 2.40GHz 28 cores “Broadwell-EP” (14 nm) node[203]  
V7 EPYC-7601 2.20GHz 32 cores “Naples” (14 nm) gpu[011]  

The “generation” column is just a way to classify the nodes on Baobab. In the following table you can see the features of each architecture.

  MMX SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 AVX F16C AVX2 FMA3
Gainestown YES YES YES YES YES YES YES NO NO NO NO
Westmere-EP YES YES YES YES YES YES YES NO NO NO NO
Sandy Bridge-EP YES YES YES YES YES YES YES YES NO NO NO
Ivy Bridge-EP YES YES YES YES YES YES YES YES YES NO NO
Haswell-EP YES YES YES YES YES YES YES YES YES YES NO
Broadwell-EP YES YES YES YES YES YES YES YES YES YES YES
Naples YES YES YES YES YES YES YES YES YES YES YES

Specify the cpu type you want

It’s normally not important on which type of node your job is running. But there are cases where it’s important to be able to stick to a given kind of cpu or generation. You can request for example to have only nodes with cpu E5-2660V0:

srun --constraint=E5-2660V0
or
#SBATCH --constraint=E5-2660V0

Or you can specify that you want a node of generation V3:

srun --constraint=V3
or
#SBATCH --constraint=V3

You can specify as well multiple contraints using OR. For example if you don’t want to use nodes of generation V1:

srun --constraint="V2|V3|V4|V5|V6"
or
#SBATCH --constraint="V2|V3|V4|V5|V6"

Partitions and limits

There are various partitions available to everybody on the cluster, and some others only available to their owners (private). If the partition name end with the suffix “-EL7” this means the node is running CentoOS7. As of August 2019, all nodes are running Centos7.

Partition Time Limit Max mem per core default Mem Per core Allowable per core Private max cores
debug-EL7 15 Minutes full node memory 3GB yes no 32
mono-EL7 4 Days 10GB 3GB yes no 400
parallel-EL7 4 Days full node memory 3GB no no 400
shared-EL7 12 Hours full node memory 3GB no no 400
mono-shared-EL7 12 Hours 10GB 3GB yes no 400
bigmem-EL7 4 Days 250GB 3GB yes no 16
shared-bigmem-EL7 12 Hours 500GB 3GB yes no ~200
shared-gpu-EL7 12 Hours full node memory 3GB yes no ~100

To avoid confusion, private partitions aren’t detailed here.

Hint

To list all partitions you have access to, simple type sinfo --format=%P

The Max mem per core is not enforced, it’s only a suggestion. If you need more than 10GB RAM per core, you should try to use a bigmem node.

If you need to access GPUs, please see Nvida CUDA for details.

Have a look on the web page baobab.unige.ch at the “status” tab to see the detail of a given partition. If you belongs to one of this group, please request to have access to the correct partition as it’s not made automatically.

The partitions shared* and mono-shared-EL7* contain all Baobab’s nodes including the private nodes. When a private node isn’t being used by its owner, it is available in this partition. These two partitions are heterogeneous. There are two nodes having 32 cores and 512GB of ram (node57 and 186) and the other nodes are Intel 8 to 20 cores with 3-4GB per core.

If you want to launch a job that requires less than a full node, please use one of the mono* partitions. If you don’t, you will monopolize a full node, which turns out to be very inefficient.

Important

The default partition is debug-EL7.

To see more details about the partitions (default time, etc.):

sinfo

The time formats accepted by SLURM are as follows:

minutes
minutes:seconds
hours:minutes:seconds
days-hours
days-hours:minutes
days-hours:minutes:seconds

Home directory

In your home directory you can put your data, code, software, etc. However, this is not a personal storage for data such as emails, private pictures, etc.

Your home folder is located here: $HOME. Please always use the the $HOME variable in your script instead of using the full path in your scripts. This is a good practice and will save you a lot of trouble.

For temporary storage, please read about the Scratch directory

Scratch directory

The scatch directory allows you to store any data that is not unique or that cannot be regenerated. Please use this store all the big files that don’t need a backup. We thank you for your cooperation. You will typically use is as a temporary storage when you write data to disk with your application.

Your scratch folder is located here: $HOME/scratch.

The content of this folder is persistant, but there is no backup.

Backup

Your home directory is backed up every day.

There is not backup of the scratch folder.

File transfer

There is no need to transfer files between compute nodes and the master as they share the same storage space.

Anyway, if you need specifically to transfer something from the login to the local storage of a node, you can use sbcast (but you probably shouldn’t).

Compilers

You have the choice between FOSS or Intel toolchain.

module compiler mpi
foss/2016a gcc 4.9.3 openmpi 1.10.2
foss/2016b gcc 5.4.0 openmpi 1.10.3
foss/2017a gcc 6.3.0 openmpi 2.0.2
foss/2017b gcc 6.4.0 openmpi 2.1.1
foss/2018a gcc 6.4.0 openmpi 2.1.2
foss/2018b gcc 7.3.0 openmpi 3.1.1
foss/2019a gcc 8.2.0 openmpi 3.1.3
intel/2016a icc 16.0.1 impi 5.1.2
intel/2016b icc 16.0.3 impi 5.1.3
intel/2017a icc 17.0.1 impi 2017 Update 1
intel/2018a icc 18.0.1 impi 2018.1.163

If you want to compile your sofware against MPI, it is very important not to compile using directly gcc, icc or similar commands, but rather rely on the wrappers mpicc, mpic++, mpicxx or similar ones.

All the newer versions of mpi will be available through the use of Module - lmod.

Example for the latest version of gcc:

module load foss

Example for the latest version of icc from Intel:

module load intel

If needed, you can stick to a particular version:

module load foss/2016a

You can see the details of what is loaded in foss or intel by doing:

module list

Intel compiler licenses

If you want to use the Intel compiler, you need to have your own intel license compiler.

If you are an academic you can get one free of charge here. You should ask for the “parallel studio xe”. Once you get the license you should copy it to your home directory in a directory named Licenses

Applications

Several applications are installed on the cluster. We describe how to use them or install them if this should be done by the user. See Applications.

Module - lmod

The easiest way to use an application is to use the “module” command. By using “module”, you don’t need to know where the softwares are physically located (the full path), but instead you can just type the application name as if its path were in your PATH (this is indeed what “module” does). The “module” command can also set other important variables for an application, so it’s always a good idea to use it.

Search for an application:

module spider R

The output of this command will indicate that you need two dependencies:

GCC/7.3.0-2.30  OpenMPI/3.1.1

To Load R:

module load GCC/7.3.0-2.30  OpenMPI/3.1.1 R/3.5.1

Then you can juste invoke for example R by typing R (instead of the full path). Of course you are still required to use SLURM to launch your software.

You can see the help for a particular module:

module help R

Hint

Module version. By using a module without specifying a version, you will always get the latest version of the software :

module load foss

If you want to stick to a particular (or legacy) version, you need to specify it when you load it :

module load foss/2017a

Hint

To load some modules at login, you can add something like this in your .bashrc:

if [ -z "$BASHRC_READ" ]; then
  export BASHRC_READ=1
  # Place any module commands here
  module load GCC/7.3.0-2.30  OpenMPI/3.1.1 R/3.5.1
fi