Pat Cluster SLURM usage

Types of compute node

The cluster's nodes are not identical, unlike many local cluster systems. There are a range of GPUs available, and also sometimes more than one OS version. Different OS versions have different software available as not all compilers/CUDA versions are supported on every OS. You select the features you want with SLURM constraints. There are also cpu-only nodes.

Currently available features:

Name

Description

teslak20

Nvidia Tesla K20m GPUs

titanblack

Nvidia GeForce 700 Titan Black GPUs

3gpu

Node has 3 gpus

4gpu

Node has 4 gpus

cpu

Node has dual, 16-core CPUs

To see what combination of features each node has run 'scontrol show nodes' on pat.

To select particular features use the '-C' or '--constraint' option to srun or sbatch. You can combine multiple features with & for a boolean AND, or | for a boolean OR.

Using the DEBUG partition

The intention of this partition is to allow people to allocate a GPU for debugging for an indefinite time. It's not for running production work. Only people nominated by group computer reps can have access to this partition. To allocate a GPU do something like this

salloc -n1 --gres=gpu:1 -p DEBUG --no-shell

using whatever parameters you need to get the GPU you want. salloc understands all the same ones as sbatch and srun . SLURM will bump running jobs off the GPUs if it needs to in order to satisfy the allocation request. The salloc command will return a job id. You'll be able to see this job in the queue, running with unlimited walltime.

Then to access the allocated GPU do something like

srun --jobid=id mycommand

where 'id' is the job id that the salloc command gave you. To get rid of the allocation and allow others to use the GPU, cancel it with

scancel id

Yusuf Hamied Department of Chemistry
Lensfield Road
Cambridge
CB2 1EW
T: +44 (0) 1223 336300
enquiries@ch.cam.ac.uk

Contacts
Directions

University Privacy Policy

Name	Nodes	Time limit	Notes
GPU	All the nodes with GPUs	30 days	Default partition
CPU	All the nodes with only CPUs	30 days
DEBUG	All the nodes	None	Pre-emptor, restricted access

Pat Cluster SLURM usage

Partitions

Types of compute node

Using the DEBUG partition

System status

Can't find what you're looking for?

Quick Links

About the Department

Departmental Services

Contact IT Support at the Department of Chemistry, University of Cambridge

Study at Cambridge

About the University

Research at Cambridge