For general advice on how to use SLURM see SLURM usage. This page describes ziggy's configuration.
Ziggy has two sorts of node:
Naming scheme | CPU type | Owned by | Number of cores per node | Memory per node | Total number of these | SLURM property |
node-0-X | Ivy Bridge | Hunter group | 16 | 132GB | 8 | ivybridge |
swan-1-X | Westmere | CMI | 12 | 24GB | 4 | westmere |
Ziggy has five 'partitions':
Partition Name | Who can use it | Nodes | Default time limit | Maximum time limit | Other limits | Preemption |
HUNTER | Members of the Hunter group | All the Ivy Bridge nodes | 24 hours | 48 hours | None | Preemptor: can stop and requeue preemptee jobs that are running on its nodes |
HUNTERLONG | Members of the Hunter group | All the Ivy Bridge nodes | 128 hours | 128 hours | A maximum of 96 cores may be in use by this partition and the LONG partition together at any one time. This is a global limit, not per user. | Preemptor: canĀ stop and requeue preemptee jobs that are running on its nodes |
SWAN | Anyone | All the Westmere nodes | 24 hours | 48 hours | None | Preemptor: canĀ stop and requeue preemptee jobs that are running on its nodes |
CLUSTER | Anyone | All the nodes | 24 hours | 48 hours | None | Preemptee: jobs can be stopped and requeued to allow preemptor jobs to run |
LONG | Anyone | All the nodes | 128 hours | 128 hours | A maximum of 96 cores may be in use by this partition and the HUNTERLONG partition together at any one time. This is a global limit, not per user. | Preemptee: jobs can be stopped and requeued to allow preemptor jobs to run |
When submitting a compute job you should normally give a list of possible partitions with the -p flag. SLURM will try to place the job in a suitable partition where it will start as soon as possible. If you don't choose a partition SLURM will use CLUSTER, which can lead to the job being stopped and put back on the queue (preempted) to allow other jobs to run.
The LONG partition allows jobs of up to 128 hours, but these can be preempted by shorter jobs.