Nest SLURM configuration

See the generic SLURM documentation for how to use SLURM. This page describes the setup on the Nest cluster.

Nest has sixteen nodes which were funded by the Theory RIG and four nodes which were funded by the Wales group. Hence there is a restricted-access partition which contains the Wales-funded nodes.

Nest has five 'partitions':

Partition Name	Who can use it	Nodes	Default time limit	Maximum time limit	Other settings	Priority for preemption
TEST	Anyone	node-0-0	30 minutes	30 minutes	None	100
MAIN	Anyone	node-0-1 to node-0-15	48 hours	48 hours	This is the default partition	50
WALES	Members of nest-wales-users group	node-0-16 to node-0-19	7 days	28 days	None	50
LONG	Anyone	node-0-1 to node-0-15	7 days	28 days	Limited to max 200 cores at a time over all jobs in this partition	50
CLUSTER	Anyone	All the nodes	7 days	28 days	None	0

When submitting a compute job you can give a list of possible partitions with the -p flag. SLURM will try to place the job in a suitable partition where it will start as soon as possible. If you don't choose a partition at all, SLURM will use MAIN. This is safe from pre-emption but does not have access to every node so may take longer to start. If you want it to start as soon as possible and don't mind risking it being pre-empted, use MAIN,CLUSTER. That way the job will run in MAIN if there are enough free nodes there, but if there are not enough nodes in MAIN but there are enough over the whole machine, it will run in CLUSTER. However if someone else then submits a job to MAIN, TEST, LONG, or WALES that can be run by cancelling the CLUSTER job, the CLUSTER job gets cancelled. If you want a cancelled job to be put back on the queue to be restarted later, use the --requeue flag.

Nest SLURM configuration

System status

Can't find what you're looking for?

Quick Links

About the Department

Departmental Services

Contact IT Support at the Department of Chemistry, University of Cambridge

Study at Cambridge

About the University

Research at Cambridge