skip to content
 

Adding users

Make sure the user already has an Active Directory account. Put them in the 'mek-quake-users' group in AD. Wait an hour. A script on mek-quake runs once per hour and creates any new accounts in the AD group on the local machine.

A script is automatically called to do the local setup on mek-quake. Unlike some of the older clusters the script does not run round each node doing local setup there; PBS now handles this at job startup time if it's required. This is why there is no /scratch/spqr1 on the nodes just after the account is created. Space on sharedscratch is created by looking up the user in the AD. If they're in the Wales group they go on /sharedscratch1 and if they're not on /sharedscratch2. This assumes that non-Wales people are in the Vendruscolo or Dobson groups. If they aren't then the directory may need to be moved depending on who gave them access to the machine.

There is no need to set a local password as mek-quake can use Active Directory passwords.

Give the user links to Mek-quake user notes, mek-quake queues, and mek-quake parallel environments. If they've never used a cluster before also give them a link to the SLURM docs.

Queueing system

Mek-quake uses SLURM. 

Remote Power Management Tools For Nodes

Use the 'apc' script on the head node like this to power nodes on and off:

mek-quake:~# apc on compute-0-0
mek-quake:~#
mek-quake:~# apc off compute-0-0 compute-1-3
mek-quake:~# apc status compute-0-0
Outlet 17 on localhost PDU wcdc-rack12-apc1.ipmi.private.ch.cam.ac.uk state is on

 

In theory you can power nodes off and on remotely via the IPMI controllers too. Don't bother, they are very unreliable. The IPMI power management works via the IPMI card in each node. This has its own IP and MAC set via flashing its firmware. It does not have its own network outlet, but shares one on the PC's motherboard.

The RAID array

This has failed, and was replaced by a 2U compute server known as mek-quake-filestore which has an Adaptec card with a RAID6 array. mek-quake-filestore has IPMI and its console is redirected down the serial port. It has an interface on the main network.

Reinstalling nodes

This is rocks:

rocks set host boot compute-X-Y action=install

reboot the node

Adding packages to nodes

To add to the live nodes do

rocks run host compute "yum install whatever"

Non-rpm-based software should be put in /usr/local, where all the nodes see it anyway. Don't put anything in /usr/local/shared- that is reserved for software that's an identical copy of that on the NFS server, and is not backed up.

Hardware and software support

Long out of warranty.

Can't find what you're looking for?

Then you might find our A-Z site index useful. Or, you can search the site using the box at the top of the page, or by clicking here.