Elvis is a cluster system and so message passing libraries are used to make code run in parallel. There are several different libraries installed. Shared memory parallelization (OpenMP, provided by the autoparallelizing options on some of the compilers) can also be used for up to eight cores but not further.
The available message passing libraries are MPI implementations: currently we have OpenMPI, MVAPICH2 and MVAPICH.
To change your parallel environment you need to load the appropriate module. By default MVAPICH2 with PGI compilers is loaded. To switch you'd edit your .bashrc file and change the module add line.
Compile your code with the MPI compilers. These are usually called mpicc, mpicxx, mpif77 and mpif90.
To run MPI code you normally use a command such as mpirun to launch your program. On elvis there is a queueing system which assigns work to compute nodes, and so the MPI library has to interact with that which makes things more complicated. In order to make it easier to launch non-OpenMPI jobs there are launchers which should be used within a batch job instead of mpirun to launch jobs. The launchers work out the correct number of cpus and which nodes to use from the queueing system. You must use the appropriate one for your MPI library, and you must have your environment configured for the library you want to use when you submit a job, as otherwise the job won't be able to find its library files. There are some examples of use in /info/torque, in particular /info/torque/castep.sh is an example CASTEP job file. Here's a list of which job start command goes with which library:
|MVAPICH||mpiexec||Needs the mpiexec module loaded. mpiexec needs to know its comm is mpich-ib for MVAPICH. If you use the modules this is set automatically.|
|MVAPICH2||mpiexec||Needs the mpiexec module loaded. mpiexec needs to know its comm is pmi for MVAPICH2. If you use the modules this is set automatically.|
You need to tell the queueing system how many nodes and cores to use. Here are some examples:
qsub run.sh -l nodes=1:ppn=8 # all 8 cores on one node qsub run.sh -l nodes=4:ppn=8 # 32 cores over 4 nodes qsub run.sh -l nodes=8:ppn=1 # 8 cores anywhere, despite what it looks like qsub run.sh -l nodes=8 # 8 cores anywhere, NOT 64 cores as you might reasonably think