Submitting jobs

Majorana is using the SLURM scheduler to schedule and execute jobs. Do not run any long running programs on the submit node, always submit them to the scheduler. In order to submit a job, you need to prepare a job file. You can find templates for these job files in /opt/ohpc/pub/examples/slurm/. Suppose you have a simple job that should print the version of argos3 and then exit. You could create a file called job.slurm and put the below content in it (adapted from the file /opt/ohpc/pub/examples/slurm/job.slurm).

Content of the job.slurm file
Content of the job.slurm file

The job file is split into three parts. The first part gives instructions to the scheduler, for example which resources to use or where to put the output files.

The second part is reserved for loading the necessary modules. In this example, only argos3 is necessary.

The third part are the job steps, that is the commands that you want to run. If you preceed the command with srun, then SLURM will try to pass on the resource allocation (e.g., for OpenMP or MPI programs). In this example, we print the argos3 version to stdout and then wait for 30 seconds before ending the job (this allows us to see the job in the queue). Normally, you should not put any sleep statements into your job file.

The job file can be submitted to the scheduler by running sbatch job.slurm. If the job file was well formed, you will receive the job ID from the scheduler. You can use this ID to check for the execution status of the job in the future.

If you lost the job ID or if you want to check all of your submitted and running jobs, you can execute squeue -u username, where username is replaced with your user name. The output will list several pieces of information, such as the state of the job or the wall time that this job is being executed.

Information provided by squeue
Information provided by squeue

The stdout and stderr streams are captured and saved in files, as specified in the job file.

Output of stdout and stderr for the ArgosVersion job
Output of stdout and stderr for the ArgosVersion job