Bash Script Commands for Slurm File Creation
man sbatchorsbatch -helpor sbatch documentation
$ vi script.sh
#!/bin/bash
#SBATCH --job-name=test ## Job name
#SBATCH --output=test_%j.out ## Output file name (%j = Job-ID)
#SBATCH --error=test_%j.err ## Error file name (%j = Job-ID)
#SBATCH --time=10:00 ## Run time limit
#SBATCH --ntasks=1 ## Number of tasks required for the run
#SBATCH --cpus-per-task=1 ## Number of cores (threads/CPUs) required for the run
#SBATCH --partition=cpu ## Specify the required partition
module purge ## Unload all modules because modules may have been loaded previously
module load anaconda3 ## Load the required module, in this example, Anaconda
srun python script.py ## Execute the code
Notes
- For submitting a job utilizing threads, set #SBATCH --cpus-per-task= according to the number of threads used.
- For submitting a job using MPI, set #SBATCH --ntasks= according to the required number of processes.
- For submitting a GPU-centric job to the GPU partition, set #SBATCH --cpus-per-task according to appropriateness. If the job does not spawn threads, set it to 1. If it spawns threads, it can be set between 2-4 threads to share the remaining cores with other cards, thus ensuring full utilization of the system capabilities.
$ sbatch script.sh
| Sbatch option | Description |
|---|---|
-J, --job-name=<job_name> |
Specify a name for the job allocation |
-o, --output=<filename_pattern> |
Standard output of the job script will be connected to the file specified by filename_pattern %x : Job name %j : Job-ID %t : Task identifier (aka rank). This will create a seperate file per task %N : Short hostname. This will create a separate file per node |
-e, --error=<filename_pattern> |
Standard error of the job script will be connected to the file specified by filename_pattern as described above |
-i, --input=<filename pattern> |
Standard input of the job script will be connected to the file specified by filename pattern |
-D, --chdir=<directory> |
Set the working directory of the batch script to directory before it is executed |
| Sbatch option | Description |
|---|---|
--mail-user=<email> |
Defines user who will receive email notification of state changes as defined by --mail-type |
--mail-type=[TYPE\|ALL\|NONE] |
Notifies user by email when certain event types occur. Valid type values are BEGIN, END, FAIL. The user to be notified is indicated with --mail-user. The values of the --mail-type directive can be declared in one line like so: --mail-type BEGIN, END, FAIL |
| Sbatch option | Description |
|---|---|
--time=<time> |
Sets a limit on the total run time of the job allocation |
-b, --begin=<time> |
Submit the batch script to the Slurm controller immediately, like normal, but tell the controller to defer the allocation of the job until the specified time |
| Sbatch option | Description |
|---|---|
--mem=<size>[units] |
Specify the memory required per node |
--mem-per-cpu=<size>[units] |
Specify the memory required per CPU |
--mem-per-gpu=<size>[units] |
Specify the memory required per GPU |
| Sbatch option | Description |
|---|---|
-n, --ntasks=<number> |
This option advises the Slurm controller that job steps run within the allocation will launch a maximum of number tasks and offer sufficient resources. The default is 1 task per node |
-N, --nodes=<minnodes>[-maxnodes] |
Request that a minimum of minnodes nodes be allocated to this job. A maximum node count may also be specified with maxnodes. If only one number is specified, this is used as both the minimum and maximum node count |
--ntasks-per-node=<ntasks> |
Request that ntasks be invoked on each node |
-c, --cpus-per-task=<ncpus> |
Advise the Slurm controller that ensuing job steps will require ncpus number of processors per task |
-p, --partition=<partition_names> |
Request a specific partition |
-w, --nodelist=<node_name_list> |
Request a specific list of hosts |
| Sbatch option | Description |
|---|---|
--gpus=[type:]<number> |
Specify the total number of GPUs required for the job |
--gpus-per-node=[type:]<number> |
Specify the number of GPUs required for the job on each node included in the job's resource allocation |