In the Slurm job script file, the following parameters should be specified
#SBATCH --time=[Job Run Duration]
The specified time affects the job's scheduling priority. To prevent CPU cores from remaining idle while a job waits for sufficient CPU resources, Slurm will evaluate which queued job can be run first without affecting previously queued jobs, rather than leaving CPU cores unused. You should specify the time as close as possible to the actual runtime of your job so that the system can properly allocate priority. If not specified, the default value will be the Timelimit of the Partition you are using.
#SBATCH -p [Partition Name]
Specifies the desired partition to use. Each partition has different resource limits and time constraints. Users should determine the resources and duration required for their job that is appropriate for the partition to be specified. If not specified, the default partition will be 'gpu'.
In addition, there are parameters that should be specified, which are explained in the examples of running Slurm in various modes below:
A Serial Job uses a single CPU core. Specify the parameter --ntasks=1 This can be omitted as the default value for --ntasks=1 is already 1.
#!/bin/bash
#SBATCH --job-name=mytest # create a short name for your job
#SBATCH --nodes=1 # node count
#SBATCH --ntasks=1 # total number of tasks across all nodes
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --time=00:20:00 # total run time limit (HH:MM:SS)
module purge
Rscript myscript.R
To Run
sbatch myscriptR.job
For Multithreaded Jobs, specify the parameter --cpus-per-task= with the desired number of threads, and then use the variable $SLURM_CPUS_PER_TASK to specify the parameter in the command.
Example of running the Gromacs software as a Multithreaded Job
#!/bin/bash
#SBATCH --job-name=multithread # create a short name for your job
#SBATCH --nodes=1 # node count
#SBATCH --ntasks=1 # total number of tasks across all nodes
#SBATCH --cpus-per-task=8 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --time=00:15:00 # maximum time needed (HH:MM:SS)
module load gromacs_gpu
gmx mdrun -ntomp $SLURM_CPUS_PER_TASK -v -noconfout -nsteps 5000 -s 1536/topol.tpr
To Run
sbatch gromac-water.gpu
You should define the --cpus-per-task= parameter and specify the CPU cores in the command using the variable $SLURM_CPUS_PER_TASK instead of hardcoding a number. Defining it this way allows Slurm to know exactly how many CPU cores your job is using, ensuring correct allocation. This prevents issues where a job attempts to run with more CPU cores than allocated, which would negatively impact system performance.
Conventionally, running MPI involves executing the command below, which specifies the number of tasks using the option `option -np [followed by the desired number of tasks]
mpirun -np 192 -hostfile hosts ./myprog.o
When submitting via Slurm, define the parameter --ntasks=[desired number of tasks] and use the prun command instead of the conventional command.
#!/bin/bash
#SBATCH --job-name=mpi-job # create a short name for your job
#SBATCH -p cpu # pritition name
#SBATCH --ntasks=192 # number of tasks per node
#SBATCH --cpus-per-task=1 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --time=00:20:00 # total run time limit (HH:MM:SS)
module purge
module load intel
prun myprog.o
To Run
sbatch mpi.job
For jobs utilizing a GPU, specify the parameter --gpus=1 The example specifies the use of 1 GPU card.
wget https://gist.githubusercontent.com/leimao/bea971e07c98ce669940111b48a4cd7b/raw/f55b4dbf6c51df6b3604f2b598643f9672251f7b/mm_optimization.cu
module load nvhpc
nvcc mm_optimization.cu -o mm_optimization
vi gpu_job.sh
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --job-name=mnist # create a short name for your job
#SBATCH --cpus-per-task=4 # cpu-cores per task (>1 if multi-threaded tasks)
#SBATCH --gpus=1 # total number of GPUs
#SBATCH --time=01:00:00 # total run time limit (HH:MM:SS)
#CUDA matrix multiplication
./mm_optimization
sbatch gpu_job.sh