How do I configure GPU compute jobs for the cluster?

Our Starlight cluster has a separate GPU partition, so if you have a job that requires a GPU, you must first remember to set the partition accordingly.

To submit a job to the GPU partition:

#SBATCH --partition=GPU        # (Submits job to the GPU partition)

To request 1 node, 8 CPU cores, and 4 GPUs, you would use the following syntax:

#SBATCH --nodes=1

#SBATCH --ntasks-per-node=8

#SBATCH --gres=gpu:4

Request a particular type of GPU

You can specify a specific type of GPU, by model name. Currently we have 3 types of NVIDIA GPUs to choose from:

  • Titan V
  • Titan RTX
  • Tesla V100s

You can specify the GPU model by modifying the "gres" directive, like so:

#SBATCH --gres=gpu:TitanV:4       #  will reserve 4 Titan V GPUs (8 Titan Vs is the max per node)

#SBATCH --gres=gpu:TitanRTX:2  #  (will reserve 2 Titan RTX GPUs (4 Titan RTXs is the max per node)

#SBATCH --gres=gpu:V100S:1        #  (will reserve 1 Tesla V100s GPU (4 Tesla V100s is the max per node)

Submitting a Job

Once you are satisfied with the contents of your submit script, save it, then submit it to the Slurm Workload Manager. Here are some helpful commands to do so:

Submit Your Job: sbatch submit-script.slurm

Check the Queue: sbatch submit-script.slurm

Show a Job's Detail: scontrol show job -d [job-id]

Cancel a Job: scancel [job-id]

 

Please refer the link https://oneit.charlotte.edu/urc/research-clusters/orion-gpu-slurm-user-notes for GPU compute jobs for the cluster