- Upload your code and data to your KoKo home directory using Globus, Filezilla or SCP.
- Choose the type of job you would like to run (salloc, srun, sbatch)
- salloc requests and holds an allocation on the cluster so you can run interactive jobs using srun, mpiexec, and other applications. See srun for more information.
More information execute “man salloc” in a koko-login.fau.edu terminal.- Example:
salloc -N 1 –exclusive
srun hostname
- Example:
- srun requests an allocation on the cluster if one is not currently granted by salloc. It then executes the specified command
More information execute “man srun” in a koko-login.fau.edu terminal.- Example:
srun -N 1 –exclusive hostname # execute the hostname of a node
- Example:
- sbatch executes a task in the background and not connected to the current terminal. It allocates a task similar to salloc and logs the results. If you computer looses connection to the cluster sbatch tasks will continue to run making this a very powerful command. An example of an sbatch task is provided below.
- Create a script named {JOBNAME}.sh to start your job containing the following:
#!/bin/sh
#SBATCH – – partition=shortq7
#SBATCH -N 1
#SBATCH – -exclusive
#SBATCH – – mem-per-cpu=16000# Load modules, if needed, run staging tasks, etc… Koko Software Modules
# Execute the task
srun hostname - Run chmod +x {JOBNAME}.sh to make the job executable.
- Submit the job using the sbatch command.
sbatch {JOBNAME}.sh
- Create a script named {JOBNAME}.sh to start your job containing the following:
- Please adjust partition (queue), application to execute, memory, tasks and the heap sizes as needed for these different examples to create your own. If you need help please let us know by submitting a ticket to the Help Desk.
- salloc requests and holds an allocation on the cluster so you can run interactive jobs using srun, mpiexec, and other applications. See srun for more information.
- You can print a list of queues with the sinfo command
[user@koko-login2 ~]$ sinfo PARTITION AVAIL TIMELIMIT NODES STATE NODELIST shortq7* up 2:00:00 2 down* node[001,056] shortq7* up 2:00:00 3 mix node[009,030-031] shortq7* up 2:00:00 37 alloc node[002-006,008,011-014,019-020,051,054,057-058,062-065,067-081,083-084] shortq7* up 2:00:00 30 idle gpu-exxact[1-5],gpu-k80,node[007,010,027-029,032,052-053,059-061,082,087-098] longq7 up 7-12:00:00 2 down* node[001,056] longq7 up 7-12:00:00 1 mix node009 longq7 up 7-12:00:00 37 alloc node[002-006,008,011-014,019-020,051,054,057-058,062-065,067-081,083-084] longq7 up 7-12:00:00 6 idle node[007,010,059-061,082]
- You can see the status of the cluster using the squeue command.
[user@koko-login2 ~]$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 2552838 longq7 MS1b_m14 usera PD 0:00 1 (Dependency) 2553360 longq7 TS_71-72 userb PD 0:00 2 (Resources) 2553425 longq7 TS_69-70 userb PD 0:00 2 (Resources) 2552836 longq7 MS1b_m14 userc R 4-12:08:47 1 node002 2552837 longq7 MS1b_m14 userc R 5-17:38:36 1 node063 2553116 longq7 Homoseri userd R 7-03:41:54 1 node078 2553117 longq7 Homoseri userd R 7-02:29:54 1 node071 2553157 longq7 HGE_V1_1 usere R 6-07:50:41 1 node003 2553288 longq7 Cys_Ket_ userf R 4-21:02:05 1 node004 ...
For more information regarding SLURM see the manuals and their quick start guide.