...
Queues: We want to keep the queue functionality of Torque/Moab where, for example, hera jobs go to hera nodes, vlass jobs go to vlass nodes. We would also like to be able to have vlasstest jobs go to the vlass nodes with a higher priority without preempting running jobs.
Slurm
- Queues are called partitions. At some level they are called partitions in Torque as well.
- Job preemtion is disabled by default
- Allows for simple priority settings in partitions with the default PriorityType=priority/basic plugin.
- E.g. PartitionName=vlass Nodes=testpost[002-004] MaxTime=144000 State=UP Priority=1000
- HTCondor
- HTCondor doesn't have queues or partitions like Torque/Moab or Slurm but there are still ways to do what we need.
- Constraints and/or seperate pools are good options.
- I don't know how to simulate the vlass/vlasstest queues. Perhaps by the time we move to HTCondor we won't need vlasstest anymore.
Interactive: The ability to assign all or part of a node to a user with shell level access (nodescheduler, qsub -I, etc), minimal granularity is per NUMA node, finer would be useful.
- What is it that we like about nodescheduler over something like qsub -I?
- It's not tied to any tty so a user can login multiple times from multiple places to their reserved node without requiring screen or tmux or VNC.
- Its creation is asynchronous. If the cluster is full you don't wait around for your reservation to start, you get an email message when it is ready.
- It's time limited (e.g. two weeks). We might be able to do the same with a queue/partition setting but could we then extend that reservation?
- HTCondor doesn't have queues or partitions like Torque/Moab or Slurm but there are still ways to do what we need.
- Constraints and/or seperate pools are good options.
- We get to define the shape of a reservation (whole node, NUMA node, etc). If we just let people use qsub -I they could reserve all sorts of sizes which may be less efficient. Then again it may be more efficient. But either way it is simpler for our users.
- I don't know how to simulate the vlass/vlasstest queues. Perhaps by the time we move to HTCondor we won't need vlasstest anymore.
- It's not tied to any tty so a user can login multiple times from multiple places to their reserved node without requiring screen or tmux or VNC.
- What is it that we like about nodescheduler over something like qsub -I?
Access: Would like to prevent users from being able to login to nodes unless they have a proper reservation.
- Slurm has
- Has a pam_slurm.so module similar to pam_pbssimpleauth.so.
- HTCondor
- Since I don't think we will be using nodescheduler with HTCondor, this isn't needed.
- Since I don't think we will be using nodescheduler with HTCondor, this isn't needed.
- Slurm has
Reservations: The ability to reserve nodes far in the future for things like CASA classes and SIW would be very helpful. It would need to prevent HTCondor from starting jobs on these nodes as reservation time approaches.
- Slurm
- scontrol create reservation starttime=now duration=5 nodes=testpost001 user=root
- scontrol create reservation starttime=2022-05-3T08:00:00 duration=21-0:0:0 nodes=nmpost[020-030] user=root reservationname=siw2022
- scontrol show res The output of this kinda sucks. Hopefully there is a better way to see all the reservations.
- Slurm
...