...
Reaper: Clean nodes of unwanted files, dirs and procs. I don't think HTCondor will need this.
Slurm
There is the pam_slurm_adopt.so that supposedly tracks and kills errant processes but it conflicts with systemd and therefore requires some special tweaking.
- HTCondor
- Seems to handle /tmp, /var/tmp, and /dev/shm properly because it uses fake versions of these dirs for each job.
- It seems to handle errant processes as well.
- Seems to handle /tmp, /var/tmp, and /dev/shm properly because it uses fake versions of these dirs for each job.
...
Reaper: Cancel jobs when accounts are closed. This could be a cron job on the Central Manager that looks at all the owners of jobs and kills jobs of any user that is not active.
Node priority: With Torque/Moab we can control the order in which the scheduler picks nodes. This allows us to run jobs on the faster nodes by default.
- Slurm
- The order of the nodes in PartitionName is not important. But you can set a Weight to a NodeName. Nodes with the lowest weight will be chosen first.
- HTCondor
- There isn't a simple list like pbsnodes in Torque but there is NEGOTIATOR_PRE_JOB_RANK which can be used to weight nodes by cpu, memory, etc.
- Slurm
...