...
Reaper: Clean nodes of unwanted files, dirs and procs. I don't think HTCondor will need this.
Slurm
- If I run vncserver via Torque, my reaper script has to kill a bunch of processes when the job is done. But when I run vncserver via Slurm those processes remain. So we will need some sort of reaper-type script for Slurm.
- There is the pam_slurm_adopt.so that supposedly tracks and kills errant processes but it conflicts with systemd and therefore requires some special tweaking.
- https://slurm.schedmd.com/pam_slurm_adopt.html
- I have been unable to get pam_slurm_adopt.so to work. It continually reads you have no active jobs on this node even though I clearly do.
- Adding PrologFlags=contain to slurm.conf seems to prevent any job from running.
- Aha. I may have it working. You have to add this option EXACTLY to all the config files in all the world or slurm doesn't work. Man! this software is fragile PrologFlags=contain to both the client and server slurm.conf files.
- HTCondor
- I am unable to run vncserver via HTCondor as a test like I did with Torque and Slurm.
- Seems to handle /tmp, /var/tmp, and /dev/shm properly because it uses fake versions of these dirs for each job.
- It seems to handle errant processes as well.
- There is also condor_preen that cleans condor directories like /var/lib/condor/spool/...
...