I have an idea how to make one OS image that can be used for both the Torque, HTCondor cluster , and Slurm Cluster such that we can have HTCondor jobs glidein to the Slurm cluster. First we employ /etc/sysconfig/condor. If this file sets CONDOR_CONFIG to a config file that sets START_MASTER = False, then HTCondor will not start. If it sets CONDOR_CONFIG to /etc/condor/condor_config or isn't set at all, then HTCondor will start normally. Next we change the LOCAL_CONFIG_FILE in /etc/condor/condor_config to /var/run/condor/condor_config.local which can be modified locally on the diskless host. This allows the Slurm pilot job to create this config file with the DAEMON_SHUTDOWN rules.
Now a special pilot job can be submitted to the Slurm cluster that starts an HTCondor startd, by running /usr/sbin/condor_master -f, and therefore makes the node into an HTCondor execution host. This works because the OS is configured as an execution host for HTCondor as well as Slurm (and Torque probably) even though it doesn't start HTCondor on boot. This way when the pilot job starts condor_master which starts condor_startd, the node announces itself as an execution host to the central manager. When there are no more HTCondor jobs to run, the startd will exit then the master will exit, then the Slurm pilot job will do some cleanup and exit, and the will go back to being just a Slurm node.
CONDOR_CONFIG
The condor_startd reads the CONDOR_CONFIG environment variable as if it exists, to find its config file instead of the default /etc/condor/condor_config and exits with an error if there is a problem reading that file.
...
The condor_startd daemon will shutdown gracefully and not be restarted if the ClassAd STARTD.DAEMON_SHUTDOWN evlauates to True. E.g.
STARTD.DAEMON_SHUTDONW = State size(ChildState) == "Unclaimed" 0 && Activity size(ChildActivity) == "Idle" 0 && (CurrentTime MyCurrentTime - EnteredCurrentActivity) > 600'
MASTER.DAEMON_SHUTDOWN = STARTD_StartTime == 0
...
The condor.service unit in systemd reads /etc/sysconfig/condor but does not evaluate it. So adding something like the following to /etc/sysconfig/condor won't work, besides this would cause HTCondor to fail if that file didn't exist and that isn't what I want.
CONDOR_CONFIG=$(cat /var/run/condor/config)
I could instead add a second EnvironmentFile like so
EnvironmentFile=-/etc/sysconfig/condor
EnvironmentFile=-/var/run/condor/config
where /var/run/condor/config sets CONDOR_CONFIG=/etc/condor/condor_config
But I can use this to keep HTCondor from starting, just like I do with Torque and Slurm. I can set CONDOR_CONFIG=/dontstartcondor in /etc/syconfig/condor in the OS image and override it with a snapshot. Then stop setting 99-nrao as a snapshot.
...
All three schedulers (Torque, slurm, condor) will be configured to start via systemd. The file pbs_mom, slurm, and condor in /etc/sysconfig will be set such that all of these schedulers will fail to start on boot.
/etc/sysconfig/pbs_mom: PBS_ARGS="-h"
/etc/sysconfig/slurm:slurmd: SLURMD_OPTIONS="-h"
/etc/sysconfig/condor: CONDOR_CONFIG=/nosuchfile/etc/condor/condor_off
Where /etc/condor/condor_off is a copy of /etc/condor/condor_config with LOCAL_CONFIG_DIR commented out and START_MASTER = False added.
If any of these schedulers are wanted to start on boot, the appropriate /etc/sysconfig file (pbs_mom, slurm, condor) will be altered via a snapshot.
/etc/sysconfig/pbs_mom: PBS_ARGS=""
/etc/sysconfig/slurm:slurmd: SLURMD_OPTIONS="--conf-server testpost-serv-1"
/etc/sysconfig/condor: CONDOR_CONFIG=/etc/condor/condor_config
...
The Pilot job submitted to Slurm will one of the two following options depending on results from my testing. This will start condor because unlike the systemd unit file, calling condor_master manually doesn't check /etc/sysconfig/condor
echo 'CONDOR_CONFIG=/etc/condor/glidein-slurm.conf' > /var/run/condor/config
echo 'STARTD.DAEMON_SHUTDONW = State SHUTDOWN = size(ChildState) == "Unclaimed" 0 && Activity size(ChildActivity) == "Idle" 0 && (MyCurrentTime - EnteredCurrentActivity) > 600' > /var/run/condor/condor_config.local
echo 'MASTER.DAEMON_SHUTDOWN = STARTD_StartTime == 0' >> /var/run/condor/condor_config.local
/usr/sbin/condor_master -f
rm -f /var/run/condor/condor_config.local
rm -f /var/run/condor/config
exit
or
echo 'CONDOR_CONFIG=/etc/condor/glidein-slurm.conf' > /var/run/condor/config
echo 'STARTD.DAEMON_SHUTDONW = State == "Unclaimed" && Activity == "Idle" && (MyCurrentTime - EnteredCurrentActivity) > 600' > /var/run/condor/condor_config.local
systemctl start condor
# loop until condor_startd is no longer a running process
systemctl stop condor
rm -f /var/run/condor/condor_config.local
rm -f /var/run/condor/config
exit
If the Payload job is very small and exits before the Pilot job can start blocking on condor_startd then the Pilot job may never end. So, it may need some code to exit after some amount of time if condor_stard hasn't been seen.
If the Pilot job starts condor_master then I may not need to add the EnvironmentFile=-/var/run/condor/config line in the condor unit file.
Factory
The factory process that watches the clusters and launches Pilot jobs should be pretty simple
If jobs are waiting in the HTCondor cluster (perhaps only vlapipe jobs)
If nodes are available in the Slurm Cluster (If not perhaps send email)
Launch one Pilot job
Sleep some amount of time, presumably more than the time HTCondor takes to launch a job
Problems
Ideas
...
Factory
The factory process that watches the clusters and launches Pilot jobs should be pretty simple cron job
PILOT_JOB=/lustre/aoc/admin/tmp/krowe/pilot.sh
idle_condor_jobs=$(condor_q -global -allusers -constraint 'JobStatus == 1' -format "%d\n" 'ServerTime - QDate' | sort -nr | head -1)
#krowe Jul 21 2021: when there are no jobs, condor_q -global returns 'All queues are empty'. Let's reset that.
if [ "${idle_condor_jobs}" = "All queues are empty" ] ; then
idle_condor_jobs=""
fi
# Is there at least one free node in Slurm?
free_slurm_nodes=$(sinfo --states=idle --Format=nodehost --noheader)# launch one pilot job
if [ -n "${idle_condor_jobs}" ] ; then
if [ -n "${free_slurm_nodes}" ] ; then
if [ -f "${PILOT_JOB}" ] ; then
sbatch --quiet ${PILOT_JOB}
fi
fi
fi