...
Say we have two clusters (HTCondor and Slurm) and both can be submitted to from the same host. We want the HTCondor jobs to use the Slurm cluster resources when the HTCondor cluster resources are full, but we probably don't want to support preemption. How could we have HTCondor submit jobs to a Slurm cluster? (HTCondor-C, flocking, overlapping, batch-grid-type, HTCondor-CE, etc)
ANSWER: write our own 'factory' that watched HTCondor and when it is full submit Pilot jobs to Slurm that launch startd daemons thus allowing the Payload jobs waiting in HTCondor to run. Will want to set the startd to exit after being idle for a little while, run the Pilot job as root, and figure out how to do cgroups properly.
Glidein
The only documentation I can find on glinein (https://htcondor.readthedocs.io/en/latest/grid-computing/introduction-grid-computing.html?highlight=glidein#introduction) seems to imply that glidein only works with Globus "HTCondor permits the temporary addition of a Globus-controlled resource to a local pool. This is called glidein." Is this correct? Is there better documentation? Is glidein even a technology or software package or is it just a generic term?
ANSWER: Greg will look at re-writring this.
Batch Grid Type
or is it just a generic term?
ANSWER: Greg will look at re-writring thisWhat is it? Can I use it to submit htcondor jobs to Slurm or Torque? How.
request_virtualmemory
If I set request_virtualmemory = 2G, condor_submit accepts it as a valid knob but the job stays idle and never runs.
...
If I set request_virtualmemory = 2000000, which should be the same as 2G, the job runs but doesn't set memory.memsw.limit_in_bytes in the cgroup.
Send ANSWER: krowe setnt mail to Greg about it
...
ANSWER: Essentially it is "Condor Week Europe". Mostly the same talks but different customer presentations.
Memory usage report
The memory usage report at the end of the condor log seems incorrect. I can watch the memory.max_usage_in_bytes in the cgroup get over 8,400MB yet the report in the condor log reads 6,464MB. Does the log only report the memory usage of the parent process and not include all the children? Is it an average memory usage over time?
...
ANSWER: idtokens. Host-based and poolpassword are not sufficient to identify users and allow for this (and probably condor_submit -remote).
HTCondor Workshop vs Condor Week
ANSWER: Essentially it is "Condor Week Europe". Mostly the same talks but different customer presentations. Could be interesting for the different customer presentations.