Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

I see that with dynamic slots, the parent slot (slot1) seems always unclaimed and idle and the child slots (slot1_1) are Claimed and Busy.  So I tried checking the ChildState attribute which looks to be a list but doesn't behaive like one.  For example, none of these show any slots

condor_status -const 'ChildState == { "Claimed" }'

condor_status -const 'sum(ChildState) == 0'

HTCondor and Slurm

NRAO has effectively two use cases:  1) Operations triggered jobs.  These are well formulated pipeline jobs, they're still fairly monolithic and long running (many hours to few days).   2) User triggered jobs, these are of course not well formulated.  We will be moving the operations jobs to htcondor.   We plan to move the user triggered jobs to SLURM form Torque.   There's enough noise in the two job loads that we don't want to have strict host carve outs for type 1 and type 2 jobs.  What we anticipate doing is having a set of nodes known only to htcondor for the bulk of operations and a set of hosts controlled by SLURM for the user facing jobs.   Periodically when they have a large set of operations jobs we'd like for them to burst into the SLURM controlled nodes.  We neither anticipate nor want the slurm jobs to burst into the htcondor set of nodes.

...