Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Reservations: The ability to reserve nodes far in the future for things like CASA classes and SIW would be very helpful.  It would need to prevent HTCondor from starting jobs on these nodes as reservation time approaches.

    • Slurm
      • scontrol create reservation starttime=now duration=5 nodes=testpost001 user=root
      • scontrol create reservation starttime=2022-05-3T08:00:00 duration=21-0:0:0 nodes=nmpost[020-030] user=root reservationname=siw2022
      • scontrol show res The output of this kinda sucks.  Hopefully there is a better way to see all the reservations.
    • HTcondor
      • This isn't really something HTCondor is designed to do.  We will use Slurm for this.
  • Ability to run jobs remotely (AWS, CHTC, OSG, etc)

    • Slurm
      • I don't think we will need this ability with Slurm
    • HTCondor
      • I have tested both condor_annex to AWS and flocking to CHTC.
  • Array jobs: Do we want to keep the Torque array job functionality?

    • Slurm
      • #SBATCH --array=0-3%2 This syntax is very similar to Torque.
    • HTCondor
      • To some extent, this isn't how HTCondor is ment to be used.  In other extents, DAGMan and the queue command can simulate this.
  • MPI: We have some users that use MPI across multiple nodes.  It would be nice to keep that as an option.

    • Slurm
      • mpich2
        • PATH=${PATH}:/usr/lib64/mpich/bin salloc --ntasks=8 mpiexec mpiexec.sh
        • PATH=${PATH}:/usr/lib64/mpich/bin salloc --nodes=2 mpiexec mpiexec.sh
      • OpenMPI
        • Use #SBATCH to request a number of tasks (cores) and then run mpiexec or mpicasa as normal.

...