Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Port nodeextendjob to Slurm
  • DONE: Port nodesfree to Slurm
  • DONE: Port nodereboot to Slurm *scontrol ASAP reboot reason=testing testpost001*
  • Create a subset of testpost cluster that only runs Slurm for admins to test.
    • Install Slurm on testpost-serv-1, testpost-master, and OS image
    • install Slurm reaper on OS image
  • Create a small subset of nmpost cluster that only runs Slurm for users to test.
    • Install Slurm on nmpost-serv-1, nmpost-master, herapost-master, and OS image
    • install Slurm reaper on OS image
    • Need at least 4 nodes: batch, interactive, vlass/vlasstest, hera/hera-i
  • Identify stake-holders (E.g. operations, DAs, sci-staff, SSA, HERA, observers) and give them the chance to test Slurm and provide opinions
  • implement useful opinions
  • Set a date to transition remaining cluster to Slurm.  Possibly before we have to pay for Torque again around Jun. 2022.
  • Do another pass on the documentation https://info.nrao.edu/computing/guide/cluster-processing

...