...
- To Do
- Change slurm so that nodes come up properly after a reboot instead of "unexpectedly rebooted"
- upgrade testpost-master to RHEL7 so it can run Slurm
- upgrade nmpost-master to RHEL7 so it can run Slurm
- Configure gibson so that it can flock to CHTC
- Implement some sort of mechanism to keep vlass jobs on vlass nodes, hera jobs on hera nodes, etc
- Document how to use Slurm with emphasis on transitioning from Torque/Moab
- Create a small subset of nmpost cluster that only runs Slurm for users to test.
- Identify stake-holders (E.g. operations, DAs, sci-staff, SSA, observers) and give them the chance to test Slurm and provide opinions
- implement useful opinions
- Set a date to transition remaining cluster to Slurm
- DONE
- DONE: Set a PoolName for the testpost and nmpost clusters. E.g. NRAO-NM-PROD and NRAO-NM-TEST. They don't have to be allcaps.
- DONE: Change slurm so that nodes come up properly after a reboot instead of "unexpectedly rebooted" ReturnToService=2