Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • DONE: Access: Would like to prevent users from being able to login to nodes unless they have a proper reservation.  Right now we restrict access via /etc/security/access.conf and use Torque's pam_pbssimpleauth.so to allow access for any user running a job.

    • Slurm
    • HTCondor
      • How do we restrict access to condor nodes to only those users with valid jobs running?
      • With the restrictions in access.conf, HTCondor can still run jobs as users like krowe2.  I think this is because HTCondor doesn't use the login mechanism but just starts shells as the user.
    • OpenPBS
      • Doesn't come with a PAM module and the Torque PAM module doesn't work with OpenPBS.
      • restrictrestrict_user and restrict_user_exceptions work in the mom_priv/config file but there is a max of 10 user exceptions.  With a PAM module we could make as many exceptions as we like and can use groups and netgroups.

...

  • DONE: Cgroups: We will need protection like what cgroups provide so that jobs can’t impact other jobs on the same node.

    • Slurm
      • /etc/slurm/cgroup.conf
    • HTCondor
      • Set CGROUP_MEMORY_LIMIT_POLICY = hard in /etc/condor/config.d/99-nrao on the execute nodes.
    • OpenPBS
      • qmgr -c "set hook pbs_cgroups enabled = true"


...