...
- Submit host needs to be able to establish a connection to the remote head node on port 9618 (HTCondor)
- Submit host needs to be able to listen for a connection from the remote head node on port 9618 (HTCondor)
- mcilroy has external IPs (146.88.1.66 for 1Gb/s and 146.88.10.66 for 10Gb/s). Is the container listening?
Remote side
- Head node requires external access establish on port 9618 to nrao.edu. (HTCondor)
- Head node listens on port 9618 from nrao.edu. (HTCondor)
- Execute nodes require external access establish on port 9618 to nrao.edu on port 9618. Can be NATed. (HTCondor)
- Execute establish on port 22 to gibson.aoc.nrao.edu. Can be NATed. (HTCondornraorsync)
Using
- Get NRAO jobs on the remote racks. This may depend on how we want to use these remote racks. If we want them to do specific types of jobs then ClassAd options may be the solution. If we want them as overflow for jobs run at NRAO then flocking may be the solution. Perhaps we want both flocking and ClassAd options. Actually flocking may be the best method because I think it doesn't require the execute nodes to have external network access.
- Flocking? What are the networking requirements?
- Classad options? I think this will require the execute hosts to have routable IPs because our submit host will talk directly to them and vice-versa. Could CCB help here?
- Other?
- Remote HTCondor concerns
- Do we want our jobs to run a an NRAO user like vlapipe or nobody?
- Do we want local jobs to run as the local user, some dedicated user, or nobody?Remote HTCondor concerns
- Need to support 50% workload for NRAP and 50% workload for local. How?
- Could have 15 nodes for us and 15 nodes for them
- What if we do nothing? HTCondor's fair-share algorithm may do the work for us if all our jobs are run as user vlapipe or something like that.
- Use RANK, and therefore preemption. https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToConfigPrioritiesForUsers
- Group Accounting
- User Priority https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToSetUserPriority
- Share disk space on head node 50% NRAO and 50% local
- Two partitions: one for NRAO and one for local?
...