You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 159 Next »


To Do

  • Talk with SSA and VLASS about how we actually use these remote clusters.
    • Staging, flocking, etc
  • Done: Find a spot at DSOC for test system.  253T if we buy a rack.
  • Get account number and routing number.
  • Buy test system

Timeline

  • Buy test system as soon as practical (assuming the project is still a go)
    • Does Jeff Kern know if this is a go or not
    • Talk to Matthew about where to put this stuff
      • May 3, 2022 krowe: talked to Matthew and Peter.  RADIAL has space 253T reserved.
  • Buy production system by July
  • Receive production system by Aug
  • Install production system by Dec
  • Running in Jan. 2023


Data Path

This is conceptual at this point.  Need to talk with SSA and VLASS about this.

  • We pre-stage data on the remote head node
  • We then either submit a job locally and it flocks to the remote site or we login to the remote site and submit from there.
    • Can we use a nifty filesystem to simplify this (Ceph or that LHC fs)?
    • This might be a good phase2 problem to solve.
    • Is this kinda what nraorsync does?
  • The remote execute hosts transfer data from the remote head node
  • The job uploads resulting data to the remote head node
  • We retrieve data from the remote head node


Using

  • Get NRAO jobs on the remote racks.  This may depend on how we want to use these remote racks. If we want them to do specific types of jobs then ClassAd options may be the solution. If we want them as overflow for jobs run at NRAO then flocking may be the solution. Perhaps we want both flocking and ClassAd options.  Actually flocking may be the best method because I think it doesn't require the execute nodes to have external network access.
    • Staging and submitting remotely?
    • Flocking?  What are the networking requirements?
    • Classad options?  I think this will require the execute hosts to have routable IPs because our submit host will talk directly to them and vice-versa.  Could CCB help here?
    • Other?
  • Remote HTCondor concerns
    • Do we want our jobs to run an NRAO user like vlapipe or nobody?
    • Do we want remote institution jobs to run as the remote institution user, some dedicated user, or nobody?
  • Need to support 50% workload for NRAO and 50% workload for remote institution.  How?
  • Share disk space on head node 50% NRAO and 50% remote institution
    • Two partitions: one for NRAO and one for remote institution?


Documentation

  • A projectbook like we did for USNO could be appropriate
  • Process diagrams (how systems boot, how jobs get started from NRAO and run, how remote institutions start jobs, etc)


Networking

HTCondor flocking requires

  • From local schedd to remote collectord on condor port 9618
  • From remote negotiator and execute hosts to local schedd.  Here the execute hosts can be NATed.
  • From local shadow to remote starterd.  Use CCB.  It allows execute hosts to live behind firewall and be NATed.

Non-flocking just requires ssh access from probably mcilroy and to gibson

NRAO side

  • NRAO -> remote head node on port 22 (ssh)
  • Submit Host -> remote head node (condor_collector) on port 9618 (HTCondor) for flocking
  • Submit Host <- remote head node (condor_negotiator) on port 9618 (HTCondor) for flocking
    • mcilroy has external IPs (146.88.1.66 for 1Gb/s and 146.88.10.66 for 10Gb/s).  Is the container listening?
  • Submit Host <- remote execute hosts (condor_starter) on port 9618 (HTCondor) for flocking
  • Submit Host (condor_shadow) -> remote execute hosts (condor_starter) on port 9618 (HTCondor) for flocking.  CCB might alleviate this.

Remote side

  • Head node <- from nrao.edu on port 22 (ssh)
  • Head node -> revere.aoc.nrao.edu on port 25 (smtp)
  • Head node -> NRAO Submit Host on port 9618 (HTCondor) for flocking
  • Head node <- NRAO Submit Host on port 9618 (HTCondor) for flocking
  • Execute node -> NRAO Submit Host on port 9618 (HTCondor) for flocking.  Execute host may be NATed.
  • Execute node -> gibson.aoc.nrao.edu on port 22 (ssh) for flocking with nraorsync.  Execute host can be NATed.


Services

  • DNS
    • What DNS domain will these hosts be in?  nrao.edu? remote-institution.site? other?
  • DHCP
  • SMTP
  • NTP
  • NFS
  • LDAP?  How do we handle accounts?  I think we will want accounts on at least the head node.  The execution nodes could run everything as nobody or as real users.  If we want real users on the execute hosts then we should use a directory service which should probably be LDAP.  No sense in teaching folks how to use NIS anymore.
    • remote institution accounts only?
  • ssh
  • rsync (nraorsync_plugin.py)
  • NAT so the nodes can download/upload data
  • TFTP (for OSes and switch)
  • condor (port 9618) https://htcondor-wiki.cs.wisc.edu/index.cgi/wiki?p=HowToMixFirewallsAndHtCondor
  • ganglia
  • nagios


Operating System

  • Must support CASA
  • Will need a patching/updating mechanism
  • How to boot diskless OS images
  • What Linux distrobution to use?
    • Can we use Red Hat with our current license?  I have looked in JDE and I can't find a recent subscription.  Need to ask David.
    • Should we buy Red Hat licenses like we did for USNO?
      • USNO is between $10K and $15K per year for 81 licensed nodes.  This may not be an EDU license.
      • NRAO used to have a 1,000 host license for Red Hat but I don't know what they have now.
    • Do we even want to use Red Hat?
      • Alternatives would be Rocky Linux or AlmaLinux since CentOS is essentially dead
  • What version do we use RHEL7 or RHEL8?

Third party software for VLASS

  • CASA
  • HTCondor
  • Will need a way to maintain the software
    • stow, rpm, modules, containers?

Third party software for remote institution

  • Will need a way to maintain software for the remote institution site
  • Will need a way to maintain the software
    • stow, rpm, modules, containers?


Management Access

  • PDU
  • UPS
  • BMC/IPMI
  • switch

Maintenance

  • replace disk (remote institution admin)
  • replace/reseat DIMM (remote institution admin)
  • replace power supply (remote institution admin)
  • NRAO may handle replacement hardware. Drop ship. Spare ourselves?
  • Patching OS images (NRAO)
  • Patching third party software like CASA and HTCondor (NRAO)
  • Altering OS images (NRAO)

Hardware


Shipping

  • Drop ship everything to the site and assemble on site.  This will require an NRAO person on site to assemble with a pre-built OS disk for the head node.  I think this is too much work to do on site.
    • Install DIMMs in nodes
    • Install NVMe drives in nodes
    • Rack everything
    • Cable everything
    • Configure switch
    • Install/Configure OS
  • Ship everything here and assemble then ship a rack-on-pallet.
  • Mix the two. Ship minimal stuff here (head node, switch, couple of compute nodes, etc) and configure and drop ship most of the nodes to the site.
    • Re-ship head node, switch, compute nodes to site
    • Re-ship memory and drives to site
  • A person from the remote site could travel to NM or CV to see the test system and get instruction.


Other

  • Keep each rack as similar to the other racks as possible.
  • Test system at NRAO should be one of everything.
  • Since we are making our own little OSG, should we try to leverage OSG for this or not?  Or do we want to make each POD a pool and flock?
  • Should we try to buy as much as we can from one vendor like Dell to simplify things?
  • APC sells a packaged rack on a pallet ready for shipping.  We could fill this with gear and ship it.  Not sure if that is a good idea or not.  We will not be able to move the unit into the server room while still on the pallet because no doorway is tall enough.  We would have to roll it off the pallet (it comes with a ramp and the rack is on casters) move it into the server room, fill and configure it, roll it out of the server room, roll it back onto the pallet, probably remove the bottom server(s) so we can attach it to the pallet, then re-add the bottom server(s).  We could use the double glass doors for this but there is a lip on the transition.  We could use the doors in the PRA closet as it has no lip but would require a lot of moving of shelves and stuff.
  • APC NetShelter SX packaged:
    • On Pallet: Height 85.79in (2179mm) Width 43.5in (1105mm)
    • On Casters: Height 78.39in 1991mm) Width 23.62in (600mm)
  • Double Glass doors: Height: 80in (2032mm) (because of the 2in maglock)
  • NRAO-NM wide server doors: Height: 83in (2133mm) Width: 48in (1187mm)
  • I could start prototyping now using AWS.
  • Do we want jobs to flock or do we want to submit jobs on the remote host and have pre-transfered data?  Involve SSA and VLASS in this question.
  • If jobs are submitted from the remote host does that mean SSA will want a container on that remote host?



Site Questions

  • Voltage in server room (120V or 208V or 240V)
  • Receptacles in server room (L5-30R or L21-30R or ...)
  • Single or dual power feeds?
  • Is power from below or from above?
  • Door width and height and path to server room.
    • Can a rack-on-pallet fit upright?  Height: 85.79inches (2179mm) Width: 43.5inches (1105mm)
    • Can a rack-on-casters fit upright?  Height: 78.39inches (1991mm) Width: 23.62inches (600mm)
    • NRAO-NM wide server door Height: 84inches (2108mm) Width: 46.75inches (1219mm)
  • Firewalls
  • How are you going to use this?
  • Do you care if this is in your DNS zone or ours?
  • Is NAT available for the execute hosts?


Resources

  • USNO correlator (Mark Wainright)
  • VLBA Control Computers (William Colburn)
  • Red Hat maintenance (William Colburn)
  • Virtual kickstart (William Colburn)
  • Switch models and ethernet (Jeff Long)
  • HTCondor best practices (Greg Thain)
  • OSG (Lauren Michael)
  • SDSC at UCSD
  • TACC at UT Austin
  • IDIA https://www.idia.ac.za/



  • No labels