You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

Operating System

  • Must support CASA
  • Will need a patching/updating mechanism
  • Try to have one OS that supports both our use and local use?
    • Or they could dual boot?
    • Or kubernetes?

Third party software for VLASS

  • CASA
  • HTCondor
  • Slurm?
  • Will need a way to maintain the software

Third party software for Local

  • Will need a way to maintain software for the local site

Services

  • DNS
  • DHCP
  • SMTP
  • NTP
  • NFS?
  • LDAP?  How do we handle accounts?
  • ssh
  • rsync (nraorsync_plugin.py)

Management Access

  • PDU
  • UPS
  • BMC/IPMI
  • switch

Maintenance

  • replace disk
  • replace/reseat DIMM
  • replace power supply
  • NRAO may handle replacement hardware. Drop ship. Spare ourselves?

Hardware

  • GPUs?  Do we get 1U nodes with room for 1 or 2 Tesla T4 GPUs or 2U nodes with room for 1 or 2 regular GPU?
  • 1Gb/s might be enough.  10Gb/s if price is good.
  • 30 1U nodes or 15 2U nodes or mix?
  • head node with lots of disk
  • test head node at NRAO (either CV or NM)
  • one PDU or two PDUs?
  • UPS for just the head node and switch?
  • environmental monitoring  Could the PDU do this?
  • rackmount KVM (not remote) and patch cables
  • NVMe drives for nodes
  • Rack

Other

Since we are making our own little OSG, should we try to leverage OSG for this or not?  Or do we want to make each POD a pool and flock?

How do we handle the 50% workload?



  • No labels