Operating System
- Must support CASA
- Will need a patching/updating mechanism
- Try to have one OS that supports both our use and local use?
- Or they could dual boot?
- Or kubernetes?
Third party software for VLASS
- CASA
- HTCondor
- Slurm?
- Will need a way to maintain the software
Third party software for Local
- Will need a way to maintain software for the local site
Services
- DNS
- DHCP
- SMTP
- NTP
- NFS?
- LDAP? How do we handle accounts?
- ssh
- rsync (nraorsync_plugin.py)
Management Access
- PDU
- UPS
- BMC/IPMI
- switch
Maintenance
- replace disk
- replace/reseat DIMM
- replace power supply
- NRAO may handle replacement hardware. Drop ship. Spare ourselves?
Hardware
- GPUs? Do we get 1U nodes with room for 1 or 2 Tesla T4 GPUs or 2U nodes with room for 1 or 2 regular GPU?
- 1Gb/s might be enough. 10Gb/s if price is good.
- 30 1U nodes or 15 2U nodes or mix?
- head node with lots of disk
- test head node at NRAO (either CV or NM)
- one PDU or two PDUs? What plug? What voltage? This may very across sites.
- UPS for just the head node and switch?
- environmental monitoring Could the PDU do this?
- rackmount KVM (not remote) and patch cables
- NVMe drives for nodes
- Cabinet Rack Doors front and rear locking with mesh. Width: 19". Height: 42U is most common. Depth: 42" or 48"? Rack must support at least 2,000 lbs static load
Other
Since we are making our own little OSG, should we try to leverage OSG for this or not? Or do we want to make each POD a pool and flock?
How do we handle the 50% workload?
Should we try to buy as much as we can from one vendor like Dell to simplify things?