Operating System
- Must support CASA
- Will need a patching/updating mechanism
- Try to have one OS that supports both our use and local use?
- Or they could dual boot?
- Or kubernetes?
Third party software for VLASS
- CASA
- HTCondor
- Slurm?
- Will need a way to maintain the software
Third party software for Local
- Will need a way to maintain software for the local site
Services
- DNS
- DHCP
- SMTP
- NTP
- NFS?
- LDAP? How do we handle accounts?
- ssh
- rsync (nraorsync_plugin.py)
Management Access
- PDU
- UPS
- BMC/IPMI
- switch
Maintenance
- replace disk
- replace/reseat DIMM
- replace power supply
- NRAO may handle replacement hardware. Drop ship. Spare ourselves?
Hardware
- GPUs? Do we get 1U nodes with room for 1 or 2 Tesla T4 GPUs or 2U nodes with room for 1 or 2 regular GPU?
- 1Gb/s might be enough. 10Gb/s if price is good.
- 30 1U nodes or 15 2U nodes or mix?
- head node with lots of disk
- test head node at NRAO (either CV or NM)
- one PDU or two PDUs?
- UPS for just the head node and switch?
- environmental monitoring Could the PDU do this?
- rackmount KVM (not remote) and patch cables
- NVMe drives for nodes
- Rack
Other
Since we are making our own little OSG, should we try to leverage OSG for this or not? Or do we want to make each POD a pool and flock?
How do we handle the 50% workload?