Operating System
- Must support CASA
- Will need a patching/updating mechanism
- Try to have one OS that supports both our use and local use?
- Or they could dual boot?
- Or kubernetes?
Third party software for VLASS
- CASA
- HTCondor
- Slurm?
- Will need a way to maintain the software
Third party software for Local
- Will need a way to maintain software for the local site
Services
- DNS
- DHCP
- SMTP
- NTP
- NFS?
- LDAP? How do we handle accounts?
- ssh
- rsync (nraorsync_plugin.py)
Management Access
- PDU
- UPS
- BMC/IPMI
- switch
Maintenance
- replace disk
- replace/reseat DIMM
- replace power supply
- NRAO may handle replacement hardware. Drop ship. Spare ourselves?
Hardware
- Cabinet Rack: Doors front and rear locking with mesh. Width: 19". Height: 42U is most common. Depth: 42" or 48"? Rack must support at least 2,000 lbs static load
- https://greatcabinets.com/product/es-ms/
- https://www.apc.com/shop/us/en/products/APC-NetShelter-SX-Server-Rack-Enclosure-42U-Black-1991H-x-600W-x-1070D-mm/P-AR3100
- https://www.apc.com/shop/us/en/products/APC-NetShelter-SX-Server-Rack-Enclosure-42U-Shock-Packaging-2000-lbs-Black-1991H-x-600W-x-1070D-mm/P-AR3100SP
- PDU: one PDU or two PDUs? What plug? What voltage? This may very across sites. What if the site has two power sources?
- UPS: for just the head node and switch? This may depend on the voltage of the PDUs.
- KVM: rackmount, not remote, and patch cables
- Switch: 1Gb/s might be enough. 10Gb/s if price is good. Test Switch at NRAO?
- Environmental Monitoring: Could the PDU do this?
- Head Node: lots of disk. Test head node at NRAO (either CV or NM)
- 30 1U nodes or 15 2U nodes or mix? NVMe drives for nodes. Swap drive?
- GPUs: Do we get GPUs? Do we get 1U nodes with room for 1 or 2 Tesla T4 GPUs or 2U nodes with room for 1 or 2 regular GPU?
- Ethernet cables:
- Power cables: single or Y calbes depending on number of PDUs and two power sources.
Other
Since we are making our own little OSG, should we try to leverage OSG for this or not? Or do we want to make each POD a pool and flock?
How do we handle the 50% workload?
Should we try to buy as much as we can from one vendor like Dell to simplify things?
APC sells a packaged rack on a pallet ready for shipping. We could fill this with gear and ship it. Not sure if that is a good idea or not.