On Jul. 25, 2022 Jeff Kern asked K. Scott Rowe to head a tiger team to investigate the various issues that have affected the ALMA Archive hosted in CV for the past few weeks to months. The team was initially just K. Scott.
Documented Issues
- https://ictjira.alma.cl/browse/AES-52
- https://confluence.alma.cl/pages/viewpage.action?pageId=91826715
Timeline of events
- 2020-03-19: ALMA suspends science observing and stows the array because of COVID-19.
- 2020-06-24: Archive webapps (aq, asaz, rh, etc, but not SP) moved to new Docker Swarm (na-arc-*) system. See more.
- 2021-03-17: ALMA re-starts limited science observations, resuming Cycle 7. See more.
- 2021-10-01: ALMA starts Cycle 8 observations. See more.
- 2022-02-03: Science Portal (SP) upgraded Plone, Python, RHEL and moved into Docker Swarm. All other webapps had already been in Docker Swarm.
- 2022-04-18: First documented report of performance issues. Webapps moved to pre-production Docker Swarm (natest-arc-*). See more
- 2022-05-09: moved Science Portal (SP) from Docker Swarm to an rsync copy on http://almaportal.cv.nrao.edu/ for performance issues
- 2022-05-31: moved Science Portal (SP) from rsync copy back to Docker Swarm
- 2022-06-30: Tracy changed the eth0 MTU on the production docker swarm nodes (na-arc-*) from the default 1500 to 9000. The test swarm is still 1500.
Benchmarks
- Using Apache Benchmarks every hour to load http://almascience.nrao.edu/ on rastan.aoc.nrao.edu
- ssh.aoc.nrao.edu:/users/krowe/alma_archive/benchmarks/almascience.nrao.edu
- Using download script to get 2013.1.00226.S-small (no ASDM tarballs) every hour on cvpost-master.aoc.nrao.edu
- ssh.cv.nrao.edu:/lustre/cv/users/krowe/tickets/scg-207/benchmarks/2013.1.00226.S-small
- Using download script to get 2013.1.00226.S-large (with ASDM tarballs) every hour on testpost-master.aoc.nrao.edu
- iperf tests using iperf3 -s -b <local IP> and iperf3 -B <source IP> -c <dest IP>
Production docker swarm iperf tests measured in Gb/s.
na-arc-1 (naasc-vs-4) | na-arc-2 (naasc-vs-4) | na-arc-3 (naasc-vs-3) | na-arc-4 (naasc-vs-4) | na-arc-5 (naasc-vs-5) | |
---|---|---|---|---|---|
na-arc-1 | 18 | 0.002 | 20 | 10 | |
na-arc-2 | 20 | 0.002 | 20 | 10 | |
na-arc-3 | 0.002 | 0.002 | 0.002 | 0.002 | |
na-arc-4 | 20 | 19 | 0.002 | ||
na-arc-5 | 10 | 10 | 0.002 | 10 | 10 |
There is clearly something wrong with na-arc-3
Test docker swarm iperf tests measured in Gb/s
natest-arc-1 (naasc-dev-vs) | natest-arc-2 (naasc-vs-1) | natest-arc-3 (naasc-vs-5) | |
---|---|---|---|
natest-arc-1 | 0.9 | 0.8 | |
natest-arc-2 | 0.9 | 0.8 | |
natest-arc-3 | 0.3 | 0.4 |
The test docker swarm (natest-arc-*) are performing as expected. The VM hosts have 1Gb/s links so getting 80% to 90% bandwidth is about as good as one can expect.
Diagrams
Questions
- Why does na-arc-3 have such poor network performance to the other na-arc nodes?
- ping na-arc-[1,2,4,5] with anything larger than -s 1490 drops all packets
- iperf tests show 10Gb/s between the VM host of na-arc-3 (naasc-vs-3 p5p1.120) and the VM host of na-arc-5 (naasc-vs-5 p2p1.120). So it isn't a bad card in either of the VM hosts.
- iptables on na-arc-3 looks different than iptables on na-arc-[2,3,5]. na-arc-1 also looks a bit different.
- docker_gwbridge interface on na-arc-[1,2,4,5] shows NO_CARRIER but not on na-arc-3.
- na-arc-3 has a veth10fd1da@if37 interface. None of the other na-arc-* nodes have a veth interface.
- Why is na-arc-5 using qdisc pfifo_fast instead of qdisc_fq_codel for eth0? (see ip addr)
- Is putting all the 1Gb/s production docker swarm nodes on the same ASIC on the same Fabric Extender of the cv-nexus switch a good idea?
- I am thinking it does not matter because it looks like the production docker swarm nodes use the 10Gb/s network which is on cv-nexus9k
- Why does natest-arc-3 have ens3 instead of eth0 and why is its speed 100Mb/s?
- virsh domiflist natest-arc-3 shows the Model as rtl8139 instead of virtio
- When I run ethtool eth0 on nar-arc-{1..5} natest-arc-{1..2} as root, the result is just Link detected: yes instead of the full report with speed while na-arc-3 shows 100Mb/s.
- Why do natest-arc-{1..3} have 9 veth* interfaces in ip addr show while na-arc-{1..5} don't have any veth* interfaces?
- Can we set up a test archive query that uses the "other" docker swarm which in this case would be the production swarm (na-arc-*)?
To Do
- Fix na-arc-3 so it gets the same performance as other na-arc-* nodes which is apparently at least 10Gb/s. (pmurphy)
- Launch services on production swarm (sbooth)
- Test the production docker swarm with a test web interface. (lsharp)
- ask other ARC if they use MTU 9000 on 10Gb. (krowe)
- Switch the production docker swarm back to MTU 1500 since the test docker swarm uses MTU 1500 and is performing better.
- Fix natest-arc-3 so it's NIC Model is virtio instead of rtl8139
People (not necessarily team members)
- K. Scott Rowe - Tiger Team Lead
- CJ Allen - sysadmin
- Tom Booth - programmer
- Liz Sharp - sysadmin
- Brian Mason - DRM Scientist
- Zhon Butcher - sysadmin
- Tracy Halstead - sysadmin
- Alvaro Aguirre - ALMA software
- Pat Murphy - CIS lead
- Rachel Rosen - previous ICT lead
- Laura Jenson - current ICT lead
- Catherine Vlahakis - Scientist
Communcation lines
- asg@listmgr.nrao.edu email list run by rrosen (Sadly, no archives are kept)
- Mattermost NAASC Systems - Mostly used by NAASC sysadmins
Answers
- Why does iperf show 10Gb/s between na-arc-5 and na-arc-[1,2,4]? How is this possible if the default interface on the respective VM Hosts is 1Gb/s?
- ANSWER: The vnets for the VM guests are tied to the 10Gb/s NICs on the VM hosts not the 1Gb/s NICs.
References
- Prepare offline infrastructure from the scratch (Describes docker swarm setup)
- file:///tmp/ALMA%20Offline%20Software%20Test_Deployment%20Concept(2).pdf