Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Timeline of events

  • 2020-03-19: ALMA suspends science observing and stows the array because of COVID-19.
  • 2020-06-24: Archive webapps (aq, asaz, rh, etc, but not SP) moved to new Docker Swarm (na-arc-*) system.  See more.
  • 2021-03-17: ALMA re-starts limited science observations, resuming Cycle 7.  See more.
  • 2021-10-01: ALMA starts Cycle 8 observations.  See more.
  • 2022-02-03: Science Portal (SP) upgraded Plone, Python, RHEL and moved into Docker Swarm.  All other webapps had already been in Docker Swarm.
  • 2022-04-18: First documented report of performance issues.  Webapps moved to pre-production Docker Swarm (natest-arc-*).  See more
  • 2022-05-09: moved Science Portal (SP) from Docker Swarm to an rsync copy on  http://almaportal.cv.nrao.edu/ for performance issues
  • 2022-05-31: moved Science Portal (SP) from rsync copy back to Docker Swarm
  • 2022-06-30: Tracy changed the eth0 MTU on the production docker swarm nodes (na-arc-*) from the default 1500 to 9000. The test swarm is still 1500.
  •  2022-07-25: Jeff Kern asked K. Scott Rowe to head a tiger team to investigate the various issues that have affected the ALMA Archive.
  • 2022-08-11: cloned na-arc-2 and moved the clone to naasc-vs-3 as na-arc-3 and change MTU to 1500.  Other na-arc nodes are 9000 but changing na-arc-3 to 9000 would require changing naasc-vs-3 which could affect other, non-archive, VM guests.
  • 2022-08-12: setup http://almaportal.cv.nrao.edu/ which uses the five na-arc nodes.  This is for internal testing.  Results show download speed at about 32KB/s regaurdless of which na-arc node the web proxy chooses.
  • 2022-08-17 krowe: Changed eth0 on na-arc-5 from qdisc pfifo_fast to qdisc fq_codel to match all the other na-arc and natest-arc nodes.  This seemed to have no affect on performance.
    • tc qdisc replace dev eth0 root fq_codel
  • 2022-08-19 krowe: For some reason, all the swarm services on na-arc-5 shutdown about 11am Central Aug. 18, 2022.  Now my wget tests are getting about 100MB/s and I tested this five times to walk through all four nodes.  I then moved the httpd to na-arc-5 and now na-arc-[1,2,4] download at ~32KB/s while na-arc-[3,5] download at ~100MB/s.
  • 2022-08-25 krowe: Tracy cahnged the following sysctl options on na-arc-5 to match the other VM Hosts.  Sadly it seems to have had no effect on wget performance.  na-arc-1, na-arc-2, na-arc-4 are 32KB/s while na-arc-3 and na-arc-5 are 45MB/s.
    • net.ipv4.conf.all.accept_redirects = 0
    • net.ipv4.conf.all.forwarding = 1

...