You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Poor Download Performance

This was first reported on 2022-04-18 and documented in https://ictjira.alma.cl/browse/AES-52  What we have seen/has been reported is that sometimes downloads are incredibly slow (10s of kB/s) and sometimes the transfer is closed with data missing from the download. Other times we see perfectly reasonable download speeds (~10 MB/s).  This was reproducable with a command like the following

wget --no-check-certificate http://almascience.nrao.edu/dataPortal/member.uid___A001_X1358_Xd2.3C286_sci.spw31.cube.I.pbcor.fits

Shortly after this report, the almascience portal was redirected from the production docker swarm to the test-prod docker swarm because it produced better download performance, although still not as good as was expected (10s of MB/s).  Also, somewhere around this time the MTUs on the production docker swarm nodes was changed from 1500 to 9000.

It was noticed that one of the production docker swarm nodes, na-arc-3, was configured differently than the other na-arc-* nodes:

  • ping na-arc-[1,2,4,5] from na-arc-3 with anything larger than -s 1490 drops all packets
  • iperf tests show 10Gb/s between the VM host of na-arc-3 (naasc-vs-3 p5p1.120) and the VM host of na-arc-5 (naasc-vs-5 p2p1.120).  So it isn't a bad card in either of the VM hosts.
  • iptables on na-arc-3 looks different than iptables on na-arc-[2,3,5].  na-arc-1 also looks a bit different.
  • docker_gwbridge interface on na-arc-[1,2,4,5] shows NO_CARRIER but not on na-arc-3.
  • na-arc-3 has a veth10fd1da@if37 interface.  None of the other na-arc-* nodes have a veth interface.
  • iperf3 tests between all the na-arc-* nodes showed na-arc-3 was performing about 10e4 times slower on both sending and receiving.

Given the number of issues with na-arc-3 it was decided to just recreated it from a clone of na-arc-2.  This happened on 2022-08-11 and since then iperf3 tests between all the na-arc-* nodes have shown expected performance.

On 2022-08-12 http://almaportal.cv.nrao.edu/ was created so that we could internally test the production docker swarm nodes in a manner similar to how external users would use it.  Now tests could be run on almaportal just like on almascience.  E.g.

wget --no-check-certificate https://almaportal.cv.nrao.edu/dataPortal/2013.1.00226.S_uid___A001_X122_X1f1_001_of_001.tar

On 2022-08-19, naasc-vs-5 lost its heartbeat with the docker swarm which caused all the swarm services on na-arc-5 shutdown about 11am Central and move to other na-arc nodes.  The reason for this lost hearbeat is unknown but it could have been user error.  After this event, wget tests started downloading at around 100MB/s.  The node na-arc-5 had been running several services including the rh-download service.  So I moved the rh-download service back to na-arc-5 with docker service update --force production_requesthandler_download and found wget performance was back to about 32KB/s.  I then moved rh-download from na-arc-5 back to na-arc-2 with docker node update --availability drain na-arc-5 and found wget performance was back to about 100MB/s.  I ran the wget test four times to make sure the web proxy walked through all the na-arc nodes.  I then moved the httpd service from na-arc-2 to na-arc-5 and found wget performance to be vary from about 32KB/s to about 100MB/s from test to test.  Using wget to access each na-arc node directly instead of going through the web proxy's round robin selection process showed that performance was based on the na-arc node used in the wget command.  E.g.

  • wget --no-check-certificate http://na-arc-1.cv.nrao.edu:8088/dataPortal/member.uid___A001_X122_X1f1.LKCA_15_13CO_cube.image.fits 32KB/s
  • wget --no-check-certificate http://na-arc-2.cv.nrao.edu:8088/dataPortal/member.uid___A001_X122_X1f1.LKCA_15_13CO_cube.image.fits 32KB/s
  • wget --no-check-certificate http://na-arc-3.cv.nrao.edu:8088/dataPortal/member.uid___A001_X122_X1f1.LKCA_15_13CO_cube.image.fits 100MB/s
  • wget --no-check-certificate http://na-arc-4.cv.nrao.edu:8088/dataPortal/member.uid___A001_X122_X1f1.LKCA_15_13CO_cube.image.fits 32KB/s
  • wget --no-check-certificate http://na-arc-5.cv.nrao.edu:8088/dataPortal/member.uid___A001_X122_X1f1.LKCA_15_13CO_cube.image.fits 100MB/s

This was a huge breakthrough because now we could see both the poor performance that users were seeing before the almascience portal was redirected, but we could also see the desired and expected performance.  It also implicated naasc-vs-4 as the problem since na-arc-1, na-arc-2, and na-arc-4 were all hosted on naasc-vs-4.

On 2022-08-31 we learned how to perform iper3 tests over the docker swarm overlay network known as ingress.  This is the network docker swarm uses to redirect traffic sent to the wrong host.  You can do this by logging into a docker swarm node like na-arc-1 and starting a shell in the ingress_sbox namespace like so

nsenter --net=/var/run/docker/netns/ingress_sbox

From there you can use ip -c addr show to see the IPs and interfaces of the ingress network namespace on that node.  You can also use iperf3 to test this ingress network.  Here are the results of our nodes.  The values are rounded for simplicity.  Hosts accross the top row are receiving while hosts along the left column are transmitting.  You can see that na-arc-3 and na-arc-5 show poor performance when transmitting to na-arc-1, na-arc-2, and na-arc-3.  This seems to implicates either naasc-vs-4 as a culpret, or na-arc-3 and na-arc-5 or their VM Hosts as the culprets.  We weren't sure.

Table3: iperf3 to/from ingress_sbox (Mb/s)


na-arc-1

10.0.0.2

na-arc-2

10.0.0.21

na-arc-3

10.0.0.19


na-arc-4

10.0.0.5

na-arc-5

10.0.0.6

na-arc-1
4,0002,0004,0003,000
na-arc-24,000
2,0004,0003,000
na-arc-30.30.3
0.33,000
na-arc-44,0004,0002,000
3,000
na-arc-50.30.32,0000.3

On 2022-09-09 a sixth docker swarm node was created (na-arc-6) on a new VM host (naasc-vs-2).  We ran iperf3 tests again in over the ingress network and found the following

Table6: iperf3 TCP throughput from/to ingress_sbox (Mb/s)

na-arc-1

(naasc-vs-4)

na-arc-2

(naasc-vs-4)

na-arc-3

(naasc-vs-3)

na-arc-4

(naasc-vs-4)

na-arc-5

(naasc-vs-5)

na-arc-6

(naasc-vs-2)

na-arc-1
39202300420031103280
na-arc-23950
2630400033503530
na-arc-30.20.3
0.227202810
na-arc-4386035802410
33903290
na-arc-50.20.224800.2
2550
na-arc-60.0050.00527900.0053290

Seeing na-arc-6 also performing poorly when transmitting to nodes on naasc-vs-4 told us that there is something wrong with the receive end of naasc-vs-4.









  • No labels