Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • https://ictjira.alma.cl/browse/AES-52
  • https://confluence.alma.cl/pages/viewpage.action?pageId=91826715
  • You can see poor performance with a command like
    • wget --no-check-certificate http://almaportal.cv.nrao.edu/dataPortal/member.uid___A001_X1358_Xd2.3C286_sci.spw31.cube.I.pbcor.fits
  • krowe has narrowed this down to the ingress overlay network created by docker swarm which is used to re-route traffic sent to the wrong host.
    • On na-arc-2
      • nsenter --net=/var/run/docker/netns/ingress_sbox
      • iperf -B 10.0.0.21 -s
    • On na-arc-3
      • nsenter --net=/var/run/docker/netns/ingress_sbox
      • iperf3 -B 10.0.0.19 -c 10.0.0.21
    • SOLUTION: Set rx-gro-hw=off on naasc-vs-4.  See Conclusions for more details.
  • 2022-09-19 krowe: With rx-gro-hw=off, retransmissions have reduced, but I still see them when sending to naasc-vs-5 and even more when sending to naasc-vs-2 but not on .  Sending to naasc-vs-3 or naasc-vs-4 does not produce retransmissions.  This is surprising given how similar naasc-vs-3 and naasc-vs-5 are.  I expect this is caused by congestion as the retransmissions are not easily reproducable.
    • On naasc-vs-2
      • iperf3 -B 10.2.120.107 -s
    • On naasc-vs-3 or naasc-vs-4 or naasc-vs-5 or naasc-cont-1 or...
      • iperf3 -B <LOCAL_IP> -c 10.2.120.107
  • 2022-09-19 krowe: With rx-gro-hw=off, throughput over the vxlan/overlay network (like ingress_sbox) has improved but still ranges from 1Gb/s to 4.6Gb/s.  The network is rated at 10Gb/s.  VLAN and VXLAN introduce about a 10% overhead penalty.  So I would expect throughput to be more like 8Gb/s to 9Gb/s.  Granted this isn't really an issue since at the moment the NGAS servers are on a 1Gb/s network, but that may change someday.

...