Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Poor Download Performance

TL;DR ethtool -K em1 gro off

This was first reported on 2022-04-18 and documented in https://ictjira.alma.cl/browse/AES-52  What we have seen/has been reported is that sometimes downloads are incredibly slow (10s of kB/s) and sometimes the transfer is closed with data missing from the download. Other times we see perfectly reasonable download speeds (~10 MB/s).  This was reproducable with a command like the following

...

From there you can use ip -c addr show to see the IPs and interfaces of the ingress network namespace on that node.  You can also use iperf3 to test this ingress network.  Here are the results of our nodes.  The values are rounded for simplicity.  Hosts accross the top row are receiving while hosts along the left column are transmitting.  You can see that na-arc-3 and na-arc-5 show poor performance when transmitting to na-arc-1, na-arc-2, and na-arc-3.  This seems to implicates either naasc-vs-4 as a culpret, or na-arc-3 and na-arc-5 or their VM Hosts as the culprets.  We weren't sure.

Table3: iperf3 to/from ingress_sbox (Mb/s)


na-arc-1

10.0.0.2

na-arc-2

10.0.0.21

na-arc-3

10.0.0.19


na-arc-4

10.0.0.5

na-arc-5

10.0.0.6

na-arc-1
4,0002,0004,0003,000
na-arc-24,000
2,0004,0003,000
na-arc-30.30.3
0.33,000
na-arc-44,0004,0002,000
3,000
na-arc-50.30.32,0000.3

On 2022-09-09 a sixth docker swarm node was created (na-arc-6) on a new VM host (naasc-vs-2).  We ran iperf3 tests again in over the ingress network and found the following

Table6: iperf3 TCP throughput from/to ingress_sbox (Mb/s)

na-arc-1

(naasc-vs-4)

na-arc-2

(naasc-vs-4)

na-arc-3

(naasc-vs-3)

na-arc-4

(naasc-vs-4)

na-arc-5

(naasc-vs-5)

na-arc-6

(naasc-vs-2)

na-arc-1
39202300420031103280
na-arc-23950
2630400033503530
na-arc-30.20.3
0.227202810
na-arc-4386035802410
33903290
na-arc-50.20.224800.2
2550
na-arc-60.0050.00527900.0053290

Seeing na-arc-6 also performing poorly when transmitting to nodes on naasc-vs-4 told us that there is something wrong with the receive end of naasc-vs-4.  So we started to look at network settings in the kernel (sysctl), network hardware, ysctl settings, and network hardware features (ethtool -k).  We found that the Network Interface Card (NIC) on naasc-vs-4 was very different than the other naasc-vs hosts

  • naasc-vs-2 uses a Solarflare Communications SFC9220
  • naasc-vs-3 uses a Solarflare Communications SFC9020
  • naasc-vs-4 uses a Broadcom BCM57412 NetXtreme-E
  • naasc-vs-5 uses a Solarflare Communications SFC9020

There were some sysctl settings that were suspecious

  • naasc-vs-4 has entries for VLANs 101 and 140 while naasc-vs-3 and naasc-vs-5 have entries for VLANs 192 and 96.
  • naasc-vs-4: net.iw_cm.default_backlog = 256  Is this because the IB modules are loaded?
  • naasc-vs-4: net.rdma_ucm.max_backlog = 1024  Is this because the IB modules are loaded?
  • naasc-vs-4: sunrpc.rdma*  Is this because the IB modules are loaded?
  • naasc-vs-4: net.netfilter.nf_log.2 = nfnetlink_log

But the real breakthrough was in the NIC features.  You can see them with ethtool -k <NIC>.  There were many differences but we found that naasc-vs-4 had rx-gro-hw: on while all the other naasc-vs hosts had it set to off.  This feature is for Generic Receive Offload.  It is hardware on the physical NIC.  GRO is an aggregation technique to coalesce several receive packets from a stream into a single large packet, thus saving CPU cycles as fewer packets need to be processed by the kernel.  The Solarflare cards don't have this feature.  I found articles suggesting that GRO can make traffic slower when it is enabled, especially when using vxlan which the docker swarm ingress network uses.

On 2022-09-16 we disabled this feature on naasc-vs-4 with ethtool -K em1 gro off and iperf3 tests now show about between 1Gb/s and 4Gb/s in both directions.

Table7: iperf3 TCP throughput from/to ingress_sbox with rx-gro-hw=off (Mb/s)

na-arc-1

(naasc-vs-4)

na-arc-2

(naasc-vs-4)

na-arc-3

(naasc-vs-3)

na-arc-4

(naasc-vs-4)

na-arc-5

(naasc-vs-5)

na-arc-6

(naasc-vs-2)

na-arc-1

4460

2580463028603150
na-arc-2

4060


2590422036902570
na-arc-3

2710

2580


308027702920
na-arc-4

1090

37202200
29703200
na-arc-5

4010

397023404010
3080
na-arc-6

3380

3060306030103080