Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • 2022-09-02 krowe: sysctl -a | grep br97 accross naasc-vs-{3..5} are identical.
  • 2022-09-02 krowe: sysctl -a | grep <vnet> accross naasc-vs-{3..5} are identical except for the vnet name (e.g. vnet2, vnet4, etc)
  • 2022-09-02 krowe: sysctl -a | grep <10Gb NIC> accross naasc-vs-{3..5} are different
    • naasc-vs-3 is identical to naasc-bs-5 except for the name of the NIC (e.g. p5p1, p2p1).
    • naasc-vs-4 has entries for VLANs 101 and 140 while naasc-vs-3 and naasc-vs-5 have entries for VLANs 192 and 96.
  • 2022-09-02 krowe: sysctl -a on naasc-vs-3 and naasc-vs-5 have no significant differences.
  • 2022-09-02 krowe: sysctl -a on naasc-vs-4 and naasc-vs-5 and found many questionable differences
    • naasc-vs-4: net.iw_cm.default_backlog = 256
      • Is this because the IB modules are loaded?
    • naasc-vs-4: net.rdma_ucm.max_backlog = 1024
      • Is this because the IB modules are loaded?
    • naasc-vs-4: sunrpc.rdma*
      • Is this because the IB modules are loaded?
    • naasc-vs-4: net.netfilter.nf_log.2 = nfnetlink_log
      • nfnetlink is a module for packet mangling.  Could this interfear with the docker swarm networking?
  • 2022-09-06 krowe: CV-NEXUS switch port capibilities for naasc-vs-{3..5} are identical.
  • 2022-09-06 krowe: CV-NEXUS9K switch port capibilities for naasc-vs-{3..5} are identical.
    • Though the recorded output rate of naasc-vs-5 is about 500 Mb/s while naasc-vs-{3..4} is about 300Kb/s.
    • And the recorded input rate of naasc-vs-5 is about 500 Mb/s while naasc-vs{3..4} is about 5 Mb/s.
    • This is very strange as it seemed naasc-vs-5 was the limiting factor but the switch ports suggest not.  Perhaps this data rate is caused by other VM guests on naasc-vs-5 (helpdesk-prod, naascweb2-prod, cartaweb-prod, natest-arc-3, cobweb2-dev)
  • 2022-09-06 krowe: ethtool -k <NIC> for naasc-vs-3 and naasc-vs-5 are identical execpt for the NIC name
  • 2022-09-06 krowe: ethtool -k <NIC> for naasc-vs-3 and naasc-vs-4 are very different.
    • hw-tc-offload: off vs hw-tc-offload: on
    • rx-gro-hw: off vs rx-gro-hw: on
    • rx-vlan-offload: off vs rx-vlan-offload: on
    • rx-vlan-stag-hw-parse: off vs rx-vlan-stag-hw-parse: on
    • tcp-segmentation-offload: off vs tcp-segmentation-offload: on
    • tx-gre-csum-segmentation: off vs tx-gre-csum-segmentation: on
    • tx-gre-segmentation: off vs tx-gre-segmentation: on
    • tx-gso-partial: off vs x-gso-partial: on
    • tx-ipip-segmentation: off vs tx-ipip-segmentation: on
    • tx-sit-segmentation: off vs tx-sit-segmentation: on
    • tx-tcp-segmentation: off vs tx-tcp-segmentation: on
    • tx-udp_tnl-csum-segmentation: off vs tx-udp_tnl-csum-segmentation: on
    • tx-udp_tnl-segmentation: off vs tx-udp_tnl-segmentation: on
    • tx-vlan-offload: off vs tx-vlan-offload: on
    • tx-vlan-stag-hw-insert: off vs tx-vlan-stag-hw-insert: on

...

  • Where is the main docker config (yaml file)?
  • Why does na-arc-5 still have net.ipv4.conf.all.accept_redirects = 1 even after a reboot while all the other na-arc nodes have this set to 0?
    • 2022-09-06 krowe: probably because na-arc-5 didn't reboot when naasc-vs-5 rebooted.  I expect it was suspended instead of rebooted.  Yet natest-arc-3 and naascweb2-prod were rebooted.  I just checked virt-manager and na-arc-5 is hosted by naasc-vs-5.  Can we reboote na-arc-5?
  • Why does naasc-vs-4 have all the infiniband modules loaded?  I don't see an IB card.  naasc-vs-1 and naasc-dev-vs also have some IB modules loaded but naasc-vs-3 and naasc-vs-5 don't have any IB modules loaded.
  • Why is nfnetlink logging enabled on naasc-vs-4?  You can see this with cat /proc/net/netfilter/nf_log and lsmod|grep -i nfnet
    • nfnetlink is a module for packet mangling.  Could this interfear with the docker swarm networking?
  • why is the eth1 interfaces in all the containers and docker_gwbridge on na-arc-1 in the 172.18.x.x range while all the other na-arcs are in the 172.19.x.x range?  Does it matter?
  • Here are some diffs in sysctl on na-arc nodes.  I tried changed na-arc-4 and na-arc-5 to match the others but performance was the same.  I then changed all the nodes to match na-arc-{1..3} and still no change in performance.  I still don't understand how na-arc-{4..5} got different setttings.  I did find that there is another directory for sysctl settings in /usr/lib/sysctl.d but that isn't why these are different.
    • na-arc-1, na-arc-2, na-arc-3, natest-arc-1, natest-arc-2, natest-arc-3
      • net.bridge.bridge-nf-call-arptables = 0

        net.bridge.bridge-nf-call-ip6tables = 0

        net.bridge.bridge-nf-call-iptables = 1

    • na-arc-4, na-arc-5
      • net.bridge.bridge-nf-call-arptables = 1

        net.bridge.bridge-nf-call-ip6tables = 1

        net.bridge.bridge-nf-call-iptables = 1

  • I see sysctl differences between the natest-arc servers and the na-arc servers.  Here is a diff of /etc/sysctl.d/99-nrao.conf on natest-arc-1 and na-arc-5
    • < #net.ipv4.tcp_tw_recycle = 1
      ---
      > net.ipv4.tcp_tw_recycle = 1
      22,39d21
      < net.ipv4.conf.all.accept_redirects=0
      < net.ipv4.conf.default.accept_redirects=0
      < net.ipv4.conf.all.secure_redirects=0
      < net.ipv4.conf.default.secure_redirects=0
      <
      < #net.ipv6.conf.all.disable_ipv6 = 1
      < #net.ipv6.conf.default.disable_ipv6 = 1
      <
      < # Mellanox recommends the following
      < net.ipv4.tcp_timestamps = 0
      < net.core.netdev_max_backlog = 250000
      <
      < net.core.rmem_default = 16777216
      < net.core.wmem_default = 16777216
      < net.core.optmem_max = 16777216
      < net.ipv4.tcp_mem = 16777216 16777216 16777216
      < net.ipv4.tcp_low_latency = 1
    • If I set net.ipv4.tcp_timestamps = 0 on na-arc-5, the wget download drops to nothing (--.-KB/s).

    • If I set all the above sysctl options, execpt net.ipv4.tcp_timestamps, on all five na-arc nodes, wget download performance doesn't change.  It is still about 32KB/s.  Also I still zeeo ZeroWindow packets.
    • Try rebooting VMs after making changes?
  • I see ZeroWindow packets sent from na-arc-5 to nangas13 while downloading a file from nangas13 using wget.  This is na-arc-5 telling nangas13 to wiat because its network buffer is full.
    • Is this because of qdisc pfifo_fast?  No.  krowe changed eth0 to *qdisc fq_codel* and still seeing ZeroWait packets.
    • Now that I have moved the rh_download to na-arc-1 and put httpd on na-arc-5 I no longer see ZeroWindow packets on na-arc-5.  But I am seeing them on na-arc-1 which is where the rh_downloader is now.  Is this because the rh_downloader is being stalled talking to something else like httpd and therefore telling nangas13 to wait?
  • Why does almaportal use ens3 while almascience uses eth0?
  • What if we move the rh-downloader container to a different node?  In fact walk it through all five nodes and test.
  • Why do I see cv-6509 when tracerouting from na-arc-5 to nangas13 but not on natest-arc-1
    • [root@na-arc-5 ~]# traceroute nangas13
      traceroute to nangas13 (10.2.140.33), 30 hops max, 60 byte packets
       1  cv-6509-vlan97.cv.nrao.edu (10.2.97.1)  0.426 ms  0.465 ms  0.523 ms
       2  cv-6509.cv.nrao.edu (10.2.254.5)  0.297 ms  0.277 ms  0.266 ms
       3  nangas13.cv.nrao.edu (10.2.140.33)  0.197 ms  0.144 ms  0.109 ms
       
    • [root@natest-arc-1 ~]# traceroute nangas13
      traceroute to nangas13 (10.2.140.33), 30 hops max, 60 byte packets
       1  cv-6509-vlan96.cv.nrao.edu (10.2.96.1)  0.459 ms  0.427 ms  0.402 ms
       2  nangas13.cv.nrao.edu (10.2.140.33)  0.184 ms  0.336 ms  0.311 ms
    • Derek wrote that 10.2.99.1 = CV-NEXUS and 10.2.96.1 = CV-6509
  • Why does natest-arc-3 have ens3 instead of eth0 and why is its speed 100Mb/s?
    • virsh domiflist natest-arc-3 shows the Model as rtl8139 instead of virtio
    • When I run ethtool eth0 on nar-arc-{1..5} natest-arc-{1..2} as root, the result is just Link detected: yes instead of the full report with speed while na-arc-3 shows 100Mb/s.
  • Why do iperf tests from natest-arc-1 and natest-arc-2 to natest-arc-3 get about half the performance (0.5Gb/s) expected especially when the reverse tests get expected performance (0.9Gb/s).
  • Is putting the production swarm nodes (na-arc-*) on the 10Gb/s network a good idea?  Sure it makes a fast connection to cvsan but it adds one more hop to the nangas servers (e.g. na-arc-1 -> cv-nexus9k -> cv-nexus -> nangas11)
  • When I connect to the container acralmaprod001.azurecr.io/offline-production/rh-download:2022.06.01.2022jun I get errors like unknown user 1009  I get the same errors on the natest-arc-1 container.
  • Does it matter that the na-arc nodes are on 10.2.97.x, their VM host is on 10.2.99.x while the natest-arc nodes are on 10.2.96.x and their VM hosts (well 2 out of 3) are also on 10.2.96.x?  Is this why I see cv-509.cv.nrao.edu when running traceroute from the na-arc nodes?
  • When running wget --no-check-certificate http://na-arc-3.cv.nrao.edu:8088/dataPortal/member.uid___A001_X1358_Xd2.3C286_sci.spw31.cube.I.pbcor.fits I see traffic going through veth14ce034 on na-arc-3 but I can't find a container associated with that veth.
  • Why does the httpd container have eth0(10.0.0.8).  This is the ingress network.  I don't see any other conrainter with an interface on 10.0.0.0/24.

...