...
- 2022-09-26 krowe: Can an older solarflare card (Solarflare Communications SFC9020) replace the card in naasc-vs-2 to see if that helps with the TCP Retransmissions?
- 2022-09-28 krowe: When thalstea replaced the card with an old SFP9020 card from cv-vs-1, the machine would not boot. So the original SFC9022 is back in naasc-vs-2. See ticket https://support.nrao.edu/show-ticket.php?ticketid=145153 for deatils.
- 2022-09-26 krowe: Can someone who is able to login, login to the nodes on the 10.2.120 network and see if those interfaces are showing dropped Rx packets? I would, but I can't login to most of them because CV.
- 2022-09-21 krowe: Why are there dozens of stuck inventory processes on naasv-vs-2?
- 2022-09-20 krowe: ifconfig shows dropped RX packes on all naasc-vs-* nodes. Is that increasing still with time? What is causing this? CJ mentioned this two months ago. I am finally looking at it now. sigh.
- 2022-09-20 krowe: It looks like device eno1 on naasc-vs-2 is configured via DHCP instead of STATIC. Is that correct?
- 2022-09-20 krowe: Why does naasc-vs-2 have APIPA configured networks (169.254.0.0)? Aren't these usually created only if there are misconfigured network(s)?
[root@naasc-vs-2 ~]# netstat -nr
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
0.0.0.0 10.2.99.1 0.0.0.0 UG 0 0 0 eno1
10.2.99.0 0.0.0.0 255.255.255.0 U 0 0 0 eno1
10.2.120.0 0.0.0.0 255.255.255.0 U 0 0 0 ens1f0np0.120
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens1f0np0
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 ens1f0np0.120
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 br97
169.254.0.0 0.0.0.0 255.255.0.0 U 0 0 0 br101
192.168.122.0 0.0.0.0 255.255.255.0 U 0 0 0 virbr0
- Why can't I download via na-arc-6? I don't think it is properly setup yet.
- wget --no-check-certificate http://na-arc-6.cv.nrao.edu:8088/dataPortal/member.uid___A001_X1284_Xc9b.spt2349-56_sci.spw19.cube.I.pbcor.fits
--2022-09-15 10:22:32-- http://na-arc-6.cv.nrao.edu:8088/dataPortal/member.uid___A001_X1284_Xc9b.spt2349-56_sci.spw19.cube.I.pbcor.fits
Resolving na-arc-6.cv.nrao.edu (na-arc-6.cv.nrao.edu)... 10.2.97.76
Connecting to na-arc-6.cv.nrao.edu (na-arc-6.cv.nrao.edu)|10.2.97.76|:8088... failed: Connection timed out.
- wget --no-check-certificate http://na-arc-6.cv.nrao.edu:8088/dataPortal/member.uid___A001_X1284_Xc9b.spt2349-56_sci.spw19.cube.I.pbcor.fits
- Why, with rx-gro-hw=off on naasc-vs-4, does na-arc-6 see so many retransmissions and small Congestion Window (Cwnd)?
[root@na-arc-6 ~]# iperf3 -B 10.0.0.16 -c 10.0.0.21
Connecting to host 10.0.0.21, port 5201
[ 4] local 10.0.0.16 port 38534 connected to 10.0.0.21 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 302 MBytes 2.54 Gbits/sec 523 207 KBytes
[ 4] 1.00-2.00 sec 322 MBytes 2.70 Gbits/sec 596 186 KBytes
[ 4] 2.00-3.00 sec 312 MBytes 2.62 Gbits/sec 687 245 KBytes
[ 4] 3.00-4.00 sec 335 MBytes 2.81 Gbits/sec 638 278 KBytes
[ 4] 4.00-5.00 sec 309 MBytes 2.60 Gbits/sec 780 146 KBytes[root@na-arc-3 ~]# iperf3 -B 10.0.0.19 -c 10.0.0.21
Connecting to host 10.0.0.21, port 5201
[ 4] local 10.0.0.19 port 52986 connected to 10.0.0.21 port 5201
[ ID] Interval Transfer Bandwidth Retr Cwnd
[ 4] 0.00-1.00 sec 309 MBytes 2.59 Gbits/sec 232 638 KBytes
[ 4] 1.00-2.00 sec 358 MBytes 3.00 Gbits/sec 0 967 KBytes
[ 4] 2.00-3.00 sec 351 MBytes 2.95 Gbits/sec 0 1.18 MBytes
[ 4] 3.00-4.00 sec 339 MBytes 2.84 Gbits/sec 74 1.36 MBytes
[ 4] 4.00-5.00 sec 359 MBytes 3.01 Gbits/sec 0 1.54 MBytes- Actuqally the retransmissions seem to very quite a lot from one run to another. That is the more important question. Also the throughput seems to vary as well from 1Gb/s to 4Gb/s. Of course the more retransmissions the less throughput. Granted this is a second order force and given that the nangas hosts have 1Gb/s links, probably won't be seen. But if we ever put 10Gb/s cards in the nangas nodes we will see this and be sad.
- Why does naasc-vs-3 have a br120 in state UNKNOWN? none of the other naasc-vs nodes have a br120.
- Why does naasc-vs-4 have all the infiniband modules loaded? I don't see an IB card. naasc-vs-1 and naasc-dev-vs also have some IB modules loaded but naasc-vs-3 and naasc-vs-5 don't have any IB modules loaded.
- Tracy will look into this
- Why is nfnetlink logging enabled on naasc-vs-4? You can see this with cat /proc/net/netfilter/nf_log and lsmod|grep -i nfnet
- nfnetlink is a module for packet mangling. Could this interfear with the docker swarm networking?
- why is the eth1 interfaces in all the containers and docker_gwbridge on na-arc-1 in the 172.18.x.x range while all the other na-arcs are in the 172.19.x.x range? Does it matter?
- Here are some diffs in sysctl on na-arc nodes. I tried changed na-arc-4 and na-arc-5 to match the others but performance was the same. I then changed all the nodes to match na-arc-{1..3} and still no change in performance. I still don't understand how na-arc-{4..5} got different setttings. I did find that there is another directory for sysctl settings in /usr/lib/sysctl.d but that isn't why these are different.
- na-arc-1, na-arc-2, na-arc-3, natest-arc-1, natest-arc-2, natest-arc-3
net.bridge.bridge-nf-call-arptables = 0
net.bridge.bridge-nf-call-ip6tables = 0
net.bridge.bridge-nf-call-iptables = 1
- na-arc-4, na-arc-5
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
- na-arc-1, na-arc-2, na-arc-3, natest-arc-1, natest-arc-2, natest-arc-3
- I see sysctl differences between the natest-arc servers and the na-arc servers. Here is a diff of /etc/sysctl.d/99-nrao.conf on natest-arc-1 and na-arc-5
< #net.ipv4.tcp_tw_recycle = 1
---
> net.ipv4.tcp_tw_recycle = 1
22,39d21
< net.ipv4.conf.all.accept_redirects=0
< net.ipv4.conf.default.accept_redirects=0
< net.ipv4.conf.all.secure_redirects=0
< net.ipv4.conf.default.secure_redirects=0
<
< #net.ipv6.conf.all.disable_ipv6 = 1
< #net.ipv6.conf.default.disable_ipv6 = 1
<
< # Mellanox recommends the following
< net.ipv4.tcp_timestamps = 0
< net.core.netdev_max_backlog = 250000
<
< net.core.rmem_default = 16777216
< net.core.wmem_default = 16777216
< net.core.optmem_max = 16777216
< net.ipv4.tcp_mem = 16777216 16777216 16777216
< net.ipv4.tcp_low_latency = 1If I set net.ipv4.tcp_timestamps = 0 on na-arc-5, the wget download drops to nothing (--.-KB/s).
- If I set all the above sysctl options, execpt net.ipv4.tcp_timestamps, on all five na-arc nodes, wget download performance doesn't change. It is still about 32KB/s. Also I still zeeo ZeroWindow packets.
- Try rebooting VMs after making changes?
- I see ZeroWindow packets sent from na-arc-5 to nangas13 while downloading a file from nangas13 using wget. This is na-arc-5 telling nangas13 to wiat because its network buffer is full.
- Is this because of qdisc pfifo_fast? No. krowe changed eth0 to *qdisc fq_codel* and still seeing ZeroWait packets.
- Now that I have moved the rh_download to na-arc-1 and put httpd on na-arc-5 I no longer see ZeroWindow packets on na-arc-5. But I am seeing them on na-arc-1 which is where the rh_downloader is now. Is this because the rh_downloader is being stalled talking to something else like httpd and therefore telling nangas13 to wait?
- Why does almaportal use ens3 while almascience uses eth0?
- What if we move the rh-downloader container to a different node? In fact walk it through all five nodes and test.
- Why do I see cv-6509 when tracerouting from na-arc-5 to nangas13 but not on natest-arc-1
[root@na-arc-5 ~]# traceroute nangas13
traceroute to nangas13 (10.2.140.33), 30 hops max, 60 byte packets
1 cv-6509-vlan97.cv.nrao.edu (10.2.97.1) 0.426 ms 0.465 ms 0.523 ms
2 cv-6509.cv.nrao.edu (10.2.254.5) 0.297 ms 0.277 ms 0.266 ms
3 nangas13.cv.nrao.edu (10.2.140.33) 0.197 ms 0.144 ms 0.109 ms[root@natest-arc-1 ~]# traceroute nangas13
traceroute to nangas13 (10.2.140.33), 30 hops max, 60 byte packets
1 cv-6509-vlan96.cv.nrao.edu (10.2.96.1) 0.459 ms 0.427 ms 0.402 ms
2 nangas13.cv.nrao.edu (10.2.140.33) 0.184 ms 0.336 ms 0.311 ms- Derek wrote that 10.2.99.1 = CV-NEXUS and 10.2.96.1 = CV-6509
- Why does natest-arc-3 have ens3 instead of eth0 and why is its speed 100Mb/s?
- virsh domiflist natest-arc-3 shows the Model as rtl8139 instead of virtio
- When I run ethtool eth0 on nar-arc-{1..5} natest-arc-{1..2} as root, the result is just Link detected: yes instead of the full report with speed while na-arc-3 shows 100Mb/s.
- Why do iperf tests from natest-arc-1 and natest-arc-2 to natest-arc-3 get about half the performance (0.5Gb/s) expected especially when the reverse tests get expected performance (0.9Gb/s).
- Is putting the production swarm nodes (na-arc-*) on the 10Gb/s network a good idea? Sure it makes a fast connection to cvsan but it adds one more hop to the nangas servers (e.g. na-arc-1 -> cv-nexus9k -> cv-nexus -> nangas11)
- When I connect to the container acralmaprod001.azurecr.io/offline-production/rh-download:2022.06.01.2022jun I get errors like unknown user 1009 I get the same errors on the natest-arc-1 container.
- Does it matter that the na-arc nodes are on 10.2.97.x, their VM host is on 10.2.99.x while the natest-arc nodes are on 10.2.96.x and their VM hosts (well 2 out of 3) are also on 10.2.96.x? Is this why I see cv-509.cv.nrao.edu when running traceroute from the na-arc nodes?
- When running wget --no-check-certificate http://na-arc-3.cv.nrao.edu:8088/dataPortal/member.uid___A001_X1358_Xd2.3C286_sci.spw31.cube.I.pbcor.fits I see traffic going through veth14ce034 on na-arc-3 but I can't find a container associated with that veth.
- Why does the httpd container have eth0(10.0.0.8). This is the ingress network. I don't see any other conrainter with an interface on 10.0.0.0/24.
- Do we want to use jumbo frames? If so, some recommend using mtu=8900 and there are a lot of places it needs to be set.
...