public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: Miquel Raynal <miquel.raynal@bootlin.com>
To: Eric Dumazet <edumazet@google.com>
Cc: "Russell King (Oracle)" <linux@armlinux.org.uk>,
	Wei Fang <wei.fang@nxp.com>, Shenwei Wang <shenwei.wang@nxp.com>,
	Clark Wang <xiaoning.wang@nxp.com>,
	davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
	linux-imx@nxp.com, netdev@vger.kernel.org,
	Thomas Petazzoni <thomas.petazzoni@bootlin.com>,
	Alexandre Belloni <alexandre.belloni@bootlin.com>,
	Maxime Chevallier <maxime.chevallier@bootlin.com>,
	Andrew Lunn <andrew@lunn.ch>,
	Stephen Hemminger <stephen@networkplumber.org>
Subject: Re: Ethernet issue on imx6
Date: Mon, 16 Oct 2023 15:58:58 +0200	[thread overview]
Message-ID: <20231016155858.7af3490b@xps-13> (raw)
In-Reply-To: <CANn89iKC9apkRG80eBPqsdKEkdawKzGt9EsBRLm61H=4Nn4jQQ@mail.gmail.com>

Hi Eric,

edumazet@google.com wrote on Mon, 16 Oct 2023 13:49:25 +0200:

> On Fri, Oct 13, 2023 at 10:40 AM Miquel Raynal
> <miquel.raynal@bootlin.com> wrote:
> >
> > Hi Russell,
> >
> > linux@armlinux.org.uk wrote on Thu, 12 Oct 2023 20:39:11 +0100:
> >  
> > > On Thu, Oct 12, 2023 at 07:34:10PM +0200, Miquel Raynal wrote:  
> > > > Hello,
> > > >
> > > > I've been scratching my foreheads for weeks on a strange imx6
> > > > network issue, I need help to go further, as I feel a bit clueless now.
> > > >
> > > > Here is my setup :
> > > > - Custom imx6q board
> > > > - Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
> > > > - Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
> > > > - The MAC (fec driver) is connected to a Micrel 9031 PHY
> > > > - The PHY is connected to the link partner through an industrial cable  
> > >
> > > "industrial cable" ?  
> >
> > It is a "unique" hardware cable, the four Ethernet pairs are foiled
> > twisted pair each and the whole cable is shielded. Additionally there
> > is the 24V power supply coming from this cable. The connector is from
> > ODU S22LOC-P16MCD0-920S. The structure of the cable should be similar
> > to a CAT7 cable with the additional power supply line.
> >  
> > > > - Testing 100BASE-T (link is stable)  
> > >
> > > Would that be full or half duplex?  
> >
> > Ah, yeah, sorry for forgetting this detail, it's full duplex.
> >  
> > > > The RGMII-ID timings are probably not totally optimal but offer
> > > > rather good performance. In UDP with iperf3:
> > > > * Downlink (host to the board) runs at full speed with 0% drop
> > > > * Uplink (board to host) runs at full speed with <1% drop
> > > >
> > > > However, if I ever try to limit the bandwidth in uplink (only), the
> > > > drop rate rises significantly, up to 30%:
> > > >
> > > > //192.168.1.1 is my host, so the below lines are from the board:
> > > > # iperf3 -c 192.168.1.1 -u -b100M
> > > > [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms
> > > > 467/82603 (0.57%)  receiver # iperf3 -c 192.168.1.1 -u -b90M
> > > > [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms
> > > > 12163/77688 (16%)  receiver # iperf3 -c 192.168.1.1 -u -b80M
> > > > [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms
> > > > 20937/69055 (30%)  receiver  
> > >
> > > My setup:
> > >
> > > i.MX6DL silicon rev 1.3
> > > Atheros AR8035 PHY
> > > 6.3.0+ (no significant changes to fec_main.c)
> > > Link, being BASE-T, is standard RJ45.
> > >
> > > Connectivity is via a bridge device (sorry, can't change that as it
> > > would be too disruptive, as this is my Internet router!)
> > >
> > > Running at 1000BASE-T (FD):
> > > [ ID] Interval           Transfer     Bitrate         Jitter
> > > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> > > Mbits/sec  0.030 ms  0/82363 (0%)  receiver [  5]   0.00-10.00  sec
> > > 107 MBytes  90.0 Mbits/sec  0.103 ms  0/77691 (0%)  receiver [  5]
> > > 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.101 ms  0/69060 (0%)
> > > receiver
> > >
> > > Running at 100BASE-Tx (FD):
> > > [ ID] Interval           Transfer     Bitrate         Jitter
> > > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> > > Mbits/sec  0.008 ms  0/82436 (0%)  receiver [  5]   0.00-10.00  sec
> > > 107 MBytes  90.0 Mbits/sec  0.088 ms  0/77692 (0%)  receiver [  5]
> > > 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.108 ms  0/69058 (0%)
> > > receiver
> > >
> > > Running at 100bASE-Tx (HD):
> > > [ ID] Interval           Transfer     Bitrate         Jitter
> > > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.3
> > > Mbits/sec  0.056 ms  0/82304 (0%)  receiver [  5]   0.00-10.00  sec
> > > 107 MBytes  90.0 Mbits/sec  0.101 ms  1/77691 (0.0013%)  receiver [
> > > 5]   0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.105 ms  0/69058
> > > (0%)  receiver
> > >
> > > So I'm afraid I don't see your issue.  
> >
> > I believe the issue cannot be at an higher level than the MAC. I also
> > do not think the MAC driver and PHY driver are specifically buggy. I
> > ruled out the hardware issue given the fact that under certain
> > conditions (high load) the network works rather well... But I certainly
> > see this issue, and when switching to TCP the results are dramatic:
> >
> > # iperf3 -c 192.168.1.1
> > Connecting to host 192.168.1.1, port 5201
> > [  5] local 192.168.1.2 port 37948 connected to 192.168.1.1 port 5201
> > [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> > [  5]   0.00-1.00   sec  11.3 MBytes  94.5 Mbits/sec   43   32.5 KBytes
> > [  5]   1.00-2.00   sec  3.29 MBytes  27.6 Mbits/sec   26   1.41 KBytes
> > [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes
> > [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > [  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > [  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > [  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> >
> > Thanks,
> > Miquèl  
> 
> Can you experiment with :
> 
> - Disabling TSO on your NIC (ethtool -K eth0 tso off)
> - Reducing max GSO size (ip link set dev eth0 gso_max_size 16384)
> 
> I suspect some kind of issues with fec TX completion, vs TSO emulation.

Wow, appears to have a significant effect. I am using Busybox's iproute
implementation which does not know gso_max_size, but I hacked directly
into netdevice.h just to see if it would have an effect. I'm adding
iproute2 to the image for further testing.

Here is the diff:

--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2364,7 +2364,7 @@ struct net_device {
 /* TCP minimal MSS is 8 (TCP_MIN_GSO_SIZE),
  * and shinfo->gso_segs is a 16bit field.
  */
-#define GSO_MAX_SIZE           (8 * GSO_MAX_SEGS)
+#define GSO_MAX_SIZE           16384u
 
        unsigned int            gso_max_size;
 #define TSO_LEGACY_MAX_SIZE    65536

And here are the results:

# ethtool -K eth0 tso off
# iperf3 -c 192.168.1.1 -u -b1M
Connecting to host 192.168.1.1, port 5201
[  5] local 192.168.1.2 port 50490 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec   123 KBytes  1.01 Mbits/sec  87  
[  5]   1.00-2.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   2.00-3.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   3.00-4.00   sec   123 KBytes  1.01 Mbits/sec  87  
[  5]   4.00-5.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   5.00-6.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   6.00-7.00   sec   123 KBytes  1.01 Mbits/sec  87  
[  5]   7.00-8.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   8.00-9.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   9.00-10.00  sec   123 KBytes  1.01 Mbits/sec  87  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  1.19 MBytes  1.00 Mbits/sec  0.000 ms  0/864 (0%)  sender
[  5]   0.00-10.05  sec  1.11 MBytes   925 Kbits/sec  0.045 ms  62/864 (7.2%)  receiver
iperf Done.
# iperf3 -c 192.168.1.1
Connecting to host 192.168.1.1, port 5201
[  5] local 192.168.1.2 port 34792 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.63 MBytes  13.7 Mbits/sec   30   1.41 KBytes       
[  5]   1.00-2.00   sec  7.40 MBytes  62.1 Mbits/sec   65   14.1 KBytes       
[  5]   2.00-3.00   sec  7.83 MBytes  65.7 Mbits/sec  109   2.83 KBytes       
[  5]   3.00-4.00   sec  2.49 MBytes  20.9 Mbits/sec   46   19.8 KBytes       
[  5]   4.00-5.00   sec  7.89 MBytes  66.2 Mbits/sec  109   2.83 KBytes       
[  5]   5.00-6.00   sec   255 KBytes  2.09 Mbits/sec   22   2.83 KBytes       
[  5]   6.00-7.00   sec  4.35 MBytes  36.5 Mbits/sec   74   41.0 KBytes       
[  5]   7.00-8.00   sec  10.9 MBytes  91.8 Mbits/sec   34   45.2 KBytes       
[  5]   8.00-9.00   sec  5.35 MBytes  44.9 Mbits/sec   82   1.41 KBytes       
[  5]   9.00-10.00  sec  1.37 MBytes  11.5 Mbits/sec   73   1.41 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  49.5 MBytes  41.5 Mbits/sec  644             sender
[  5]   0.00-10.05  sec  49.3 MBytes  41.1 Mbits/sec                  receiver
iperf Done.

There is still a noticeable amount of drop/retries, but overall the
results are significantly better. What is the rationale behind the
choice of 16384 in particular? Could this be further improved?

Thanks a lot,
Miquèl

  reply	other threads:[~2023-10-16 13:59 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-12 17:34 Ethernet issue on imx6 Miquel Raynal
2023-10-12 19:39 ` Russell King (Oracle)
2023-10-13  8:40   ` Miquel Raynal
2023-10-13 10:16     ` Wei Fang
2023-10-16 11:49     ` Eric Dumazet
2023-10-16 13:58       ` Miquel Raynal [this message]
2023-10-16 15:06         ` Eric Dumazet
2023-10-16 15:36         ` Miquel Raynal
2023-10-16 19:37           ` Eric Dumazet
2023-10-16 21:47             ` Russell King (Oracle)
2023-10-17 11:19             ` Miquel Raynal
2023-10-12 20:46 ` Andrew Lunn
2023-10-12 22:58   ` Stephen Hemminger
2023-10-13  8:27     ` Miquel Raynal
2023-10-13 15:51       ` Andrew Lunn
2023-10-27 20:58         ` Miquel Raynal
2023-11-17 15:09           ` Miquel Raynal
2023-10-16  8:48       ` Alexander Stein
2023-10-16 13:31         ` Miquel Raynal
2023-10-16 14:41           ` Alexander Stein
2023-10-17 10:49             ` Miquel Raynal
2023-10-18  9:08               ` Alexander Stein
2023-10-27 20:58                 ` Miquel Raynal
2023-10-13  8:50 ` James Chapman
2023-10-13 10:37   ` Miquel Raynal
2023-10-13 11:54     ` James Chapman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231016155858.7af3490b@xps-13 \
    --to=miquel.raynal@bootlin.com \
    --cc=alexandre.belloni@bootlin.com \
    --cc=andrew@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-imx@nxp.com \
    --cc=linux@armlinux.org.uk \
    --cc=maxime.chevallier@bootlin.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=shenwei.wang@nxp.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas.petazzoni@bootlin.com \
    --cc=wei.fang@nxp.com \
    --cc=xiaoning.wang@nxp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox