netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Ethernet issue on imx6
@ 2023-10-12 17:34 Miquel Raynal
  2023-10-12 19:39 ` Russell King (Oracle)
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Miquel Raynal @ 2023-10-12 17:34 UTC (permalink / raw)
  To: Wei Fang, Shenwei Wang, Clark Wang, Russell King
  Cc: davem, edumazet, kuba, pabeni, linux-imx, netdev,
	Thomas Petazzoni, Alexandre Belloni, Maxime Chevallier

Hello,

I've been scratching my foreheads for weeks on a strange imx6
network issue, I need help to go further, as I feel a bit clueless now.

Here is my setup :
- Custom imx6q board
- Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
- Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
- The MAC (fec driver) is connected to a Micrel 9031 PHY
- The PHY is connected to the link partner through an industrial cable
- Testing 100BASE-T (link is stable)

The RGMII-ID timings are probably not totally optimal but offer rather
good performance. In UDP with iperf3:
* Downlink (host to the board) runs at full speed with 0% drop
* Uplink (board to host) runs at full speed with <1% drop

However, if I ever try to limit the bandwidth in uplink (only), the drop
rate rises significantly, up to 30%:

//192.168.1.1 is my host, so the below lines are from the board:
# iperf3 -c 192.168.1.1 -u -b100M
[  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms  467/82603 (0.57%)  receiver
# iperf3 -c 192.168.1.1 -u -b90M
[  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms  12163/77688 (16%)  receiver
# iperf3 -c 192.168.1.1 -u -b80M
[  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms  20937/69055 (30%)  receiver

One direct consequence, I believe, is that tcp transfers quickly stall
or run at an insanely low speed (~40kiB/s).

I've tried to disable all the hardware offloading reported by ethtool
with no additional success.

Last but not least, I observe another very strange behavior: when I
perform an uplink transfer at a "reduced" speed (80Mbps or below), as
said above, I observe a ~30% drop rate. But if I run a full speed UDP
transfer in downlink at the same time, the drop rate lowers to ~3-4%.
See below, this is an iperf server on my host receiving UDP traffic from
my board. After 5 seconds I start a full speed UDP transfer from the
host to the board:

[  5] local 192.168.1.1 port 5201 connected to 192.168.1.2 port 57216
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  6.29 MBytes  52.7 Mbits/sec  0.152 ms  2065/6617 (31%)  
[  5]   1.00-2.00   sec  6.50 MBytes  54.6 Mbits/sec  0.118 ms  2199/6908 (32%)  
[  5]   2.00-3.00   sec  6.64 MBytes  55.7 Mbits/sec  0.123 ms  2099/6904 (30%)  
[  5]   3.00-4.00   sec  6.58 MBytes  55.2 Mbits/sec  0.091 ms  2141/6905 (31%)  
[  5]   4.00-5.00   sec  6.59 MBytes  55.3 Mbits/sec  0.092 ms  2134/6907 (31%)  
[  5]   5.00-6.00   sec  8.36 MBytes  70.1 Mbits/sec  0.088 ms  853/6904 (12%)  
[  5]   6.00-7.00   sec  9.14 MBytes  76.7 Mbits/sec  0.085 ms  281/6901 (4.1%)  
[  5]   7.00-8.00   sec  9.19 MBytes  77.1 Mbits/sec  0.147 ms  255/6911 (3.7%)  
[  5]   8.00-9.00   sec  9.22 MBytes  77.3 Mbits/sec  0.160 ms  233/6907 (3.4%)  
[  5]   9.00-10.00  sec  9.25 MBytes  77.6 Mbits/sec  0.129 ms  211/6906 (3.1%)  
[  5]  10.00-10.04  sec   392 KBytes  76.9 Mbits/sec  0.113 ms  11/288 (3.8%) 

If the downlink transfer is not at full speed, I don't observe any
difference.

I've commented out the runtime_pm callbacks in the fec driver, but
nothing changed.

Any hint or idea will be highly appreciated!

Thanks a lot,
Miquèl

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-12 17:34 Ethernet issue on imx6 Miquel Raynal
@ 2023-10-12 19:39 ` Russell King (Oracle)
  2023-10-13  8:40   ` Miquel Raynal
  2023-10-12 20:46 ` Andrew Lunn
  2023-10-13  8:50 ` James Chapman
  2 siblings, 1 reply; 26+ messages in thread
From: Russell King (Oracle) @ 2023-10-12 19:39 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Wei Fang, Shenwei Wang, Clark Wang, davem, edumazet, kuba, pabeni,
	linux-imx, netdev, Thomas Petazzoni, Alexandre Belloni,
	Maxime Chevallier

On Thu, Oct 12, 2023 at 07:34:10PM +0200, Miquel Raynal wrote:
> Hello,
> 
> I've been scratching my foreheads for weeks on a strange imx6
> network issue, I need help to go further, as I feel a bit clueless now.
> 
> Here is my setup :
> - Custom imx6q board
> - Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
> - Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
> - The MAC (fec driver) is connected to a Micrel 9031 PHY
> - The PHY is connected to the link partner through an industrial cable

"industrial cable" ?

> - Testing 100BASE-T (link is stable)

Would that be full or half duplex?

> The RGMII-ID timings are probably not totally optimal but offer rather
> good performance. In UDP with iperf3:
> * Downlink (host to the board) runs at full speed with 0% drop
> * Uplink (board to host) runs at full speed with <1% drop
> 
> However, if I ever try to limit the bandwidth in uplink (only), the drop
> rate rises significantly, up to 30%:
> 
> //192.168.1.1 is my host, so the below lines are from the board:
> # iperf3 -c 192.168.1.1 -u -b100M
> [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms  467/82603 (0.57%)  receiver
> # iperf3 -c 192.168.1.1 -u -b90M
> [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms  12163/77688 (16%)  receiver
> # iperf3 -c 192.168.1.1 -u -b80M
> [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms  20937/69055 (30%)  receiver

My setup:

i.MX6DL silicon rev 1.3
Atheros AR8035 PHY
6.3.0+ (no significant changes to fec_main.c)
Link, being BASE-T, is standard RJ45.

Connectivity is via a bridge device (sorry, can't change that as it would
be too disruptive, as this is my Internet router!)

Running at 1000BASE-T (FD):
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.01  sec   114 MBytes  95.4 Mbits/sec  0.030 ms  0/82363 (0%)  receiver
[  5]   0.00-10.00  sec   107 MBytes  90.0 Mbits/sec  0.103 ms  0/77691 (0%)  receiver
[  5]   0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.101 ms  0/69060 (0%)  receiver

Running at 100BASE-Tx (FD):
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.01  sec   114 MBytes  95.4 Mbits/sec  0.008 ms  0/82436 (0%)  receiver
[  5]   0.00-10.00  sec   107 MBytes  90.0 Mbits/sec  0.088 ms  0/77692 (0%)  receiver
[  5]   0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.108 ms  0/69058 (0%)  receiver

Running at 100bASE-Tx (HD):
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.01  sec   114 MBytes  95.3 Mbits/sec  0.056 ms  0/82304 (0%)  receiver
[  5]   0.00-10.00  sec   107 MBytes  90.0 Mbits/sec  0.101 ms  1/77691 (0.0013%)  receiver
[  5]   0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.105 ms  0/69058 (0%)  receiver

So I'm afraid I don't see your issue.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-12 17:34 Ethernet issue on imx6 Miquel Raynal
  2023-10-12 19:39 ` Russell King (Oracle)
@ 2023-10-12 20:46 ` Andrew Lunn
  2023-10-12 22:58   ` Stephen Hemminger
  2023-10-13  8:50 ` James Chapman
  2 siblings, 1 reply; 26+ messages in thread
From: Andrew Lunn @ 2023-10-12 20:46 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Wei Fang, Shenwei Wang, Clark Wang, Russell King, davem, edumazet,
	kuba, pabeni, linux-imx, netdev, Thomas Petazzoni,
	Alexandre Belloni, Maxime Chevallier

> //192.168.1.1 is my host, so the below lines are from the board:
> # iperf3 -c 192.168.1.1 -u -b100M
> [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms  467/82603 (0.57%)  receiver
> # iperf3 -c 192.168.1.1 -u -b90M
> [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms  12163/77688 (16%)  receiver
> # iperf3 -c 192.168.1.1 -u -b80M
> [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms  20937/69055 (30%)  receiver

Have you tried playing with ‐‐pacing‐timer ?

Maybe iperf is producing a big bursts of packets and then silence for
a while. The burst is overflowing a buffer somewhere? Smooth the flow
and it might work better?

  Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-12 20:46 ` Andrew Lunn
@ 2023-10-12 22:58   ` Stephen Hemminger
  2023-10-13  8:27     ` Miquel Raynal
  0 siblings, 1 reply; 26+ messages in thread
From: Stephen Hemminger @ 2023-10-12 22:58 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Miquel Raynal, Wei Fang, Shenwei Wang, Clark Wang, Russell King,
	davem, edumazet, kuba, pabeni, linux-imx, netdev,
	Thomas Petazzoni, Alexandre Belloni, Maxime Chevallier

On Thu, 12 Oct 2023 22:46:09 +0200
Andrew Lunn <andrew@lunn.ch> wrote:

> > //192.168.1.1 is my host, so the below lines are from the board:
> > # iperf3 -c 192.168.1.1 -u -b100M
> > [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms  467/82603 (0.57%)  receiver
> > # iperf3 -c 192.168.1.1 -u -b90M
> > [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms  12163/77688 (16%)  receiver
> > # iperf3 -c 192.168.1.1 -u -b80M
> > [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms  20937/69055 (30%)  receiver  
> 
> Have you tried playing with ‐‐pacing‐timer ?
> 
> Maybe iperf is producing a big bursts of packets and then silence for
> a while. The burst is overflowing a buffer somewhere? Smooth the flow
> and it might work better?
> 
>   Andrew
> 

Please post the basic system info.
Like kernel dmesg log.
All network statistics including ethtool.
Any special qdisc or firewall configuration.

Likely a hardware or driver bug that is doing something wrong
when a lot of packets are received.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-12 22:58   ` Stephen Hemminger
@ 2023-10-13  8:27     ` Miquel Raynal
  2023-10-13 15:51       ` Andrew Lunn
  2023-10-16  8:48       ` Alexander Stein
  0 siblings, 2 replies; 26+ messages in thread
From: Miquel Raynal @ 2023-10-13  8:27 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Andrew Lunn, Wei Fang, Shenwei Wang, Clark Wang, Russell King,
	davem, edumazet, kuba, pabeni, linux-imx, netdev,
	Thomas Petazzoni, Alexandre Belloni, Maxime Chevallier

Hi Stephen & Andrew,

stephen@networkplumber.org wrote on Thu, 12 Oct 2023 15:58:57 -0700:

> On Thu, 12 Oct 2023 22:46:09 +0200
> Andrew Lunn <andrew@lunn.ch> wrote:
> 
> > > //192.168.1.1 is my host, so the below lines are from the board:
> > > # iperf3 -c 192.168.1.1 -u -b100M
> > > [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms  467/82603 (0.57%)  receiver
> > > # iperf3 -c 192.168.1.1 -u -b90M
> > > [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms  12163/77688 (16%)  receiver
> > > # iperf3 -c 192.168.1.1 -u -b80M
> > > [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms  20937/69055 (30%)  receiver    
> > 
> > Have you tried playing with ‐‐pacing‐timer ?
> > 
> > Maybe iperf is producing a big bursts of packets and then silence for
> > a while. The burst is overflowing a buffer somewhere? Smooth the flow
> > and it might work better?

I've just tried and the results are kind of the opposite of what I
would expect. Here are the values so maybe you'll have a different
understanding:

From --pacing-timer 1 to 100000 (should be microseconds IIUC), results
are the same. And then, increasing the period decreases the drop rate:

# iperf3 -c 192.168.1.1 -u -b1M --pacing-timer 1000
[  5]   0.00-10.04  sec   604 KBytes   493 Kbits/sec  0.062 ms  437/864 (51%)  receiver
# iperf3 -c 192.168.1.1 -u -b1M --pacing-timer 10000
[  5]   0.00-10.05  sec   581 KBytes   474 Kbits/sec  0.102 ms  452/863 (52%)  receiver
# iperf3 -c 192.168.1.1 -u -b1M --pacing-timer 100000
[  5]   0.00-10.05  sec   867 KBytes   707 Kbits/sec  0.094 ms  240/853 (28%)  receiver
# iperf3 -c 192.168.1.1 -u -b1M --pacing-timer 1000000
[  5]   0.00-10.05  sec  1.04 MBytes   866 Kbits/sec  0.080 ms  27/778 (3.5%)  receiver

> Please post the basic system info.
> Like kernel dmesg log.

Please find the logs below.

> All network statistics including ethtool.

PHY statistics remain empty:

# ethtool --phy-statistics eth0
PHY statistics:
     phy_receive_errors: 0
     phy_idle_errors: 0

Interrupts work as expected on the MAC side (I added traces in the IRQ
handler to see how it was behaving):

# cat /proc/interrupts | grep ethernet
 82:     344546          0          0          0  gpio-mxc   6 Level     2188000.ethernet
104:          1          0          0          0  gpio-mxc  28 Level     2188000.ethernet
337:          0          0          0          0     GIC-0 151 Level     2188000.ethernet

# ethtool -S eth0
NIC statistics:
     tx_dropped: 0
     tx_packets: 10118
     tx_broadcast: 0
     tx_multicast: 13
     tx_crc_errors: 0
     tx_undersize: 0
     tx_oversize: 0
     tx_fragment: 0
     tx_jabber: 0
     tx_collision: 0
     tx_64byte: 130
     tx_65to127byte: 61031
     tx_128to255byte: 19
     tx_256to511byte: 10
     tx_512to1023byte: 5
     tx_1024to2047byte: 14459
     tx_GTE2048byte: 0
     tx_octets: 26219280
     IEEE_tx_drop: 0
     IEEE_tx_frame_ok: 10118
     IEEE_tx_1col: 0
     IEEE_tx_mcol: 0
     IEEE_tx_def: 0
     IEEE_tx_lcol: 0
     IEEE_tx_excol: 0
     IEEE_tx_macerr: 0
     IEEE_tx_cserr: 0
     IEEE_tx_sqe: 0
     IEEE_tx_fdxfc: 0
     IEEE_tx_octets_ok: 26219280
     rx_packets: 35369
     rx_broadcast: 1
     rx_multicast: 5
     rx_crc_errors: 0
     rx_undersize: 0
     rx_oversize: 0
     rx_fragment: 0
     rx_jabber: 0
     rx_64byte: 10
     rx_65to127byte: 9083
     rx_128to255byte: 8
     rx_256to511byte: 8
     rx_512to1023byte: 0
     rx_1024to2047byte: 26260
     rx_GTE2048byte: 0
     rx_octets: 436459630
     IEEE_rx_drop: 0
     IEEE_rx_frame_ok: 35369
     IEEE_rx_crc: 0
     IEEE_rx_align: 0
     IEEE_rx_macerr: 0
     IEEE_rx_fdxfc: 0
     IEEE_rx_octets_ok: 436459630

> Any special qdisc or firewall configuration.

None.

> Likely a hardware or driver bug that is doing something wrong
> when a lot of packets are received.

Well, isn't it kind of the opposite? If we flood the interface it works
better than when we pace the traffic (that's what I see whenever I
reduce the throughput or when I enlarge the iperf timer).

I'm also doubtful about the fact that receiving full speed traffic makes
the uplink stable.

Thanks,
Miquèl

---

switch to partitions #0, OK
mmc1 is current device
reading boot.scr
444 bytes read in 10 ms (43 KiB/s)
## Executing script at 20000000
Booting from mmc ...
reading zImage
9160016 bytes read in 462 ms (18.9 MiB/s)
reading <board>.dtb
40052 bytes read in 22 ms (1.7 MiB/s)
boot device tree kernel ...
Kernel image @ 0x12000000 [ 0x000000 - 0x8bc550 ]
## Flattened Device Tree blob at 18000000
   Booting using the fdt blob at 0x18000000
   Using Device Tree in place at 18000000, end 1800cc73

Starting kernel ...

[    0.000000] Booting Linux on physical CPU 0x0
[    0.000000] Linux version 6.5.0 (mraynal@xps-13) (arm-linux-gcc.br_real (Buildroot 2
020.08-14-ge5a2a90) 10.2.0, GNU ld (GNU Binutils) 2.34) #120 SMP Thu Oct 12 18:10:20 CE
ST 2023
[    0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7), cr=10c5387d
[    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT aliasing instruction cache
[    0.000000] OF: fdt: Machine model: TQ TQMa6Q on MBa6x
[    0.000000] Memory policy: Data cache writealloc
[    0.000000] cma: Reserved 160 MiB at 0x46000000
[    0.000000] Zone ranges:
[    0.000000]   Normal   [mem 0x0000000010000000-0x000000003fffffff]
[    0.000000]   HighMem  [mem 0x0000000040000000-0x000000004fffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000010000000-0x000000004fffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000010000000-0x000000004fffffff]
[    0.000000] percpu: Embedded 13 pages/cpu s23124 r8192 d21932 u53248
[    0.000000] Kernel command line: root=/dev/mmcblk1p2 ro rootwait console=ttymxc1,115
200 cma=160M
[    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes, linear)
[    0.000000] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 260608
[    0.000000] mem auto-init: stack:off, heap alloc:off, heap free:off
[    0.000000] Memory: 854196K/1048576K available (13312K kernel code, 1308K rwdata, 39
44K rodata, 1024K init, 401K bss, 30540K reserved, 163840K cma-reserved, 98304K highmem
)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
[    0.000000] rcu: Hierarchical RCU implementation.
[    0.000000] rcu:     RCU event tracing is enabled.
[    0.000000]  Tracing variant of Tasks RCU enabled.
[    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10 jiffies.
[    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
[    0.000000] L2C-310 errata 752271 769419 enabled
[    0.000000] L2C-310 enabling early BRESP for Cortex-A9
[    0.000000] L2C-310 full line of zeros enabled for Cortex-A9
[    0.000000] L2C-310 ID prefetch enabled, offset 16 lines
[    0.000000] L2C-310 dynamic clock gating enabled, standby mode enabled
[    0.000000] L2C-310 cache controller enabled, 16 ways, 1024 kB
[    0.000000] L2C-310: CACHE_ID 0x410000c7, AUX_CTRL 0x76470001
[    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
[    0.000000] Switching to timer-based delay loop, resolution 333ns
[    0.000001] sched_clock: 32 bits at 3000kHz, resolution 333ns, wraps every 715827882
841ns
[    0.000018] clocksource: mxc_timer1: mask: 0xffffffff max_cycles: 0xffffffff, max_id
le_ns: 637086815595 ns
[    0.001530] Console: colour dummy device 80x30
[    0.001571] Calibrating delay loop (skipped), value calculated using timer frequency
.. 6.00 BogoMIPS (lpj=30000)
[    0.001587] CPU: Testing write buffer coherency: ok
[    0.001625] CPU0: Spectre v2: using BPIALL workaround
[    0.001633] pid_max: default: 32768 minimum: 301
[    0.001764] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
[    0.001787] Mountpoint-cache hash table entries: 2048 (order: 1, 8192 bytes, linear)
[    0.002589] CPU0: thread -1, cpu 0, socket 0, mpidr 80000000
[    0.003612] RCU Tasks Trace: Setting shift to 2 and lim to 1 rcu_task_cb_adjust=1.
[    0.003775] Setting up static identity map for 0x10100000 - 0x10100078
[    0.003964] rcu: Hierarchical SRCU implementation.
[    0.003970] rcu:     Max phase no-delay instances is 1000.
[    0.005041] smp: Bringing up secondary CPUs ...
[    0.005965] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
[    0.005983] CPU1: Spectre v2: using BPIALL workaround
[    0.007009] CPU2: thread -1, cpu 2, socket 0, mpidr 80000002
[    0.007025] CPU2: Spectre v2: using BPIALL workaround
[    0.008023] CPU3: thread -1, cpu 3, socket 0, mpidr 80000003
[    0.008040] CPU3: Spectre v2: using BPIALL workaround
[    0.008149] smp: Brought up 1 node, 4 CPUs
[    0.008161] SMP: Total of 4 processors activated (24.00 BogoMIPS).
[    0.008171] CPU: All CPU(s) started in SVC mode.
[    0.008688] devtmpfs: initialized
[    0.017370] VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 4
[    0.017636] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_
ns: 19112604462750000 ns
[    0.017660] futex hash table entries: 1024 (order: 4, 65536 bytes, linear)
[    0.025946] pinctrl core: initialized pinctrl subsystem
[    0.027948] NET: Registered PF_NETLINK/PF_ROUTE protocol family
[    0.035191] DMA: preallocated 256 KiB pool for atomic coherent allocations
[    0.036387] thermal_sys: Registered thermal governor 'step_wise'
[    0.036449] cpuidle: using governor menu
[    0.036568] CPU identified as i.MX6Q, silicon rev 1.5
[    0.042593] platform soc: Fixed dependency cycle(s) with /soc/aips-bus@02000000/gpc@
020dc000
[    0.057409] platform 2400000.ipu: Fixed dependency cycle(s) with /soc/aips-bus@02000
000/ldb/lvds-channel@0/port@1/endpoint
[    0.057451] platform 2400000.ipu: Fixed dependency cycle(s) with /soc/hdmi@0120000/p
ort@1/endpoint
[    0.057476] platform 2400000.ipu: Fixed dependency cycle(s) with /soc/aips-bus@02000
000/ldb/lvds-channel@0/port@0/endpoint
[    0.057507] platform 2400000.ipu: Fixed dependency cycle(s) with /soc/hdmi@0120000/p
ort@0/endpoint
[    0.057530] platform 2400000.ipu: Fixed dependency cycle(s) with /soc/aips-bus@02000
000/iomuxc-gpr@020e0000/ipu1_csi0_mux/port@2/endpoint
[    0.058449] platform 2800000.ipu: Fixed dependency cycle(s) with /soc/aips-bus@02000
000/ldb/lvds-channel@0/port@3/endpoint
[    0.058510] platform 2800000.ipu: Fixed dependency cycle(s) with /soc/hdmi@0120000/p
ort@3/endpoint
[    0.058558] platform 2800000.ipu: Fixed dependency cycle(s) with /soc/aips-bus@02000
000/ldb/lvds-channel@0/port@2/endpoint
[    0.058611] platform 2800000.ipu: Fixed dependency cycle(s) with /soc/hdmi@0120000/p
ort@2/endpoint
[    0.058633] platform 2800000.ipu: Fixed dependency cycle(s) with /soc/aips-bus@02000
000/iomuxc-gpr@020e0000/ipu2_csi1_mux/port@2/endpoint
[    0.061550] No ATAGs?
[    0.061690] hw-breakpoint: found 5 (+1 reserved) breakpoint and 1 watchpoint registe
rs.
[    0.061702] hw-breakpoint: maximum watchpoint size is 4 bytes.
[    0.063055] imx6q-pinctrl 20e0000.iomuxc: initialized IMX pinctrl driver
[    0.067250] kprobes: kprobe jump-optimization is enabled. All kprobes are optimized 
if possible.
[    0.068787] gpio gpiochip0: Static allocation of GPIO base is deprecated, use dynami
c allocation.
[    0.070727] gpio gpiochip1: Static allocation of GPIO base is deprecated, use dynami
c allocation.
[    0.072435] gpio gpiochip2: Static allocation of GPIO base is deprecated, use dynami
c allocation.
[    0.074182] gpio gpiochip3: Static allocation of GPIO base is deprecated, use dynami
c allocation.
[    0.075910] gpio gpiochip4: Static allocation of GPIO base is deprecated, use dynami
c allocation.
[    0.077677] gpio gpiochip5: Static allocation of GPIO base is deprecated, use dynami
c allocation.
[    0.079441] gpio gpiochip6: Static allocation of GPIO base is deprecated, use dynami
c allocation.
[    0.083631] SCSI subsystem initialized
[    0.084119] usbcore: registered new interface driver usbfs
[    0.084160] usbcore: registered new interface driver hub
[    0.084205] usbcore: registered new device driver usb
[    0.086993] pca953x 0-0020: supply vcc not found, using dummy regulator
[    0.087157] pca953x 0-0020: using no AI
[    0.100030] pca953x 0-0020: interrupt support not compiled in
[    0.120764] pca953x 0-0021: supply vcc not found, using dummy regulator
[    0.120904] pca953x 0-0021: using no AI
[    0.160606] pca953x 0-0022: supply vcc not found, using dummy regulator
[    0.160732] pca953x 0-0022: using no AI
[    0.200704] i2c i2c-0: IMX I2C adapter registered
[    0.201717] i2c i2c-1: IMX I2C adapter registered
[    0.201937] mc: Linux media interface: v0.10
[    0.202003] videodev: Linux video capture interface: v2.00
[    0.202117] pps_core: LinuxPPS API ver. 1 registered
[    0.202124] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <gi
ometti@linux.it>
[    0.202151] PTP clock support registered
[    0.202996] Advanced Linux Sound Architecture Driver Initialized.
[    0.203875] Bluetooth: Core ver 2.22
[    0.203918] NET: Registered PF_BLUETOOTH protocol family
[    0.203924] Bluetooth: HCI device and connection manager initialized
[    0.203937] Bluetooth: HCI socket layer initialized
[    0.203946] Bluetooth: L2CAP socket layer initialized
[    0.203965] Bluetooth: SCO socket layer initialized
[    0.204470] vgaarb: loaded
[    0.204864] clocksource: Switched to clocksource mxc_timer1
[    0.205119] VFS: Disk quotas dquot_6.6.0
[    0.205178] VFS: Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
[    0.215795] NET: Registered PF_INET protocol family
[    0.216223] IP idents hash table entries: 16384 (order: 5, 131072 bytes, linear)
[    0.218261] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes, 
linear)
[    0.218290] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear)
[    0.218304] TCP established hash table entries: 8192 (order: 3, 32768 bytes, linear)
[    0.218393] TCP bind hash table entries: 8192 (order: 5, 131072 bytes, linear)
[    0.218680] TCP: Hash tables configured (established 8192 bind 8192)
[    0.218891] UDP hash table entries: 512 (order: 2, 16384 bytes, linear)
[    0.218944] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes, linear)
[    0.219141] NET: Registered PF_UNIX/PF_LOCAL protocol family
[    0.219687] RPC: Registered named UNIX socket transport module.
[    0.219697] RPC: Registered udp transport module.
[    0.219703] RPC: Registered tcp transport module.
[    0.219708] RPC: Registered tcp-with-tls transport module.
[    0.219713] RPC: Registered tcp NFSv4.1 backchannel transport module.
[    0.221148] PCI: CLS 0 bytes, default 64
[    0.221977] armv7-pmu soc:pmu: hw perfevents: no interrupt-affinity property, guessi
ng.
[    0.222197] hw perfevents: enabled with armv7_cortex_a9 PMU driver, 7 counters avail
able
[    0.223895] Initialise system trusted keyrings
[    0.224175] workingset: timestamp_bits=30 max_order=18 bucket_order=0
[    0.224818] NFS: Registering the id_resolver key type
[    0.224899] Key type id_resolver registered
[    0.224907] Key type id_legacy registered
[    0.224937] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
[    0.224946] nfs4flexfilelayout_init: NFSv4 Flexfile Layout Driver Registering...
[    0.224982] jffs2: version 2.2. (NAND) © 2001-2006 Red Hat, Inc.
[    0.225263] fuse: init (API version 7.38)
[    0.366002] Key type asymmetric registered
[    0.366012] Asymmetric key parser 'x509' registered
[    0.366111] bounce: pool size: 64 pages
[    0.366141] io scheduler mq-deadline registered
[    0.366149] io scheduler kyber registered
[    0.366177] io scheduler bfq registered
[    0.374797] mxs-dma 110000.dma-apbh: initialized
[    0.380411] 21e8000.serial: ttymxc1 at MMIO 0x21e8000 (irq = 267, base_baud = 500000
0) is a IMX
[    0.380466] pfuze100-regulator 0-0008: Full layer: 2, Metal layer: 1
[    0.380517] printk: console [ttymxc1] enabled
[    0.420856] pfuze100-regulator 0-0008: FAB: 0, FIN: 0
[    0.426703] 21f0000.serial: ttymxc3 at MMIO 0x21f0000 (irq = 268, base_baud = 500000
0) is a IMX
[    0.429991] pfuze100-regulator 0-0008: pfuze100 found.
[    1.415972] dwhdmi-imx 120000.hdmi: Detected HDMI TX controller v1.30a with HDCP (DW
C HDMI 3D TX PHY)
[    1.438220] etnaviv etnaviv: bound 130000.gpu (ops 0xc0eaa4c0)
[    1.444322] etnaviv etnaviv: bound 134000.gpu (ops 0xc0eaa4c0)
[    1.450446] etnaviv etnaviv: bound 2204000.gpu (ops 0xc0eaa4c0)
[    1.456412] etnaviv-gpu 130000.gpu: model: GC2000, revision: 5108
[    1.462736] etnaviv-gpu 134000.gpu: model: GC320, revision: 5007
[    1.468843] etnaviv-gpu 2204000.gpu: model: GC355, revision: 1215
[    1.474978] etnaviv-gpu 2204000.gpu: Ignoring GPU with VG and FE2.0
[    1.481750] [drm] Initialized etnaviv 1.3.0 20151214 for etnaviv on minor 0
[    1.490567] imx-ipuv3 2400000.ipu: IPUv3H probed
[    1.497417] imx-drm display-subsystem: bound imx-ipuv3-crtc.2 (ops 0xc0e990bc)
[    1.504784] imx-drm display-subsystem: bound imx-ipuv3-crtc.3 (ops 0xc0e990bc)
[    1.512221] imx-drm display-subsystem: bound imx-ipuv3-crtc.6 (ops 0xc0e990bc)
[    1.519601] imx-drm display-subsystem: bound imx-ipuv3-crtc.7 (ops 0xc0e990bc)
[    1.526929] imx-drm display-subsystem: bound 120000.hdmi (ops 0xc0e99bb0)
[    1.533801] imx-drm display-subsystem: bound 2000000.aips-bus:ldb (ops 0xc0e9985c)
[    1.542009] [drm] Initialized imx-drm 1.0.0 20120507 for display-subsystem on minor 
1
[    1.617102] Console: switching to colour frame buffer device 128x48
[    1.639863] imx-drm display-subsystem: [drm] fb0: imx-drmdrmfb frame buffer device
[    1.647569] imx-ipuv3 2800000.ipu: IPUv3H probed
[    1.664218] brd: module loaded
[    1.675094] loop: module loaded
[    1.678559] at24 0-0050: supply vcc not found, using dummy regulator
[    1.686264] at24 0-0050: 8192 byte 24c64 EEPROM, writable, 32 bytes/write
[    1.693254] at24 0-005e: supply vcc not found, using dummy regulator
[    1.700649] at24 0-005e: 6 byte 24mac402 EEPROM, read-only
[    1.707334] ahci-imx 2200000.sata: fsl,transmit-level-mV not specified, using 000000
24
[    1.715309] ahci-imx 2200000.sata: fsl,transmit-boost-mdB not specified, using 00000
480
[    1.723325] ahci-imx 2200000.sata: fsl,transmit-atten-16ths not specified, using 000
02000
[    1.731527] ahci-imx 2200000.sata: fsl,receive-eq-mdB not specified, using 05000000
[    1.739323] ahci-imx 2200000.sata: supply ahci not found, using dummy regulator
[    1.746879] ahci-imx 2200000.sata: supply phy not found, using dummy regulator
[    1.754171] ahci-imx 2200000.sata: supply target not found, using dummy regulator
[    1.765103] ahci-imx 2200000.sata: SSS flag set, parallel bus scan disabled
[    1.772093] ahci-imx 2200000.sata: AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl p
latform mode
[    1.780930] ahci-imx 2200000.sata: flags: ncq sntf stag pm led clo only pmp pio slum
 part ccc apst 
[    1.791631] scsi host0: ahci-imx
[    1.795182] ata1: SATA max UDMA/133 mmio [mem 0x02200000-0x02203fff] port 0x100 irq 
281
[    1.807887] CAN device driver interface
[    1.815291] pps pps0: new PPS source ptp0
[    1.824500] fec 2188000.ethernet eth0: registered PHC device 0
[    1.830997] usbcore: registered new device driver r8152-cfgselector
[    1.837331] usbcore: registered new interface driver r8152
[    1.842875] usbcore: registered new interface driver lan78xx
[    1.848605] usbcore: registered new interface driver asix
[    1.854042] usbcore: registered new interface driver ax88179_178a
[    1.860192] usbcore: registered new interface driver cdc_ether
[    1.866080] usbcore: registered new interface driver smsc95xx
[    1.871862] usbcore: registered new interface driver net1080
[    1.877579] usbcore: registered new interface driver cdc_subset
[    1.883536] usbcore: registered new interface driver zaurus
[    1.889163] usbcore: registered new interface driver MOSCHIP usb-ethernet driver
[    1.896643] usbcore: registered new interface driver cdc_ncm
[    1.902339] usbcore: registered new interface driver r8153_ecm
[    1.908265] usbcore: registered new interface driver usb-storage
[    1.915845] imx_usb 2184000.usb: No over current polarity defined
[    1.925481] ci_hdrc ci_hdrc.0: EHCI Host Controller
[    1.930395] ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 1
[    1.964891] ci_hdrc ci_hdrc.0: USB 2.0 started, EHCI 1.00
[    1.970466] usb usb1: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice
= 6.05
[    1.978770] usb usb1: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    1.986022] usb usb1: Product: EHCI Host Controller
[    1.990909] usb usb1: Manufacturer: Linux 6.5.0 ehci_hcd
[    1.996243] usb usb1: SerialNumber: ci_hdrc.0
[    2.001210] hub 1-0:1.0: USB hub found
[    2.005040] hub 1-0:1.0: 1 port detected
[    2.013077] ci_hdrc ci_hdrc.1: EHCI Host Controller
[    2.018008] ci_hdrc ci_hdrc.1: new USB bus registered, assigned bus number 2
[    2.054882] ci_hdrc ci_hdrc.1: USB 2.0 started, EHCI 1.00
[    2.060434] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice
= 6.05
[    2.068736] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[    2.075993] usb usb2: Product: EHCI Host Controller
[    2.080880] usb usb2: Manufacturer: Linux 6.5.0 ehci_hcd
[    2.086215] usb usb2: SerialNumber: ci_hdrc.1
[    2.091157] hub 2-0:1.0: USB hub found
[    2.094973] hub 2-0:1.0: 1 port detected
[    2.100576] SPI driver ads7846 has no spi_device_id for ti,tsc2046
[    2.106788] SPI driver ads7846 has no spi_device_id for ti,ads7843
[    2.112974] SPI driver ads7846 has no spi_device_id for ti,ads7845
[    2.119173] SPI driver ads7846 has no spi_device_id for ti,ads7873
[    2.126405] ata1: SATA link down (SStatus 0 SControl 300)
[    2.128977] rtc-ds1307 0-0068: SET TIME!
[    2.131859] ahci-imx 2200000.sata: no device found, disabling link.
[    2.138861] rtc-ds1307 0-0068: registered as rtc0
[    2.142031] ahci-imx 2200000.sata: pass ahci_imx..hotplug=1 to enable hotplug
[    2.148304] rtc-ds1307 0-0068: setting system clock to 2000-01-01T00:00:16 UTC (9466
84816)
[    2.164313] snvs_rtc 20cc000.snvs:snvs-rtc-lp: registered as rtc1
[    2.170598] i2c_dev: i2c /dev entries driver
[    2.179371] Bluetooth: HCI UART driver ver 2.3
[    2.183827] Bluetooth: HCI UART protocol H4 registered
[    2.189030] Bluetooth: HCI UART protocol LL registered
[    2.195090] sdhci: Secure Digital Host Controller Interface driver
[    2.201278] sdhci: Copyright(c) Pierre Ossman
[    2.205665] sdhci-pltfm: SDHCI platform and OF driver helper
[    2.212740] sdhci-esdhc-imx 2194000.usdhc: Got CD GPIO
[    2.213147] caam 2100000.caam: Entropy delay = 3200
[    2.217980] sdhci-esdhc-imx 2194000.usdhc: Got WP GPIO
[    2.235314] caam 2100000.caam: Instantiated RNG4 SH0
[    2.247792] caam 2100000.caam: Instantiated RNG4 SH1
[    2.252993] caam 2100000.caam: device ID = 0x0a16010000000000 (Era 4)
[    2.254409] mmc0: SDHCI controller on 2198000.usdhc [2198000.usdhc] using ADMA
[    2.259488] caam 2100000.caam: job rings = 2, qi = 0
[    2.274830] caam algorithms registered in /proc/crypto
[    2.280160] caam 2100000.caam: registering rng-caam
[    2.284350] mmc1: SDHCI controller on 2194000.usdhc [2194000.usdhc] using ADMA
[    2.285314] caam 2100000.caam: rng crypto API alg registered prng-caam
[    2.297630] random: crng init done
[    2.303285] usbcore: registered new interface driver usbhid
[    2.308911] usbhid: USB HID core driver
[    2.313588] imx-ipuv3-csi imx-ipuv3-csi.0: Registered ipu1_csi0 capture as /dev/vide
o0
[    2.320064] mmc0: new DDR MMC card at address 0001
[    2.321564] usb 1-1: new high-speed USB device number 2 using ci_hdrc
[    2.326456] imx-ipuv3 2400000.ipu: Registered ipu1_ic_prpenc capture as /dev/video1
[    2.333527] mmcblk0: mmc0:0001 Q2J54A 3.64 GiB
[    2.340788] mmc1: new high speed SDHC card at address 1234
[    2.344970] imx-ipuv3 2400000.ipu: Registered ipu1_ic_prpvf capture as /dev/video2
[    2.345512] imx-ipuv3-csi imx-ipuv3-csi.1: Registered ipu1_csi1 capture as /dev/vide
o3
[    2.351359] mmcblk1: mmc1:1234 SA32G 28.8 GiB
[    2.358940] imx-ipuv3-csi imx-ipuv3-csi.4: Registered ipu2_csi0 capture as /dev/vide
o4
[    2.359085] mmcblk0boot0: mmc0:0001 Q2J54A 2.00 MiB
[    2.361287] mmcblk0boot1: mmc0:0001 Q2J54A 2.00 MiB
[    2.363107] mmcblk0rpmb: mmc0:0001 Q2J54A 512 KiB, chardev (243:0)
[    2.394649]  mmcblk1: p1 p2 p3
[    2.394924] usb 2-1: new high-speed USB device number 2 using ci_hdrc
[    2.394948] imx-ipuv3 2800000.ipu: Registered ipu2_ic_prpenc capture as /dev/video5
[    2.397988] imx-ipuv3 2800000.ipu: Registered ipu2_ic_prpvf capture as /dev/video6
[    2.419931] imx-ipuv3-csi imx-ipuv3-csi.5: Registered ipu2_csi1 capture as /dev/vide
o7
[    2.436511] NET: Registered PF_INET6 protocol family
[    2.442621] Segment Routing with IPv6
[    2.446385] In-situ OAM (IOAM) with IPv6
[    2.450396] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
[    2.457018] NET: Registered PF_PACKET protocol family
[    2.462109] can: controller area network core
[    2.466561] NET: Registered PF_CAN protocol family
[    2.471387] can: raw protocol
[    2.474365] can: broadcast manager protocol
[    2.478607] can: netlink gateway - max_hops=1
[    2.483072] Key type dns_resolver registered
[    2.489383] Registering SWP/SWPB emulation handler
[    2.504238] Loading compiled-in X.509 certificates
[    2.539840] video-mux 20e0000.iomuxc-gpr:ipu1_csi0_mux: Consider updating driver vid
eo-mux to match on endpoints
[    2.550967] video-mux 20e0000.iomuxc-gpr:ipu2_csi1_mux: Consider updating driver vid
eo-mux to match on endpoints
[    2.555909] usb 1-1: New USB device found, idVendor=0bda, idProduct=8179, bcdDevice=
 0.00
[    2.563328] imx-media: Registered ipu_ic_pp csc/scaler as /dev/video8
[    2.569363] usb 1-1: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    2.576973] imx_thermal 2000000.aips-bus:tempmon: Industrial CPU temperature grade -
 max:105C critical:100C passive:95C
[    2.582980] usb 1-1: Product: 802.11n NIC
[    2.597803] usb 1-1: Manufacturer: Realtek
[    2.601909] usb 1-1: SerialNumber: 00E04C0001
[    2.610341] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[    2.621866] Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[    2.627617] clk: Disabling unused clocks
[    2.631928] platform regulatory.0: Direct firmware load for regulatory.db failed wit
h error -2
[    2.634899] ALSA device list:
[    2.640606] cfg80211: failed to load regulatory.db
[    2.643524]   No soundcards found.
[    2.648408] usb 2-1: New USB device found, idVendor=0424, idProduct=2517, bcdDevice=
 0.02
[    2.660001] usb 2-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    2.667658] hub 2-1:1.0: USB hub found
[    2.671598] hub 2-1:1.0: 7 ports detected
[    2.700941] EXT4-fs (mmcblk1p2): INFO: recovery required on readonly filesystem
[    2.708329] EXT4-fs (mmcblk1p2): write access will be enabled during recovery
[    2.994978] usb 2-1.1: new high-speed USB device number 3 using ci_hdrc
[    3.145639] usb 2-1.1: New USB device found, idVendor=0424, idProduct=9e00, bcdDevic
e= 3.00
[    3.154030] usb 2-1.1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[    3.164547] smsc95xx v2.0.0
[    3.289135] SMSC LAN8710/LAN8720 usb-002:003:01: attached PHY driver (mii_bus:phy_ad
dr=usb-002:003:01, irq=294)
[    3.300938] smsc95xx 2-1.1:1.0 eth1: register 'smsc95xx' at usb-ci_hdrc.1-1.1, smsc9
5xx USB 2.0 Ethernet, f2:f7:83:3c:d3:e8
[    3.766601] EXT4-fs (mmcblk1p2): recovery complete
[    4.017039] EXT4-fs (mmcblk1p2): mounted filesystem 1c93b4dc-44a6-4b43-93b0-ce3b0bbd
0391 ro with ordered data mode. Quota mode: none.
[    4.029221] VFS: Mounted root (ext4 filesystem) readonly on device 179:10.
[    4.037240] devtmpfs: mounted
[    4.042698] Freeing unused kernel image (initmem) memory: 1024K
[    4.049122] Run /sbin/init as init process
[    4.330114] EXT4-fs (mmcblk1p2): re-mounted 1c93b4dc-44a6-4b43-93b0-ce3b0bbd0391 r/w
. Quota mode: none.
Starting psplash: OK
Starting syslogd: OK
Starting klogd: OK
Running sysctl: OK
Populating /dev using udev: 
[    4.650852] udevd[134]: starting version 3.2.9
[    4.692906] udevd[135]: starting eudev-3.2.9
done
Starting watchdog...
Initializing random number generator: OK
Saving random seed: OK
Starting usbguard daemon: OK
Starting rngd: OK
Starting system message bus: done
Starting network:
[    6.261676] Micrel KSZ9031 Gigabit PHY 2188000.ethernet-1:03: attached PHY driver (mii_bus:phy_addr=2188000.ethernet-1:03, irq=56)
OK
Starting chrony: OK
Starting php-fpm  done
Starting nginx...
Starting sshd: OK
Touchscreen Firmware
Tool version:   v0.29_20170705
APILIB version: v1.0.62.0705
Try to start Stephanie 5 GUI
login:
[    8.500637] fec 2188000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
[    8.533754] fec 2188000.ethernet eth0: Link is Down
[   11.147566] fec 2188000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/t
x
[   12.646102] platform 2008000.ecspi: deferred probe pending
root
Password: 
# 
# ip link set dev eth0 up
# ip addr add 192.168.1.2/24 dev eth0
ip: RTNETLINK answers: File exists
# iperf3 -c 192.168.1.1

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-12 19:39 ` Russell King (Oracle)
@ 2023-10-13  8:40   ` Miquel Raynal
  2023-10-13 10:16     ` Wei Fang
  2023-10-16 11:49     ` Eric Dumazet
  0 siblings, 2 replies; 26+ messages in thread
From: Miquel Raynal @ 2023-10-13  8:40 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Wei Fang, Shenwei Wang, Clark Wang, davem, edumazet, kuba, pabeni,
	linux-imx, netdev, Thomas Petazzoni, Alexandre Belloni,
	Maxime Chevallier, Andrew Lunn, Stephen Hemminger

Hi Russell,

linux@armlinux.org.uk wrote on Thu, 12 Oct 2023 20:39:11 +0100:

> On Thu, Oct 12, 2023 at 07:34:10PM +0200, Miquel Raynal wrote:
> > Hello,
> > 
> > I've been scratching my foreheads for weeks on a strange imx6
> > network issue, I need help to go further, as I feel a bit clueless now.
> > 
> > Here is my setup :
> > - Custom imx6q board
> > - Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
> > - Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
> > - The MAC (fec driver) is connected to a Micrel 9031 PHY
> > - The PHY is connected to the link partner through an industrial cable  
> 
> "industrial cable" ?

It is a "unique" hardware cable, the four Ethernet pairs are foiled
twisted pair each and the whole cable is shielded. Additionally there
is the 24V power supply coming from this cable. The connector is from
ODU S22LOC-P16MCD0-920S. The structure of the cable should be similar
to a CAT7 cable with the additional power supply line.

> > - Testing 100BASE-T (link is stable)  
> 
> Would that be full or half duplex?

Ah, yeah, sorry for forgetting this detail, it's full duplex.

> > The RGMII-ID timings are probably not totally optimal but offer
> > rather good performance. In UDP with iperf3:
> > * Downlink (host to the board) runs at full speed with 0% drop
> > * Uplink (board to host) runs at full speed with <1% drop
> > 
> > However, if I ever try to limit the bandwidth in uplink (only), the
> > drop rate rises significantly, up to 30%:
> > 
> > //192.168.1.1 is my host, so the below lines are from the board:
> > # iperf3 -c 192.168.1.1 -u -b100M
> > [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms
> > 467/82603 (0.57%)  receiver # iperf3 -c 192.168.1.1 -u -b90M
> > [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms
> > 12163/77688 (16%)  receiver # iperf3 -c 192.168.1.1 -u -b80M
> > [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms
> > 20937/69055 (30%)  receiver  
> 
> My setup:
> 
> i.MX6DL silicon rev 1.3
> Atheros AR8035 PHY
> 6.3.0+ (no significant changes to fec_main.c)
> Link, being BASE-T, is standard RJ45.
> 
> Connectivity is via a bridge device (sorry, can't change that as it
> would be too disruptive, as this is my Internet router!)
> 
> Running at 1000BASE-T (FD):
> [ ID] Interval           Transfer     Bitrate         Jitter
> Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> Mbits/sec  0.030 ms  0/82363 (0%)  receiver [  5]   0.00-10.00  sec
> 107 MBytes  90.0 Mbits/sec  0.103 ms  0/77691 (0%)  receiver [  5]
> 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.101 ms  0/69060 (0%)
> receiver
> 
> Running at 100BASE-Tx (FD):
> [ ID] Interval           Transfer     Bitrate         Jitter
> Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> Mbits/sec  0.008 ms  0/82436 (0%)  receiver [  5]   0.00-10.00  sec
> 107 MBytes  90.0 Mbits/sec  0.088 ms  0/77692 (0%)  receiver [  5]
> 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.108 ms  0/69058 (0%)
> receiver
> 
> Running at 100bASE-Tx (HD):
> [ ID] Interval           Transfer     Bitrate         Jitter
> Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.3
> Mbits/sec  0.056 ms  0/82304 (0%)  receiver [  5]   0.00-10.00  sec
> 107 MBytes  90.0 Mbits/sec  0.101 ms  1/77691 (0.0013%)  receiver [
> 5]   0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.105 ms  0/69058
> (0%)  receiver
> 
> So I'm afraid I don't see your issue.

I believe the issue cannot be at an higher level than the MAC. I also
do not think the MAC driver and PHY driver are specifically buggy. I
ruled out the hardware issue given the fact that under certain
conditions (high load) the network works rather well... But I certainly
see this issue, and when switching to TCP the results are dramatic:

# iperf3 -c 192.168.1.1
Connecting to host 192.168.1.1, port 5201
[  5] local 192.168.1.2 port 37948 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  11.3 MBytes  94.5 Mbits/sec   43   32.5 KBytes       
[  5]   1.00-2.00   sec  3.29 MBytes  27.6 Mbits/sec   26   1.41 KBytes       
[  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
[  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes       
[  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
[  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes  

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-12 17:34 Ethernet issue on imx6 Miquel Raynal
  2023-10-12 19:39 ` Russell King (Oracle)
  2023-10-12 20:46 ` Andrew Lunn
@ 2023-10-13  8:50 ` James Chapman
  2023-10-13 10:37   ` Miquel Raynal
  2 siblings, 1 reply; 26+ messages in thread
From: James Chapman @ 2023-10-13  8:50 UTC (permalink / raw)
  To: Miquel Raynal, Wei Fang, Shenwei Wang, Clark Wang, Russell King
  Cc: davem, edumazet, kuba, pabeni, linux-imx, netdev,
	Thomas Petazzoni, Alexandre Belloni, Maxime Chevallier

On 12/10/2023 18:34, Miquel Raynal wrote:
> Hello,
>
> I've been scratching my foreheads for weeks on a strange imx6
> network issue, I need help to go further, as I feel a bit clueless now.
>
> Here is my setup :
> - Custom imx6q board
> - Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
> - Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
> - The MAC (fec driver) is connected to a Micrel 9031 PHY
> - The PHY is connected to the link partner through an industrial cable
> - Testing 100BASE-T (link is stable)
>
> The RGMII-ID timings are probably not totally optimal but offer rather
> good performance. In UDP with iperf3:
> * Downlink (host to the board) runs at full speed with 0% drop
> * Uplink (board to host) runs at full speed with <1% drop
>
> However, if I ever try to limit the bandwidth in uplink (only), the drop
> rate rises significantly, up to 30%:
>
> //192.168.1.1 is my host, so the below lines are from the board:
> # iperf3 -c 192.168.1.1 -u -b100M
> [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms  467/82603 (0.57%)  receiver
> # iperf3 -c 192.168.1.1 -u -b90M
> [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms  12163/77688 (16%)  receiver
> # iperf3 -c 192.168.1.1 -u -b80M
> [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms  20937/69055 (30%)  receiver
>
> One direct consequence, I believe, is that tcp transfers quickly stall
> or run at an insanely low speed (~40kiB/s).
>
> I've tried to disable all the hardware offloading reported by ethtool
> with no additional success.
>
> Last but not least, I observe another very strange behavior: when I
> perform an uplink transfer at a "reduced" speed (80Mbps or below), as
> said above, I observe a ~30% drop rate. But if I run a full speed UDP
> transfer in downlink at the same time, the drop rate lowers to ~3-4%.
> See below, this is an iperf server on my host receiving UDP traffic from
> my board. After 5 seconds I start a full speed UDP transfer from the
> host to the board:
>
> [  5] local 192.168.1.1 port 5201 connected to 192.168.1.2 port 57216
> [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
> [  5]   0.00-1.00   sec  6.29 MBytes  52.7 Mbits/sec  0.152 ms  2065/6617 (31%)
> [  5]   1.00-2.00   sec  6.50 MBytes  54.6 Mbits/sec  0.118 ms  2199/6908 (32%)
> [  5]   2.00-3.00   sec  6.64 MBytes  55.7 Mbits/sec  0.123 ms  2099/6904 (30%)
> [  5]   3.00-4.00   sec  6.58 MBytes  55.2 Mbits/sec  0.091 ms  2141/6905 (31%)
> [  5]   4.00-5.00   sec  6.59 MBytes  55.3 Mbits/sec  0.092 ms  2134/6907 (31%)
> [  5]   5.00-6.00   sec  8.36 MBytes  70.1 Mbits/sec  0.088 ms  853/6904 (12%)
> [  5]   6.00-7.00   sec  9.14 MBytes  76.7 Mbits/sec  0.085 ms  281/6901 (4.1%)
> [  5]   7.00-8.00   sec  9.19 MBytes  77.1 Mbits/sec  0.147 ms  255/6911 (3.7%)
> [  5]   8.00-9.00   sec  9.22 MBytes  77.3 Mbits/sec  0.160 ms  233/6907 (3.4%)
> [  5]   9.00-10.00  sec  9.25 MBytes  77.6 Mbits/sec  0.129 ms  211/6906 (3.1%)
> [  5]  10.00-10.04  sec   392 KBytes  76.9 Mbits/sec  0.113 ms  11/288 (3.8%)
>
> If the downlink transfer is not at full speed, I don't observe any
> difference.
>
> I've commented out the runtime_pm callbacks in the fec driver, but
> nothing changed.
>
> Any hint or idea will be highly appreciated!
>
> Thanks a lot,
> Miquèl
>
Check your board's interrupt configuration. At high data rates, NAPI may 
mask interrupt delivery/routing issues since NAPI keeps interrupts 
disabled longer. Also, if the CPU has hardware interrupt coalescing 
features enabled, these may not play well with NAPI.

Low level irq configuration is quite complex (and flexible) in devices 
like iMX. It may be further complicated by some of it being done by the 
bootloader. So perhaps experiment with the fec driver's NAPI weight and 
debug the irq handler first to test whether interrupt handling is 
working as expected on your board before digging in the low level, 
board-specific irq setup code.

James



^ permalink raw reply	[flat|nested] 26+ messages in thread

* RE: Ethernet issue on imx6
  2023-10-13  8:40   ` Miquel Raynal
@ 2023-10-13 10:16     ` Wei Fang
  2023-10-16 11:49     ` Eric Dumazet
  1 sibling, 0 replies; 26+ messages in thread
From: Wei Fang @ 2023-10-13 10:16 UTC (permalink / raw)
  To: Miquel Raynal, Russell King (Oracle)
  Cc: Shenwei Wang, Clark Wang, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	dl-linux-imx, netdev@vger.kernel.org, Thomas Petazzoni,
	Alexandre Belloni, Maxime Chevallier, Andrew Lunn,
	Stephen Hemminger

> -----Original Message-----
> From: Miquel Raynal <miquel.raynal@bootlin.com>
> Sent: 2023年10月13日 16:40
> To: Russell King (Oracle) <linux@armlinux.org.uk>
> Cc: Wei Fang <wei.fang@nxp.com>; Shenwei Wang
> <shenwei.wang@nxp.com>; Clark Wang <xiaoning.wang@nxp.com>;
> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; dl-linux-imx <linux-imx@nxp.com>;
> netdev@vger.kernel.org; Thomas Petazzoni
> <thomas.petazzoni@bootlin.com>; Alexandre Belloni
> <alexandre.belloni@bootlin.com>; Maxime Chevallier
> <maxime.chevallier@bootlin.com>; Andrew Lunn <andrew@lunn.ch>;
> Stephen Hemminger <stephen@networkplumber.org>
> Subject: Re: Ethernet issue on imx6
> 
> Hi Russell,
> 
> linux@armlinux.org.uk wrote on Thu, 12 Oct 2023 20:39:11 +0100:
> 
> > On Thu, Oct 12, 2023 at 07:34:10PM +0200, Miquel Raynal wrote:
> > > Hello,
> > >
> > > I've been scratching my foreheads for weeks on a strange imx6
> > > network issue, I need help to go further, as I feel a bit clueless now.
> > >
> > > Here is my setup :
> > > - Custom imx6q board
> > > - Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
> > > - Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
> > > - The MAC (fec driver) is connected to a Micrel 9031 PHY
> > > - The PHY is connected to the link partner through an industrial cable
> >
> > "industrial cable" ?
> 
> It is a "unique" hardware cable, the four Ethernet pairs are foiled
> twisted pair each and the whole cable is shielded. Additionally there
> is the 24V power supply coming from this cable. The connector is from
> ODU S22LOC-P16MCD0-920S. The structure of the cable should be similar
> to a CAT7 cable with the additional power supply line.
> 
Is it necessary to use this 'industrial cable'? Can it be replaced with CAT5 or
CAT6 cable?

I also cannot reproduce this issue with my i.MX6UL and i.MX8ULP boards.
root@imx6ul7d:~# iperf3 -c 10.193.108.176 -u -b80M
Connecting to host 10.193.108.176, port 5201
[  5] local 10.193.102.126 port 46382 connected to 10.193.108.176 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec  9.53 MBytes  80.0 Mbits/sec  6903
[  5]   1.00-2.00   sec  9.54 MBytes  80.0 Mbits/sec  6909
[  5]   2.00-3.00   sec  9.54 MBytes  80.0 Mbits/sec  6906
[  5]   3.00-4.00   sec  9.54 MBytes  80.0 Mbits/sec  6906
[  5]   4.00-5.00   sec  9.54 MBytes  80.0 Mbits/sec  6905
[  5]   5.00-6.00   sec  9.54 MBytes  80.0 Mbits/sec  6907
[  5]   6.00-7.00   sec  9.54 MBytes  80.0 Mbits/sec  6906
[  5]   7.00-8.00   sec  9.53 MBytes  79.9 Mbits/sec  6901
[  5]   8.00-9.00   sec  9.54 MBytes  80.1 Mbits/sec  6911
[  5]   9.00-10.00  sec  9.53 MBytes  80.0 Mbits/sec  6903
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.000 ms  0/69057 (0%)  sender
[  5]   0.00-10.04  sec  95.4 MBytes  79.6 Mbits/sec  0.046 ms  0/69057 (0%)  receiver

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-13  8:50 ` James Chapman
@ 2023-10-13 10:37   ` Miquel Raynal
  2023-10-13 11:54     ` James Chapman
  0 siblings, 1 reply; 26+ messages in thread
From: Miquel Raynal @ 2023-10-13 10:37 UTC (permalink / raw)
  To: James Chapman
  Cc: Wei Fang, Shenwei Wang, Clark Wang, Russell King, davem, edumazet,
	kuba, pabeni, linux-imx, netdev, Thomas Petazzoni,
	Alexandre Belloni, Maxime Chevallier

Hi James,

jchapman@katalix.com wrote on Fri, 13 Oct 2023 09:50:49 +0100:

> On 12/10/2023 18:34, Miquel Raynal wrote:
> > Hello,
> >
> > I've been scratching my foreheads for weeks on a strange imx6
> > network issue, I need help to go further, as I feel a bit clueless now.
> >
> > Here is my setup :
> > - Custom imx6q board
> > - Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
> > - Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
> > - The MAC (fec driver) is connected to a Micrel 9031 PHY
> > - The PHY is connected to the link partner through an industrial cable
> > - Testing 100BASE-T (link is stable)
> >
> > The RGMII-ID timings are probably not totally optimal but offer rather
> > good performance. In UDP with iperf3:
> > * Downlink (host to the board) runs at full speed with 0% drop
> > * Uplink (board to host) runs at full speed with <1% drop
> >
> > However, if I ever try to limit the bandwidth in uplink (only), the drop
> > rate rises significantly, up to 30%:
> >
> > //192.168.1.1 is my host, so the below lines are from the board:
> > # iperf3 -c 192.168.1.1 -u -b100M
> > [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms  467/82603 (0.57%)  receiver
> > # iperf3 -c 192.168.1.1 -u -b90M
> > [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms  12163/77688 (16%)  receiver
> > # iperf3 -c 192.168.1.1 -u -b80M
> > [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms  20937/69055 (30%)  receiver
> >
> > One direct consequence, I believe, is that tcp transfers quickly stall
> > or run at an insanely low speed (~40kiB/s).
> >
> > I've tried to disable all the hardware offloading reported by ethtool
> > with no additional success.
> >
> > Last but not least, I observe another very strange behavior: when I
> > perform an uplink transfer at a "reduced" speed (80Mbps or below), as
> > said above, I observe a ~30% drop rate. But if I run a full speed UDP
> > transfer in downlink at the same time, the drop rate lowers to ~3-4%.
> > See below, this is an iperf server on my host receiving UDP traffic from
> > my board. After 5 seconds I start a full speed UDP transfer from the
> > host to the board:
> >
> > [  5] local 192.168.1.1 port 5201 connected to 192.168.1.2 port 57216
> > [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
> > [  5]   0.00-1.00   sec  6.29 MBytes  52.7 Mbits/sec  0.152 ms  2065/6617 (31%)
> > [  5]   1.00-2.00   sec  6.50 MBytes  54.6 Mbits/sec  0.118 ms  2199/6908 (32%)
> > [  5]   2.00-3.00   sec  6.64 MBytes  55.7 Mbits/sec  0.123 ms  2099/6904 (30%)
> > [  5]   3.00-4.00   sec  6.58 MBytes  55.2 Mbits/sec  0.091 ms  2141/6905 (31%)
> > [  5]   4.00-5.00   sec  6.59 MBytes  55.3 Mbits/sec  0.092 ms  2134/6907 (31%)
> > [  5]   5.00-6.00   sec  8.36 MBytes  70.1 Mbits/sec  0.088 ms  853/6904 (12%)
> > [  5]   6.00-7.00   sec  9.14 MBytes  76.7 Mbits/sec  0.085 ms  281/6901 (4.1%)
> > [  5]   7.00-8.00   sec  9.19 MBytes  77.1 Mbits/sec  0.147 ms  255/6911 (3.7%)
> > [  5]   8.00-9.00   sec  9.22 MBytes  77.3 Mbits/sec  0.160 ms  233/6907 (3.4%)
> > [  5]   9.00-10.00  sec  9.25 MBytes  77.6 Mbits/sec  0.129 ms  211/6906 (3.1%)
> > [  5]  10.00-10.04  sec   392 KBytes  76.9 Mbits/sec  0.113 ms  11/288 (3.8%)
> >
> > If the downlink transfer is not at full speed, I don't observe any
> > difference.
> >
> > I've commented out the runtime_pm callbacks in the fec driver, but
> > nothing changed.
> >
> > Any hint or idea will be highly appreciated!
> >
> > Thanks a lot,
> > Miquèl
> >  
> Check your board's interrupt configuration. At high data rates, NAPI may mask interrupt delivery/routing issues since NAPI keeps interrupts disabled longer. Also, if the CPU has hardware interrupt coalescing features enabled, these may not play well with NAPI.
> 
> Low level irq configuration is quite complex (and flexible) in devices like iMX. It may be further complicated by some of it being done by the bootloader. So perhaps experiment with the fec driver's NAPI weight and debug the irq handler first to test whether interrupt handling is working as expected on your board before digging in the low level, board-specific irq setup code.

Thanks a lot for looking into this. I've tried to play a little bit
with the NAPI budget but saw no difference at all in the results. With
this new information in mind, do you think I should look deeper?

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-13 10:37   ` Miquel Raynal
@ 2023-10-13 11:54     ` James Chapman
  0 siblings, 0 replies; 26+ messages in thread
From: James Chapman @ 2023-10-13 11:54 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Wei Fang, Shenwei Wang, Clark Wang, Russell King, davem, edumazet,
	kuba, pabeni, linux-imx, netdev, Thomas Petazzoni,
	Alexandre Belloni, Maxime Chevallier

On 13/10/2023 11:37, Miquel Raynal wrote:
> Hi James,
>
> jchapman@katalix.com wrote on Fri, 13 Oct 2023 09:50:49 +0100:
>
>> On 12/10/2023 18:34, Miquel Raynal wrote:
>>> Hello,
>>>
>>> I've been scratching my foreheads for weeks on a strange imx6
>>> network issue, I need help to go further, as I feel a bit clueless now.
>>>
>>> Here is my setup :
>>> - Custom imx6q board
>>> - Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
>>> - Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
>>> - The MAC (fec driver) is connected to a Micrel 9031 PHY
>>> - The PHY is connected to the link partner through an industrial cable
>>> - Testing 100BASE-T (link is stable)
>>>
>>> The RGMII-ID timings are probably not totally optimal but offer rather
>>> good performance. In UDP with iperf3:
>>> * Downlink (host to the board) runs at full speed with 0% drop
>>> * Uplink (board to host) runs at full speed with <1% drop
>>>
>>> However, if I ever try to limit the bandwidth in uplink (only), the drop
>>> rate rises significantly, up to 30%:
>>>
>>> //192.168.1.1 is my host, so the below lines are from the board:
>>> # iperf3 -c 192.168.1.1 -u -b100M
>>> [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms  467/82603 (0.57%)  receiver
>>> # iperf3 -c 192.168.1.1 -u -b90M
>>> [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms  12163/77688 (16%)  receiver
>>> # iperf3 -c 192.168.1.1 -u -b80M
>>> [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms  20937/69055 (30%)  receiver
>>>
>>> One direct consequence, I believe, is that tcp transfers quickly stall
>>> or run at an insanely low speed (~40kiB/s).
>>>
>>> I've tried to disable all the hardware offloading reported by ethtool
>>> with no additional success.
>>>
>>> Last but not least, I observe another very strange behavior: when I
>>> perform an uplink transfer at a "reduced" speed (80Mbps or below), as
>>> said above, I observe a ~30% drop rate. But if I run a full speed UDP
>>> transfer in downlink at the same time, the drop rate lowers to ~3-4%.
>>> See below, this is an iperf server on my host receiving UDP traffic from
>>> my board. After 5 seconds I start a full speed UDP transfer from the
>>> host to the board:
>>>
>>> [  5] local 192.168.1.1 port 5201 connected to 192.168.1.2 port 57216
>>> [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
>>> [  5]   0.00-1.00   sec  6.29 MBytes  52.7 Mbits/sec  0.152 ms  2065/6617 (31%)
>>> [  5]   1.00-2.00   sec  6.50 MBytes  54.6 Mbits/sec  0.118 ms  2199/6908 (32%)
>>> [  5]   2.00-3.00   sec  6.64 MBytes  55.7 Mbits/sec  0.123 ms  2099/6904 (30%)
>>> [  5]   3.00-4.00   sec  6.58 MBytes  55.2 Mbits/sec  0.091 ms  2141/6905 (31%)
>>> [  5]   4.00-5.00   sec  6.59 MBytes  55.3 Mbits/sec  0.092 ms  2134/6907 (31%)
>>> [  5]   5.00-6.00   sec  8.36 MBytes  70.1 Mbits/sec  0.088 ms  853/6904 (12%)
>>> [  5]   6.00-7.00   sec  9.14 MBytes  76.7 Mbits/sec  0.085 ms  281/6901 (4.1%)
>>> [  5]   7.00-8.00   sec  9.19 MBytes  77.1 Mbits/sec  0.147 ms  255/6911 (3.7%)
>>> [  5]   8.00-9.00   sec  9.22 MBytes  77.3 Mbits/sec  0.160 ms  233/6907 (3.4%)
>>> [  5]   9.00-10.00  sec  9.25 MBytes  77.6 Mbits/sec  0.129 ms  211/6906 (3.1%)
>>> [  5]  10.00-10.04  sec   392 KBytes  76.9 Mbits/sec  0.113 ms  11/288 (3.8%)
>>>
>>> If the downlink transfer is not at full speed, I don't observe any
>>> difference.
>>>
>>> I've commented out the runtime_pm callbacks in the fec driver, but
>>> nothing changed.
>>>
>>> Any hint or idea will be highly appreciated!
>>>
>>> Thanks a lot,
>>> Miquèl
>>>   
>> Check your board's interrupt configuration. At high data rates, NAPI may mask interrupt delivery/routing issues since NAPI keeps interrupts disabled longer. Also, if the CPU has hardware interrupt coalescing features enabled, these may not play well with NAPI.
>>
>> Low level irq configuration is quite complex (and flexible) in devices like iMX. It may be further complicated by some of it being done by the bootloader. So perhaps experiment with the fec driver's NAPI weight and debug the irq handler first to test whether interrupt handling is working as expected on your board before digging in the low level, board-specific irq setup code.
> Thanks a lot for looking into this. I've tried to play a little bit
> with the NAPI budget but saw no difference at all in the results. With
> this new information in mind, do you think I should look deeper?
>
> Thanks,
> Miquèl

I think so. I was just suggesting tweaking variables in your setup which 
might change interrupt characteristics while monitoring interrupt 
handlers in order that you can be sure your device interrupts work as 
expected. Some instrumentation in the driver's interrupt and NAPI 
handlers may help. Perhaps plotting various counts (rx/tx interrupts, 
rx/tx interrupt on/off changes, NAPI schedules etc) at different packet 
rates might show where things start going wrong and could help to 
identify the root cause.

James



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-13  8:27     ` Miquel Raynal
@ 2023-10-13 15:51       ` Andrew Lunn
  2023-10-27 20:58         ` Miquel Raynal
  2023-10-16  8:48       ` Alexander Stein
  1 sibling, 1 reply; 26+ messages in thread
From: Andrew Lunn @ 2023-10-13 15:51 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Stephen Hemminger, Wei Fang, Shenwei Wang, Clark Wang,
	Russell King, davem, edumazet, kuba, pabeni, linux-imx, netdev,
	Thomas Petazzoni, Alexandre Belloni, Maxime Chevallier

> # ethtool -S eth0
> NIC statistics:
>      tx_dropped: 0
>      tx_packets: 10118
>      tx_broadcast: 0
>      tx_multicast: 13
>      tx_crc_errors: 0
>      tx_undersize: 0
>      tx_oversize: 0
>      tx_fragment: 0
>      tx_jabber: 0
>      tx_collision: 0
>      tx_64byte: 130
>      tx_65to127byte: 61031
>      tx_128to255byte: 19
>      tx_256to511byte: 10
>      tx_512to1023byte: 5
>      tx_1024to2047byte: 14459
>      tx_GTE2048byte: 0
>      tx_octets: 26219280

These values come from the hardware. They should reflect what actually
made it onto the wire.

Do the values match what the link peer actually received?

Also, can you compare them to what iperf says it transmitted.

From this, we can rule out the industrial cable, and should also be
able to rule out the receiver is the problem, not the transmitter.

     Andrew

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-13  8:27     ` Miquel Raynal
  2023-10-13 15:51       ` Andrew Lunn
@ 2023-10-16  8:48       ` Alexander Stein
  2023-10-16 13:31         ` Miquel Raynal
  1 sibling, 1 reply; 26+ messages in thread
From: Alexander Stein @ 2023-10-16  8:48 UTC (permalink / raw)
  To: Stephen Hemminger, Miquel Raynal
  Cc: Andrew Lunn, Wei Fang, Shenwei Wang, Clark Wang, Russell King,
	davem, edumazet, kuba, pabeni, linux-imx, netdev,
	Thomas Petazzoni, Alexandre Belloni, Maxime Chevallier

Hi Miquel,

Am Freitag, 13. Oktober 2023, 10:27:18 CEST schrieb Miquel Raynal:
> Hi Stephen & Andrew,
> 
> stephen@networkplumber.org wrote on Thu, 12 Oct 2023 15:58:57 -0700:
> > On Thu, 12 Oct 2023 22:46:09 +0200
> > 
> > Andrew Lunn <andrew@lunn.ch> wrote:
> > > > //192.168.1.1 is my host, so the below lines are from the board:
> > > > # iperf3 -c 192.168.1.1 -u -b100M
> > > > [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms 
> > > > 467/82603 (0.57%)  receiver # iperf3 -c 192.168.1.1 -u -b90M
> > > > [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms 
> > > > 12163/77688 (16%)  receiver # iperf3 -c 192.168.1.1 -u -b80M
> > > > [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms 
> > > > 20937/69055 (30%)  receiver> > 
> > > Have you tried playing with ‐‐pacing‐timer ?
> > > 
> > > Maybe iperf is producing a big bursts of packets and then silence for
> > > a while. The burst is overflowing a buffer somewhere? Smooth the flow
> > > and it might work better?
> 
> I've just tried and the results are kind of the opposite of what I
> would expect. Here are the values so maybe you'll have a different
> understanding:
> 
> From --pacing-timer 1 to 100000 (should be microseconds IIUC), results
> are the same. And then, increasing the period decreases the drop rate:
> 
> # iperf3 -c 192.168.1.1 -u -b1M --pacing-timer 1000
> [  5]   0.00-10.04  sec   604 KBytes   493 Kbits/sec  0.062 ms  437/864
> (51%)  receiver # iperf3 -c 192.168.1.1 -u -b1M --pacing-timer 10000
> [  5]   0.00-10.05  sec   581 KBytes   474 Kbits/sec  0.102 ms  452/863
> (52%)  receiver # iperf3 -c 192.168.1.1 -u -b1M --pacing-timer 100000
> [  5]   0.00-10.05  sec   867 KBytes   707 Kbits/sec  0.094 ms  240/853
> (28%)  receiver # iperf3 -c 192.168.1.1 -u -b1M --pacing-timer 1000000
> [  5]   0.00-10.05  sec  1.04 MBytes   866 Kbits/sec  0.080 ms  27/778
> (3.5%)  receiver
> > Please post the basic system info.
> > Like kernel dmesg log.
> 
> Please find the logs below.
> 
> > All network statistics including ethtool.
> 
> PHY statistics remain empty:
> 
> # ethtool --phy-statistics eth0
> PHY statistics:
>      phy_receive_errors: 0
>      phy_idle_errors: 0
> 
> Interrupts work as expected on the MAC side (I added traces in the IRQ
> handler to see how it was behaving):
> 
> # cat /proc/interrupts | grep ethernet
>  82:     344546          0          0          0  gpio-mxc   6 Level    
> 2188000.ethernet 104:          1          0          0          0  gpio-mxc
>  28 Level     2188000.ethernet 337:          0          0          0       
>   0     GIC-0 151 Level     2188000.ethernet
> 
> # ethtool -S eth0
> NIC statistics:
>      tx_dropped: 0
>      tx_packets: 10118
>      tx_broadcast: 0
>      tx_multicast: 13
>      tx_crc_errors: 0
>      tx_undersize: 0
>      tx_oversize: 0
>      tx_fragment: 0
>      tx_jabber: 0
>      tx_collision: 0
>      tx_64byte: 130
>      tx_65to127byte: 61031
>      tx_128to255byte: 19
>      tx_256to511byte: 10
>      tx_512to1023byte: 5
>      tx_1024to2047byte: 14459
>      tx_GTE2048byte: 0
>      tx_octets: 26219280
>      IEEE_tx_drop: 0
>      IEEE_tx_frame_ok: 10118
>      IEEE_tx_1col: 0
>      IEEE_tx_mcol: 0
>      IEEE_tx_def: 0
>      IEEE_tx_lcol: 0
>      IEEE_tx_excol: 0
>      IEEE_tx_macerr: 0
>      IEEE_tx_cserr: 0
>      IEEE_tx_sqe: 0
>      IEEE_tx_fdxfc: 0
>      IEEE_tx_octets_ok: 26219280
>      rx_packets: 35369
>      rx_broadcast: 1
>      rx_multicast: 5
>      rx_crc_errors: 0
>      rx_undersize: 0
>      rx_oversize: 0
>      rx_fragment: 0
>      rx_jabber: 0
>      rx_64byte: 10
>      rx_65to127byte: 9083
>      rx_128to255byte: 8
>      rx_256to511byte: 8
>      rx_512to1023byte: 0
>      rx_1024to2047byte: 26260
>      rx_GTE2048byte: 0
>      rx_octets: 436459630
>      IEEE_rx_drop: 0
>      IEEE_rx_frame_ok: 35369
>      IEEE_rx_crc: 0
>      IEEE_rx_align: 0
>      IEEE_rx_macerr: 0
>      IEEE_rx_fdxfc: 0
>      IEEE_rx_octets_ok: 436459630
> 
> > Any special qdisc or firewall configuration.
> 
> None.
> 
> > Likely a hardware or driver bug that is doing something wrong
> > when a lot of packets are received.
> 
> Well, isn't it kind of the opposite? If we flood the interface it works
> better than when we pace the traffic (that's what I see whenever I
> reduce the throughput or when I enlarge the iperf timer).
> 
> I'm also doubtful about the fact that receiving full speed traffic makes
> the uplink stable.
> 
> Thanks,
> Miquèl
> 
> ---
> 
> switch to partitions #0, OK
> mmc1 is current device
> reading boot.scr
> 444 bytes read in 10 ms (43 KiB/s)
> ## Executing script at 20000000
> Booting from mmc ...
> reading zImage
> 9160016 bytes read in 462 ms (18.9 MiB/s)
> reading <board>.dtb

Which device tree is that?

> 40052 bytes read in 22 ms (1.7 MiB/s)
> boot device tree kernel ...
> Kernel image @ 0x12000000 [ 0x000000 - 0x8bc550 ]
> ## Flattened Device Tree blob at 18000000
>    Booting using the fdt blob at 0x18000000
>    Using Device Tree in place at 18000000, end 1800cc73
> 
> Starting kernel ...
> 
> [    0.000000] Booting Linux on physical CPU 0x0
> [    0.000000] Linux version 6.5.0 (mraynal@xps-13) (arm-linux-gcc.br_real
> (Buildroot 2 020.08-14-ge5a2a90) 10.2.0, GNU ld (GNU Binutils) 2.34) #120
> SMP Thu Oct 12 18:10:20 CE ST 2023
> [    0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7),
> cr=10c5387d [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT
> aliasing instruction cache
> [    0.000000] OF: fdt: Machine model: TQ TQMa6Q
> on MBa6x

Your first mail mentions a custom board, but this indicates "TQMa6Q
on MBa6x", so which is it?
Please note that there are two different module variants, imx6qdl-tqma6a.dtsi 
and imx6qdl-tqma6b.dtsi. They deal with i.MX6's ERR006687 differently.
Package drop without any load somewhat indicates this issue.

Best regards,
Alexander

> [    0.000000] Memory policy: Data cache writealloc
> [    0.000000] cma: Reserved 160 MiB at 0x46000000
> [    0.000000] Zone ranges:
> [    0.000000]   Normal   [mem 0x0000000010000000-0x000000003fffffff]
> [    0.000000]   HighMem  [mem 0x0000000040000000-0x000000004fffffff]
> [    0.000000] Movable zone start for each node
> [    0.000000] Early memory node ranges
> [    0.000000]   node   0: [mem 0x0000000010000000-0x000000004fffffff]
> [    0.000000] Initmem setup node 0 [mem
> 0x0000000010000000-0x000000004fffffff] [    0.000000] percpu: Embedded 13
> pages/cpu s23124 r8192 d21932 u53248 [    0.000000] Kernel command line:
> root=/dev/mmcblk1p2 ro rootwait console=ttymxc1,115 200 cma=160M
> [    0.000000] Dentry cache hash table entries: 131072 (order: 7, 524288
> bytes, linear) [    0.000000] Inode-cache hash table entries: 65536 (order:
> 6, 262144 bytes, linear) [    0.000000] Built 1 zonelists, mobility
> grouping on.  Total pages: 260608 [    0.000000] mem auto-init: stack:off,
> heap alloc:off, heap free:off [    0.000000] Memory: 854196K/1048576K
> available (13312K kernel code, 1308K rwdata, 39 44K rodata, 1024K init,
> 401K bss, 30540K reserved, 163840K cma-reserved, 98304K highmem )
> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
> [    0.000000] rcu: Hierarchical RCU implementation.
> [    0.000000] rcu:     RCU event tracing is enabled.
> [    0.000000]  Tracing variant of Tasks RCU enabled.
> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 10
> jiffies. [    0.000000] NR_IRQS: 16, nr_irqs: 16, preallocated irqs: 16
> [    0.000000] L2C-310 errata 752271 769419 enabled
> [    0.000000] L2C-310 enabling early BRESP for Cortex-A9
> [    0.000000] L2C-310 full line of zeros enabled for Cortex-A9
> [    0.000000] L2C-310 ID prefetch enabled, offset 16 lines
> [    0.000000] L2C-310 dynamic clock gating enabled, standby mode enabled
> [    0.000000] L2C-310 cache controller enabled, 16 ways, 1024 kB
> [    0.000000] L2C-310: CACHE_ID 0x410000c7, AUX_CTRL 0x76470001
> [    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on
> contention. [    0.000000] Switching to timer-based delay loop, resolution
> 333ns [    0.000001] sched_clock: 32 bits at 3000kHz, resolution 333ns,
> wraps every 715827882 841ns
> [    0.000018] clocksource: mxc_timer1: mask: 0xffffffff max_cycles:
> 0xffffffff, max_id le_ns: 637086815595 ns
> [    0.001530] Console: colour dummy device 80x30
> [    0.001571] Calibrating delay loop (skipped), value calculated using
> timer frequency .. 6.00 BogoMIPS (lpj=30000)
> [    0.001587] CPU: Testing write buffer coherency: ok
> [    0.001625] CPU0: Spectre v2: using BPIALL workaround
> [    0.001633] pid_max: default: 32768 minimum: 301
> [    0.001764] Mount-cache hash table entries: 2048 (order: 1, 8192 bytes,
> linear) [    0.001787] Mountpoint-cache hash table entries: 2048 (order: 1,
> 8192 bytes, linear) [    0.002589] CPU0: thread -1, cpu 0, socket 0, mpidr
> 80000000
> [    0.003612] RCU Tasks Trace: Setting shift to 2 and lim to 1
> rcu_task_cb_adjust=1. [    0.003775] Setting up static identity map for
> 0x10100000 - 0x10100078 [    0.003964] rcu: Hierarchical SRCU
> implementation.
> [    0.003970] rcu:     Max phase no-delay instances is 1000.
> [    0.005041] smp: Bringing up secondary CPUs ...
> [    0.005965] CPU1: thread -1, cpu 1, socket 0, mpidr 80000001
> [    0.005983] CPU1: Spectre v2: using BPIALL workaround
> [    0.007009] CPU2: thread -1, cpu 2, socket 0, mpidr 80000002
> [    0.007025] CPU2: Spectre v2: using BPIALL workaround
> [    0.008023] CPU3: thread -1, cpu 3, socket 0, mpidr 80000003
> [    0.008040] CPU3: Spectre v2: using BPIALL workaround
> [    0.008149] smp: Brought up 1 node, 4 CPUs
> [    0.008161] SMP: Total of 4 processors activated (24.00 BogoMIPS).
> [    0.008171] CPU: All CPU(s) started in SVC mode.
> [    0.008688] devtmpfs: initialized
> [    0.017370] VFP support v0.3: implementor 41 architecture 3 part 30
> variant 9 rev 4 [    0.017636] clocksource: jiffies: mask: 0xffffffff
> max_cycles: 0xffffffff, max_idle_ ns: 19112604462750000 ns
> [    0.017660] futex hash table entries: 1024 (order: 4, 65536 bytes,
> linear) [    0.025946] pinctrl core: initialized pinctrl subsystem
> [    0.027948] NET: Registered PF_NETLINK/PF_ROUTE protocol family
> [    0.035191] DMA: preallocated 256 KiB pool for atomic coherent
> allocations [    0.036387] thermal_sys: Registered thermal governor
> 'step_wise' [    0.036449] cpuidle: using governor menu
> [    0.036568] CPU identified as i.MX6Q, silicon rev 1.5
> [    0.042593] platform soc: Fixed dependency cycle(s) with
> /soc/aips-bus@02000000/gpc@ 020dc000
> [    0.057409] platform 2400000.ipu: Fixed dependency cycle(s) with
> /soc/aips-bus@02000 000/ldb/lvds-channel@0/port@1/endpoint
> [    0.057451] platform 2400000.ipu: Fixed dependency cycle(s) with
> /soc/hdmi@0120000/p ort@1/endpoint
> [    0.057476] platform 2400000.ipu: Fixed dependency cycle(s) with
> /soc/aips-bus@02000 000/ldb/lvds-channel@0/port@0/endpoint
> [    0.057507] platform 2400000.ipu: Fixed dependency cycle(s) with
> /soc/hdmi@0120000/p ort@0/endpoint
> [    0.057530] platform 2400000.ipu: Fixed dependency cycle(s) with
> /soc/aips-bus@02000 000/iomuxc-gpr@020e0000/ipu1_csi0_mux/port@2/endpoint
> [    0.058449] platform 2800000.ipu: Fixed dependency cycle(s) with
> /soc/aips-bus@02000 000/ldb/lvds-channel@0/port@3/endpoint
> [    0.058510] platform 2800000.ipu: Fixed dependency cycle(s) with
> /soc/hdmi@0120000/p ort@3/endpoint
> [    0.058558] platform 2800000.ipu: Fixed dependency cycle(s) with
> /soc/aips-bus@02000 000/ldb/lvds-channel@0/port@2/endpoint
> [    0.058611] platform 2800000.ipu: Fixed dependency cycle(s) with
> /soc/hdmi@0120000/p ort@2/endpoint
> [    0.058633] platform 2800000.ipu: Fixed dependency cycle(s) with
> /soc/aips-bus@02000 000/iomuxc-gpr@020e0000/ipu2_csi1_mux/port@2/endpoint
> [    0.061550] No ATAGs?
> [    0.061690] hw-breakpoint: found 5 (+1 reserved) breakpoint and 1
> watchpoint registe rs.
> [    0.061702] hw-breakpoint: maximum watchpoint size is 4 bytes.
> [    0.063055] imx6q-pinctrl 20e0000.iomuxc: initialized IMX pinctrl driver
> [    0.067250] kprobes: kprobe jump-optimization is enabled. All kprobes are
> optimized if possible.
> [    0.068787] gpio gpiochip0: Static allocation of GPIO base is deprecated,
> use dynami c allocation.
> [    0.070727] gpio gpiochip1: Static allocation of GPIO base is deprecated,
> use dynami c allocation.
> [    0.072435] gpio gpiochip2: Static allocation of GPIO base is deprecated,
> use dynami c allocation.
> [    0.074182] gpio gpiochip3: Static allocation of GPIO base is deprecated,
> use dynami c allocation.
> [    0.075910] gpio gpiochip4: Static allocation of GPIO base is deprecated,
> use dynami c allocation.
> [    0.077677] gpio gpiochip5: Static allocation of GPIO base is deprecated,
> use dynami c allocation.
> [    0.079441] gpio gpiochip6: Static allocation of GPIO base is deprecated,
> use dynami c allocation.
> [    0.083631] SCSI subsystem initialized
> [    0.084119] usbcore: registered new interface driver usbfs
> [    0.084160] usbcore: registered new interface driver hub
> [    0.084205] usbcore: registered new device driver usb
> [    0.086993] pca953x 0-0020: supply vcc not found, using dummy regulator
> [    0.087157] pca953x 0-0020: using no AI
> [    0.100030] pca953x 0-0020: interrupt support not compiled in
> [    0.120764] pca953x 0-0021: supply vcc not found, using dummy regulator
> [    0.120904] pca953x 0-0021: using no AI
> [    0.160606] pca953x 0-0022: supply vcc not found, using dummy regulator
> [    0.160732] pca953x 0-0022: using no AI
> [    0.200704] i2c i2c-0: IMX I2C adapter registered
> [    0.201717] i2c i2c-1: IMX I2C adapter registered
> [    0.201937] mc: Linux media interface: v0.10
> [    0.202003] videodev: Linux video capture interface: v2.00
> [    0.202117] pps_core: LinuxPPS API ver. 1 registered
> [    0.202124] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo
> Giometti <gi ometti@linux.it>
> [    0.202151] PTP clock support registered
> [    0.202996] Advanced Linux Sound Architecture Driver Initialized.
> [    0.203875] Bluetooth: Core ver 2.22
> [    0.203918] NET: Registered PF_BLUETOOTH protocol family
> [    0.203924] Bluetooth: HCI device and connection manager initialized
> [    0.203937] Bluetooth: HCI socket layer initialized
> [    0.203946] Bluetooth: L2CAP socket layer initialized
> [    0.203965] Bluetooth: SCO socket layer initialized
> [    0.204470] vgaarb: loaded
> [    0.204864] clocksource: Switched to clocksource mxc_timer1
> [    0.205119] VFS: Disk quotas dquot_6.6.0
> [    0.205178] VFS: Dquot-cache hash table entries: 1024 (order 0, 4096
> bytes) [    0.215795] NET: Registered PF_INET protocol family
> [    0.216223] IP idents hash table entries: 16384 (order: 5, 131072 bytes,
> linear) [    0.218261] tcp_listen_portaddr_hash hash table entries: 512
> (order: 0, 4096 bytes, linear)
> [    0.218290] Table-perturb hash table entries: 65536 (order: 6, 262144
> bytes, linear) [    0.218304] TCP established hash table entries: 8192
> (order: 3, 32768 bytes, linear) [    0.218393] TCP bind hash table entries:
> 8192 (order: 5, 131072 bytes, linear) [    0.218680] TCP: Hash tables
> configured (established 8192 bind 8192) [    0.218891] UDP hash table
> entries: 512 (order: 2, 16384 bytes, linear) [    0.218944] UDP-Lite hash
> table entries: 512 (order: 2, 16384 bytes, linear) [    0.219141] NET:
> Registered PF_UNIX/PF_LOCAL protocol family
> [    0.219687] RPC: Registered named UNIX socket transport module.
> [    0.219697] RPC: Registered udp transport module.
> [    0.219703] RPC: Registered tcp transport module.
> [    0.219708] RPC: Registered tcp-with-tls transport module.
> [    0.219713] RPC: Registered tcp NFSv4.1 backchannel transport module.
> [    0.221148] PCI: CLS 0 bytes, default 64
> [    0.221977] armv7-pmu soc:pmu: hw perfevents: no interrupt-affinity
> property, guessi ng.
> [    0.222197] hw perfevents: enabled with armv7_cortex_a9 PMU driver, 7
> counters avail able
> [    0.223895] Initialise system trusted keyrings
> [    0.224175] workingset: timestamp_bits=30 max_order=18 bucket_order=0
> [    0.224818] NFS: Registering the id_resolver key type
> [    0.224899] Key type id_resolver registered
> [    0.224907] Key type id_legacy registered
> [    0.224937] nfs4filelayout_init: NFSv4 File Layout Driver Registering...
> [    0.224946] nfs4flexfilelayout_init: NFSv4 Flexfile Layout Driver
> Registering... [    0.224982] jffs2: version 2.2. (NAND) © 2001-2006 Red
> Hat, Inc. [    0.225263] fuse: init (API version 7.38)
> [    0.366002] Key type asymmetric registered
> [    0.366012] Asymmetric key parser 'x509' registered
> [    0.366111] bounce: pool size: 64 pages
> [    0.366141] io scheduler mq-deadline registered
> [    0.366149] io scheduler kyber registered
> [    0.366177] io scheduler bfq registered
> [    0.374797] mxs-dma 110000.dma-apbh: initialized
> [    0.380411] 21e8000.serial: ttymxc1 at MMIO 0x21e8000 (irq = 267,
> base_baud = 500000 0) is a IMX
> [    0.380466] pfuze100-regulator 0-0008: Full layer: 2, Metal layer: 1
> [    0.380517] printk: console [ttymxc1] enabled
> [    0.420856] pfuze100-regulator 0-0008: FAB: 0, FIN: 0
> [    0.426703] 21f0000.serial: ttymxc3 at MMIO 0x21f0000 (irq = 268,
> base_baud = 500000 0) is a IMX
> [    0.429991] pfuze100-regulator 0-0008: pfuze100 found.
> [    1.415972] dwhdmi-imx 120000.hdmi: Detected HDMI TX controller v1.30a
> with HDCP (DW C HDMI 3D TX PHY)
> [    1.438220] etnaviv etnaviv: bound 130000.gpu (ops 0xc0eaa4c0)
> [    1.444322] etnaviv etnaviv: bound 134000.gpu (ops 0xc0eaa4c0)
> [    1.450446] etnaviv etnaviv: bound 2204000.gpu (ops 0xc0eaa4c0)
> [    1.456412] etnaviv-gpu 130000.gpu: model: GC2000, revision: 5108
> [    1.462736] etnaviv-gpu 134000.gpu: model: GC320, revision: 5007
> [    1.468843] etnaviv-gpu 2204000.gpu: model: GC355, revision: 1215
> [    1.474978] etnaviv-gpu 2204000.gpu: Ignoring GPU with VG and FE2.0
> [    1.481750] [drm] Initialized etnaviv 1.3.0 20151214 for etnaviv on minor
> 0 [    1.490567] imx-ipuv3 2400000.ipu: IPUv3H probed
> [    1.497417] imx-drm display-subsystem: bound imx-ipuv3-crtc.2 (ops
> 0xc0e990bc) [    1.504784] imx-drm display-subsystem: bound
> imx-ipuv3-crtc.3 (ops 0xc0e990bc) [    1.512221] imx-drm display-subsystem:
> bound imx-ipuv3-crtc.6 (ops 0xc0e990bc) [    1.519601] imx-drm
> display-subsystem: bound imx-ipuv3-crtc.7 (ops 0xc0e990bc) [    1.526929]
> imx-drm display-subsystem: bound 120000.hdmi (ops 0xc0e99bb0) [   
> 1.533801] imx-drm display-subsystem: bound 2000000.aips-bus:ldb (ops
> 0xc0e9985c) [    1.542009] [drm] Initialized imx-drm 1.0.0 20120507 for
> display-subsystem on minor 1
> [    1.617102] Console: switching to colour frame buffer device 128x48
> [    1.639863] imx-drm display-subsystem: [drm] fb0: imx-drmdrmfb frame
> buffer device [    1.647569] imx-ipuv3 2800000.ipu: IPUv3H probed
> [    1.664218] brd: module loaded
> [    1.675094] loop: module loaded
> [    1.678559] at24 0-0050: supply vcc not found, using dummy regulator
> [    1.686264] at24 0-0050: 8192 byte 24c64 EEPROM, writable, 32 bytes/write
> [    1.693254] at24 0-005e: supply vcc not found, using dummy regulator [  
>  1.700649] at24 0-005e: 6 byte 24mac402 EEPROM, read-only
> [    1.707334] ahci-imx 2200000.sata: fsl,transmit-level-mV not specified,
> using 000000 24
> [    1.715309] ahci-imx 2200000.sata: fsl,transmit-boost-mdB not specified,
> using 00000 480
> [    1.723325] ahci-imx 2200000.sata: fsl,transmit-atten-16ths not
> specified, using 000 02000
> [    1.731527] ahci-imx 2200000.sata: fsl,receive-eq-mdB not specified,
> using 05000000 [    1.739323] ahci-imx 2200000.sata: supply ahci not found,
> using dummy regulator [    1.746879] ahci-imx 2200000.sata: supply phy not
> found, using dummy regulator [    1.754171] ahci-imx 2200000.sata: supply
> target not found, using dummy regulator [    1.765103] ahci-imx
> 2200000.sata: SSS flag set, parallel bus scan disabled [    1.772093]
> ahci-imx 2200000.sata: AHCI 0001.0300 32 slots 1 ports 3 Gbps 0x1 impl p
> latform mode
> [    1.780930] ahci-imx 2200000.sata: flags: ncq sntf stag pm led clo only
> pmp pio slum part ccc apst
> [    1.791631] scsi host0: ahci-imx
> [    1.795182] ata1: SATA max UDMA/133 mmio [mem 0x02200000-0x02203fff] port
> 0x100 irq 281
> [    1.807887] CAN device driver interface
> [    1.815291] pps pps0: new PPS source ptp0
> [    1.824500] fec 2188000.ethernet eth0: registered PHC device 0
> [    1.830997] usbcore: registered new device driver r8152-cfgselector
> [    1.837331] usbcore: registered new interface driver r8152
> [    1.842875] usbcore: registered new interface driver lan78xx
> [    1.848605] usbcore: registered new interface driver asix
> [    1.854042] usbcore: registered new interface driver ax88179_178a
> [    1.860192] usbcore: registered new interface driver cdc_ether
> [    1.866080] usbcore: registered new interface driver smsc95xx
> [    1.871862] usbcore: registered new interface driver net1080
> [    1.877579] usbcore: registered new interface driver cdc_subset
> [    1.883536] usbcore: registered new interface driver zaurus
> [    1.889163] usbcore: registered new interface driver MOSCHIP usb-ethernet
> driver [    1.896643] usbcore: registered new interface driver cdc_ncm
> [    1.902339] usbcore: registered new interface driver r8153_ecm
> [    1.908265] usbcore: registered new interface driver usb-storage
> [    1.915845] imx_usb 2184000.usb: No over current polarity defined
> [    1.925481] ci_hdrc ci_hdrc.0: EHCI Host Controller
> [    1.930395] ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus
> number 1 [    1.964891] ci_hdrc ci_hdrc.0: USB 2.0 started, EHCI 1.00
> [    1.970466] usb usb1: New USB device found, idVendor=1d6b,
> idProduct=0002, bcdDevice = 6.05
> [    1.978770] usb usb1: New USB device strings: Mfr=3, Product=2,
> SerialNumber=1 [    1.986022] usb usb1: Product: EHCI Host Controller
> [    1.990909] usb usb1: Manufacturer: Linux 6.5.0 ehci_hcd
> [    1.996243] usb usb1: SerialNumber: ci_hdrc.0
> [    2.001210] hub 1-0:1.0: USB hub found
> [    2.005040] hub 1-0:1.0: 1 port detected
> [    2.013077] ci_hdrc ci_hdrc.1: EHCI Host Controller
> [    2.018008] ci_hdrc ci_hdrc.1: new USB bus registered, assigned bus
> number 2 [    2.054882] ci_hdrc ci_hdrc.1: USB 2.0 started, EHCI 1.00
> [    2.060434] usb usb2: New USB device found, idVendor=1d6b,
> idProduct=0002, bcdDevice = 6.05
> [    2.068736] usb usb2: New USB device strings: Mfr=3, Product=2,
> SerialNumber=1 [    2.075993] usb usb2: Product: EHCI Host Controller
> [    2.080880] usb usb2: Manufacturer: Linux 6.5.0 ehci_hcd
> [    2.086215] usb usb2: SerialNumber: ci_hdrc.1
> [    2.091157] hub 2-0:1.0: USB hub found
> [    2.094973] hub 2-0:1.0: 1 port detected
> [    2.100576] SPI driver ads7846 has no spi_device_id for ti,tsc2046
> [    2.106788] SPI driver ads7846 has no spi_device_id for ti,ads7843
> [    2.112974] SPI driver ads7846 has no spi_device_id for ti,ads7845
> [    2.119173] SPI driver ads7846 has no spi_device_id for ti,ads7873
> [    2.126405] ata1: SATA link down (SStatus 0 SControl 300)
> [    2.128977] rtc-ds1307 0-0068: SET TIME!
> [    2.131859] ahci-imx 2200000.sata: no device found, disabling link.
> [    2.138861] rtc-ds1307 0-0068: registered as rtc0
> [    2.142031] ahci-imx 2200000.sata: pass ahci_imx..hotplug=1 to enable
> hotplug [    2.148304] rtc-ds1307 0-0068: setting system clock to
> 2000-01-01T00:00:16 UTC (9466 84816)
> [    2.164313] snvs_rtc 20cc000.snvs:snvs-rtc-lp: registered as rtc1
> [    2.170598] i2c_dev: i2c /dev entries driver
> [    2.179371] Bluetooth: HCI UART driver ver 2.3
> [    2.183827] Bluetooth: HCI UART protocol H4 registered
> [    2.189030] Bluetooth: HCI UART protocol LL registered
> [    2.195090] sdhci: Secure Digital Host Controller Interface driver
> [    2.201278] sdhci: Copyright(c) Pierre Ossman
> [    2.205665] sdhci-pltfm: SDHCI platform and OF driver helper
> [    2.212740] sdhci-esdhc-imx 2194000.usdhc: Got CD GPIO
> [    2.213147] caam 2100000.caam: Entropy delay = 3200
> [    2.217980] sdhci-esdhc-imx 2194000.usdhc: Got WP GPIO
> [    2.235314] caam 2100000.caam: Instantiated RNG4 SH0
> [    2.247792] caam 2100000.caam: Instantiated RNG4 SH1
> [    2.252993] caam 2100000.caam: device ID = 0x0a16010000000000 (Era 4)
> [    2.254409] mmc0: SDHCI controller on 2198000.usdhc [2198000.usdhc] using
> ADMA [    2.259488] caam 2100000.caam: job rings = 2, qi = 0
> [    2.274830] caam algorithms registered in /proc/crypto
> [    2.280160] caam 2100000.caam: registering rng-caam
> [    2.284350] mmc1: SDHCI controller on 2194000.usdhc [2194000.usdhc] using
> ADMA [    2.285314] caam 2100000.caam: rng crypto API alg registered
> prng-caam [    2.297630] random: crng init done
> [    2.303285] usbcore: registered new interface driver usbhid
> [    2.308911] usbhid: USB HID core driver
> [    2.313588] imx-ipuv3-csi imx-ipuv3-csi.0: Registered ipu1_csi0 capture
> as /dev/vide o0
> [    2.320064] mmc0: new DDR MMC card at address 0001
> [    2.321564] usb 1-1: new high-speed USB device number 2 using ci_hdrc
> [    2.326456] imx-ipuv3 2400000.ipu: Registered ipu1_ic_prpenc capture as
> /dev/video1 [    2.333527] mmcblk0: mmc0:0001 Q2J54A 3.64 GiB
> [    2.340788] mmc1: new high speed SDHC card at address 1234
> [    2.344970] imx-ipuv3 2400000.ipu: Registered ipu1_ic_prpvf capture as
> /dev/video2 [    2.345512] imx-ipuv3-csi imx-ipuv3-csi.1: Registered
> ipu1_csi1 capture as /dev/vide o3
> [    2.351359] mmcblk1: mmc1:1234 SA32G 28.8 GiB
> [    2.358940] imx-ipuv3-csi imx-ipuv3-csi.4: Registered ipu2_csi0 capture
> as /dev/vide o4
> [    2.359085] mmcblk0boot0: mmc0:0001 Q2J54A 2.00 MiB
> [    2.361287] mmcblk0boot1: mmc0:0001 Q2J54A 2.00 MiB
> [    2.363107] mmcblk0rpmb: mmc0:0001 Q2J54A 512 KiB, chardev (243:0)
> [    2.394649]  mmcblk1: p1 p2 p3
> [    2.394924] usb 2-1: new high-speed USB device number 2 using ci_hdrc
> [    2.394948] imx-ipuv3 2800000.ipu: Registered ipu2_ic_prpenc capture as
> /dev/video5 [    2.397988] imx-ipuv3 2800000.ipu: Registered ipu2_ic_prpvf
> capture as /dev/video6 [    2.419931] imx-ipuv3-csi imx-ipuv3-csi.5:
> Registered ipu2_csi1 capture as /dev/vide o7
> [    2.436511] NET: Registered PF_INET6 protocol family
> [    2.442621] Segment Routing with IPv6
> [    2.446385] In-situ OAM (IOAM) with IPv6
> [    2.450396] sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
> [    2.457018] NET: Registered PF_PACKET protocol family
> [    2.462109] can: controller area network core
> [    2.466561] NET: Registered PF_CAN protocol family
> [    2.471387] can: raw protocol
> [    2.474365] can: broadcast manager protocol
> [    2.478607] can: netlink gateway - max_hops=1
> [    2.483072] Key type dns_resolver registered
> [    2.489383] Registering SWP/SWPB emulation handler
> [    2.504238] Loading compiled-in X.509 certificates
> [    2.539840] video-mux 20e0000.iomuxc-gpr:ipu1_csi0_mux: Consider updating
> driver vid eo-mux to match on endpoints
> [    2.550967] video-mux 20e0000.iomuxc-gpr:ipu2_csi1_mux: Consider updating
> driver vid eo-mux to match on endpoints
> [    2.555909] usb 1-1: New USB device found, idVendor=0bda, idProduct=8179,
> bcdDevice= 0.00
> [    2.563328] imx-media: Registered ipu_ic_pp csc/scaler as /dev/video8
> [    2.569363] usb 1-1: New USB device strings: Mfr=1, Product=2,
> SerialNumber=3 [    2.576973] imx_thermal 2000000.aips-bus:tempmon:
> Industrial CPU temperature grade - max:105C critical:100C passive:95C
> [    2.582980] usb 1-1: Product: 802.11n NIC
> [    2.597803] usb 1-1: Manufacturer: Realtek
> [    2.601909] usb 1-1: SerialNumber: 00E04C0001
> [    2.610341] cfg80211: Loading compiled-in X.509 certificates for
> regulatory database [    2.621866] Loaded X.509 cert 'sforshee:
> 00b28ddf47aef9cea7'
> [    2.627617] clk: Disabling unused clocks
> [    2.631928] platform regulatory.0: Direct firmware load for regulatory.db
> failed wit h error -2
> [    2.634899] ALSA device list:
> [    2.640606] cfg80211: failed to load regulatory.db
> [    2.643524]   No soundcards found.
> [    2.648408] usb 2-1: New USB device found, idVendor=0424, idProduct=2517,
> bcdDevice= 0.02
> [    2.660001] usb 2-1: New USB device strings: Mfr=0, Product=0,
> SerialNumber=0 [    2.667658] hub 2-1:1.0: USB hub found
> [    2.671598] hub 2-1:1.0: 7 ports detected
> [    2.700941] EXT4-fs (mmcblk1p2): INFO: recovery required on readonly
> filesystem [    2.708329] EXT4-fs (mmcblk1p2): write access will be enabled
> during recovery [    2.994978] usb 2-1.1: new high-speed USB device number
> 3 using ci_hdrc [    3.145639] usb 2-1.1: New USB device found,
> idVendor=0424, idProduct=9e00, bcdDevic e= 3.00
> [    3.154030] usb 2-1.1: New USB device strings: Mfr=0, Product=0,
> SerialNumber=0 [    3.164547] smsc95xx v2.0.0
> [    3.289135] SMSC LAN8710/LAN8720 usb-002:003:01: attached PHY driver
> (mii_bus:phy_ad dr=usb-002:003:01, irq=294)
> [    3.300938] smsc95xx 2-1.1:1.0 eth1: register 'smsc95xx' at
> usb-ci_hdrc.1-1.1, smsc9 5xx USB 2.0 Ethernet, f2:f7:83:3c:d3:e8
> [    3.766601] EXT4-fs (mmcblk1p2): recovery complete
> [    4.017039] EXT4-fs (mmcblk1p2): mounted filesystem
> 1c93b4dc-44a6-4b43-93b0-ce3b0bbd 0391 ro with ordered data mode. Quota
> mode: none.
> [    4.029221] VFS: Mounted root (ext4 filesystem) readonly on device
> 179:10. [    4.037240] devtmpfs: mounted
> [    4.042698] Freeing unused kernel image (initmem) memory: 1024K
> [    4.049122] Run /sbin/init as init process
> [    4.330114] EXT4-fs (mmcblk1p2): re-mounted
> 1c93b4dc-44a6-4b43-93b0-ce3b0bbd0391 r/w . Quota mode: none.
> Starting psplash: OK
> Starting syslogd: OK
> Starting klogd: OK
> Running sysctl: OK
> Populating /dev using udev:
> [    4.650852] udevd[134]: starting version 3.2.9
> [    4.692906] udevd[135]: starting eudev-3.2.9
> done
> Starting watchdog...
> Initializing random number generator: OK
> Saving random seed: OK
> Starting usbguard daemon: OK
> Starting rngd: OK
> Starting system message bus: done
> Starting network:
> [    6.261676] Micrel KSZ9031 Gigabit PHY 2188000.ethernet-1:03: attached
> PHY driver (mii_bus:phy_addr=2188000.ethernet-1:03, irq=56) OK
> Starting chrony: OK
> Starting php-fpm  done
> Starting nginx...
> Starting sshd: OK
> Touchscreen Firmware
> Tool version:   v0.29_20170705
> APILIB version: v1.0.62.0705
> Try to start Stephanie 5 GUI
> login:
> [    8.500637] fec 2188000.ethernet eth0: Link is Up - 100Mbps/Full - flow
> control rx/tx [    8.533754] fec 2188000.ethernet eth0: Link is Down
> [   11.147566] fec 2188000.ethernet eth0: Link is Up - 100Mbps/Full - flow
> control rx/t x
> [   12.646102] platform 2008000.ecspi: deferred probe pending
> root
> Password:
> #
> # ip link set dev eth0 up
> # ip addr add 192.168.1.2/24 dev eth0
> ip: RTNETLINK answers: File exists
> # iperf3 -c 192.168.1.1


-- 
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
http://www.tq-group.com/



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-13  8:40   ` Miquel Raynal
  2023-10-13 10:16     ` Wei Fang
@ 2023-10-16 11:49     ` Eric Dumazet
  2023-10-16 13:58       ` Miquel Raynal
  1 sibling, 1 reply; 26+ messages in thread
From: Eric Dumazet @ 2023-10-16 11:49 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Russell King (Oracle), Wei Fang, Shenwei Wang, Clark Wang, davem,
	kuba, pabeni, linux-imx, netdev, Thomas Petazzoni,
	Alexandre Belloni, Maxime Chevallier, Andrew Lunn,
	Stephen Hemminger

On Fri, Oct 13, 2023 at 10:40 AM Miquel Raynal
<miquel.raynal@bootlin.com> wrote:
>
> Hi Russell,
>
> linux@armlinux.org.uk wrote on Thu, 12 Oct 2023 20:39:11 +0100:
>
> > On Thu, Oct 12, 2023 at 07:34:10PM +0200, Miquel Raynal wrote:
> > > Hello,
> > >
> > > I've been scratching my foreheads for weeks on a strange imx6
> > > network issue, I need help to go further, as I feel a bit clueless now.
> > >
> > > Here is my setup :
> > > - Custom imx6q board
> > > - Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
> > > - Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
> > > - The MAC (fec driver) is connected to a Micrel 9031 PHY
> > > - The PHY is connected to the link partner through an industrial cable
> >
> > "industrial cable" ?
>
> It is a "unique" hardware cable, the four Ethernet pairs are foiled
> twisted pair each and the whole cable is shielded. Additionally there
> is the 24V power supply coming from this cable. The connector is from
> ODU S22LOC-P16MCD0-920S. The structure of the cable should be similar
> to a CAT7 cable with the additional power supply line.
>
> > > - Testing 100BASE-T (link is stable)
> >
> > Would that be full or half duplex?
>
> Ah, yeah, sorry for forgetting this detail, it's full duplex.
>
> > > The RGMII-ID timings are probably not totally optimal but offer
> > > rather good performance. In UDP with iperf3:
> > > * Downlink (host to the board) runs at full speed with 0% drop
> > > * Uplink (board to host) runs at full speed with <1% drop
> > >
> > > However, if I ever try to limit the bandwidth in uplink (only), the
> > > drop rate rises significantly, up to 30%:
> > >
> > > //192.168.1.1 is my host, so the below lines are from the board:
> > > # iperf3 -c 192.168.1.1 -u -b100M
> > > [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms
> > > 467/82603 (0.57%)  receiver # iperf3 -c 192.168.1.1 -u -b90M
> > > [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms
> > > 12163/77688 (16%)  receiver # iperf3 -c 192.168.1.1 -u -b80M
> > > [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms
> > > 20937/69055 (30%)  receiver
> >
> > My setup:
> >
> > i.MX6DL silicon rev 1.3
> > Atheros AR8035 PHY
> > 6.3.0+ (no significant changes to fec_main.c)
> > Link, being BASE-T, is standard RJ45.
> >
> > Connectivity is via a bridge device (sorry, can't change that as it
> > would be too disruptive, as this is my Internet router!)
> >
> > Running at 1000BASE-T (FD):
> > [ ID] Interval           Transfer     Bitrate         Jitter
> > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> > Mbits/sec  0.030 ms  0/82363 (0%)  receiver [  5]   0.00-10.00  sec
> > 107 MBytes  90.0 Mbits/sec  0.103 ms  0/77691 (0%)  receiver [  5]
> > 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.101 ms  0/69060 (0%)
> > receiver
> >
> > Running at 100BASE-Tx (FD):
> > [ ID] Interval           Transfer     Bitrate         Jitter
> > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> > Mbits/sec  0.008 ms  0/82436 (0%)  receiver [  5]   0.00-10.00  sec
> > 107 MBytes  90.0 Mbits/sec  0.088 ms  0/77692 (0%)  receiver [  5]
> > 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.108 ms  0/69058 (0%)
> > receiver
> >
> > Running at 100bASE-Tx (HD):
> > [ ID] Interval           Transfer     Bitrate         Jitter
> > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.3
> > Mbits/sec  0.056 ms  0/82304 (0%)  receiver [  5]   0.00-10.00  sec
> > 107 MBytes  90.0 Mbits/sec  0.101 ms  1/77691 (0.0013%)  receiver [
> > 5]   0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.105 ms  0/69058
> > (0%)  receiver
> >
> > So I'm afraid I don't see your issue.
>
> I believe the issue cannot be at an higher level than the MAC. I also
> do not think the MAC driver and PHY driver are specifically buggy. I
> ruled out the hardware issue given the fact that under certain
> conditions (high load) the network works rather well... But I certainly
> see this issue, and when switching to TCP the results are dramatic:
>
> # iperf3 -c 192.168.1.1
> Connecting to host 192.168.1.1, port 5201
> [  5] local 192.168.1.2 port 37948 connected to 192.168.1.1 port 5201
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.00   sec  11.3 MBytes  94.5 Mbits/sec   43   32.5 KBytes
> [  5]   1.00-2.00   sec  3.29 MBytes  27.6 Mbits/sec   26   1.41 KBytes
> [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes
> [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> [  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> [  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> [  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
>
> Thanks,
> Miquèl

Can you experiment with :

- Disabling TSO on your NIC (ethtool -K eth0 tso off)
- Reducing max GSO size (ip link set dev eth0 gso_max_size 16384)

I suspect some kind of issues with fec TX completion, vs TSO emulation.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-16  8:48       ` Alexander Stein
@ 2023-10-16 13:31         ` Miquel Raynal
  2023-10-16 14:41           ` Alexander Stein
  0 siblings, 1 reply; 26+ messages in thread
From: Miquel Raynal @ 2023-10-16 13:31 UTC (permalink / raw)
  To: Alexander Stein
  Cc: Stephen Hemminger, Andrew Lunn, Wei Fang, Shenwei Wang,
	Clark Wang, Russell King, davem, edumazet, kuba, pabeni,
	linux-imx, netdev, Thomas Petazzoni, Alexandre Belloni,
	Maxime Chevallier

Hi Alexander,

Thanks a lot for your feedback.

> > switch to partitions #0, OK
> > mmc1 is current device
> > reading boot.scr
> > 444 bytes read in 10 ms (43 KiB/s)
> > ## Executing script at 20000000
> > Booting from mmc ...
> > reading zImage
> > 9160016 bytes read in 462 ms (18.9 MiB/s)
> > reading <board>.dtb  
> 
> Which device tree is that?
> 
> > 40052 bytes read in 22 ms (1.7 MiB/s)
> > boot device tree kernel ...
> > Kernel image @ 0x12000000 [ 0x000000 - 0x8bc550 ]
> > ## Flattened Device Tree blob at 18000000
> >    Booting using the fdt blob at 0x18000000
> >    Using Device Tree in place at 18000000, end 1800cc73
> > 
> > Starting kernel ...
> > 
> > [    0.000000] Booting Linux on physical CPU 0x0
> > [    0.000000] Linux version 6.5.0 (mraynal@xps-13) (arm-linux-gcc.br_real
> > (Buildroot 2 020.08-14-ge5a2a90) 10.2.0, GNU ld (GNU Binutils) 2.34) #120
> > SMP Thu Oct 12 18:10:20 CE ST 2023
> > [    0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7),
> > cr=10c5387d [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT
> > aliasing instruction cache
> > [    0.000000] OF: fdt: Machine model: TQ TQMa6Q
> > on MBa6x  
> 
> Your first mail mentions a custom board, but this indicates "TQMa6Q
> on MBa6x", so which is it?

It's a custom carrier board with a TQMA6Q-AA module.

> Please note that there are two different module variants, imx6qdl-tqma6a.dtsi 
> and imx6qdl-tqma6b.dtsi. They deal with i.MX6's ERR006687 differently.
> Package drop without any load somewhat indicates this issue.

I've tried with and without the fsl,err006687-workaround-present DT
property. It gets successfully parsed an I see the lower idle state
being disabled under mach-imx. I've also tried just commenting out the
registration of the cpuidle driver, just to be sure. I saw no
difference.

By the way, we tried with a TQ eval board with this SoM and saw the same
issue (not me, I don't have this board in hands). Don't you experience
something similar? I went across a couple of people reporting similar
issues with these modules but none of them reported how they fixed it
(if they did). I tried two different images based on TQ's Github using
v4.14.69 and v5.10 kernels.

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-16 11:49     ` Eric Dumazet
@ 2023-10-16 13:58       ` Miquel Raynal
  2023-10-16 15:06         ` Eric Dumazet
  2023-10-16 15:36         ` Miquel Raynal
  0 siblings, 2 replies; 26+ messages in thread
From: Miquel Raynal @ 2023-10-16 13:58 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Russell King (Oracle), Wei Fang, Shenwei Wang, Clark Wang, davem,
	kuba, pabeni, linux-imx, netdev, Thomas Petazzoni,
	Alexandre Belloni, Maxime Chevallier, Andrew Lunn,
	Stephen Hemminger

Hi Eric,

edumazet@google.com wrote on Mon, 16 Oct 2023 13:49:25 +0200:

> On Fri, Oct 13, 2023 at 10:40 AM Miquel Raynal
> <miquel.raynal@bootlin.com> wrote:
> >
> > Hi Russell,
> >
> > linux@armlinux.org.uk wrote on Thu, 12 Oct 2023 20:39:11 +0100:
> >  
> > > On Thu, Oct 12, 2023 at 07:34:10PM +0200, Miquel Raynal wrote:  
> > > > Hello,
> > > >
> > > > I've been scratching my foreheads for weeks on a strange imx6
> > > > network issue, I need help to go further, as I feel a bit clueless now.
> > > >
> > > > Here is my setup :
> > > > - Custom imx6q board
> > > > - Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
> > > > - Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
> > > > - The MAC (fec driver) is connected to a Micrel 9031 PHY
> > > > - The PHY is connected to the link partner through an industrial cable  
> > >
> > > "industrial cable" ?  
> >
> > It is a "unique" hardware cable, the four Ethernet pairs are foiled
> > twisted pair each and the whole cable is shielded. Additionally there
> > is the 24V power supply coming from this cable. The connector is from
> > ODU S22LOC-P16MCD0-920S. The structure of the cable should be similar
> > to a CAT7 cable with the additional power supply line.
> >  
> > > > - Testing 100BASE-T (link is stable)  
> > >
> > > Would that be full or half duplex?  
> >
> > Ah, yeah, sorry for forgetting this detail, it's full duplex.
> >  
> > > > The RGMII-ID timings are probably not totally optimal but offer
> > > > rather good performance. In UDP with iperf3:
> > > > * Downlink (host to the board) runs at full speed with 0% drop
> > > > * Uplink (board to host) runs at full speed with <1% drop
> > > >
> > > > However, if I ever try to limit the bandwidth in uplink (only), the
> > > > drop rate rises significantly, up to 30%:
> > > >
> > > > //192.168.1.1 is my host, so the below lines are from the board:
> > > > # iperf3 -c 192.168.1.1 -u -b100M
> > > > [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms
> > > > 467/82603 (0.57%)  receiver # iperf3 -c 192.168.1.1 -u -b90M
> > > > [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms
> > > > 12163/77688 (16%)  receiver # iperf3 -c 192.168.1.1 -u -b80M
> > > > [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms
> > > > 20937/69055 (30%)  receiver  
> > >
> > > My setup:
> > >
> > > i.MX6DL silicon rev 1.3
> > > Atheros AR8035 PHY
> > > 6.3.0+ (no significant changes to fec_main.c)
> > > Link, being BASE-T, is standard RJ45.
> > >
> > > Connectivity is via a bridge device (sorry, can't change that as it
> > > would be too disruptive, as this is my Internet router!)
> > >
> > > Running at 1000BASE-T (FD):
> > > [ ID] Interval           Transfer     Bitrate         Jitter
> > > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> > > Mbits/sec  0.030 ms  0/82363 (0%)  receiver [  5]   0.00-10.00  sec
> > > 107 MBytes  90.0 Mbits/sec  0.103 ms  0/77691 (0%)  receiver [  5]
> > > 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.101 ms  0/69060 (0%)
> > > receiver
> > >
> > > Running at 100BASE-Tx (FD):
> > > [ ID] Interval           Transfer     Bitrate         Jitter
> > > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> > > Mbits/sec  0.008 ms  0/82436 (0%)  receiver [  5]   0.00-10.00  sec
> > > 107 MBytes  90.0 Mbits/sec  0.088 ms  0/77692 (0%)  receiver [  5]
> > > 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.108 ms  0/69058 (0%)
> > > receiver
> > >
> > > Running at 100bASE-Tx (HD):
> > > [ ID] Interval           Transfer     Bitrate         Jitter
> > > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.3
> > > Mbits/sec  0.056 ms  0/82304 (0%)  receiver [  5]   0.00-10.00  sec
> > > 107 MBytes  90.0 Mbits/sec  0.101 ms  1/77691 (0.0013%)  receiver [
> > > 5]   0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.105 ms  0/69058
> > > (0%)  receiver
> > >
> > > So I'm afraid I don't see your issue.  
> >
> > I believe the issue cannot be at an higher level than the MAC. I also
> > do not think the MAC driver and PHY driver are specifically buggy. I
> > ruled out the hardware issue given the fact that under certain
> > conditions (high load) the network works rather well... But I certainly
> > see this issue, and when switching to TCP the results are dramatic:
> >
> > # iperf3 -c 192.168.1.1
> > Connecting to host 192.168.1.1, port 5201
> > [  5] local 192.168.1.2 port 37948 connected to 192.168.1.1 port 5201
> > [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> > [  5]   0.00-1.00   sec  11.3 MBytes  94.5 Mbits/sec   43   32.5 KBytes
> > [  5]   1.00-2.00   sec  3.29 MBytes  27.6 Mbits/sec   26   1.41 KBytes
> > [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes
> > [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > [  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > [  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > [  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> >
> > Thanks,
> > Miquèl  
> 
> Can you experiment with :
> 
> - Disabling TSO on your NIC (ethtool -K eth0 tso off)
> - Reducing max GSO size (ip link set dev eth0 gso_max_size 16384)
> 
> I suspect some kind of issues with fec TX completion, vs TSO emulation.

Wow, appears to have a significant effect. I am using Busybox's iproute
implementation which does not know gso_max_size, but I hacked directly
into netdevice.h just to see if it would have an effect. I'm adding
iproute2 to the image for further testing.

Here is the diff:

--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2364,7 +2364,7 @@ struct net_device {
 /* TCP minimal MSS is 8 (TCP_MIN_GSO_SIZE),
  * and shinfo->gso_segs is a 16bit field.
  */
-#define GSO_MAX_SIZE           (8 * GSO_MAX_SEGS)
+#define GSO_MAX_SIZE           16384u
 
        unsigned int            gso_max_size;
 #define TSO_LEGACY_MAX_SIZE    65536

And here are the results:

# ethtool -K eth0 tso off
# iperf3 -c 192.168.1.1 -u -b1M
Connecting to host 192.168.1.1, port 5201
[  5] local 192.168.1.2 port 50490 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec   123 KBytes  1.01 Mbits/sec  87  
[  5]   1.00-2.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   2.00-3.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   3.00-4.00   sec   123 KBytes  1.01 Mbits/sec  87  
[  5]   4.00-5.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   5.00-6.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   6.00-7.00   sec   123 KBytes  1.01 Mbits/sec  87  
[  5]   7.00-8.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   8.00-9.00   sec   122 KBytes   996 Kbits/sec  86  
[  5]   9.00-10.00  sec   123 KBytes  1.01 Mbits/sec  87  
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  1.19 MBytes  1.00 Mbits/sec  0.000 ms  0/864 (0%)  sender
[  5]   0.00-10.05  sec  1.11 MBytes   925 Kbits/sec  0.045 ms  62/864 (7.2%)  receiver
iperf Done.
# iperf3 -c 192.168.1.1
Connecting to host 192.168.1.1, port 5201
[  5] local 192.168.1.2 port 34792 connected to 192.168.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.63 MBytes  13.7 Mbits/sec   30   1.41 KBytes       
[  5]   1.00-2.00   sec  7.40 MBytes  62.1 Mbits/sec   65   14.1 KBytes       
[  5]   2.00-3.00   sec  7.83 MBytes  65.7 Mbits/sec  109   2.83 KBytes       
[  5]   3.00-4.00   sec  2.49 MBytes  20.9 Mbits/sec   46   19.8 KBytes       
[  5]   4.00-5.00   sec  7.89 MBytes  66.2 Mbits/sec  109   2.83 KBytes       
[  5]   5.00-6.00   sec   255 KBytes  2.09 Mbits/sec   22   2.83 KBytes       
[  5]   6.00-7.00   sec  4.35 MBytes  36.5 Mbits/sec   74   41.0 KBytes       
[  5]   7.00-8.00   sec  10.9 MBytes  91.8 Mbits/sec   34   45.2 KBytes       
[  5]   8.00-9.00   sec  5.35 MBytes  44.9 Mbits/sec   82   1.41 KBytes       
[  5]   9.00-10.00  sec  1.37 MBytes  11.5 Mbits/sec   73   1.41 KBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  49.5 MBytes  41.5 Mbits/sec  644             sender
[  5]   0.00-10.05  sec  49.3 MBytes  41.1 Mbits/sec                  receiver
iperf Done.

There is still a noticeable amount of drop/retries, but overall the
results are significantly better. What is the rationale behind the
choice of 16384 in particular? Could this be further improved?

Thanks a lot,
Miquèl

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-16 13:31         ` Miquel Raynal
@ 2023-10-16 14:41           ` Alexander Stein
  2023-10-17 10:49             ` Miquel Raynal
  0 siblings, 1 reply; 26+ messages in thread
From: Alexander Stein @ 2023-10-16 14:41 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Stephen Hemminger, Andrew Lunn, Wei Fang, Shenwei Wang,
	Clark Wang, Russell King, davem, edumazet, kuba, pabeni,
	linux-imx, netdev, Thomas Petazzoni, Alexandre Belloni,
	Maxime Chevallier

Hi Miquel,

Am Montag, 16. Oktober 2023, 15:31:54 CEST schrieb Miquel Raynal:
> Hi Alexander,
> 
> Thanks a lot for your feedback.
> 
> > > switch to partitions #0, OK
> > > mmc1 is current device
> > > reading boot.scr
> > > 444 bytes read in 10 ms (43 KiB/s)
> > > ## Executing script at 20000000
> > > Booting from mmc ...
> > > reading zImage
> > > 9160016 bytes read in 462 ms (18.9 MiB/s)
> > > reading <board>.dtb
> > 
> > Which device tree is that?
> > 
> > > 40052 bytes read in 22 ms (1.7 MiB/s)
> > > boot device tree kernel ...
> > > Kernel image @ 0x12000000 [ 0x000000 - 0x8bc550 ]
> > > ## Flattened Device Tree blob at 18000000
> > > 
> > >    Booting using the fdt blob at 0x18000000
> > >    Using Device Tree in place at 18000000, end 1800cc73
> > > 
> > > Starting kernel ...
> > > 
> > > [    0.000000] Booting Linux on physical CPU 0x0
> > > [    0.000000] Linux version 6.5.0 (mraynal@xps-13)
> > > (arm-linux-gcc.br_real
> > > (Buildroot 2 020.08-14-ge5a2a90) 10.2.0, GNU ld (GNU Binutils) 2.34)
> > > #120
> > > SMP Thu Oct 12 18:10:20 CE ST 2023
> > > [    0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7),
> > > cr=10c5387d [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT
> > > aliasing instruction cache
> > > [    0.000000] OF: fdt: Machine model: TQ TQMa6Q
> > > on MBa6x
> > 
> > Your first mail mentions a custom board, but this indicates "TQMa6Q
> > on MBa6x", so which is it?
> 
> It's a custom carrier board with a TQMA6Q-AA module.

Could you please adjust the machine model to your mainboard if it is not a 
MBa6x? Thanks.
Which HW revision is this module? It should be printed in u-boot during start. 
Can you provide a full log?

> > Please note that there are two different module variants,
> > imx6qdl-tqma6a.dtsi and imx6qdl-tqma6b.dtsi. They deal with i.MX6's
> > ERR006687 differently. Package drop without any load somewhat indicates
> > this issue.
> 
> I've tried with and without the fsl,err006687-workaround-present DT
> property. It gets successfully parsed an I see the lower idle state
> being disabled under mach-imx. I've also tried just commenting out the
> registration of the cpuidle driver, just to be sure. I saw no
> difference.

fsl,err006687-workaround-present requires a specific HW workaround, see [1]. 
So this is not applicable on every module.

> By the way, we tried with a TQ eval board with this SoM and saw the same
> issue (not me, I don't have this board in hands). Don't you experience
> something similar? I went across a couple of people reporting similar
> issues with these modules but none of them reported how they fixed it
> (if they did). I tried two different images based on TQ's Github using
> v4.14.69 and v5.10 kernels.

Personally I've heard the first time about this issue. I never noticed 
something like this. Does this issue also appear when using TCP? Or is it an 
UDP only issue?

Best regards,
Alexander

[1] https://github.com/tq-systems/linux-tqmaxx/blob/TQMa8-fslc-5.10-2.1.x-imx/
arch/arm/boot/dts/imx6qdl-tqma6a.dtsi#L36-L48

-- 
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
http://www.tq-group.com/



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-16 13:58       ` Miquel Raynal
@ 2023-10-16 15:06         ` Eric Dumazet
  2023-10-16 15:36         ` Miquel Raynal
  1 sibling, 0 replies; 26+ messages in thread
From: Eric Dumazet @ 2023-10-16 15:06 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Russell King (Oracle), Wei Fang, Shenwei Wang, Clark Wang, davem,
	kuba, pabeni, linux-imx, netdev, Thomas Petazzoni,
	Alexandre Belloni, Maxime Chevallier, Andrew Lunn,
	Stephen Hemminger

On Mon, Oct 16, 2023 at 3:59 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hi Eric,
>
> edumazet@google.com wrote on Mon, 16 Oct 2023 13:49:25 +0200:
>
> > On Fri, Oct 13, 2023 at 10:40 AM Miquel Raynal
> > <miquel.raynal@bootlin.com> wrote:
> > >
> > > Hi Russell,
> > >
> > > linux@armlinux.org.uk wrote on Thu, 12 Oct 2023 20:39:11 +0100:
> > >
> > > > On Thu, Oct 12, 2023 at 07:34:10PM +0200, Miquel Raynal wrote:
> > > > > Hello,
> > > > >
> > > > > I've been scratching my foreheads for weeks on a strange imx6
> > > > > network issue, I need help to go further, as I feel a bit clueless now.
> > > > >
> > > > > Here is my setup :
> > > > > - Custom imx6q board
> > > > > - Bootloader: U-Boot 2017.11 (also tried with a 2016.03)
> > > > > - Kernel : 4.14(.69,.146,.322), v5.10 and v6.5 with the same behavior
> > > > > - The MAC (fec driver) is connected to a Micrel 9031 PHY
> > > > > - The PHY is connected to the link partner through an industrial cable
> > > >
> > > > "industrial cable" ?
> > >
> > > It is a "unique" hardware cable, the four Ethernet pairs are foiled
> > > twisted pair each and the whole cable is shielded. Additionally there
> > > is the 24V power supply coming from this cable. The connector is from
> > > ODU S22LOC-P16MCD0-920S. The structure of the cable should be similar
> > > to a CAT7 cable with the additional power supply line.
> > >
> > > > > - Testing 100BASE-T (link is stable)
> > > >
> > > > Would that be full or half duplex?
> > >
> > > Ah, yeah, sorry for forgetting this detail, it's full duplex.
> > >
> > > > > The RGMII-ID timings are probably not totally optimal but offer
> > > > > rather good performance. In UDP with iperf3:
> > > > > * Downlink (host to the board) runs at full speed with 0% drop
> > > > > * Uplink (board to host) runs at full speed with <1% drop
> > > > >
> > > > > However, if I ever try to limit the bandwidth in uplink (only), the
> > > > > drop rate rises significantly, up to 30%:
> > > > >
> > > > > //192.168.1.1 is my host, so the below lines are from the board:
> > > > > # iperf3 -c 192.168.1.1 -u -b100M
> > > > > [  5]   0.00-10.05  sec   113 MBytes  94.6 Mbits/sec  0.044 ms
> > > > > 467/82603 (0.57%)  receiver # iperf3 -c 192.168.1.1 -u -b90M
> > > > > [  5]   0.00-10.04  sec  90.5 MBytes  75.6 Mbits/sec  0.146 ms
> > > > > 12163/77688 (16%)  receiver # iperf3 -c 192.168.1.1 -u -b80M
> > > > > [  5]   0.00-10.05  sec  66.4 MBytes  55.5 Mbits/sec  0.162 ms
> > > > > 20937/69055 (30%)  receiver
> > > >
> > > > My setup:
> > > >
> > > > i.MX6DL silicon rev 1.3
> > > > Atheros AR8035 PHY
> > > > 6.3.0+ (no significant changes to fec_main.c)
> > > > Link, being BASE-T, is standard RJ45.
> > > >
> > > > Connectivity is via a bridge device (sorry, can't change that as it
> > > > would be too disruptive, as this is my Internet router!)
> > > >
> > > > Running at 1000BASE-T (FD):
> > > > [ ID] Interval           Transfer     Bitrate         Jitter
> > > > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> > > > Mbits/sec  0.030 ms  0/82363 (0%)  receiver [  5]   0.00-10.00  sec
> > > > 107 MBytes  90.0 Mbits/sec  0.103 ms  0/77691 (0%)  receiver [  5]
> > > > 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.101 ms  0/69060 (0%)
> > > > receiver
> > > >
> > > > Running at 100BASE-Tx (FD):
> > > > [ ID] Interval           Transfer     Bitrate         Jitter
> > > > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.4
> > > > Mbits/sec  0.008 ms  0/82436 (0%)  receiver [  5]   0.00-10.00  sec
> > > > 107 MBytes  90.0 Mbits/sec  0.088 ms  0/77692 (0%)  receiver [  5]
> > > > 0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.108 ms  0/69058 (0%)
> > > > receiver
> > > >
> > > > Running at 100bASE-Tx (HD):
> > > > [ ID] Interval           Transfer     Bitrate         Jitter
> > > > Lost/Total Datagrams [  5]   0.00-10.01  sec   114 MBytes  95.3
> > > > Mbits/sec  0.056 ms  0/82304 (0%)  receiver [  5]   0.00-10.00  sec
> > > > 107 MBytes  90.0 Mbits/sec  0.101 ms  1/77691 (0.0013%)  receiver [
> > > > 5]   0.00-10.00  sec  95.4 MBytes  80.0 Mbits/sec  0.105 ms  0/69058
> > > > (0%)  receiver
> > > >
> > > > So I'm afraid I don't see your issue.
> > >
> > > I believe the issue cannot be at an higher level than the MAC. I also
> > > do not think the MAC driver and PHY driver are specifically buggy. I
> > > ruled out the hardware issue given the fact that under certain
> > > conditions (high load) the network works rather well... But I certainly
> > > see this issue, and when switching to TCP the results are dramatic:
> > >
> > > # iperf3 -c 192.168.1.1
> > > Connecting to host 192.168.1.1, port 5201
> > > [  5] local 192.168.1.2 port 37948 connected to 192.168.1.1 port 5201
> > > [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> > > [  5]   0.00-1.00   sec  11.3 MBytes  94.5 Mbits/sec   43   32.5 KBytes
> > > [  5]   1.00-2.00   sec  3.29 MBytes  27.6 Mbits/sec   26   1.41 KBytes
> > > [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > > [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes
> > > [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > [  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > [  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > > [  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > >
> > > Thanks,
> > > Miquèl
> >
> > Can you experiment with :
> >
> > - Disabling TSO on your NIC (ethtool -K eth0 tso off)
> > - Reducing max GSO size (ip link set dev eth0 gso_max_size 16384)
> >
> > I suspect some kind of issues with fec TX completion, vs TSO emulation.
>
> Wow, appears to have a significant effect. I am using Busybox's iproute
> implementation which does not know gso_max_size, but I hacked directly
> into netdevice.h just to see if it would have an effect. I'm adding
> iproute2 to the image for further testing.
>
> Here is the diff:
>
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2364,7 +2364,7 @@ struct net_device {
>  /* TCP minimal MSS is 8 (TCP_MIN_GSO_SIZE),
>   * and shinfo->gso_segs is a 16bit field.
>   */
> -#define GSO_MAX_SIZE           (8 * GSO_MAX_SEGS)
> +#define GSO_MAX_SIZE           16384u
>
>         unsigned int            gso_max_size;
>  #define TSO_LEGACY_MAX_SIZE    65536
>
> And here are the results:
>
> # ethtool -K eth0 tso off
> # iperf3 -c 192.168.1.1 -u -b1M
> Connecting to host 192.168.1.1, port 5201
> [  5] local 192.168.1.2 port 50490 connected to 192.168.1.1 port 5201
> [ ID] Interval           Transfer     Bitrate         Total Datagrams
> [  5]   0.00-1.00   sec   123 KBytes  1.01 Mbits/sec  87
> [  5]   1.00-2.00   sec   122 KBytes   996 Kbits/sec  86
> [  5]   2.00-3.00   sec   122 KBytes   996 Kbits/sec  86
> [  5]   3.00-4.00   sec   123 KBytes  1.01 Mbits/sec  87
> [  5]   4.00-5.00   sec   122 KBytes   996 Kbits/sec  86
> [  5]   5.00-6.00   sec   122 KBytes   996 Kbits/sec  86
> [  5]   6.00-7.00   sec   123 KBytes  1.01 Mbits/sec  87
> [  5]   7.00-8.00   sec   122 KBytes   996 Kbits/sec  86
> [  5]   8.00-9.00   sec   122 KBytes   996 Kbits/sec  86
> [  5]   9.00-10.00  sec   123 KBytes  1.01 Mbits/sec  87
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
> [  5]   0.00-10.00  sec  1.19 MBytes  1.00 Mbits/sec  0.000 ms  0/864 (0%)  sender
> [  5]   0.00-10.05  sec  1.11 MBytes   925 Kbits/sec  0.045 ms  62/864 (7.2%)  receiver
> iperf Done.
> # iperf3 -c 192.168.1.1
> Connecting to host 192.168.1.1, port 5201
> [  5] local 192.168.1.2 port 34792 connected to 192.168.1.1 port 5201
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.00   sec  1.63 MBytes  13.7 Mbits/sec   30   1.41 KBytes
> [  5]   1.00-2.00   sec  7.40 MBytes  62.1 Mbits/sec   65   14.1 KBytes
> [  5]   2.00-3.00   sec  7.83 MBytes  65.7 Mbits/sec  109   2.83 KBytes
> [  5]   3.00-4.00   sec  2.49 MBytes  20.9 Mbits/sec   46   19.8 KBytes
> [  5]   4.00-5.00   sec  7.89 MBytes  66.2 Mbits/sec  109   2.83 KBytes
> [  5]   5.00-6.00   sec   255 KBytes  2.09 Mbits/sec   22   2.83 KBytes
> [  5]   6.00-7.00   sec  4.35 MBytes  36.5 Mbits/sec   74   41.0 KBytes
> [  5]   7.00-8.00   sec  10.9 MBytes  91.8 Mbits/sec   34   45.2 KBytes
> [  5]   8.00-9.00   sec  5.35 MBytes  44.9 Mbits/sec   82   1.41 KBytes
> [  5]   9.00-10.00  sec  1.37 MBytes  11.5 Mbits/sec   73   1.41 KBytes
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec  49.5 MBytes  41.5 Mbits/sec  644             sender
> [  5]   0.00-10.05  sec  49.3 MBytes  41.1 Mbits/sec                  receiver
> iperf Done.
>
> There is still a noticeable amount of drop/retries, but overall the
> results are significantly better. What is the rationale behind the
> choice of 16384 in particular? Could this be further improved?

Use of fec driver was the common trigger with another thread discussed
in netdev@

Can you go back to standard gso_max_size, and apply the patch found here :

https://lore.kernel.org/netdev/CANn89iJUBujG2AOBYsr0V7qyC5WTgzx0GucO=2ES69tTDJRziw@mail.gmail.com/

You could possibly compile a more recent iproute2/ip command, and play
with gso_max_size,
I wonder if soft tso used in fec driver could have some corner cases.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-16 13:58       ` Miquel Raynal
  2023-10-16 15:06         ` Eric Dumazet
@ 2023-10-16 15:36         ` Miquel Raynal
  2023-10-16 19:37           ` Eric Dumazet
  1 sibling, 1 reply; 26+ messages in thread
From: Miquel Raynal @ 2023-10-16 15:36 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Russell King (Oracle), Wei Fang, Shenwei Wang, Clark Wang, davem,
	kuba, pabeni, linux-imx, netdev, Thomas Petazzoni,
	Alexandre Belloni, Maxime Chevallier, Andrew Lunn,
	Stephen Hemminger

Hello again,

> > > # iperf3 -c 192.168.1.1
> > > Connecting to host 192.168.1.1, port 5201
> > > [  5] local 192.168.1.2 port 37948 connected to 192.168.1.1 port 5201
> > > [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> > > [  5]   0.00-1.00   sec  11.3 MBytes  94.5 Mbits/sec   43   32.5 KBytes
> > > [  5]   1.00-2.00   sec  3.29 MBytes  27.6 Mbits/sec   26   1.41 KBytes
> > > [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > > [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes
> > > [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > [  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > [  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > > [  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > >
> > > Thanks,
> > > Miquèl    
> > 
> > Can you experiment with :
> > 
> > - Disabling TSO on your NIC (ethtool -K eth0 tso off)
> > - Reducing max GSO size (ip link set dev eth0 gso_max_size 16384)
> > 
> > I suspect some kind of issues with fec TX completion, vs TSO emulation.  
> 
> Wow, appears to have a significant effect. I am using Busybox's iproute
> implementation which does not know gso_max_size, but I hacked directly
> into netdevice.h just to see if it would have an effect. I'm adding
> iproute2 to the image for further testing.
> 
> Here is the diff:
> 
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2364,7 +2364,7 @@ struct net_device {
>  /* TCP minimal MSS is 8 (TCP_MIN_GSO_SIZE),
>   * and shinfo->gso_segs is a 16bit field.
>   */
> -#define GSO_MAX_SIZE           (8 * GSO_MAX_SEGS)
> +#define GSO_MAX_SIZE           16384u
>  
>         unsigned int            gso_max_size;
>  #define TSO_LEGACY_MAX_SIZE    65536
> 
> And here are the results:
> 
> # ethtool -K eth0 tso off
> # iperf3 -c 192.168.1.1 -u -b1M
> Connecting to host 192.168.1.1, port 5201
> [  5] local 192.168.1.2 port 50490 connected to 192.168.1.1 port 5201
> [ ID] Interval           Transfer     Bitrate         Total Datagrams
> [  5]   0.00-1.00   sec   123 KBytes  1.01 Mbits/sec  87  
> [  5]   1.00-2.00   sec   122 KBytes   996 Kbits/sec  86  
> [  5]   2.00-3.00   sec   122 KBytes   996 Kbits/sec  86  
> [  5]   3.00-4.00   sec   123 KBytes  1.01 Mbits/sec  87  
> [  5]   4.00-5.00   sec   122 KBytes   996 Kbits/sec  86  
> [  5]   5.00-6.00   sec   122 KBytes   996 Kbits/sec  86  
> [  5]   6.00-7.00   sec   123 KBytes  1.01 Mbits/sec  87  
> [  5]   7.00-8.00   sec   122 KBytes   996 Kbits/sec  86  
> [  5]   8.00-9.00   sec   122 KBytes   996 Kbits/sec  86  
> [  5]   9.00-10.00  sec   123 KBytes  1.01 Mbits/sec  87  
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
> [  5]   0.00-10.00  sec  1.19 MBytes  1.00 Mbits/sec  0.000 ms  0/864 (0%)  sender
> [  5]   0.00-10.05  sec  1.11 MBytes   925 Kbits/sec  0.045 ms  62/864 (7.2%)  receiver
> iperf Done.
> # iperf3 -c 192.168.1.1
> Connecting to host 192.168.1.1, port 5201
> [  5] local 192.168.1.2 port 34792 connected to 192.168.1.1 port 5201
> [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> [  5]   0.00-1.00   sec  1.63 MBytes  13.7 Mbits/sec   30   1.41 KBytes       
> [  5]   1.00-2.00   sec  7.40 MBytes  62.1 Mbits/sec   65   14.1 KBytes       
> [  5]   2.00-3.00   sec  7.83 MBytes  65.7 Mbits/sec  109   2.83 KBytes       
> [  5]   3.00-4.00   sec  2.49 MBytes  20.9 Mbits/sec   46   19.8 KBytes       
> [  5]   4.00-5.00   sec  7.89 MBytes  66.2 Mbits/sec  109   2.83 KBytes       
> [  5]   5.00-6.00   sec   255 KBytes  2.09 Mbits/sec   22   2.83 KBytes       
> [  5]   6.00-7.00   sec  4.35 MBytes  36.5 Mbits/sec   74   41.0 KBytes       
> [  5]   7.00-8.00   sec  10.9 MBytes  91.8 Mbits/sec   34   45.2 KBytes       
> [  5]   8.00-9.00   sec  5.35 MBytes  44.9 Mbits/sec   82   1.41 KBytes       
> [  5]   9.00-10.00  sec  1.37 MBytes  11.5 Mbits/sec   73   1.41 KBytes       
> - - - - - - - - - - - - - - - - - - - - - - - - -
> [ ID] Interval           Transfer     Bitrate         Retr
> [  5]   0.00-10.00  sec  49.5 MBytes  41.5 Mbits/sec  644             sender
> [  5]   0.00-10.05  sec  49.3 MBytes  41.1 Mbits/sec                  receiver
> iperf Done.
> 
> There is still a noticeable amount of drop/retries, but overall the
> results are significantly better. What is the rationale behind the
> choice of 16384 in particular? Could this be further improved?

Apparently I've been too enthusiastic. After sending this e-mail I've
re-generated an image with iproute2 and dd'ed the whole image into an
SD card, while until now I was just updating the kernel/DT manually and
got the same performances as above without the gro size trick. I need
to clarify this further.

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-16 15:36         ` Miquel Raynal
@ 2023-10-16 19:37           ` Eric Dumazet
  2023-10-16 21:47             ` Russell King (Oracle)
  2023-10-17 11:19             ` Miquel Raynal
  0 siblings, 2 replies; 26+ messages in thread
From: Eric Dumazet @ 2023-10-16 19:37 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Russell King (Oracle), Wei Fang, Shenwei Wang, Clark Wang, davem,
	kuba, pabeni, linux-imx, netdev, Thomas Petazzoni,
	Alexandre Belloni, Maxime Chevallier, Andrew Lunn,
	Stephen Hemminger

On Mon, Oct 16, 2023 at 5:37 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
>
> Hello again,
>
> > > > # iperf3 -c 192.168.1.1
> > > > Connecting to host 192.168.1.1, port 5201
> > > > [  5] local 192.168.1.2 port 37948 connected to 192.168.1.1 port 5201
> > > > [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> > > > [  5]   0.00-1.00   sec  11.3 MBytes  94.5 Mbits/sec   43   32.5 KBytes
> > > > [  5]   1.00-2.00   sec  3.29 MBytes  27.6 Mbits/sec   26   1.41 KBytes
> > > > [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > > [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > > > [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes
> > > > [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > > [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > > [  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > > [  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > > > [  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > > >
> > > > Thanks,
> > > > Miquèl
> > >
> > > Can you experiment with :
> > >
> > > - Disabling TSO on your NIC (ethtool -K eth0 tso off)
> > > - Reducing max GSO size (ip link set dev eth0 gso_max_size 16384)
> > >
> > > I suspect some kind of issues with fec TX completion, vs TSO emulation.
> >
> > Wow, appears to have a significant effect. I am using Busybox's iproute
> > implementation which does not know gso_max_size, but I hacked directly
> > into netdevice.h just to see if it would have an effect. I'm adding
> > iproute2 to the image for further testing.
> >
> > Here is the diff:
> >
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -2364,7 +2364,7 @@ struct net_device {
> >  /* TCP minimal MSS is 8 (TCP_MIN_GSO_SIZE),
> >   * and shinfo->gso_segs is a 16bit field.
> >   */
> > -#define GSO_MAX_SIZE           (8 * GSO_MAX_SEGS)
> > +#define GSO_MAX_SIZE           16384u
> >
> >         unsigned int            gso_max_size;
> >  #define TSO_LEGACY_MAX_SIZE    65536
> >
> > And here are the results:
> >
> > # ethtool -K eth0 tso off
> > # iperf3 -c 192.168.1.1 -u -b1M
> > Connecting to host 192.168.1.1, port 5201
> > [  5] local 192.168.1.2 port 50490 connected to 192.168.1.1 port 5201
> > [ ID] Interval           Transfer     Bitrate         Total Datagrams
> > [  5]   0.00-1.00   sec   123 KBytes  1.01 Mbits/sec  87
> > [  5]   1.00-2.00   sec   122 KBytes   996 Kbits/sec  86
> > [  5]   2.00-3.00   sec   122 KBytes   996 Kbits/sec  86
> > [  5]   3.00-4.00   sec   123 KBytes  1.01 Mbits/sec  87
> > [  5]   4.00-5.00   sec   122 KBytes   996 Kbits/sec  86
> > [  5]   5.00-6.00   sec   122 KBytes   996 Kbits/sec  86
> > [  5]   6.00-7.00   sec   123 KBytes  1.01 Mbits/sec  87
> > [  5]   7.00-8.00   sec   122 KBytes   996 Kbits/sec  86
> > [  5]   8.00-9.00   sec   122 KBytes   996 Kbits/sec  86
> > [  5]   9.00-10.00  sec   123 KBytes  1.01 Mbits/sec  87
> > - - - - - - - - - - - - - - - - - - - - - - - - -
> > [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
> > [  5]   0.00-10.00  sec  1.19 MBytes  1.00 Mbits/sec  0.000 ms  0/864 (0%)  sender
> > [  5]   0.00-10.05  sec  1.11 MBytes   925 Kbits/sec  0.045 ms  62/864 (7.2%)  receiver
> > iperf Done.
> > # iperf3 -c 192.168.1.1
> > Connecting to host 192.168.1.1, port 5201
> > [  5] local 192.168.1.2 port 34792 connected to 192.168.1.1 port 5201
> > [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> > [  5]   0.00-1.00   sec  1.63 MBytes  13.7 Mbits/sec   30   1.41 KBytes
> > [  5]   1.00-2.00   sec  7.40 MBytes  62.1 Mbits/sec   65   14.1 KBytes
> > [  5]   2.00-3.00   sec  7.83 MBytes  65.7 Mbits/sec  109   2.83 KBytes
> > [  5]   3.00-4.00   sec  2.49 MBytes  20.9 Mbits/sec   46   19.8 KBytes
> > [  5]   4.00-5.00   sec  7.89 MBytes  66.2 Mbits/sec  109   2.83 KBytes
> > [  5]   5.00-6.00   sec   255 KBytes  2.09 Mbits/sec   22   2.83 KBytes
> > [  5]   6.00-7.00   sec  4.35 MBytes  36.5 Mbits/sec   74   41.0 KBytes
> > [  5]   7.00-8.00   sec  10.9 MBytes  91.8 Mbits/sec   34   45.2 KBytes
> > [  5]   8.00-9.00   sec  5.35 MBytes  44.9 Mbits/sec   82   1.41 KBytes
> > [  5]   9.00-10.00  sec  1.37 MBytes  11.5 Mbits/sec   73   1.41 KBytes
> > - - - - - - - - - - - - - - - - - - - - - - - - -
> > [ ID] Interval           Transfer     Bitrate         Retr
> > [  5]   0.00-10.00  sec  49.5 MBytes  41.5 Mbits/sec  644             sender
> > [  5]   0.00-10.05  sec  49.3 MBytes  41.1 Mbits/sec                  receiver
> > iperf Done.
> >
> > There is still a noticeable amount of drop/retries, but overall the
> > results are significantly better. What is the rationale behind the
> > choice of 16384 in particular? Could this be further improved?
>
> Apparently I've been too enthusiastic. After sending this e-mail I've
> re-generated an image with iproute2 and dd'ed the whole image into an
> SD card, while until now I was just updating the kernel/DT manually and
> got the same performances as above without the gro size trick. I need
> to clarify this further.
>

Looking a bit at fec, I think fec_enet_txq_put_hdr_tso() is  bogus...

txq->tso_hdrs should be properly aligned by definition.

If FEC_QUIRK_SWAP_FRAME is requested, better copy the right thing, not
original skb->data ???

diff --git a/drivers/net/ethernet/freescale/fec_main.c
b/drivers/net/ethernet/freescale/fec_main.c
index 77c8e9cfb44562e73bfa89d06c5d4b179d755502..520436d579d66cc3263527373d754a206cb5bcd6
100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -753,7 +753,6 @@ fec_enet_txq_put_hdr_tso(struct fec_enet_priv_tx_q *txq,
        struct fec_enet_private *fep = netdev_priv(ndev);
        int hdr_len = skb_tcp_all_headers(skb);
        struct bufdesc_ex *ebdp = container_of(bdp, struct bufdesc_ex, desc);
-       void *bufaddr;
        unsigned long dmabuf;
        unsigned short status;
        unsigned int estatus = 0;
@@ -762,11 +761,11 @@ fec_enet_txq_put_hdr_tso(struct fec_enet_priv_tx_q *txq,
        status &= ~BD_ENET_TX_STATS;
        status |= (BD_ENET_TX_TC | BD_ENET_TX_READY);

-       bufaddr = txq->tso_hdrs + index * TSO_HEADER_SIZE;
        dmabuf = txq->tso_hdrs_dma + index * TSO_HEADER_SIZE;
-       if (((unsigned long)bufaddr) & fep->tx_align ||
-               fep->quirks & FEC_QUIRK_SWAP_FRAME) {
-               memcpy(txq->tx_bounce[index], skb->data, hdr_len);
+       if (fep->quirks & FEC_QUIRK_SWAP_FRAME) {
+               void *bufaddr = txq->tso_hdrs + index * TSO_HEADER_SIZE;
+
+               memcpy(txq->tx_bounce[index], bufaddr, hdr_len);
                bufaddr = txq->tx_bounce[index];

                if (fep->quirks & FEC_QUIRK_SWAP_FRAME)

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-16 19:37           ` Eric Dumazet
@ 2023-10-16 21:47             ` Russell King (Oracle)
  2023-10-17 11:19             ` Miquel Raynal
  1 sibling, 0 replies; 26+ messages in thread
From: Russell King (Oracle) @ 2023-10-16 21:47 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Miquel Raynal, Wei Fang, Shenwei Wang, Clark Wang, davem, kuba,
	pabeni, linux-imx, netdev, Thomas Petazzoni, Alexandre Belloni,
	Maxime Chevallier, Andrew Lunn, Stephen Hemminger

On Mon, Oct 16, 2023 at 09:37:58PM +0200, Eric Dumazet wrote:
> diff --git a/drivers/net/ethernet/freescale/fec_main.c
> b/drivers/net/ethernet/freescale/fec_main.c
> index 77c8e9cfb44562e73bfa89d06c5d4b179d755502..520436d579d66cc3263527373d754a206cb5bcd6
> 100644
> --- a/drivers/net/ethernet/freescale/fec_main.c
> +++ b/drivers/net/ethernet/freescale/fec_main.c
> @@ -753,7 +753,6 @@ fec_enet_txq_put_hdr_tso(struct fec_enet_priv_tx_q *txq,
>         struct fec_enet_private *fep = netdev_priv(ndev);
>         int hdr_len = skb_tcp_all_headers(skb);
>         struct bufdesc_ex *ebdp = container_of(bdp, struct bufdesc_ex, desc);
> -       void *bufaddr;
>         unsigned long dmabuf;
>         unsigned short status;
>         unsigned int estatus = 0;
> @@ -762,11 +761,11 @@ fec_enet_txq_put_hdr_tso(struct fec_enet_priv_tx_q *txq,
>         status &= ~BD_ENET_TX_STATS;
>         status |= (BD_ENET_TX_TC | BD_ENET_TX_READY);
> 
> -       bufaddr = txq->tso_hdrs + index * TSO_HEADER_SIZE;
>         dmabuf = txq->tso_hdrs_dma + index * TSO_HEADER_SIZE;
> -       if (((unsigned long)bufaddr) & fep->tx_align ||
> -               fep->quirks & FEC_QUIRK_SWAP_FRAME) {
> -               memcpy(txq->tx_bounce[index], skb->data, hdr_len);
> +       if (fep->quirks & FEC_QUIRK_SWAP_FRAME) {
> +               void *bufaddr = txq->tso_hdrs + index * TSO_HEADER_SIZE;
> +
> +               memcpy(txq->tx_bounce[index], bufaddr, hdr_len);
>                 bufaddr = txq->tx_bounce[index];
> 
>                 if (fep->quirks & FEC_QUIRK_SWAP_FRAME)

I'm not sure this has any effect on the reported issue.

1. For imx6 based devices, FEC_QUIRK_SWAP_FRAME is not set.
2. fep->tx_align is 15, TSO_HEADER_SIZE is 256, and ->tso_hdrs is
   derived from dma_alloc_coherent() which will be page aligned.
   So this condition will also always be false.

So, while the patch looks correct to me, I think it will only have an
effect on imx28 based systems that set FEC_QUIRK_SWAP_FRAME, and for
that it looks correct to me, since the header is always located in
txq->tso_hdrs and we need to copy it from there into the "bounce"
buffer.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

Thanks!

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-16 14:41           ` Alexander Stein
@ 2023-10-17 10:49             ` Miquel Raynal
  2023-10-18  9:08               ` Alexander Stein
  0 siblings, 1 reply; 26+ messages in thread
From: Miquel Raynal @ 2023-10-17 10:49 UTC (permalink / raw)
  To: Alexander Stein
  Cc: Stephen Hemminger, Andrew Lunn, Wei Fang, Shenwei Wang,
	Clark Wang, Russell King, davem, edumazet, kuba, pabeni,
	linux-imx, netdev, Thomas Petazzoni, Alexandre Belloni,
	Maxime Chevallier

Hi Alexander,

alexander.stein@ew.tq-group.com wrote on Mon, 16 Oct 2023 16:41:50
+0200:

> Hi Miquel,
> 
> Am Montag, 16. Oktober 2023, 15:31:54 CEST schrieb Miquel Raynal:
> > Hi Alexander,
> > 
> > Thanks a lot for your feedback.
> >   
> > > > switch to partitions #0, OK
> > > > mmc1 is current device
> > > > reading boot.scr
> > > > 444 bytes read in 10 ms (43 KiB/s)
> > > > ## Executing script at 20000000
> > > > Booting from mmc ...
> > > > reading zImage
> > > > 9160016 bytes read in 462 ms (18.9 MiB/s)
> > > > reading <board>.dtb  
> > > 
> > > Which device tree is that?
> > >   
> > > > 40052 bytes read in 22 ms (1.7 MiB/s)
> > > > boot device tree kernel ...
> > > > Kernel image @ 0x12000000 [ 0x000000 - 0x8bc550 ]
> > > > ## Flattened Device Tree blob at 18000000
> > > > 
> > > >    Booting using the fdt blob at 0x18000000
> > > >    Using Device Tree in place at 18000000, end 1800cc73
> > > > 
> > > > Starting kernel ...
> > > > 
> > > > [    0.000000] Booting Linux on physical CPU 0x0
> > > > [    0.000000] Linux version 6.5.0 (mraynal@xps-13)
> > > > (arm-linux-gcc.br_real
> > > > (Buildroot 2 020.08-14-ge5a2a90) 10.2.0, GNU ld (GNU Binutils) 2.34)
> > > > #120
> > > > SMP Thu Oct 12 18:10:20 CE ST 2023
> > > > [    0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7),
> > > > cr=10c5387d [    0.000000] CPU: PIPT / VIPT nonaliasing data cache, VIPT
> > > > aliasing instruction cache
> > > > [    0.000000] OF: fdt: Machine model: TQ TQMa6Q
> > > > on MBa6x  
> > > 
> > > Your first mail mentions a custom board, but this indicates "TQMa6Q
> > > on MBa6x", so which is it?  
> > 
> > It's a custom carrier board with a TQMA6Q-AA module.  
> 
> Could you please adjust the machine model to your mainboard if it is not a 
> MBa6x? Thanks.
> Which HW revision is this module? It should be printed in u-boot during start. 
> Can you provide a full log?

The full kernel log is at the bottom of this e-mail:
https://lore.kernel.org/netdev/20231013102718.6b3a2dfe@xps-13/

On the module I read on a white sticker:
	TQMA6Q-AA
	RK.0203
And on one side of the PCB:
	TQMa6x.0201

Do you know if this module has the hardware workaround discussed below?
(I don't have the schematics of the module)

Here is also the U-Boot log:

U-Boot 2017.11 (Aug 11 2023 - 19:35:47 +0200)

CPU:   Freescale i.MX6Q rev1.5 at 792 MHz
Reset cause: POR
Board: TQMa6Q on a MBa6x
I2C:   ready
DRAM:  1 GiB
PMIC: PFUZE100 ID=0x10 REV=0x21
MMC:   FSL_SDHC: 0, FSL_SDHC: 1
reading uboot.env
In:    serial
Out:   serial
Err:   serial
Net:   FEC [PRIME]
Warning: FEC MAC addresses don't match:
Address in SROM is         00:d0:93:44:a4:c0
Address in environment is  fc:c2:3d:18:5f:91

starting USB...
USB0:   Port not available.
USB1:   USB EHCI 1.00
scanning bus 1 for devices... 3 USB Device(s) found
       scanning usb for storage devices... 0 Storage Device(s) found
       scanning usb for ethernet devices... 1 Ethernet Device(s) found
Hit any key to stop autoboot:  0 
switch to partitions #0, OK
mmc1 is current device
reading boot.scr
444 bytes read in 10 ms (43 KiB/s)
## Executing script at 20000000
Booting from mmc ...
reading zImage
7354128 bytes read in 368 ms (19.1 MiB/s)
reading stephan_Stephanie_ControlUnit_A809_60_408.dtb
40002 bytes read in 25 ms (1.5 MiB/s)
boot device tree kernel ...
Kernel image @ 0x12000000 [ 0x000000 - 0x703710 ]
## Flattened Device Tree blob at 18000000
   Booting using the fdt blob at 0x18000000
   Using Device Tree in place at 18000000, end 1800cc41

Starting kernel ...

> > > Please note that there are two different module variants,
> > > imx6qdl-tqma6a.dtsi and imx6qdl-tqma6b.dtsi. They deal with i.MX6's
> > > ERR006687 differently. Package drop without any load somewhat indicates
> > > this issue.  
> > 
> > I've tried with and without the fsl,err006687-workaround-present DT
> > property. It gets successfully parsed an I see the lower idle state
> > being disabled under mach-imx. I've also tried just commenting out the
> > registration of the cpuidle driver, just to be sure. I saw no
> > difference.  
> 
> fsl,err006687-workaround-present requires a specific HW workaround, see [1]. 
> So this is not applicable on every module.

Based on the information provided above, do you think I can rely on the
HW workaround?

I've tried disabling the registration of both the CPUidle and CPUfreq
drivers in the machine code and I see a real difference. The transfers
are still not perfect though, but I believe this is related to the ~1%
drop of the RGMII lines (timings are not perfect, but I could not
extend them more).

I believe if the hardware workaround is not available on this module I
can still disable CPUidle and CPUfreq as a workaround of the
workaround...?

> > By the way, we tried with a TQ eval board with this SoM and saw the same
> > issue (not me, I don't have this board in hands). Don't you experience
> > something similar? I went across a couple of people reporting similar
> > issues with these modules but none of them reported how they fixed it
> > (if they did). I tried two different images based on TQ's Github using
> > v4.14.69 and v5.10 kernels.  
> 
> Personally I've heard the first time about this issue. I never noticed 
> something like this. Does this issue also appear when using TCP? Or is it an 
> UDP only issue?

With a mainline kernel:
* With UDP I get a high drop rate.
* With TCP I get slow/bumpy throughputs.

> [1] https://github.com/tq-systems/linux-tqmaxx/blob/TQMa8-fslc-5.10-2.1.x-imx/
> arch/arm/boot/dts/imx6qdl-tqma6a.dtsi#L36-L48

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-16 19:37           ` Eric Dumazet
  2023-10-16 21:47             ` Russell King (Oracle)
@ 2023-10-17 11:19             ` Miquel Raynal
  1 sibling, 0 replies; 26+ messages in thread
From: Miquel Raynal @ 2023-10-17 11:19 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Russell King (Oracle), Wei Fang, Shenwei Wang, Clark Wang, davem,
	kuba, pabeni, linux-imx, netdev, Thomas Petazzoni,
	Alexandre Belloni, Maxime Chevallier, Andrew Lunn,
	Stephen Hemminger, Alexander Stein

Hi Eric,

edumazet@google.com wrote on Mon, 16 Oct 2023 21:37:58 +0200:

> On Mon, Oct 16, 2023 at 5:37 PM Miquel Raynal <miquel.raynal@bootlin.com> wrote:
> >
> > Hello again,
> >  
> > > > > # iperf3 -c 192.168.1.1
> > > > > Connecting to host 192.168.1.1, port 5201
> > > > > [  5] local 192.168.1.2 port 37948 connected to 192.168.1.1 port 5201
> > > > > [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> > > > > [  5]   0.00-1.00   sec  11.3 MBytes  94.5 Mbits/sec   43   32.5 KBytes
> > > > > [  5]   1.00-2.00   sec  3.29 MBytes  27.6 Mbits/sec   26   1.41 KBytes
> > > > > [  5]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > > > [  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > > > > [  5]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes
> > > > > [  5]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > > > [  5]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > > > [  5]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes
> > > > > [  5]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > > > > [  5]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes
> > > > >
> > > > > Thanks,
> > > > > Miquèl  
> > > >
> > > > Can you experiment with :
> > > >
> > > > - Disabling TSO on your NIC (ethtool -K eth0 tso off)
> > > > - Reducing max GSO size (ip link set dev eth0 gso_max_size 16384)
> > > >
> > > > I suspect some kind of issues with fec TX completion, vs TSO emulation.  
> > >
> > > Wow, appears to have a significant effect. I am using Busybox's iproute
> > > implementation which does not know gso_max_size, but I hacked directly
> > > into netdevice.h just to see if it would have an effect. I'm adding
> > > iproute2 to the image for further testing.
> > >
> > > Here is the diff:
> > >
> > > --- a/include/linux/netdevice.h
> > > +++ b/include/linux/netdevice.h
> > > @@ -2364,7 +2364,7 @@ struct net_device {
> > >  /* TCP minimal MSS is 8 (TCP_MIN_GSO_SIZE),
> > >   * and shinfo->gso_segs is a 16bit field.
> > >   */
> > > -#define GSO_MAX_SIZE           (8 * GSO_MAX_SEGS)
> > > +#define GSO_MAX_SIZE           16384u
> > >
> > >         unsigned int            gso_max_size;
> > >  #define TSO_LEGACY_MAX_SIZE    65536
> > >
> > > And here are the results:
> > >
> > > # ethtool -K eth0 tso off
> > > # iperf3 -c 192.168.1.1 -u -b1M
> > > Connecting to host 192.168.1.1, port 5201
> > > [  5] local 192.168.1.2 port 50490 connected to 192.168.1.1 port 5201
> > > [ ID] Interval           Transfer     Bitrate         Total Datagrams
> > > [  5]   0.00-1.00   sec   123 KBytes  1.01 Mbits/sec  87
> > > [  5]   1.00-2.00   sec   122 KBytes   996 Kbits/sec  86
> > > [  5]   2.00-3.00   sec   122 KBytes   996 Kbits/sec  86
> > > [  5]   3.00-4.00   sec   123 KBytes  1.01 Mbits/sec  87
> > > [  5]   4.00-5.00   sec   122 KBytes   996 Kbits/sec  86
> > > [  5]   5.00-6.00   sec   122 KBytes   996 Kbits/sec  86
> > > [  5]   6.00-7.00   sec   123 KBytes  1.01 Mbits/sec  87
> > > [  5]   7.00-8.00   sec   122 KBytes   996 Kbits/sec  86
> > > [  5]   8.00-9.00   sec   122 KBytes   996 Kbits/sec  86
> > > [  5]   9.00-10.00  sec   123 KBytes  1.01 Mbits/sec  87
> > > - - - - - - - - - - - - - - - - - - - - - - - - -
> > > [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
> > > [  5]   0.00-10.00  sec  1.19 MBytes  1.00 Mbits/sec  0.000 ms  0/864 (0%)  sender
> > > [  5]   0.00-10.05  sec  1.11 MBytes   925 Kbits/sec  0.045 ms  62/864 (7.2%)  receiver
> > > iperf Done.
> > > # iperf3 -c 192.168.1.1
> > > Connecting to host 192.168.1.1, port 5201
> > > [  5] local 192.168.1.2 port 34792 connected to 192.168.1.1 port 5201
> > > [ ID] Interval           Transfer     Bitrate         Retr  Cwnd
> > > [  5]   0.00-1.00   sec  1.63 MBytes  13.7 Mbits/sec   30   1.41 KBytes
> > > [  5]   1.00-2.00   sec  7.40 MBytes  62.1 Mbits/sec   65   14.1 KBytes
> > > [  5]   2.00-3.00   sec  7.83 MBytes  65.7 Mbits/sec  109   2.83 KBytes
> > > [  5]   3.00-4.00   sec  2.49 MBytes  20.9 Mbits/sec   46   19.8 KBytes
> > > [  5]   4.00-5.00   sec  7.89 MBytes  66.2 Mbits/sec  109   2.83 KBytes
> > > [  5]   5.00-6.00   sec   255 KBytes  2.09 Mbits/sec   22   2.83 KBytes
> > > [  5]   6.00-7.00   sec  4.35 MBytes  36.5 Mbits/sec   74   41.0 KBytes
> > > [  5]   7.00-8.00   sec  10.9 MBytes  91.8 Mbits/sec   34   45.2 KBytes
> > > [  5]   8.00-9.00   sec  5.35 MBytes  44.9 Mbits/sec   82   1.41 KBytes
> > > [  5]   9.00-10.00  sec  1.37 MBytes  11.5 Mbits/sec   73   1.41 KBytes
> > > - - - - - - - - - - - - - - - - - - - - - - - - -
> > > [ ID] Interval           Transfer     Bitrate         Retr
> > > [  5]   0.00-10.00  sec  49.5 MBytes  41.5 Mbits/sec  644             sender
> > > [  5]   0.00-10.05  sec  49.3 MBytes  41.1 Mbits/sec                  receiver
> > > iperf Done.
> > >
> > > There is still a noticeable amount of drop/retries, but overall the
> > > results are significantly better. What is the rationale behind the
> > > choice of 16384 in particular? Could this be further improved?  
> >
> > Apparently I've been too enthusiastic. After sending this e-mail I've
> > re-generated an image with iproute2 and dd'ed the whole image into an
> > SD card, while until now I was just updating the kernel/DT manually and
> > got the same performances as above without the gro size trick. I need
> > to clarify this further.
> >  
> 
> Looking a bit at fec, I think fec_enet_txq_put_hdr_tso() is  bogus...
> 
> txq->tso_hdrs should be properly aligned by definition.
> 
> If FEC_QUIRK_SWAP_FRAME is requested, better copy the right thing, not
> original skb->data ???

I've clarified the situation after looking at the build artifacts and
going through (way) longer testing sessions, as successive 10-second
tests can lead to really different results.

On a 4.14.322 kernel (still maintained) I really get extremely crappy
throughput.

On a mainline 6.5 kernel I thought I had a similar issue but this was
due to wrong RGMII-ID timings being used (I ported the board from 4.14
to 6.5 and made a mistake). So with the right timings, I get
much better throughput but still significantly low compared to what I
would expect.

So I tested Eric's fixes:
- TCP fix:
https://lore.kernel.org/netdev/CANn89iJUBujG2AOBYsr0V7qyC5WTgzx0GucO=2ES69tTDJRziw@mail.gmail.com/
- FEC fix:
https://lore.kernel.org/netdev/CANn89iLxKQOY5ZA5o3d1y=v4MEAsAQnzmVDjmLY0_bJPG93tKQ@mail.gmail.com/
As well as different CPUfreq/CPUidle parameters, as pointed out by
Alexander:
https://lore.kernel.org/netdev/2245614.iZASKD2KPV@steina-w/

Here are the results of 100 seconds iperf uplink TCP tests, as reported
by the receiver. First value is the mean, the raw results are in the '(' ')'.
Unit: Mbps

Default setup:
CPUidle yes, CPUfreq yes, TCP fix no, FEC fix no: 30.2 (23.8, 28.4, 38.4)

CPU power management tests (with TCP fix and FEC fix):
CPUidle yes, CPUfreq yes: 26.5 (24.5, 28.5)
CPUidle  no, CPUfreq yes: 50.3 (44.8, 55.7)
CPUidle yes, CPUfreq  no: 80.2 (75.8, 79.5, 80.8, 81.8, 83.1)
CPUidle  no, CPUfreq  no: 85.4 (80.6, 81.1, 86.2, 87.5, 91.8)

Eric's fixes tests (No CPUidle, no CPUfreq):
TCP fix yes, FEC fix yes: 85.4 (80.6, 81.1, 86.2, 87.5, 91.8) (same as above)
TCP fix  no, FEC fix yes: 82.0 (74.5, 75.9, 82.2, 87.5, 90.2)
TCP fix yes, FEC fix  no: 81.4 (77.5, 77.7, 82.8, 83.7, 85.4)
TCP fix  no, FEC fix  no: 79.6 (68.2, 77.6, 78.9, 86.4, 87.1)

So indeed the TCP and FEC patches don't seem to have a real impact (or
a small one, I don't know given how scattered are the results). However
there is definitely something wrong with the low power settings and I
believe the Errata pointed by Alexander may have a real impact there
(ERR006687 ENET: Only the ENET wake-up interrupt request can wake the
system from Wait mode [i.MX 6Dual/6Quad Only]), probably that my
hardware lacks the hardware workaround.

I believe the remaining fluctuations are due to the RGMII-ID timings
not being totally optimal, I think I would need to extend them slightly
more in the Tx path but they are already set to the maximum value.
Anyhow, I no longer see any difference in the drop rate between -b1M
and -b0 (<1%) so I believe it is acceptable like that.

Now I might try to track what is missing in 4.14.322 and perhaps ask
for a backport if it's relevant.

Thanks a lot for all your feedback,
Miquèl

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-17 10:49             ` Miquel Raynal
@ 2023-10-18  9:08               ` Alexander Stein
  2023-10-27 20:58                 ` Miquel Raynal
  0 siblings, 1 reply; 26+ messages in thread
From: Alexander Stein @ 2023-10-18  9:08 UTC (permalink / raw)
  To: Miquel Raynal
  Cc: Stephen Hemminger, Andrew Lunn, Wei Fang, Shenwei Wang,
	Clark Wang, Russell King, davem, edumazet, kuba, pabeni,
	linux-imx, netdev, Thomas Petazzoni, Alexandre Belloni,
	Maxime Chevallier

Hi Miquel,

Am Dienstag, 17. Oktober 2023, 12:49:19 CEST schrieb Miquel Raynal:
> Hi Alexander,
> 
> alexander.stein@ew.tq-group.com wrote on Mon, 16 Oct 2023 16:41:50
> 
> +0200:
> > Hi Miquel,
> > 
> > Am Montag, 16. Oktober 2023, 15:31:54 CEST schrieb Miquel Raynal:
> > > Hi Alexander,
> > > 
> > > Thanks a lot for your feedback.
> > > 
> > > > > switch to partitions #0, OK
> > > > > mmc1 is current device
> > > > > reading boot.scr
> > > > > 444 bytes read in 10 ms (43 KiB/s)
> > > > > ## Executing script at 20000000
> > > > > Booting from mmc ...
> > > > > reading zImage
> > > > > 9160016 bytes read in 462 ms (18.9 MiB/s)
> > > > > reading <board>.dtb
> > > > 
> > > > Which device tree is that?
> > > > 
> > > > > 40052 bytes read in 22 ms (1.7 MiB/s)
> > > > > boot device tree kernel ...
> > > > > Kernel image @ 0x12000000 [ 0x000000 - 0x8bc550 ]
> > > > > ## Flattened Device Tree blob at 18000000
> > > > > 
> > > > >    Booting using the fdt blob at 0x18000000
> > > > >    Using Device Tree in place at 18000000, end 1800cc73
> > > > > 
> > > > > Starting kernel ...
> > > > > 
> > > > > [    0.000000] Booting Linux on physical CPU 0x0
> > > > > [    0.000000] Linux version 6.5.0 (mraynal@xps-13)
> > > > > (arm-linux-gcc.br_real
> > > > > (Buildroot 2 020.08-14-ge5a2a90) 10.2.0, GNU ld (GNU Binutils) 2.34)
> > > > > #120
> > > > > SMP Thu Oct 12 18:10:20 CE ST 2023
> > > > > [    0.000000] CPU: ARMv7 Processor [412fc09a] revision 10 (ARMv7),
> > > > > cr=10c5387d [    0.000000] CPU: PIPT / VIPT nonaliasing data cache,
> > > > > VIPT
> > > > > aliasing instruction cache
> > > > > [    0.000000] OF: fdt: Machine model: TQ TQMa6Q
> > > > > on MBa6x
> > > > 
> > > > Your first mail mentions a custom board, but this indicates "TQMa6Q
> > > > on MBa6x", so which is it?
> > > 
> > > It's a custom carrier board with a TQMA6Q-AA module.
> > 
> > Could you please adjust the machine model to your mainboard if it is not a
> > MBa6x? Thanks.
> > Which HW revision is this module? It should be printed in u-boot during
> > start. Can you provide a full log?
> 
> The full kernel log is at the bottom of this e-mail:
> https://lore.kernel.org/netdev/20231013102718.6b3a2dfe@xps-13/
> 
> On the module I read on a white sticker:
> 	TQMA6Q-AA
> 	RK.0203
> And on one side of the PCB:
> 	TQMa6x.0201
> 
> Do you know if this module has the hardware workaround discussed below?
> (I don't have the schematics of the module)

Yes, the TQMA6Q-AA RK.0203 has the ethernet hardware workaround implemented. 
So you should use the imx6q-tqma6a.dtsi (and eventuelly imx6qdl-tqma6a.dtsi) 
module device tree.

> Here is also the U-Boot log:
> 
> U-Boot 2017.11 (Aug 11 2023 - 19:35:47 +0200)
> 
> CPU:   Freescale i.MX6Q rev1.5 at 792 MHz
> Reset cause: POR
> Board: TQMa6Q on a MBa6x
> I2C:   ready
> DRAM:  1 GiB
> PMIC: PFUZE100 ID=0x10 REV=0x21
> MMC:   FSL_SDHC: 0, FSL_SDHC: 1
> reading uboot.env
> In:    serial
> Out:   serial
> Err:   serial
> Net:   FEC [PRIME]
> Warning: FEC MAC addresses don't match:
> Address in SROM is         00:d0:93:44:a4:c0
> Address in environment is  fc:c2:3d:18:5f:91
> 
> starting USB...
> USB0:   Port not available.
> USB1:   USB EHCI 1.00
> scanning bus 1 for devices... 3 USB Device(s) found
>        scanning usb for storage devices... 0 Storage Device(s) found
>        scanning usb for ethernet devices... 1 Ethernet Device(s) found
> Hit any key to stop autoboot:  0
> switch to partitions #0, OK
> mmc1 is current device
> reading boot.scr
> 444 bytes read in 10 ms (43 KiB/s)
> ## Executing script at 20000000
> Booting from mmc ...
> reading zImage
> 7354128 bytes read in 368 ms (19.1 MiB/s)
> reading stephan_Stephanie_ControlUnit_A809_60_408.dtb
> 40002 bytes read in 25 ms (1.5 MiB/s)
> boot device tree kernel ...
> Kernel image @ 0x12000000 [ 0x000000 - 0x703710 ]
> ## Flattened Device Tree blob at 18000000
>    Booting using the fdt blob at 0x18000000
>    Using Device Tree in place at 18000000, end 1800cc41
> 
> Starting kernel ...
> 
> > > > Please note that there are two different module variants,
> > > > imx6qdl-tqma6a.dtsi and imx6qdl-tqma6b.dtsi. They deal with i.MX6's
> > > > ERR006687 differently. Package drop without any load somewhat
> > > > indicates
> > > > this issue.
> > > 
> > > I've tried with and without the fsl,err006687-workaround-present DT
> > > property. It gets successfully parsed an I see the lower idle state
> > > being disabled under mach-imx. I've also tried just commenting out the
> > > registration of the cpuidle driver, just to be sure. I saw no
> > > difference.
> > 
> > fsl,err006687-workaround-present requires a specific HW workaround, see
> > [1]. So this is not applicable on every module.
> 
> Based on the information provided above, do you think I can rely on the
> HW workaround?

The original u-boot auto-detects if the hardware workaround is present and 
default selects the appropriate device tree, either variant A or B, for MBa6x 
usage.

> I've tried disabling the registration of both the CPUidle and CPUfreq
> drivers in the machine code and I see a real difference. The transfers
> are still not perfect though, but I believe this is related to the ~1%
> drop of the RGMII lines (timings are not perfect, but I could not
> extend them more).
> 
> I believe if the hardware workaround is not available on this module I
> can still disable CPUidle and CPUfreq as a workaround of the
> workaround...?

It's hard say without knowing the cause of your problem. I didn't see any of 
these problems here.

> > > By the way, we tried with a TQ eval board with this SoM and saw the same
> > > issue (not me, I don't have this board in hands). Don't you experience
> > > something similar? I went across a couple of people reporting similar
> > > issues with these modules but none of them reported how they fixed it
> > > (if they did). I tried two different images based on TQ's Github using
> > > v4.14.69 and v5.10 kernels.

You mentioned a couple of other people having similar problems with these 
modules. Can you tell me more about those? I'd like to gather more 
information. Thanks.

Best regards,
Alexander

> > 
> > Personally I've heard the first time about this issue. I never noticed
> > something like this. Does this issue also appear when using TCP? Or is it
> > an UDP only issue?
> 
> With a mainline kernel:
> * With UDP I get a high drop rate.
> * With TCP I get slow/bumpy throughputs.
> 
> > [1]
> > https://github.com/tq-systems/linux-tqmaxx/blob/TQMa8-fslc-5.10-2.1.x-imx
> > / arch/arm/boot/dts/imx6qdl-tqma6a.dtsi#L36-L48
> 
> Thanks,
> Miquèl


-- 
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
http://www.tq-group.com/



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-18  9:08               ` Alexander Stein
@ 2023-10-27 20:58                 ` Miquel Raynal
  0 siblings, 0 replies; 26+ messages in thread
From: Miquel Raynal @ 2023-10-27 20:58 UTC (permalink / raw)
  To: Alexander Stein
  Cc: Stephen Hemminger, Andrew Lunn, Wei Fang, Shenwei Wang,
	Clark Wang, Russell King, davem, edumazet, kuba, pabeni,
	linux-imx, netdev, Thomas Petazzoni, Alexandre Belloni,
	Maxime Chevallier

Hi Alexander,

> > The full kernel log is at the bottom of this e-mail:
> > https://lore.kernel.org/netdev/20231013102718.6b3a2dfe@xps-13/
> > 
> > On the module I read on a white sticker:
> > 	TQMA6Q-AA
> > 	RK.0203
> > And on one side of the PCB:
> > 	TQMa6x.0201
> > 
> > Do you know if this module has the hardware workaround discussed below?
> > (I don't have the schematics of the module)  
> 
> Yes, the TQMA6Q-AA RK.0203 has the ethernet hardware workaround implemented. 
> So you should use the imx6q-tqma6a.dtsi (and eventuelly imx6qdl-tqma6a.dtsi) 
> module device tree.

[...]

> > > > > Please note that there are two different module variants,
> > > > > imx6qdl-tqma6a.dtsi and imx6qdl-tqma6b.dtsi. They deal with i.MX6's
> > > > > ERR006687 differently. Package drop without any load somewhat
> > > > > indicates
> > > > > this issue.  
> > > > 
> > > > I've tried with and without the fsl,err006687-workaround-present DT
> > > > property. It gets successfully parsed an I see the lower idle state
> > > > being disabled under mach-imx. I've also tried just commenting out the
> > > > registration of the cpuidle driver, just to be sure. I saw no
> > > > difference.  
> > > 
> > > fsl,err006687-workaround-present requires a specific HW workaround, see
> > > [1]. So this is not applicable on every module.  
> > 
> > Based on the information provided above, do you think I can rely on the
> > HW workaround?  
> 
> The original u-boot auto-detects if the hardware workaround is present and 
> default selects the appropriate device tree, either variant A or B, for MBa6x 
> usage.

So apparently the hardware workaround would be on my module and is
already enabled by software. This would not be the real issue but just
making it worse. I think I diagnosed an issue related to the concurrent
use of DMA to read from the RAM with the IPU. Here is the link of the
new discussion:
https://lists.freedesktop.org/archives/dri-devel/2023-October/428251.html

> > I've tried disabling the registration of both the CPUidle and CPUfreq
> > drivers in the machine code and I see a real difference. The transfers
> > are still not perfect though, but I believe this is related to the ~1%
> > drop of the RGMII lines (timings are not perfect, but I could not
> > extend them more).
> > 
> > I believe if the hardware workaround is not available on this module I
> > can still disable CPUidle and CPUfreq as a workaround of the
> > workaround...?  
> 
> It's hard say without knowing the cause of your problem. I didn't see any of 
> these problems here.
> 
> > > > By the way, we tried with a TQ eval board with this SoM and saw the same
> > > > issue (not me, I don't have this board in hands). Don't you experience
> > > > something similar? I went across a couple of people reporting similar
> > > > issues with these modules but none of them reported how they fixed it
> > > > (if they did). I tried two different images based on TQ's Github using
> > > > v4.14.69 and v5.10 kernels.  
> 
> You mentioned a couple of other people having similar problems with these 
> modules. Can you tell me more about those? I'd like to gather more 
> information. Thanks.

I searched again and found this one which really looked identical to my
initial issue:
https://community.nxp.com/t5/i-MX-Processors/Why-Imx6q-ethernet-is-too-slow/m-p/918992
Plus one other which I cannot find anymore.

> 
> Best regards,
> Alexander

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-13 15:51       ` Andrew Lunn
@ 2023-10-27 20:58         ` Miquel Raynal
  2023-11-17 15:09           ` Miquel Raynal
  0 siblings, 1 reply; 26+ messages in thread
From: Miquel Raynal @ 2023-10-27 20:58 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Stephen Hemminger, Wei Fang, Shenwei Wang, Clark Wang,
	Russell King, davem, edumazet, kuba, pabeni, linux-imx, netdev,
	Thomas Petazzoni, Alexandre Belloni, Maxime Chevallier

Hi Andrew,

andrew@lunn.ch wrote on Fri, 13 Oct 2023 17:51:20 +0200:

> > # ethtool -S eth0
> > NIC statistics:
> >      tx_dropped: 0
> >      tx_packets: 10118
> >      tx_broadcast: 0
> >      tx_multicast: 13
> >      tx_crc_errors: 0
> >      tx_undersize: 0
> >      tx_oversize: 0
> >      tx_fragment: 0
> >      tx_jabber: 0
> >      tx_collision: 0
> >      tx_64byte: 130
> >      tx_65to127byte: 61031
> >      tx_128to255byte: 19
> >      tx_256to511byte: 10
> >      tx_512to1023byte: 5
> >      tx_1024to2047byte: 14459
> >      tx_GTE2048byte: 0
> >      tx_octets: 26219280  
> 
> These values come from the hardware. They should reflect what actually
> made it onto the wire.
> 
> Do the values match what the link peer actually received?
> 
> Also, can you compare them to what iperf says it transmitted.
> 
> From this, we can rule out the industrial cable, and should also be
> able to rule out the receiver is the problem, not the transmitter.

I've investigated this further and found a strange relationship with
the display subsystem. It seems like there is some congestion happening
at the interconnect level. I wanted to point out that your hints helped
as I observed that the above counters were incrementing as expected,
but the packets were just not sent out. My interpretation is some
kind of uDMA timeout caused by some hardware locking on the NIC by
the IPU which cannot be diagnosed at the ENET level (the interrupt
handler is firing but the skb's are not sent out, but we have no
error status for that).

Here is the link of the thread I've just started with DRM people in
order to really tackle this issue:
https://lists.freedesktop.org/archives/dri-devel/2023-October/428251.html

Thanks,
Miquèl

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Ethernet issue on imx6
  2023-10-27 20:58         ` Miquel Raynal
@ 2023-11-17 15:09           ` Miquel Raynal
  0 siblings, 0 replies; 26+ messages in thread
From: Miquel Raynal @ 2023-11-17 15:09 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Stephen Hemminger, Wei Fang, Shenwei Wang, Clark Wang,
	Russell King, davem, edumazet, kuba, pabeni, linux-imx, netdev,
	Thomas Petazzoni, Alexandre Belloni, Maxime Chevallier

Hello,

> I've investigated this further and found a strange relationship with
> the display subsystem. It seems like there is some congestion happening
> at the interconnect level. I wanted to point out that your hints helped
> as I observed that the above counters were incrementing as expected,
> but the packets were just not sent out. My interpretation is some
> kind of uDMA timeout caused by some hardware locking on the NIC by
> the IPU which cannot be diagnosed at the ENET level (the interrupt
> handler is firing but the skb's are not sent out, but we have no
> error status for that).
> 
> Here is the link of the thread I've just started with DRM people in
> order to really tackle this issue:
> https://lists.freedesktop.org/archives/dri-devel/2023-October/428251.html

For future reference, the thread mentioned above unfortunately did not
lead to any discussion (I admit it's not a common topic though) and
further investigation pointed at the DDR configuration. I had a hard
time making a link between the reset pad of the DDR controller being
misconfigured and the Ethernet drop rate, I still fail to do, but in
practice this very little change apparently had a significant impact and
totally solved our issue:
https://lore.kernel.org/u-boot/20231117150044.1792080-1-miquel.raynal@bootlin.com/

Thanks to all of you for your help and feedback,
Miquèl

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-11-17 15:09 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-12 17:34 Ethernet issue on imx6 Miquel Raynal
2023-10-12 19:39 ` Russell King (Oracle)
2023-10-13  8:40   ` Miquel Raynal
2023-10-13 10:16     ` Wei Fang
2023-10-16 11:49     ` Eric Dumazet
2023-10-16 13:58       ` Miquel Raynal
2023-10-16 15:06         ` Eric Dumazet
2023-10-16 15:36         ` Miquel Raynal
2023-10-16 19:37           ` Eric Dumazet
2023-10-16 21:47             ` Russell King (Oracle)
2023-10-17 11:19             ` Miquel Raynal
2023-10-12 20:46 ` Andrew Lunn
2023-10-12 22:58   ` Stephen Hemminger
2023-10-13  8:27     ` Miquel Raynal
2023-10-13 15:51       ` Andrew Lunn
2023-10-27 20:58         ` Miquel Raynal
2023-11-17 15:09           ` Miquel Raynal
2023-10-16  8:48       ` Alexander Stein
2023-10-16 13:31         ` Miquel Raynal
2023-10-16 14:41           ` Alexander Stein
2023-10-17 10:49             ` Miquel Raynal
2023-10-18  9:08               ` Alexander Stein
2023-10-27 20:58                 ` Miquel Raynal
2023-10-13  8:50 ` James Chapman
2023-10-13 10:37   ` Miquel Raynal
2023-10-13 11:54     ` James Chapman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).