linux-omap.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* am335x: performnce issues with FTDI and LOW_LATENCY
@ 2023-02-28  7:59 Yegor Yefremov
  2023-03-06  7:42 ` Tony Lindgren
  0 siblings, 1 reply; 7+ messages in thread
From: Yegor Yefremov @ 2023-02-28  7:59 UTC (permalink / raw)
  To: Linux-OMAP; +Cc: Bin Liu, Johan Hovold

I have the same am335x-based system running with both 3.18.x and 5.4.x
(6.2.1 too) kernels. In the full setup the system handles 4x FT4232-H
chips.

# lsusb -t
/:  Bus 02.Port 1: Dev 1, Class=root_hub, Driver=musb-hdrc/1p, 480M
    |__ Port 1: Dev 2, If 0, Class=, Driver=hub/4p, 480M
        |__ Port 2: Dev 3, If 0, Class=, Driver=ftdi_sio, 480M
        |__ Port 2: Dev 3, If 1, Class=, Driver=ftdi_sio, 480M
        |__ Port 2: Dev 3, If 2, Class=, Driver=ftdi_sio, 480M
        |__ Port 2: Dev 3, If 3, Class=, Driver=ftdi_sio, 480M
        |__ Port 3: Dev 4, If 1, Class=, Driver=ftdi_sio, 480M
        |__ Port 3: Dev 4, If 2, Class=, Driver=ftdi_sio, 480M
        |__ Port 3: Dev 4, If 0, Class=, Driver=ftdi_sio, 480M
        |__ Port 3: Dev 4, If 3, Class=, Driver=ftdi_sio, 480M
/:  Bus 01.Port 1: Dev 1, Class=root_hub, Driver=musb-hdrc/1p, 480M
    |__ Port 1: Dev 2, If 0, Class=, Driver=hub/4p, 480M
        |__ Port 2: Dev 3, If 0, Class=, Driver=ftdi_sio, 480M
        |__ Port 2: Dev 3, If 3, Class=, Driver=ftdi_sio, 480M
        |__ Port 2: Dev 3, If 1, Class=, Driver=ftdi_sio, 480M
        |__ Port 2: Dev 3, If 2, Class=, Driver=ftdi_sio, 480M
        |__ Port 3: Dev 4, If 0, Class=, Driver=ftdi_sio, 480M
        |__ Port 3: Dev 4, If 1, Class=, Driver=ftdi_sio, 480M
        |__ Port 3: Dev 4, If 2, Class=, Driver=ftdi_sio, 480M
        |__ Port 3: Dev 4, If 3, Class=, Driver=ftdi_sio, 480M

When I open all 16 serial ports with the LOW_LATENCY flag in the
latest kernels, the system performance drops dramatically. It is best
to watch via iperf3:

Kernel 6.2.1
16 serial ports closed: 90.2 Mbits/sec
16 serial ports opened without LOW_LATENCY: 88.3 Mbits/sec
16 serial ports opened with LOW_LATENCY: 12.1 Mbits/sec

Kernel: 3.18.1
16 serial ports closed: 61.1 Mbits/sec
16 serial ports opened without LOW_LATENCY: 53.7 Mbits/sec
16 serial ports opened with LOW_LATENCY: 37.2 Mbits/sec

Any idea why the performance drop is so big?

Regards,
Yegor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: am335x: performnce issues with FTDI and LOW_LATENCY
  2023-02-28  7:59 am335x: performnce issues with FTDI and LOW_LATENCY Yegor Yefremov
@ 2023-03-06  7:42 ` Tony Lindgren
  2023-03-07  9:53   ` Yegor Yefremov
  0 siblings, 1 reply; 7+ messages in thread
From: Tony Lindgren @ 2023-03-06  7:42 UTC (permalink / raw)
  To: Yegor Yefremov; +Cc: Linux-OMAP, Bin Liu, Johan Hovold

* Yegor Yefremov <yegorslists@googlemail.com> [230228 08:01]:
> Any idea why the performance drop is so big?

Maybe lots of interrupts and dma not being used for musb in this case?

Tony

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: am335x: performnce issues with FTDI and LOW_LATENCY
  2023-03-06  7:42 ` Tony Lindgren
@ 2023-03-07  9:53   ` Yegor Yefremov
  2023-03-09  7:30     ` Tony Lindgren
  0 siblings, 1 reply; 7+ messages in thread
From: Yegor Yefremov @ 2023-03-07  9:53 UTC (permalink / raw)
  To: Tony Lindgren; +Cc: Linux-OMAP, Bin Liu, Johan Hovold

On Mon, Mar 6, 2023 at 8:42 AM Tony Lindgren <tony@atomide.com> wrote:
>
> * Yegor Yefremov <yegorslists@googlemail.com> [230228 08:01]:
> > Any idea why the performance drop is so big?
>
> Maybe lots of interrupts and dma not being used for musb in this case?

Using "irqtop -d 1", I get the following results:

3.18.1 LATENCY_OFF (16 ports): ca. 1000 IRQs/s INTC 17 47400000.dma-controller
3.18.1 LATENCY_ON (16 ports): ca. 4000 IRQs/s INTC 17 47400000.dma-controller

6.2.1 LATENCY_OFF (16 ports): ca. 300 IRQs/s INTC 17 47400000.dma-controller
6.2.1 LATENCY_ON (16 ports): ca. 1000 IRQs/s INTC 17 47400000.dma-controller

#zcat /proc/config.gz | grep CPP
CONFIG_USB_TI_CPPI41_DMA=y
CONFIG_TI_CPPI41=y

Looks like 3.18.1 can handle more interrupts than 6.2.1.

Yegor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: am335x: performnce issues with FTDI and LOW_LATENCY
  2023-03-07  9:53   ` Yegor Yefremov
@ 2023-03-09  7:30     ` Tony Lindgren
  2023-03-10 22:35       ` Bin Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Tony Lindgren @ 2023-03-09  7:30 UTC (permalink / raw)
  To: Yegor Yefremov; +Cc: Linux-OMAP, Bin Liu, Johan Hovold

* Yegor Yefremov <yegorslists@googlemail.com> [230307 09:53]:
> On Mon, Mar 6, 2023 at 8:42 AM Tony Lindgren <tony@atomide.com> wrote:
> >
> > * Yegor Yefremov <yegorslists@googlemail.com> [230228 08:01]:
> > > Any idea why the performance drop is so big?
> >
> > Maybe lots of interrupts and dma not being used for musb in this case?
> 
> Using "irqtop -d 1", I get the following results:
> 
> 3.18.1 LATENCY_OFF (16 ports): ca. 1000 IRQs/s INTC 17 47400000.dma-controller
> 3.18.1 LATENCY_ON (16 ports): ca. 4000 IRQs/s INTC 17 47400000.dma-controller
> 
> 6.2.1 LATENCY_OFF (16 ports): ca. 300 IRQs/s INTC 17 47400000.dma-controller
> 6.2.1 LATENCY_ON (16 ports): ca. 1000 IRQs/s INTC 17 47400000.dma-controller

Hmm I wonder what's causing that. Earlier the Ethernet gadget had some
alignment define tweak that made transfers faster. What kind of data
transfer are you testing with?

> #zcat /proc/config.gz | grep CPP
> CONFIG_USB_TI_CPPI41_DMA=y
> CONFIG_TI_CPPI41=y

From what I recall musb still handles short transfers with PIO, I think
this is the case also for cppi41 dma. Sounds like that does not explain
the difference you're seeing between 3.18 and 6.2 kernels though.

Regards,

Tony

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: am335x: performnce issues with FTDI and LOW_LATENCY
  2023-03-09  7:30     ` Tony Lindgren
@ 2023-03-10 22:35       ` Bin Liu
  2023-03-13  7:08         ` Yegor Yefremov
  0 siblings, 1 reply; 7+ messages in thread
From: Bin Liu @ 2023-03-10 22:35 UTC (permalink / raw)
  To: Tony Lindgren; +Cc: Yegor Yefremov, Linux-OMAP, Johan Hovold

On Thu, Mar 09, 2023 at 09:30:00AM +0200, Tony Lindgren wrote:
> * Yegor Yefremov <yegorslists@googlemail.com> [230307 09:53]:
> > On Mon, Mar 6, 2023 at 8:42 AM Tony Lindgren <tony@atomide.com> wrote:
> > >
> > > * Yegor Yefremov <yegorslists@googlemail.com> [230228 08:01]:
> > > > Any idea why the performance drop is so big?
> > >
> > > Maybe lots of interrupts and dma not being used for musb in this case?
> > 
> > Using "irqtop -d 1", I get the following results:
> > 
> > 3.18.1 LATENCY_OFF (16 ports): ca. 1000 IRQs/s INTC 17 47400000.dma-controller
> > 3.18.1 LATENCY_ON (16 ports): ca. 4000 IRQs/s INTC 17 47400000.dma-controller
> > 
> > 6.2.1 LATENCY_OFF (16 ports): ca. 300 IRQs/s INTC 17 47400000.dma-controller
> > 6.2.1 LATENCY_ON (16 ports): ca. 1000 IRQs/s INTC 17 47400000.dma-controller
> 
> Hmm I wonder what's causing that. Earlier the Ethernet gadget had some
> alignment define tweak that made transfers faster. What kind of data
> transfer are you testing with?
> 
> > #zcat /proc/config.gz | grep CPP
> > CONFIG_USB_TI_CPPI41_DMA=y
> > CONFIG_TI_CPPI41=y
> 
> From what I recall musb still handles short transfers with PIO, I think
> this is the case also for cppi41 dma. Sounds like that does not explain
> the difference you're seeing between 3.18 and 6.2 kernels though.

I quickly scanned the changes on musb_cppi41.c and dma/cppi41.c between
v3.18 and v5.4, but nothing stands out. I am wondering if this is
something caused by outside of usb subsystem...

-Bin.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: am335x: performnce issues with FTDI and LOW_LATENCY
  2023-03-10 22:35       ` Bin Liu
@ 2023-03-13  7:08         ` Yegor Yefremov
  2023-03-23  7:06           ` Tony Lindgren
  0 siblings, 1 reply; 7+ messages in thread
From: Yegor Yefremov @ 2023-03-13  7:08 UTC (permalink / raw)
  To: Bin Liu, Tony Lindgren, Yegor Yefremov, Linux-OMAP, Johan Hovold

On Fri, Mar 10, 2023 at 11:35 PM Bin Liu <b-liu@ti.com> wrote:
>
> On Thu, Mar 09, 2023 at 09:30:00AM +0200, Tony Lindgren wrote:
> > * Yegor Yefremov <yegorslists@googlemail.com> [230307 09:53]:
> > > On Mon, Mar 6, 2023 at 8:42 AM Tony Lindgren <tony@atomide.com> wrote:
> > > >
> > > > * Yegor Yefremov <yegorslists@googlemail.com> [230228 08:01]:
> > > > > Any idea why the performance drop is so big?
> > > >
> > > > Maybe lots of interrupts and dma not being used for musb in this case?
> > >
> > > Using "irqtop -d 1", I get the following results:
> > >
> > > 3.18.1 LATENCY_OFF (16 ports): ca. 1000 IRQs/s INTC 17 47400000.dma-controller
> > > 3.18.1 LATENCY_ON (16 ports): ca. 4000 IRQs/s INTC 17 47400000.dma-controller
> > >
> > > 6.2.1 LATENCY_OFF (16 ports): ca. 300 IRQs/s INTC 17 47400000.dma-controller
> > > 6.2.1 LATENCY_ON (16 ports): ca. 1000 IRQs/s INTC 17 47400000.dma-controller
> >
> > Hmm I wonder what's causing that. Earlier the Ethernet gadget had some
> > alignment define tweak that made transfers faster. What kind of data
> > transfer are you testing with?
> >
> > > #zcat /proc/config.gz | grep CPP
> > > CONFIG_USB_TI_CPPI41_DMA=y
> > > CONFIG_TI_CPPI41=y
> >
> > From what I recall musb still handles short transfers with PIO, I think
> > this is the case also for cppi41 dma. Sounds like that does not explain
> > the difference you're seeing between 3.18 and 6.2 kernels though.
>
> I quickly scanned the changes on musb_cppi41.c and dma/cppi41.c between
> v3.18 and v5.4, but nothing stands out. I am wondering if this is
> something caused by outside of usb subsystem...

As for the network transfer, here is some background info. The devices
use SNMP (also internally) to handle device configuration data. This
issue was first detected as devices with 8 serial ports reacted really
slow when opening their web interface (on a 16 port device, opening a
web page lasted more than 2 minutes). Profiling showed the system was
busy handling UDP transactions (internal UDP requests to collect data
for the web interface).

The devices that were using OMAP UARTs only (one and two port devices)
didn't show this behavior. So the root cause was found: FTDIs. To
examine their impact on the system without our firmware, I have
written a small program where I can open as many ports as I need and
also specify the LOW_LATENCY flag. iperf3 with default settings (TCP)
could exactly show the influence of the LOW_LATENCY flag.

"modprobe mtd_speedtest" shows 50% performance degradation with 16
ports open and the LOW_LATENCY flag.

Any idea how I can get more info about what's going on in the kernel?

Yegor

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: am335x: performnce issues with FTDI and LOW_LATENCY
  2023-03-13  7:08         ` Yegor Yefremov
@ 2023-03-23  7:06           ` Tony Lindgren
  0 siblings, 0 replies; 7+ messages in thread
From: Tony Lindgren @ 2023-03-23  7:06 UTC (permalink / raw)
  To: Yegor Yefremov; +Cc: Bin Liu, Linux-OMAP, Johan Hovold

* Yegor Yefremov <yegorslists@googlemail.com> [230313 07:09]:
> On Fri, Mar 10, 2023 at 11:35 PM Bin Liu <b-liu@ti.com> wrote:
> >
> > On Thu, Mar 09, 2023 at 09:30:00AM +0200, Tony Lindgren wrote:
> > > * Yegor Yefremov <yegorslists@googlemail.com> [230307 09:53]:
> > > > On Mon, Mar 6, 2023 at 8:42 AM Tony Lindgren <tony@atomide.com> wrote:
> > > > >
> > > > > * Yegor Yefremov <yegorslists@googlemail.com> [230228 08:01]:
> > > > > > Any idea why the performance drop is so big?
> > > > >
> > > > > Maybe lots of interrupts and dma not being used for musb in this case?
> > > >
> > > > Using "irqtop -d 1", I get the following results:
> > > >
> > > > 3.18.1 LATENCY_OFF (16 ports): ca. 1000 IRQs/s INTC 17 47400000.dma-controller
> > > > 3.18.1 LATENCY_ON (16 ports): ca. 4000 IRQs/s INTC 17 47400000.dma-controller
> > > >
> > > > 6.2.1 LATENCY_OFF (16 ports): ca. 300 IRQs/s INTC 17 47400000.dma-controller
> > > > 6.2.1 LATENCY_ON (16 ports): ca. 1000 IRQs/s INTC 17 47400000.dma-controller
> > >
> > > Hmm I wonder what's causing that. Earlier the Ethernet gadget had some
> > > alignment define tweak that made transfers faster. What kind of data
> > > transfer are you testing with?
> > >
> > > > #zcat /proc/config.gz | grep CPP
> > > > CONFIG_USB_TI_CPPI41_DMA=y
> > > > CONFIG_TI_CPPI41=y
> > >
> > > From what I recall musb still handles short transfers with PIO, I think
> > > this is the case also for cppi41 dma. Sounds like that does not explain
> > > the difference you're seeing between 3.18 and 6.2 kernels though.
> >
> > I quickly scanned the changes on musb_cppi41.c and dma/cppi41.c between
> > v3.18 and v5.4, but nothing stands out. I am wondering if this is
> > something caused by outside of usb subsystem...
> 
> As for the network transfer, here is some background info. The devices
> use SNMP (also internally) to handle device configuration data. This
> issue was first detected as devices with 8 serial ports reacted really
> slow when opening their web interface (on a 16 port device, opening a
> web page lasted more than 2 minutes). Profiling showed the system was
> busy handling UDP transactions (internal UDP requests to collect data
> for the web interface).
> 
> The devices that were using OMAP UARTs only (one and two port devices)
> didn't show this behavior. So the root cause was found: FTDIs. To
> examine their impact on the system without our firmware, I have
> written a small program where I can open as many ports as I need and
> also specify the LOW_LATENCY flag. iperf3 with default settings (TCP)
> could exactly show the influence of the LOW_LATENCY flag.
> 
> "modprobe mtd_speedtest" shows 50% performance degradation with 16
> ports open and the LOW_LATENCY flag.
> 
> Any idea how I can get more info about what's going on in the kernel?

Maybe try adding some trace_printk() to the code.

I'd check the fifo read/write for PIO, those should end up using
ldmia/stdmia via the related read and write functions.

And maybe threaded IRQ related changes have caused longer latencies
for PIO transfers? Maybe check DMA related transfers and see if they
too are slower now?

Regards,

Tony

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-03-23  7:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-28  7:59 am335x: performnce issues with FTDI and LOW_LATENCY Yegor Yefremov
2023-03-06  7:42 ` Tony Lindgren
2023-03-07  9:53   ` Yegor Yefremov
2023-03-09  7:30     ` Tony Lindgren
2023-03-10 22:35       ` Bin Liu
2023-03-13  7:08         ` Yegor Yefremov
2023-03-23  7:06           ` Tony Lindgren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).