Question for AMD/Xilinx Zynq PREEMP_RT configuration check, CAN latency measuremet and FOSDEM 2025

public inbox for linux-rt-users@vger.kernel.org
 help / color / mirror / Atom feed

* Question for AMD/Xilinx Zynq PREEMP_RT configuration check, CAN latency measuremet and FOSDEM 2025
@ 2025-01-28 15:29 Pavel Pisa
  2025-01-29 10:17 ` Sebastian Andrzej Siewior
  0 siblings, 1 reply; 8+ messages in thread
From: Pavel Pisa @ 2025-01-28 15:29 UTC (permalink / raw)
  To: linux-rt-users, Carsten Emde; +Cc: linux-can, Oliver Hartkopp, Jan Altenberg

[-- Attachment #1: Type: text/plain, Size: 3533 bytes --]

Dear real time community,

we are long-time users of PREEMP_RT on more platforms and work
on CAN/CAN FD, motion control support for Linux, and other RTOSes.

We contributed by CAN latency testing for decades, and we run
our latest solution for CAN latency testing on Linux mainline
and PREEMP_RT kernel continuously from March 2023

  https://canbus.pages.fel.cvut.cz/#can-bus-channels-mutual-latency-testing

  https://canbus.pages.fel.cvut.cz/can-latester/

We use the attached configuration file.

Please check if you find some problematic choices.
The cyclic test worked well, and we have even delivered two systems
to OSADL QA real-time farm 

  https://www.osadl.org/?id=4109

The maximal latency is under 200 usec, we have run even pysimCoder

  https://github.com/robertobucher/pysimCoder

and Matlab/Simulink generated code at 4 kHz for PMSM control etc.

  https://github.com/aa4cc/ert_linux

All works well.

However, the CAN/CAN FD communication latency measured on the CTU CAN FD IP
core is far from optimal. Some runs under load with
10 msec latency. Our own CAN FD stack for RTEMS keeps with no exception
under 60 usec on the same hardware.

I understand that the Linux socket layer and networking
stack are complex, and many optimizations are ahead.
We will be happy to contribute where we can and find time
and even some resources to engage more students etc...

But I would like to be sure that the bad results are not
caused by our mistakes in configuration.

I will be happy to meet you and discuss Linux and other
control and real-time areas at FOSDEM 2025.

I have had an interest in presenting there our students'
projects, open-source motion control system

  https://gitlab.fel.cvut.cz/otrees/motion/samocon

and online aid/website for training, exercises and education
of computer architectures build around our QtRvSim simulator

  https://comparch.edu.cvut.cz/online-tools/webeval/

But these talks has not been accepted.

I even offered to talk about PREEMP_RT and our control
and CAN projects because I have prepared one for our
local community in Czech. That talk has been accepted,
and it seems that no other developer more familiar
with Linux RT has submitted the talk, so I will try
to cover even some remembrance of PREEMPT_RT history.
But I would be happy to receive feedback and suggestions
for corrections.

Original talk in Czech

  https://talks.openalt.cz/openalt-2024/talk/3XTMDF/

Slides in English which I want to update/correct for FOSDEM

  https://talks.openalt.cz/media/openalt-2024/submissions/3XTMDF/resources/openalt24_linux_for_rt-reduced_FbZPuS0.pdf

FOSDEM 2025 talk abstract

https://fosdem.org/2025/schedule/event/fosdem-2025-5411-linux-kernel-mainline-real-time-history-support-and-experience-based-on-robotic-and-automotive-projects/

Best wishes,

                Pavel

                Pavel Pisa
    phone:      +420 603531357
    e-mail:     pisa@cmp.felk.cvut.cz
    Department of Control Engineering FEE CVUT
    Karlovo namesti 13, 121 35, Prague 2
    university: http://control.fel.cvut.cz/
    personal:   http://cmp.felk.cvut.cz/~pisa
    company:    https://pikron.com/ PiKRON s.r.o.
    Kankovskeho 1235, 182 00 Praha 8, Czech Republic
    projects:   https://www.openhub.net/accounts/ppisa
    social:     https://social.kernel.org/ppisa
    CAN related:http://canbus.pages.fel.cvut.cz/
    RISC-V education: https://comparch.edu.cvut.cz/
    Open Technologies Research Education and Exchange Services
    https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home

[-- Attachment #2: linux-6.13.0-rc6-rt3-microzed-config.gz --]
[-- Type: application/x-gzip, Size: 61183 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Question for AMD/Xilinx Zynq PREEMP_RT configuration check, CAN latency measuremet and FOSDEM 2025
  2025-01-28 15:29 Question for AMD/Xilinx Zynq PREEMP_RT configuration check, CAN latency measuremet and FOSDEM 2025 Pavel Pisa
@ 2025-01-29 10:17 ` Sebastian Andrzej Siewior
  2025-01-29 12:04   ` Pavel Pisa
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-01-29 10:17 UTC (permalink / raw)
  To: Pavel Pisa
  Cc: linux-rt-users, Carsten Emde, linux-can, Oliver Hartkopp,
	Jan Altenberg

On 2025-01-28 16:29:27 [+0100], Pavel Pisa wrote:
> Please check if you find some problematic choices.

I didn't find anything obviously wrong. Assuming your CPU is busy in
general you could remove NO_HZ in favour of PERIODIC. This is however
not to cause spikes you describe below.

> The cyclic test worked well, and we have even delivered two systems
> to OSADL QA real-time farm 
> 
>   https://www.osadl.org/?id=4109

It shows "IRQ work interrupts". Not sure what causes them.

> However, the CAN/CAN FD communication latency measured on the CTU CAN FD IP
> core is far from optimal. Some runs under load with
> 10 msec latency. Our own CAN FD stack for RTEMS keeps with no exception
> under 60 usec on the same hardware.
> 
> I understand that the Linux socket layer and networking
> stack are complex, and many optimizations are ahead.
> We will be happy to contribute where we can and find time
> and even some resources to engage more students etc...
> 
> But I would like to be sure that the bad results are not
> caused by our mistakes in configuration.

You have CAN and "regular networking". My guess would be that regular
networking blocks blocks BH and so your CAN. You could try to have all
interrupts serviced on CPU0 and move CAN to CPU1. If so this should
improve then. Other than that, I would suggest to get some tracing to
see what delays your CAN interrupts and/ or handling in general. 

> I will be happy to meet you and discuss Linux and other
> control and real-time areas at FOSDEM 2025.

I should be able to make it.

…
> Slides in English which I want to update/correct for FOSDEM
> 
>   https://talks.openalt.cz/media/openalt-2024/submissions/3XTMDF/resources/openalt24_linux_for_rt-reduced_FbZPuS0.pdf

looks good. If you want additional history points, I have some at
	https://files.speakerdeck.com/presentations/0620b5b3a00b42fc91fba6cc4092d278/KR_2024_PREEMPT_RT_over_the_years.pdf
	Slide 11 - 21.

However you have most of the pieces so.

> Best wishes,
> 
>                 Pavel
> 

Sebastian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Question for AMD/Xilinx Zynq PREEMP_RT configuration check, CAN latency measuremet and FOSDEM 2025
  2025-01-29 10:17 ` Sebastian Andrzej Siewior
@ 2025-01-29 12:04   ` Pavel Pisa
  2025-01-29 14:40     ` Sebastian Andrzej Siewior
  2025-03-28 12:04     ` CAN latency measuremet on AMD/Xilinx Zynq with PREEMP_RT - added threaded NAPI configuration Pavel Pisa
  0 siblings, 2 replies; 8+ messages in thread
From: Pavel Pisa @ 2025-01-29 12:04 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: linux-rt-users, Carsten Emde, linux-can, Oliver Hartkopp,
	Jan Altenberg, Pavel Hronek

Hello Sebastian,

On Wednesday 29 of January 2025 11:17:09 Sebastian Andrzej Siewior wrote:
> On 2025-01-28 16:29:27 [+0100], Pavel Pisa wrote:
> > Please check if you find some problematic choices.
>
> I didn't find anything obviously wrong. Assuming your CPU is busy in
> general you could remove NO_HZ in favour of PERIODIC. This is however
> not to cause spikes you describe below.

Great, thanks much for review by expert.

> > The cyclic test worked well, and we have even delivered two systems
> > to OSADL QA real-time farm
> >
> >   https://www.osadl.org/?id=4109
>
> It shows "IRQ work interrupts". Not sure what causes them.

I am not sure either. That list is from old kernel
in long term testing setup at OSADL.

The actual one show none IRQ work interrupts
after last reboot and overnigh test

Linux mzapo 6.13.0-rc6-rt3-dut #1 SMP PREEMPT_RT
Wed Jan 29 04:46:40 CET 2025 armv7l GNU/Linux

           CPU0       CPU1
 24:          0          0 GIC-0  27 Edge      gt
 25:     700822     327164 GIC-0  29 Edge      twd
 26:        300          0 GIC-0  59 Level     xuartps
 29:          0          0 GIC-0  45 Level     f8003000.dmac
 30:          0          0 GIC-0  46 Level     f8003000.dmac
 31:          0          0 GIC-0  47 Level     f8003000.dmac
 32:          0          0 GIC-0  48 Level     f8003000.dmac
 33:          0          0 GIC-0  49 Level     f8003000.dmac
 34:          0          0 GIC-0  72 Level     f8003000.dmac
 35:          0          0 GIC-0  73 Level     f8003000.dmac
 36:          0          0 GIC-0  74 Level     f8003000.dmac
 37:          0          0 GIC-0  75 Level     f8003000.dmac
 40:     460330          0 GIC-0  54 Level     end0
 41:          0          0 GIC-0  53 Level     e0002000.usb
 42:        356          0 GIC-0  56 Level     mmc0
 43:          0          0 GIC-0  43 Level     ttc_clockevent
 44:         25          0 GIC-0  39 Level     f8007100.adc
 45:          0          0 GIC-0  37 Level     arm-pmu
 46:          0          0 GIC-0  38 Level     arm-pmu
 47:        128          0 GIC-0  40 Level     f8007000.devcfg
 48:     314697          0 GIC-0  61 Level     can2
 49:     314597          0 GIC-0  62 Level     can3
 50:     314759          0 GIC-0  63 Level     can4
 51:     311516          0 GIC-0  64 Level     can5
IPI0:          0          0  CPU wakeup interrupts
IPI1:          0          0  Timer broadcast interrupts
IPI2:      17849     292126  Rescheduling interrupts
IPI3:       5923      11315  Function call interrupts
IPI4:          0          0  CPU stop interrupts
IPI5:     271078      74040  IRQ work interrupts
IPI6:          0          0  completion interrupts
Err:          0

So this seems as no cause.

> > However, the CAN/CAN FD communication latency measured on the CTU CAN FD
> > IP core is far from optimal. Some runs under load with
> > 10 msec latency. Our own CAN FD stack for RTEMS keeps with no exception
> > under 60 usec on the same hardware.
> >
> > I understand that the Linux socket layer and networking
> > stack are complex, and many optimizations are ahead.
> > We will be happy to contribute where we can and find time
> > and even some resources to engage more students etc...
> >
> > But I would like to be sure that the bad results are not
> > caused by our mistakes in configuration.
>
> You have CAN and "regular networking". My guess would be that regular
> networking blocks blocks BH and so your CAN. You could try to have all
> interrupts serviced on CPU0 and move CAN to CPU1. If so this should 
> improve then. Other than that, I would suggest to get some tracing to
> see what delays your CAN interrupts and/ or handling in general.

Yes, I think that design mixing regular networking packet
processing with CAN is the problem. We test even with setup where
CAN interrupts priority is boosted to 90

    echo "-> Rise CAN irq priorities"
    PIDS=$(ps -e | grep -E irq/[0-9]+-can[3-4] | tr -s ' ' | cut -d ' ' -f2)
    TXPID=$(ps -e | grep -E irq/[0-9]+-can2 | tr -s ' ' | cut -d ' ' -f2)
    chrt -f --pid 80 $TXPID
    for pid in $PIDS ; do
        chrt -f --pid 85 $pid
    done

ps Hxa --sort rtprio -o pid,policy,rtprio,state,tname,time,command

...
   70 FF      50 S ?        00:00:00 [irq/37-f8003000.dmac]
   71 FF      50 S ?        00:00:38 [irq/40-eth%d]
...
  405 FF      50 S ?        00:00:00 [irq/26-xuartps]
  355 FF      90 S ?        00:00:06 [irq/48-can2]
  361 FF      90 S ?        00:00:13 [irq/49-can3]
  366 FF      90 S ?        00:00:07 [irq/50-can4]
  371 FF      90 S ?        00:00:06 [irq/51-can5]
   22 FF      99 S ?        00:00:00 [migration/0]
   27 FF      99 S ?        00:00:00 [migration/1]

Even this setup is problematic under load.
The situation with CAN IRQ priority 50 and 90 can be compared
by clicking on "RT priority set" option

https://canbus.pages.fel.cvut.cz/can-latester/inspect.html?kernel=rt&prio=1&load=1&flood=1&fd=1

The switch between in kernel CAN gateway and userpace one
is controlled by "Kernel GW".

User CAN gateway is run with priority 80

  chrt -r 80 ugw -f can3 can2

I spot interesting trend after

  run-250103-045322-hist+6.13.0-rc1-rt1-g5374fecd2695+flood-prio-fd-load.json

that user gateway case, simple copy of frames from can3 to can2
has never exceed 1.4 ms almost for one month.
It could be interesting to corelate that with kernel changes.

We use branch

  for-kbuild-bot/current-stable

from

  git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git

to run daily testing. We can consider even something different,
but this choice has been given by interest in something
functional for each day and ahead of mainline merges to
catch some problems in advance.

It is interesting than in kernel gateway is significantly worse
now. It does not overhead of switching to userspace. But I am not
sure if it is not invoked in some kernel worker which
has lower or same real time priority than Ethenet networking.

In general, I think that the problem is that incommin
packets (CAN and Ethernet) load the same per CPU
worker. There are even backlog_napi threads per CPU

   46 TS       - S ?        00:00:00 [backlog_napi/0]
   47 TS       - S ?        00:00:00 [backlog_napi/1]

It has even TS priority. If I remember well, there has been
added option to allocate separate RX packets processing
therad (instead for default per CPU one) for given interface.
But I have no experience with such configuration.

Do you have or somebody else have idea how to achieve
that and if it is legal to boost such kernel therad
priority. It could help, because my general experience
with PREEMPT_RT even on this target is very positive
for tasks mapping HW directly and doing RT control.
Same for latency tester. No spikes under load over
250 usec or less.

> > I will be happy to meet you and discuss Linux and other
> > control and real-time areas at FOSDEM 2025.
>
> I should be able to make it.

Great, I would be happy to meet at FOSDEM or discuss
these topic later at some event.

> > Slides in English which I want to update/correct for FOSDEM
> >
> >  
> > https://talks.openalt.cz/media/openalt-2024/submissions/3XTMDF/resources/
> >openalt24_linux_for_rt-reduced_FbZPuS0.pdf
>
> looks good. If you want additional history points, I have some at
> 	https://files.speakerdeck.com/presentations/0620b5b3a00b42fc91fba6cc4092d2
>78/KR_2024_PREEMPT_RT_over_the_years.pdf Slide 11 - 21.

Thanks much for the input

> However you have most of the pieces so.
>

Best wishes,

                Pavel
-- 
                Pavel Pisa
    phone:      +420 603531357
    e-mail:     pisa@cmp.felk.cvut.cz
    Department of Control Engineering FEE CVUT
    Karlovo namesti 13, 121 35, Prague 2
    university: http://control.fel.cvut.cz/
    personal:   http://cmp.felk.cvut.cz/~pisa
    social:     https://social.kernel.org/ppisa
    projects:   https://www.openhub.net/accounts/ppisa
    CAN related:http://canbus.pages.fel.cvut.cz/
    RISC-V education: https://comparch.edu.cvut.cz/
    Open Technologies Research Education and Exchange Services
    https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Question for AMD/Xilinx Zynq PREEMP_RT configuration check, CAN latency measuremet and FOSDEM 2025
  2025-01-29 12:04   ` Pavel Pisa
@ 2025-01-29 14:40     ` Sebastian Andrzej Siewior
  2025-03-28 12:04     ` CAN latency measuremet on AMD/Xilinx Zynq with PREEMP_RT - added threaded NAPI configuration Pavel Pisa
  1 sibling, 0 replies; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-01-29 14:40 UTC (permalink / raw)
  To: Pavel Pisa
  Cc: linux-rt-users, Carsten Emde, linux-can, Oliver Hartkopp,
	Jan Altenberg, Pavel Hronek

On 2025-01-29 13:04:15 [+0100], Pavel Pisa wrote:
> Hello Sebastian,
Hi Pavel,

> The actual one show none IRQ work interrupts
> after last reboot and overnigh test
> 
> Linux mzapo 6.13.0-rc6-rt3-dut #1 SMP PREEMPT_RT
> Wed Jan 29 04:46:40 CET 2025 armv7l GNU/Linux
…
>            CPU0       CPU1
>  48:     314697          0 GIC-0  61 Level     can2
>  49:     314597          0 GIC-0  62 Level     can3
>  50:     314759          0 GIC-0  63 Level     can4
>  51:     311516          0 GIC-0  64 Level     can5
> IPI0:          0          0  CPU wakeup interrupts
> IPI1:          0          0  Timer broadcast interrupts
> IPI2:      17849     292126  Rescheduling interrupts
> IPI3:       5923      11315  Function call interrupts
> IPI4:          0          0  CPU stop interrupts
> IPI5:     271078      74040  IRQ work interrupts
> IPI6:          0          0  completion interrupts
> Err:          0
> 
> So this seems as no cause.

None you say? I see 271078 on CPU0 and 74040 on the other one.

> Yes, I think that design mixing regular networking packet
> processing with CAN is the problem. We test even with setup where
> CAN interrupts priority is boosted to 90
> 
>     echo "-> Rise CAN irq priorities"
>     PIDS=$(ps -e | grep -E irq/[0-9]+-can[3-4] | tr -s ' ' | cut -d ' ' -f2)
>     TXPID=$(ps -e | grep -E irq/[0-9]+-can2 | tr -s ' ' | cut -d ' ' -f2)
>     chrt -f --pid 80 $TXPID
>     for pid in $PIDS ; do
>         chrt -f --pid 85 $pid
>     done

but boosting the prio does not help because lock contention leads to PI
and forces its way through. The problem is that networking will
continue.

You need to go to /proc/irq/${can_irq} and push the affinity to CPU1.

> Even this setup is problematic under load.

I would expect no change.

> to run daily testing. We can consider even something different,
> but this choice has been given by interest in something
> functional for each day and ahead of mainline merges to
> catch some problems in advance.

Oh okay.

> It is interesting than in kernel gateway is significantly worse
> now. It does not overhead of switching to userspace. But I am not
> sure if it is not invoked in some kernel worker which
> has lower or same real time priority than Ethenet networking.
> 
> In general, I think that the problem is that incommin
> packets (CAN and Ethernet) load the same per CPU
> worker. There are even backlog_napi threads per CPU
> 
>    46 TS       - S ?        00:00:00 [backlog_napi/0]
>    47 TS       - S ?        00:00:00 [backlog_napi/1]
> 
> It has even TS priority. If I remember well, there has been
> added option to allocate separate RX packets processing
> therad (instead for default per CPU one) for given interface.
> But I have no experience with such configuration.

backlog NAPI is used by devices which don't bring their own NAPI.

> Do you have or somebody else have idea how to achieve
> that and if it is legal to boost such kernel therad
> priority. It could help, because my general experience
> with PREEMPT_RT even on this target is very positive
> for tasks mapping HW directly and doing RT control.
> Same for latency tester. No spikes under load over
> 250 usec or less.

I wouldn't boost it unconditionally. If you enable tracing with
sched_switch, interrupts and maybe net then you should see how the flow
of the CAN skb is. I don't know if it touches backlog_napi. Ideally it
shouldn't. There shouldn't be anything that could interfere with it such
ethernet traffic (say ssh) or local sockets.
Once you see the regular flow you should be able to what blocks it once
you step the trace during a spike.

> Best wishes,
> 
>                 Pavel

Sebastian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* CAN latency measuremet on AMD/Xilinx Zynq with PREEMP_RT - added threaded NAPI configuration
  2025-01-29 12:04   ` Pavel Pisa
  2025-01-29 14:40     ` Sebastian Andrzej Siewior
@ 2025-03-28 12:04     ` Pavel Pisa
  2025-04-17  8:12       ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 8+ messages in thread
From: Pavel Pisa @ 2025-03-28 12:04 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior, Marc Kleine-Budde
  Cc: linux-rt-users, Carsten Emde, linux-can, Oliver Hartkopp,
	Jan Altenberg, Pavel Hronek

[-- Attachment #1: Type: text/plain, Size: 6629 bytes --]

Hello Marc and Sebastian,

thanks for suggestions and discussion at FOSDEM 2025.

We are slow, sorry, I have teaching, we have worked
on new NuttX boot loader, pysmCoder and RTEMS, etc..

But there is some partial progress for our Linux
CAN/CAN FD latency benchmarking

  https://canbus.pages.fel.cvut.cz/#can-bus-channels-mutual-latency-testing

On Saturday 01 of February 2025 14:56:49 you wrote:
> On 27.01.2025 00:27:02, Pavel Pisa wrote:
> > May it be that there is even some problem in our RT kernel
> > configuration. But I have not found which option could
> > be problematic. I would send config files if you want to look
> > at them.

Thanks to Sebastian for review which did not reveal
something suspicious.

> Try switching the CAN interfaces to threaded NAPI:
>
>     echo 1 | sudo tee /sys/class/net/canX/threaded
>
> and configure the priorities of the can interface NAPI thread. You might
> also switch the Ethernet interfaces to threaded NAPI and give them a
> different/lower prio.

Pavel Hronek has found some time on his Erasmus stay in Italy
and added Threaded NAPI into options of our CAN testing matrix.

The scripts to set paramaters for CAN interfaces on the device
under the test (DUT) are located in the directory device-scripts
of our latency automation repository
 
  https://gitlab.fel.cvut.cz/canbus/can-benchmark/can-latester-automation/-/tree/master/device-scripts

The scripts are attached as well for easier review.

The set-can-threaded.sh sets

  echo $1 > /sys/class/net/canX/threaded

and I have added increase of the napi/canX-Y
treads priority to it on this Monday (March 24)

  chrt -f --pid 80 $pid

This setup logs are selected by option "Threaded NAPI"

The classical boost of CAN interfaces IRQ threads
is under switch "RT priority set", which locates
irq/[0-9]+-can[0-9] and sets

  chrt -f --pid 90 $pid

The final state of the system in the threaded config show

  ps Hxa --sort rtprio -o pid,policy,rtprio,state,tname,time,command

   16 FF       1 S ?        00:00:13 [ktimers/0]
   17 FF       1 I ?        00:00:01 [rcu_preempt]
   18 FF       1 S ?        00:00:00 [rcub/0]
   20 FF       1 S ?        00:00:02 [rcuc/0]
   23 FF       1 S ?        00:00:04 [irq_work/0]
   26 FF       1 S ?        00:00:01 [irq_work/1]
   28 FF       1 S ?        00:00:02 [rcuc/1]
   29 FF       1 S ?        00:00:05 [ktimers/1]
   48 FF      50 S ?        00:00:00 [watchdogd]
   63 FF      50 S ?        00:00:00 [irq/29-f8003000.dmac]
   ...
   72 FF      50 S ?        00:01:19 [irq/40-eth%d]
   75 FF      50 S ?        00:00:00 [irq/41-e0002000.usb]
   78 FF      50 S ?        00:00:00 [irq/44-f8007100.adc]
   80 FF      50 S ?        00:00:00 [irq/42-mmc0]
   81 FF      50 S ?        00:00:00 [irq/42-s-mmc0]
   84 FF      50 S ?        00:00:00 [irq/47-f8007000.devcfg]
  410 FF      50 S ?        00:00:00 [irq/26-xuartps]
 1907 FF      80 S ?        00:00:00 [napi/can2-19]
 1908 FF      80 S ?        00:00:13 [napi/can3-20]
 1909 FF      80 S ?        00:00:09 [napi/can4-21]
 1910 FF      80 S ?        00:00:08 [napi/can5-22]
  355 FF      90 S ?        00:00:12 [irq/48-can2]
  364 FF      90 S ?        00:00:20 [irq/49-can3]
  369 FF      90 S ?        00:00:14 [irq/50-can4]
  376 FF      90 S ?        00:00:14 [irq/51-can5]
   22 FF      99 S ?        00:00:00 [migration/0]
   27 FF      99 S ?        00:00:00 [migration/1]

There is recorded significant change when the respective
NAPI threads priority has been increased and policy
changed from default TS to FF. We are starting to get
maximal times under one millisecond in the most cases

CAN FD messages, flood, system under network load,
CAN gateway (retransmit all messages to the secondary
intreface on DUT) kernel

  https://canbus.pages.fel.cvut.cz/can-latester/inspect.html?kernel=rt&load=1&prio=1&flood=1&fd=1&thrd=1&kern=1

For standard CAN messages

  https://canbus.pages.fel.cvut.cz/can-latester/inspect.html?kernel=rt&load=1&prio=1&flood=1&fd=0&thrd=1&kern=1

You can switch on and of other options to see how the latency
profiles vary. Suspicious is the series from run

  run-250326-05****-hist+6.14.0-rc1-rt1-gecdc0d0bb42d+***.json

Which should be after priority boost but maximal latency
are high still for more combinations. But it could be some
interference. Our daily mainline and RT kernel data starts from
April 2023 (Archives available under Data range), so we will
be more certain what is the result of the current setup
after some months.

> MY coworker Lucas mentioned another option would be to stay with
> traditional soft IRQ based NAPI but reduce the NAPI weight for the
> "unimportant" interfaces to 1.

OK, we will keep that on our list of the experiments to try.
I hope that our tests could help to enhance CAN users situation
on GNU/Linux when they use it for control. The result can
be even manual how to setup priorities and interfaces for
reliable operation. We already provide toolong to obtain
some confidence in the given setup because all our tooling
can be run by everybody on their setup.

May it be we can catch something which can be enhanced in the
mainline kernel code as well...

For sure, deeper ftrace based analysis is on our list as well,
but it could be art of some studnet thesis or some joint debug
session with somebody from you. I probably do not find time
to study that deeper without guidance soon.

> > RTEMS behavior on same HW experiences maximal latency about 60 usec
> > even with networking and other load, sources, documentation, reports
> >
> >  
> > https://canbus.pages.fel.cvut.cz/#cancan-fd-subsystem-and-drivers-for-rtems
> >
> > The CiA article is already available to public and referenced from our
> > page same as the Michal Lenc's Master's thesis.

Some update to above mentioned project of our complete new CAN/CAN FD stack
for RTEMS. It has been mainlined in January and documentation propagated
to the online site in March

  https://ftp.rtems.org/pub/rtems/people/amar/docs1/bsp-howto/can.html

Best wishes,

                Pavel

                Pavel Pisa
    phone:      +420 603531357
    e-mail:     pisa@cmp.felk.cvut.cz
    Department of Control Engineering FEE CVUT
    Karlovo namesti 13, 121 35, Prague 2
    university: http://control.fel.cvut.cz/
    personal:   http://cmp.felk.cvut.cz/~pisa
    social:     https://social.kernel.org/ppisa
    projects:   https://www.openhub.net/accounts/ppisa
    CAN related:http://canbus.pages.fel.cvut.cz/
    RISC-V education: https://comparch.edu.cvut.cz/
    Open Technologies Research Education and Exchange Services
    https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home

[-- Attachment #2: set-can-threaded.sh --]
[-- Type: text/x-shellscript, Size: 408 bytes --]

#!/bin/bash

NAPI_PRIO=80

if (( $# != 1 )); then
  echo "Need 1 argument (value 0 or 1)" >&2
  exit 1
fi

for ifc in can2 can3 can4 can5 ; do
  echo $1 > /sys/class/net/$ifc/threaded
done

if [ $1 -eq 1 ] ; then
  PIDS=$(ps -e | grep -E napi/can[0-9]+-[0-9] | tr -s ' ' | cut -d ' ' -f2)
  for pid in $PIDS ; do
    echo "Setting RT priority $NAPI_PRIO for $pid"
    chrt -f --pid $NAPI_PRIO $pid
  done
fi

[-- Attachment #3: set-irq-prio.sh --]
[-- Type: text/x-shellscript, Size: 228 bytes --]

#!/bin/bash
if [ ! -z $1 ] ; then
PRIO=$1
else
PRIO=90
fi

PIDS=$(ps -e | grep -E irq/[0-9]+-can[0-9] | tr -s ' ' | cut -d ' ' -f2)
for pid in $PIDS ; do
	echo "Setting RT priority $PRIO for $pid"
	chrt -f --pid $PRIO $pid
done

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CAN latency measuremet on AMD/Xilinx Zynq with PREEMP_RT - added threaded NAPI configuration
  2025-03-28 12:04     ` CAN latency measuremet on AMD/Xilinx Zynq with PREEMP_RT - added threaded NAPI configuration Pavel Pisa
@ 2025-04-17  8:12       ` Sebastian Andrzej Siewior
  2025-04-18 10:12         ` Pavel Pisa
  0 siblings, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2025-04-17  8:12 UTC (permalink / raw)
  To: Pavel Pisa
  Cc: Marc Kleine-Budde, linux-rt-users, Carsten Emde, linux-can,
	Oliver Hartkopp, Jan Altenberg, Pavel Hronek

On 2025-03-28 13:04:47 [+0100], Pavel Pisa wrote:
> Hello Marc and Sebastian,
> 
Hi Pavel,

…
> #!/bin/bash
> 
> NAPI_PRIO=80
> 
> if (( $# != 1 )); then
>   echo "Need 1 argument (value 0 or 1)" >&2
>   exit 1
> fi
…
> for ifc in can2 can3 can4 can5 ; do
>   echo $1 > /sys/class/net/$ifc/threaded
> done
> 
> if [ $1 -eq 1 ] ; then
>   PIDS=$(ps -e | grep -E napi/can[0-9]+-[0-9] | tr -s ' ' | cut -d ' ' -f2)
>   for pid in $PIDS ; do
>     echo "Setting RT priority $NAPI_PRIO for $pid"
>     chrt -f --pid $NAPI_PRIO $pid
>   done
> fi

The IRQ thread should be limited to one CPU which is the same where the
IRQ it itself is set to. I don't think that this done the NAPI thread
automatically so it is probably free floating in the system.

Sebastian

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CAN latency measuremet on AMD/Xilinx Zynq with PREEMP_RT - added threaded NAPI configuration
  2025-04-17  8:12       ` Sebastian Andrzej Siewior
@ 2025-04-18 10:12         ` Pavel Pisa
  2025-04-18 20:18           ` Oliver Hartkopp
  0 siblings, 1 reply; 8+ messages in thread
From: Pavel Pisa @ 2025-04-18 10:12 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Marc Kleine-Budde, linux-rt-users, Carsten Emde, linux-can,
	Oliver Hartkopp, Jan Altenberg, Pavel Hronek

Hello Sebastian,

On Thursday 17 of April 2025 10:12:54 Sebastian Andrzej Siewior wrote:
> The IRQ thread should be limited to one CPU which is the same where the
> IRQ it itself is set to. I don't think that this done the NAPI thread
> automatically so it is probably free floating in the system.

you are right, I have added

  taskset -p 1 $pid

in can-latester-automation/device-scripts/set-can-threaded.sh

the effect can be seen after some days on the midnight
build and testing records.

Our system is small and simple and all CAN IRQs are
mapped to CPU0 now. But I have looked if I can find some
easy way how to find affinity of the IRQ thread from
/sys/class/net/canX and have not succeed a much.
There is queues/rx-0/rps_cpus but it is probably another
level.

There is easy way to find matching kernel driver task
and copy affinity to NAPI task. But I am not sure how
much naming match is guaranteed if some interfaces aliasing
etc. is in effect. I see next names now

  [irq/48-can2]
  [irq/49-can3]
  [irq/50-can4]
  [irq/51-can5]

and

 [napi/can2-19]
 [napi/can3-20]
 [napi/can4-21]
 [napi/can5-22]

One question to Oliver, in which thread/callaback context
is running kernel CAN gateway? I think that it does not
use separate task. Because with threaded NAPI it seems
that simple user space "gateway" (task to forward all
messages from one interface to another) has more stable
results than routing of messages directly in kernel.

Some side note, project implementing FlexCAN controller
emulation for QEMU (initial target sabrelite iMX6)
is moving forward. And as Bernhard Beschow submitted
iMX8 platform support into mainline QEMU, the FlexCAN
emulation support can be extended to it in future as well.
If somebody is interested then we can somehow join
resources. Foe example if some funding is found
I would discuse if the studnet working on the thesis
project finalized by submitting iMX6 support would
be willing to continue on iMX8 or other targets support.

Best wishes,

                Pavel Pisa
    phone:      +420 603531357
    e-mail:     pisa@cmp.felk.cvut.cz
    Department of Control Engineering FEE CVUT
    Karlovo namesti 13, 121 35, Prague 2
    university: http://control.fel.cvut.cz/
    personal:   http://cmp.felk.cvut.cz/~pisa
    social:     https://social.kernel.org/ppisa
    projects:   https://www.openhub.net/accounts/ppisa
    CAN related:http://canbus.pages.fel.cvut.cz/
    RISC-V education: https://comparch.edu.cvut.cz/
    Open Technologies Research Education and Exchange Services
    https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: CAN latency measuremet on AMD/Xilinx Zynq with PREEMP_RT - added threaded NAPI configuration
  2025-04-18 10:12         ` Pavel Pisa
@ 2025-04-18 20:18           ` Oliver Hartkopp
  0 siblings, 0 replies; 8+ messages in thread
From: Oliver Hartkopp @ 2025-04-18 20:18 UTC (permalink / raw)
  To: Pavel Pisa, Sebastian Andrzej Siewior
  Cc: Marc Kleine-Budde, linux-rt-users, Carsten Emde, linux-can,
	Jan Altenberg, Pavel Hronek

Hello Pavel,

On 18.04.25 12:12, Pavel Pisa wrote:
> Hello Sebastian,
> 
> On Thursday 17 of April 2025 10:12:54 Sebastian Andrzej Siewior wrote:
>> The IRQ thread should be limited to one CPU which is the same where the
>> IRQ it itself is set to. I don't think that this done the NAPI thread
>> automatically so it is probably free floating in the system.
> 
> you are right, I have added
> 
>    taskset -p 1 $pid
> 
> in can-latester-automation/device-scripts/set-can-threaded.sh
> 
> the effect can be seen after some days on the midnight
> build and testing records.
> 
> Our system is small and simple and all CAN IRQs are
> mapped to CPU0 now. But I have looked if I can find some
> easy way how to find affinity of the IRQ thread from
> /sys/class/net/canX and have not succeed a much.
> There is queues/rx-0/rps_cpus but it is probably another
> level.
> 
> There is easy way to find matching kernel driver task
> and copy affinity to NAPI task. But I am not sure how
> much naming match is guaranteed if some interfaces aliasing
> etc. is in effect. I see next names now
> 
>    [irq/48-can2]
>    [irq/49-can3]
>    [irq/50-can4]
>    [irq/51-can5]
> 
> and
> 
>   [napi/can2-19]
>   [napi/can3-20]
>   [napi/can4-21]
>   [napi/can5-22]
> 
> One question to Oliver, in which thread/callaback context
> is running kernel CAN gateway? I think that it does not
> use separate task. Because with threaded NAPI it seems
> that simple user space "gateway" (task to forward all
> messages from one interface to another) has more stable
> results than routing of messages directly in kernel.

The can_gw get's the CAN frames via NET_RX softirq like all other (CAN) 
network protocols.

The original reason for can_gw was, that an existing user space gateway 
was not able to cope with high CAN traffic loads due to scheduling and 
buffer overflows. That was with a standard kernel and might be different 
with PREEMPT_RT now.

Best regards,
Oliver

> 
> Some side note, project implementing FlexCAN controller
> emulation for QEMU (initial target sabrelite iMX6)
> is moving forward. And as Bernhard Beschow submitted
> iMX8 platform support into mainline QEMU, the FlexCAN
> emulation support can be extended to it in future as well.
> If somebody is interested then we can somehow join
> resources. Foe example if some funding is found
> I would discuse if the studnet working on the thesis
> project finalized by submitting iMX6 support would
> be willing to continue on iMX8 or other targets support.
> 
> Best wishes,
> 
>                  Pavel Pisa
>      phone:      +420 603531357
>      e-mail:     pisa@cmp.felk.cvut.cz
>      Department of Control Engineering FEE CVUT
>      Karlovo namesti 13, 121 35, Prague 2
>      university: http://control.fel.cvut.cz/
>      personal:   http://cmp.felk.cvut.cz/~pisa
>      social:     https://social.kernel.org/ppisa
>      projects:   https://www.openhub.net/accounts/ppisa
>      CAN related:http://canbus.pages.fel.cvut.cz/
>      RISC-V education: https://comparch.edu.cvut.cz/
>      Open Technologies Research Education and Exchange Services
>      https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-04-18 20:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-28 15:29 Question for AMD/Xilinx Zynq PREEMP_RT configuration check, CAN latency measuremet and FOSDEM 2025 Pavel Pisa
2025-01-29 10:17 ` Sebastian Andrzej Siewior
2025-01-29 12:04   ` Pavel Pisa
2025-01-29 14:40     ` Sebastian Andrzej Siewior
2025-03-28 12:04     ` CAN latency measuremet on AMD/Xilinx Zynq with PREEMP_RT - added threaded NAPI configuration Pavel Pisa
2025-04-17  8:12       ` Sebastian Andrzej Siewior
2025-04-18 10:12         ` Pavel Pisa
2025-04-18 20:18           ` Oliver Hartkopp

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox