From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Pavel Pisa <pisa@fel.cvut.cz>
Cc: linux-rt-users@vger.kernel.org, Carsten Emde <c.emde@osadl.org>,
linux-can@vger.kernel.org,
Oliver Hartkopp <socketcan@hartkopp.net>,
Jan Altenberg <Jan.Altenberg@osadl.org>,
Pavel Hronek <hronepa1@fel.cvut.cz>
Subject: Re: Question for AMD/Xilinx Zynq PREEMP_RT configuration check, CAN latency measuremet and FOSDEM 2025
Date: Wed, 29 Jan 2025 15:40:29 +0100 [thread overview]
Message-ID: <20250129144029.snKWIeXA@linutronix.de> (raw)
In-Reply-To: <202501291304.15901.pisa@fel.cvut.cz>
On 2025-01-29 13:04:15 [+0100], Pavel Pisa wrote:
> Hello Sebastian,
Hi Pavel,
> The actual one show none IRQ work interrupts
> after last reboot and overnigh test
>
> Linux mzapo 6.13.0-rc6-rt3-dut #1 SMP PREEMPT_RT
> Wed Jan 29 04:46:40 CET 2025 armv7l GNU/Linux
…
> CPU0 CPU1
> 48: 314697 0 GIC-0 61 Level can2
> 49: 314597 0 GIC-0 62 Level can3
> 50: 314759 0 GIC-0 63 Level can4
> 51: 311516 0 GIC-0 64 Level can5
> IPI0: 0 0 CPU wakeup interrupts
> IPI1: 0 0 Timer broadcast interrupts
> IPI2: 17849 292126 Rescheduling interrupts
> IPI3: 5923 11315 Function call interrupts
> IPI4: 0 0 CPU stop interrupts
> IPI5: 271078 74040 IRQ work interrupts
> IPI6: 0 0 completion interrupts
> Err: 0
>
> So this seems as no cause.
None you say? I see 271078 on CPU0 and 74040 on the other one.
> Yes, I think that design mixing regular networking packet
> processing with CAN is the problem. We test even with setup where
> CAN interrupts priority is boosted to 90
>
> echo "-> Rise CAN irq priorities"
> PIDS=$(ps -e | grep -E irq/[0-9]+-can[3-4] | tr -s ' ' | cut -d ' ' -f2)
> TXPID=$(ps -e | grep -E irq/[0-9]+-can2 | tr -s ' ' | cut -d ' ' -f2)
> chrt -f --pid 80 $TXPID
> for pid in $PIDS ; do
> chrt -f --pid 85 $pid
> done
but boosting the prio does not help because lock contention leads to PI
and forces its way through. The problem is that networking will
continue.
You need to go to /proc/irq/${can_irq} and push the affinity to CPU1.
> Even this setup is problematic under load.
I would expect no change.
> to run daily testing. We can consider even something different,
> but this choice has been given by interest in something
> functional for each day and ahead of mainline merges to
> catch some problems in advance.
Oh okay.
> It is interesting than in kernel gateway is significantly worse
> now. It does not overhead of switching to userspace. But I am not
> sure if it is not invoked in some kernel worker which
> has lower or same real time priority than Ethenet networking.
>
> In general, I think that the problem is that incommin
> packets (CAN and Ethernet) load the same per CPU
> worker. There are even backlog_napi threads per CPU
>
> 46 TS - S ? 00:00:00 [backlog_napi/0]
> 47 TS - S ? 00:00:00 [backlog_napi/1]
>
> It has even TS priority. If I remember well, there has been
> added option to allocate separate RX packets processing
> therad (instead for default per CPU one) for given interface.
> But I have no experience with such configuration.
backlog NAPI is used by devices which don't bring their own NAPI.
> Do you have or somebody else have idea how to achieve
> that and if it is legal to boost such kernel therad
> priority. It could help, because my general experience
> with PREEMPT_RT even on this target is very positive
> for tasks mapping HW directly and doing RT control.
> Same for latency tester. No spikes under load over
> 250 usec or less.
I wouldn't boost it unconditionally. If you enable tracing with
sched_switch, interrupts and maybe net then you should see how the flow
of the CAN skb is. I don't know if it touches backlog_napi. Ideally it
shouldn't. There shouldn't be anything that could interfere with it such
ethernet traffic (say ssh) or local sockets.
Once you see the regular flow you should be able to what blocks it once
you step the trace during a spike.
> Best wishes,
>
> Pavel
Sebastian
next prev parent reply other threads:[~2025-01-29 14:40 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-28 15:29 Question for AMD/Xilinx Zynq PREEMP_RT configuration check, CAN latency measuremet and FOSDEM 2025 Pavel Pisa
2025-01-29 10:17 ` Sebastian Andrzej Siewior
2025-01-29 12:04 ` Pavel Pisa
2025-01-29 14:40 ` Sebastian Andrzej Siewior [this message]
2025-03-28 12:04 ` CAN latency measuremet on AMD/Xilinx Zynq with PREEMP_RT - added threaded NAPI configuration Pavel Pisa
2025-04-17 8:12 ` Sebastian Andrzej Siewior
2025-04-18 10:12 ` Pavel Pisa
2025-04-18 20:18 ` Oliver Hartkopp
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250129144029.snKWIeXA@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=Jan.Altenberg@osadl.org \
--cc=c.emde@osadl.org \
--cc=hronepa1@fel.cvut.cz \
--cc=linux-can@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=pisa@fel.cvut.cz \
--cc=socketcan@hartkopp.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox