From: Gratian Crisan <gratian.crisan@ni.com>
To: Thomas Gleixner <tglx@linutronix.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Steven Rostedt <rostedt@goodmis.org>
Cc: linux-rt-users@vger.kernel.org
Subject: irq thread latency caused by softirq_ctrl.lock contention
Date: Wed, 15 Sep 2021 17:59:50 -0500 [thread overview]
Message-ID: <87a6kd78n7.fsf@ni.com> (raw)
Hi guys,
We are seeing some pretty significant degradation in some of our
benchmarks when using 5.10.59-rt52 or 5.14.1-rt19 kernels versus
previous numbers on 4.14-rt and I'm looking for thoughts/ideas if
anything can be done about it.
The synthetic benchmark in question is measuring the latency of a
"analog in" -> "analog out" loop. We have two versions of this
benchmark: one that is interrupt based and one that polls on the I/O.
The polling mode benchmark is comparable between 5.x-rt and 4.14-rt.
However the most extreme case of the interrupt based one shows up to 60%
degradation in latency in 5.x-rt kernels.
I was able to confirm that the issue is caused by contention on
'softirq_ctrl.lock' local lock in the softirq code. This results in
event traces with the following pattern:
<idle>-0 [000] 582.023481: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=12 next_prio=91
ksoftirqd/0-12 [000] 582.023485: irq_handler_entry: irq=102 name=atomicchinchk
ksoftirqd/0-12 [000] 582.023485: irq_handler_exit: irq=102 ret=handled
ksoftirqd/0-12 [000] 582.023486: sched_waking: comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
ksoftirqd/0-12 [000] 582.023488: sched_wakeup: comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
ksoftirqd/0-12 [000] 582.023490: rcu_utilization: Start context switch
ksoftirqd/0-12 [000] 582.023490: rcu_utilization: End context switch
ksoftirqd/0-12 [000] 582.023493: sched_switch: prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=91 prev_state=R+ ==> next_comm=irq/102-atomicc next_pid=248 next_prio=1
irq/102-atomicc-248 [000] 582.023496: sched_pi_setprio: comm=ksoftirqd/0 pid=12 oldprio=91 newprio=1
irq/102-atomicc-248 [000] 582.023499: rcu_utilization: Start context switch
irq/102-atomicc-248 [000] 582.023499: rcu_utilization: End context switch
irq/102-atomicc-248 [000] 582.023501: sched_switch: prev_comm=irq/102-atomicc prev_pid=248 prev_prio=1 prev_state=R ==> next_comm=ksoftirqd/0 next_pid=12 next_prio=1
ksoftirqd/0-12 [000] 582.023503: softirq_entry: vec=7 [action=SCHED]
ksoftirqd/0-12 [000] 582.023508: softirq_exit: vec=7 [action=SCHED]
ksoftirqd/0-12 [000] 582.023510: sched_pi_setprio: comm=ksoftirqd/0 pid=12 oldprio=1 newprio=91
ksoftirqd/0-12 [000] 582.023513: sched_waking: comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
ksoftirqd/0-12 [000] 582.023515: sched_wakeup: comm=irq/102-atomicc pid=248 prio=1 target_cpu=000
ksoftirqd/0-12 [000] 582.023519: rcu_utilization: Start context switch
ksoftirqd/0-12 [000] 582.023519: rcu_utilization: End context switch
ksoftirqd/0-12 [000] 582.023521: sched_switch: prev_comm=ksoftirqd/0 prev_pid=12 prev_prio=91 prev_state=S ==> next_comm=irq/102-atomicc next_pid=248 next_prio=1
irq/102-atomicc-248 [000] 582.023530: sched_waking: comm=S81039292 pid=2240 prio=40 target_cpu=001
irq/102-atomicc-248 [000] 582.023532: sched_wake_idle_without_ipi: cpu=1
irq/102-atomicc-248 [000] 582.023533: sched_wakeup: comm=S81039292 pid=2240 prio=40 target_cpu=001
irq/102-atomicc-248 [000] 582.023539: rcu_utilization: Start context switch
irq/102-atomicc-248 [000] 582.023540: rcu_utilization: End context switch
irq/102-atomicc-248 [000] 582.023542: sched_switch: prev_comm=irq/102-atomicc prev_pid=248 prev_prio=1 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
The 'irq/102-atomicc-248' irq thread is our high priority data
acquisition thread. The additional pi boost and context switch seems to
account for the main differences I'm seeing versus 4.14-rt. This race +
pi boost happens with other lower priority irq threads too but the
ksoftirq case is the most common one.
I would appreciate any thoughts on how/if we could improve this?
Thanks,
Gratian
next reply other threads:[~2021-09-15 23:38 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-15 22:59 Gratian Crisan [this message]
2021-09-20 9:18 ` irq thread latency caused by softirq_ctrl.lock contention Sebastian Andrzej Siewior
2021-09-22 19:16 ` Gratian Crisan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87a6kd78n7.fsf@ni.com \
--to=gratian.crisan@ni.com \
--cc=bigeasy@linutronix.de \
--cc=linux-rt-users@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).