From: "Jorge Ramirez-Ortiz, Foundries" <jorge@foundries.io>
To: paulmck@kernel.org
Cc: josh@joshtriplett.org, rostedt@goodmis.org,
mathieu.desnoyers@efficios.com, jiangshanlai@gmail.com,
joel@joelfernandes.org, rcu@vger.kernel.org, soc@kernel.org,
linux-arm-kernel@lists.infradead.org
Subject: rcu_preempt detected stalls
Date: Tue, 31 Aug 2021 17:21:44 +0200 [thread overview]
Message-ID: <20210831152144.GA28128@trex> (raw)
Hi
When enabling CONFIG_PREEMPT and running the stress-ng scheduler class
tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following.
[ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972)
[ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root
+->qsmask 0x0
[ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
[ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 62.622359] rcu: RCU grace-period kthread stack dump:
[ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028
[ 62.637308] Call trace:
[ 62.639748] __switch_to+0x11c/0x190
[ 62.643319] __schedule+0x3b8/0x8d8
[ 62.646796] schedule+0x4c/0x108
[ 62.650018] schedule_timeout+0x1ac/0x358
[ 62.654021] rcu_gp_kthread+0x6a8/0x12b8
[ 62.657933] kthread+0x14c/0x158
[ 62.661153] ret_from_fork+0x10/0x18
[ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002
[ 62.689604] Preemption disabled at:
[ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58
[ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5
[ 62.706296] Hardware name: Zynqmp new (DT)
[ 62.710115] Call trace:
[ 62.712548] dump_backtrace+0x0/0x240
[ 62.716202] show_stack+0x2c/0x38
[ 62.719510] dump_stack+0xcc/0x104
[ 62.722904] __schedule_bug+0x78/0xc8
[ 62.726556] __schedule+0x70c/0x8d8
[ 62.730037] schedule+0x4c/0x108
[ 62.733259] do_notify_resume+0x224/0x5d8
[ 62.737259] work_pending+0xc/0x2a4
The error results in OOM eventually.
RCU priority boosting does work around this issue but it seems to me
a workaround more than a fix (otherwise boosting would be enabled
by CONFIG_PREEMPT for arm64 I guess?).
The question is: is this an arm64 bug that should be investigated? or
is this some known corner case of running stress-ng that is already
understood?
thanks
Jorge
WARNING: multiple messages have this Message-ID (diff)
From: "Jorge Ramirez-Ortiz, Foundries" <jorge@foundries.io>
To: paulmck@kernel.org
Cc: josh@joshtriplett.org, rostedt@goodmis.org,
mathieu.desnoyers@efficios.com, jiangshanlai@gmail.com,
joel@joelfernandes.org, rcu@vger.kernel.org, soc@kernel.org,
linux-arm-kernel@lists.infradead.org
Subject: rcu_preempt detected stalls
Date: Tue, 31 Aug 2021 17:21:44 +0200 [thread overview]
Message-ID: <20210831152144.GA28128@trex> (raw)
Message-ID: <20210831152144.VOyu0gjmwOCWZBtSzDaOYQE-YYszY9tK_z8p-fZZ_kM@z> (raw)
Hi
When enabling CONFIG_PREEMPT and running the stress-ng scheduler class
tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following.
[ 62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 62.585015] (detected by 0, t=5253 jiffies, g=3017, q=2972)
[ 62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root
+->qsmask 0x0
[ 62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1
[ 62.613246] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
[ 62.622359] rcu: RCU grace-period kthread stack dump:
[ 62.627395] task:rcu_preempt state:R running task stack: 0 pid: 14 ppid: 2 flags:0x00000028
[ 62.637308] Call trace:
[ 62.639748] __switch_to+0x11c/0x190
[ 62.643319] __schedule+0x3b8/0x8d8
[ 62.646796] schedule+0x4c/0x108
[ 62.650018] schedule_timeout+0x1ac/0x358
[ 62.654021] rcu_gp_kthread+0x6a8/0x12b8
[ 62.657933] kthread+0x14c/0x158
[ 62.661153] ret_from_fork+0x10/0x18
[ 62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002
[ 62.689604] Preemption disabled at:
[ 62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58
[ 62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5
[ 62.706296] Hardware name: Zynqmp new (DT)
[ 62.710115] Call trace:
[ 62.712548] dump_backtrace+0x0/0x240
[ 62.716202] show_stack+0x2c/0x38
[ 62.719510] dump_stack+0xcc/0x104
[ 62.722904] __schedule_bug+0x78/0xc8
[ 62.726556] __schedule+0x70c/0x8d8
[ 62.730037] schedule+0x4c/0x108
[ 62.733259] do_notify_resume+0x224/0x5d8
[ 62.737259] work_pending+0xc/0x2a4
The error results in OOM eventually.
RCU priority boosting does work around this issue but it seems to me
a workaround more than a fix (otherwise boosting would be enabled
by CONFIG_PREEMPT for arm64 I guess?).
The question is: is this an arm64 bug that should be investigated? or
is this some known corner case of running stress-ng that is already
understood?
thanks
Jorge
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next reply other threads:[~2021-08-31 15:21 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-31 15:21 Jorge Ramirez-Ortiz, Foundries [this message]
2021-08-31 15:21 ` rcu_preempt detected stalls Jorge Ramirez-Ortiz, Foundries
2021-08-31 15:53 ` Paul E. McKenney
2021-08-31 15:53 ` Paul E. McKenney
2021-08-31 17:01 ` Zhouyi Zhou
2021-08-31 17:01 ` Zhouyi Zhou
2021-08-31 17:11 ` Zhouyi Zhou
2021-08-31 17:11 ` Zhouyi Zhou
2021-09-01 1:03 ` Zhouyi Zhou
2021-09-01 1:03 ` Zhouyi Zhou
2021-09-01 4:08 ` Neeraj Upadhyay
2021-09-01 6:47 ` Zhouyi Zhou
2021-09-01 6:47 ` Zhouyi Zhou
2021-09-01 8:23 ` Jorge Ramirez-Ortiz, Foundries
2021-09-01 8:23 ` Jorge Ramirez-Ortiz, Foundries
2021-09-01 9:17 ` Zhouyi Zhou
2021-09-01 9:17 ` Zhouyi Zhou
-- strict thread matches above, loose matches on Subject: below --
2014-10-13 17:35 Dave Jones
2014-10-15 2:35 ` Sasha Levin
2014-10-23 18:39 ` Paul E. McKenney
2014-10-23 18:55 ` Sasha Levin
2014-10-23 19:58 ` Paul E. McKenney
2014-10-24 12:28 ` Sasha Levin
2014-10-24 16:13 ` Paul E. McKenney
2014-10-24 16:39 ` Sasha Levin
2014-10-27 21:13 ` Paul E. McKenney
2014-10-27 23:44 ` Paul E. McKenney
2014-10-27 23:44 ` Paul E. McKenney
2014-11-13 23:07 ` Paul E. McKenney
2014-11-13 23:07 ` Paul E. McKenney
2014-11-13 23:10 ` Sasha Levin
2014-11-13 23:10 ` Sasha Levin
2014-10-30 23:41 ` Sasha Levin
2014-10-23 18:32 ` Paul E. McKenney
2014-10-23 18:40 ` Dave Jones
2014-10-23 19:28 ` Paul E. McKenney
2014-10-23 19:37 ` Dave Jones
2014-10-23 19:52 ` Paul E. McKenney
2014-10-23 20:28 ` Dave Jones
2014-10-23 20:44 ` Paul E. McKenney
2014-10-23 19:13 ` Oleg Nesterov
2014-10-23 19:38 ` Paul E. McKenney
2014-10-23 19:53 ` Oleg Nesterov
2014-10-23 20:24 ` Paul E. McKenney
2014-10-23 21:13 ` Oleg Nesterov
2014-10-23 21:38 ` Paul E. McKenney
2014-10-25 3:16 ` Dâniel Fraga
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210831152144.GA28128@trex \
--to=jorge@foundries.io \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=josh@joshtriplett.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=soc@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.