All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jorge Ramirez-Ortiz, Foundries" <jorge@foundries.io>
To: paulmck@kernel.org
Cc: josh@joshtriplett.org, rostedt@goodmis.org,
	mathieu.desnoyers@efficios.com, jiangshanlai@gmail.com,
	joel@joelfernandes.org, rcu@vger.kernel.org, soc@kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: rcu_preempt detected stalls
Date: Tue, 31 Aug 2021 17:21:44 +0200	[thread overview]
Message-ID: <20210831152144.GA28128@trex> (raw)

Hi

When enabling CONFIG_PREEMPT and running the stress-ng scheduler class
tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following.

[   62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:                                                              
[   62.585015]  (detected by 0, t=5253 jiffies, g=3017, q=2972)                                                                   
[   62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root  
+->qsmask 0x0                                                                                                                     
[   62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1               
[   62.613246] rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.                        
[   62.622359] rcu: RCU grace-period kthread stack dump:                                                                          
[   62.627395] task:rcu_preempt     state:R  running task     stack:    0 pid:   14 ppid:     2 flags:0x00000028                  
[   62.637308] Call trace:                                                                                                        
[   62.639748]  __switch_to+0x11c/0x190                                                                                           
[   62.643319]  __schedule+0x3b8/0x8d8                                                                                            
[   62.646796]  schedule+0x4c/0x108                                                                                               
[   62.650018]  schedule_timeout+0x1ac/0x358                                                                                      
[   62.654021]  rcu_gp_kthread+0x6a8/0x12b8                                                                                       
[   62.657933]  kthread+0x14c/0x158                                                                                               
[   62.661153]  ret_from_fork+0x10/0x18                                                                                           
[   62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002                                                       
[   62.689604] Preemption disabled at:                                                                                            
[   62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58                                                                       
[   62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5                                         
[   62.706296] Hardware name: Zynqmp new (DT)                                                                                        
[   62.710115] Call trace:                                                                                                        
[   62.712548]  dump_backtrace+0x0/0x240                                                                                          
[   62.716202]  show_stack+0x2c/0x38                                                                                              
[   62.719510]  dump_stack+0xcc/0x104                                                                                             
[   62.722904]  __schedule_bug+0x78/0xc8                                                                                          
[   62.726556]  __schedule+0x70c/0x8d8                                                                                            
[   62.730037]  schedule+0x4c/0x108                                                                                               
[   62.733259]  do_notify_resume+0x224/0x5d8                                                                                      
[   62.737259]  work_pending+0xc/0x2a4

The error results in OOM eventually.

RCU priority boosting does work around this issue but it seems to me
a workaround more than a fix (otherwise boosting would be enabled
by CONFIG_PREEMPT for arm64 I guess?).

The question is: is this an arm64 bug that should be investigated? or
is this some known corner case of running stress-ng that is already
understood?

thanks
Jorge




WARNING: multiple messages have this Message-ID (diff)
From: "Jorge Ramirez-Ortiz, Foundries" <jorge@foundries.io>
To: paulmck@kernel.org
Cc: josh@joshtriplett.org, rostedt@goodmis.org,
	mathieu.desnoyers@efficios.com, jiangshanlai@gmail.com,
	joel@joelfernandes.org, rcu@vger.kernel.org, soc@kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: rcu_preempt detected stalls
Date: Tue, 31 Aug 2021 17:21:44 +0200	[thread overview]
Message-ID: <20210831152144.GA28128@trex> (raw)
Message-ID: <20210831152144.VOyu0gjmwOCWZBtSzDaOYQE-YYszY9tK_z8p-fZZ_kM@z> (raw)

Hi

When enabling CONFIG_PREEMPT and running the stress-ng scheduler class
tests on arm64 (xilinx zynqmp and imx imx8mm SoCs) we are observing the following.

[   62.578917] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:                                                              
[   62.585015]  (detected by 0, t=5253 jiffies, g=3017, q=2972)                                                                   
[   62.590663] rcu: All QSes seen, last rcu_preempt kthread activity 5254 (4294907943-4294902689), jiffies_till_next_fqs=1, root  
+->qsmask 0x0                                                                                                                     
[   62.603086] rcu: rcu_preempt kthread starved for 5258 jiffies! g3017 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=1               
[   62.613246] rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.                        
[   62.622359] rcu: RCU grace-period kthread stack dump:                                                                          
[   62.627395] task:rcu_preempt     state:R  running task     stack:    0 pid:   14 ppid:     2 flags:0x00000028                  
[   62.637308] Call trace:                                                                                                        
[   62.639748]  __switch_to+0x11c/0x190                                                                                           
[   62.643319]  __schedule+0x3b8/0x8d8                                                                                            
[   62.646796]  schedule+0x4c/0x108                                                                                               
[   62.650018]  schedule_timeout+0x1ac/0x358                                                                                      
[   62.654021]  rcu_gp_kthread+0x6a8/0x12b8                                                                                       
[   62.657933]  kthread+0x14c/0x158                                                                                               
[   62.661153]  ret_from_fork+0x10/0x18                                                                                           
[   62.682919] BUG: scheduling while atomic: stress-ng-hrtim/831/0x00000002                                                       
[   62.689604] Preemption disabled at:                                                                                            
[   62.689614] [<ffffffc010059418>] irq_enter_rcu+0x30/0x58                                                                       
[   62.698393] CPU: 0 PID: 831 Comm: stress-ng-hrtim Not tainted 5.10.42+ #5                                         
[   62.706296] Hardware name: Zynqmp new (DT)                                                                                        
[   62.710115] Call trace:                                                                                                        
[   62.712548]  dump_backtrace+0x0/0x240                                                                                          
[   62.716202]  show_stack+0x2c/0x38                                                                                              
[   62.719510]  dump_stack+0xcc/0x104                                                                                             
[   62.722904]  __schedule_bug+0x78/0xc8                                                                                          
[   62.726556]  __schedule+0x70c/0x8d8                                                                                            
[   62.730037]  schedule+0x4c/0x108                                                                                               
[   62.733259]  do_notify_resume+0x224/0x5d8                                                                                      
[   62.737259]  work_pending+0xc/0x2a4

The error results in OOM eventually.

RCU priority boosting does work around this issue but it seems to me
a workaround more than a fix (otherwise boosting would be enabled
by CONFIG_PREEMPT for arm64 I guess?).

The question is: is this an arm64 bug that should be investigated? or
is this some known corner case of running stress-ng that is already
understood?

thanks
Jorge




_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

             reply	other threads:[~2021-08-31 15:21 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-31 15:21 Jorge Ramirez-Ortiz, Foundries [this message]
2021-08-31 15:21 ` rcu_preempt detected stalls Jorge Ramirez-Ortiz, Foundries
2021-08-31 15:53 ` Paul E. McKenney
2021-08-31 15:53   ` Paul E. McKenney
2021-08-31 17:01 ` Zhouyi Zhou
2021-08-31 17:01   ` Zhouyi Zhou
2021-08-31 17:11   ` Zhouyi Zhou
2021-08-31 17:11     ` Zhouyi Zhou
2021-09-01  1:03     ` Zhouyi Zhou
2021-09-01  1:03       ` Zhouyi Zhou
2021-09-01  4:08       ` Neeraj Upadhyay
2021-09-01  6:47         ` Zhouyi Zhou
2021-09-01  6:47           ` Zhouyi Zhou
2021-09-01  8:23       ` Jorge Ramirez-Ortiz, Foundries
2021-09-01  8:23         ` Jorge Ramirez-Ortiz, Foundries
2021-09-01  9:17         ` Zhouyi Zhou
2021-09-01  9:17           ` Zhouyi Zhou
  -- strict thread matches above, loose matches on Subject: below --
2014-10-13 17:35 Dave Jones
2014-10-15  2:35 ` Sasha Levin
2014-10-23 18:39   ` Paul E. McKenney
2014-10-23 18:55     ` Sasha Levin
2014-10-23 19:58       ` Paul E. McKenney
2014-10-24 12:28         ` Sasha Levin
2014-10-24 16:13           ` Paul E. McKenney
2014-10-24 16:39             ` Sasha Levin
2014-10-27 21:13               ` Paul E. McKenney
2014-10-27 23:44                 ` Paul E. McKenney
2014-10-27 23:44                   ` Paul E. McKenney
2014-11-13 23:07                   ` Paul E. McKenney
2014-11-13 23:07                     ` Paul E. McKenney
2014-11-13 23:10                     ` Sasha Levin
2014-11-13 23:10                       ` Sasha Levin
2014-10-30 23:41                 ` Sasha Levin
2014-10-23 18:32 ` Paul E. McKenney
2014-10-23 18:40   ` Dave Jones
2014-10-23 19:28     ` Paul E. McKenney
2014-10-23 19:37       ` Dave Jones
2014-10-23 19:52         ` Paul E. McKenney
2014-10-23 20:28           ` Dave Jones
2014-10-23 20:44             ` Paul E. McKenney
2014-10-23 19:13   ` Oleg Nesterov
2014-10-23 19:38     ` Paul E. McKenney
2014-10-23 19:53       ` Oleg Nesterov
2014-10-23 20:24         ` Paul E. McKenney
2014-10-23 21:13           ` Oleg Nesterov
2014-10-23 21:38             ` Paul E. McKenney
2014-10-25  3:16 ` Dâniel Fraga

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210831152144.GA28128@trex \
    --to=jorge@foundries.io \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=soc@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.