public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* False-positive RCU stall warnings on large systems...
@ 2013-02-19 16:34 Daniel J Blueman
  2013-02-19 18:16 ` Paul E. McKenney
  0 siblings, 1 reply; 6+ messages in thread
From: Daniel J Blueman @ 2013-02-19 16:34 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Steffen Persvold, LKML

Hi Paul,

On some of our larger servers with many hundreds of cores and when under 
high duress, we can see scheduler RCU stall warnings [1], so find we 
have to increase the hardcoded RCU_STALL_RAT_DELAY up from 2 and 
RCU_JIFFIES_TILL_FORCE_QS up from 3.

Is there a more sustainable way to account for this to avoid it being 
hard-coded, such as making it and dependent timeouts a fraction of 
CONFIG_RCU_CPU_STALL_TIMEOUT?

On the other hand, perhaps this is just caused by clock jitter (eg due 
to distance from a contended clock source)? So increasing these a bit 
may just be adequate in general...

Many thanks,
   Daniel

--- [1]

[ 3939.010085] INFO: rcu_sched detected stalls on CPUs/tasks: {} 
(detected by 1, t=29662 jiffies, g=3053, c=3052, q=598)
[ 3939.020008] INFO: Stall ended before state dump start
-- 
Daniel J Blueman
Principal Software Engineer, Numascale Asia

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-03-06 17:11 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-19 16:34 False-positive RCU stall warnings on large systems Daniel J Blueman
2013-02-19 18:16 ` Paul E. McKenney
2013-02-20  3:35   ` Daniel J Blueman
2013-02-25 16:32     ` Paul E. McKenney
2013-03-05  9:02       ` Daniel J Blueman
2013-03-06 17:03         ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox