From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752739AbbGaPxx (ORCPT ); Fri, 31 Jul 2015 11:53:53 -0400 Received: from e39.co.us.ibm.com ([32.97.110.160]:35667 "EHLO e39.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752382AbbGaPxv (ORCPT ); Fri, 31 Jul 2015 11:53:51 -0400 X-Helo: d03dlp03.boulder.ibm.com X-MailFrom: paulmck@linux.vnet.ibm.com X-RcptTo: linux-kernel@vger.kernel.org Date: Fri, 31 Jul 2015 08:53:43 -0700 From: "Paul E. McKenney" To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, mingo@kernel.org, jiangshanlai@gmail.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@efficios.com, josh@joshtriplett.org, tglx@linutronix.de, rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com, dvhart@linux.intel.com, fweisbec@gmail.com, oleg@redhat.com, bobby.prani@gmail.com, Alexander Gordeev Subject: Re: [PATCH tip/core/rcu 02/12] rcu: Panic if RCU tree can not accommodate all CPUs Message-ID: <20150731155343.GP27280@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20150717223041.GA14464@linux.vnet.ibm.com> <1437172263-15466-1-git-send-email-paulmck@linux.vnet.ibm.com> <1437172263-15466-2-git-send-email-paulmck@linux.vnet.ibm.com> <20150730122835.GX19282@twins.programming.kicks-ass.net> <20150730152517.GE27280@linux.vnet.ibm.com> <20150730153251.GL25159@twins.programming.kicks-ass.net> <20150730155454.GH27280@linux.vnet.ibm.com> <20150730162202.GN25159@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150730162202.GN25159@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15073115-0033-0000-0000-00000558AAF9 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jul 30, 2015 at 06:22:02PM +0200, Peter Zijlstra wrote: > On Thu, Jul 30, 2015 at 08:54:54AM -0700, Paul E. McKenney wrote: > > > Good point, and it already does, and I clearly was confused, apologies. > > > > So the real way to make this happen is (for example) to build > > with CONFIG_RCU_FANOUT=2 and CONFIG_RCU_FANOUT_LEAF=16 (the > > default), which could accommodate up to 128 CPUs. Then boot with > > rcutree.rcu_fanout_leaf=2 on a system with more than 16 CPUs, with > > rcutree.rcu_fanout_leaf=3 on a system with more than 24 CPUs, and so on. > > Ah, runtime overrides and operator error, but then we can WARN(), reset > the arguments and try again, right? No need to panic the machine and > fail to boot. Good point, like the patch below? Which also legitimizes my example after the fact, as it previously simply prohibited having rcutree.rcu_fanout_leaf less than CONFIG_RCU_FANOUT_LEAF. :-/ > > Of course, the truly macho way to get this error message is to build > > with CONFIG_RCU_FANOUT=64 and CONFIG_RCU_FANOUT_LEAF=64, then boot with > > rcutree.rcu_fanout_leaf=63 on a system with more than 16,515,072 CPUs. > > Of course, you get serious style points if the system manages to stay > > up for more than 24 hours without a hardware failure. ;-) > > Yes, I'll go power up the nuclear reactor in the basement first :-) Only one? ;-) Thanx, Paul ------------------------------------------------------------------------ rcu: Eliminate panic when silly boot-time fanout specified This commit loosens rcutree.rcu_fanout_leaf range checks and replaces a panic() with a fallback to compile-time values. This fallback is accompanied by a WARN_ON(), and both occur when the rcutree.rcu_fanout_leaf value is too small to accommodate the number of CPUs. For example, given the current four-level limit for the rcu_node tree, a system with more than 16 CPUs built with CONFIG_FANOUT=2 must have rcutree.rcu_fanout_leaf larger than 2. Reported-by: Peter Zijlstra Signed-off-by: Paul E. McKenney diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt index 01b5b68a237a..2a5d4696bdb9 100644 --- a/Documentation/kernel-parameters.txt +++ b/Documentation/kernel-parameters.txt @@ -3059,9 +3059,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted. cache-to-cache transfer latencies. rcutree.rcu_fanout_leaf= [KNL] - Increase the number of CPUs assigned to each - leaf rcu_node structure. Useful for very large - systems. + Change the number of CPUs assigned to each + leaf rcu_node structure. Useful for very + large systems, which will choose the value 64, + and for NUMA systems with large remote-access + latencies, which will choose a value aligned + with the appropriate hardware boundaries. rcutree.jiffies_till_sched_qs= [KNL] Set required age in jiffies for a diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index ce43fac5ff91..9f8040396d3e 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -4216,13 +4216,12 @@ static void __init rcu_init_geometry(void) rcu_fanout_leaf, nr_cpu_ids); /* - * The boot-time rcu_fanout_leaf parameter is only permitted - * to increase the leaf-level fanout, not decrease it. Of course, - * the leaf-level fanout cannot exceed the number of bits in - * the rcu_node masks. Complain and fall back to the compile- - * time values if these limits are exceeded. + * The boot-time rcu_fanout_leaf parameter must be at least two + * and cannot exceed the number of bits in the rcu_node masks. + * Complain and fall back to the compile-time values if this + * limit is exceeded. */ - if (rcu_fanout_leaf < RCU_FANOUT_LEAF || + if (rcu_fanout_leaf < 2 || rcu_fanout_leaf > sizeof(unsigned long) * 8) { rcu_fanout_leaf = RCU_FANOUT_LEAF; WARN_ON(1); @@ -4239,10 +4238,13 @@ static void __init rcu_init_geometry(void) /* * The tree must be able to accommodate the configured number of CPUs. - * If this limit is exceeded than we have a serious problem elsewhere. + * If this limit is exceeded, fall back to the compile-time values. */ - if (nr_cpu_ids > rcu_capacity[RCU_NUM_LVLS - 1]) - panic("rcu_init_geometry: rcu_capacity[] is too small"); + if (nr_cpu_ids > rcu_capacity[RCU_NUM_LVLS - 1]) { + rcu_fanout_leaf = RCU_FANOUT_LEAF; + WARN_ON(1); + return; + } /* Calculate the number of levels in the tree. */ for (i = 0; nr_cpu_ids > rcu_capacity[i]; i++) {