From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754380Ab2EaWEF (ORCPT ); Thu, 31 May 2012 18:04:05 -0400 Received: from e32.co.us.ibm.com ([32.97.110.150]:37024 "EHLO e32.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753910Ab2EaWEC (ORCPT ); Thu, 31 May 2012 18:04:02 -0400 Date: Thu, 31 May 2012 15:02:29 -0700 From: "Paul E. McKenney" To: Frederic Weisbecker Cc: LKML , linaro-sched-sig@lists.linaro.org, Alessio Igor Bogani , Andrew Morton , Avi Kivity , Chris Metcalf , Christoph Lameter , Daniel Lezcano , Geoff Levand , Gilad Ben Yossef , Hakan Akkan , Ingo Molnar , Kevin Hilman , Max Krasnyansky , Peter Zijlstra , Stephen Hemminger , Steven Rostedt , Sven-Thorsten Dietrich , Thomas Gleixner Subject: Re: [PATCH 11/41] nohz/cpuset: Don't turn off the tick if rcu needs it Message-ID: <20120531220229.GM2357@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <1335830115-14335-1-git-send-email-fweisbec@gmail.com> <1335830115-14335-12-git-send-email-fweisbec@gmail.com> <20120522171658.GA8087@linux.vnet.ibm.com> <20120523135205.GB1663@somewhere> <20120523151537.GA2402@linux.vnet.ibm.com> <20120523160629.GJ1663@somewhere> <20120523162739.GE2402@linux.vnet.ibm.com> <20120531160117.GD27841@somewhere.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20120531160117.GD27841@somewhere.redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12053122-3270-0000-0000-000006CDF7AF Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 31, 2012 at 06:01:21PM +0200, Frederic Weisbecker wrote: > On Wed, May 23, 2012 at 09:27:39AM -0700, Paul E. McKenney wrote: > > On Wed, May 23, 2012 at 06:06:33PM +0200, Frederic Weisbecker wrote: > > > On Wed, May 23, 2012 at 08:15:42AM -0700, Paul E. McKenney wrote: > > > > On Wed, May 23, 2012 at 03:52:09PM +0200, Frederic Weisbecker wrote: > > > > > > > +#ifdef CONFIG_CPUSETS_NO_HZ > > > > > > > +static bool can_stop_adaptive_tick(void) > > > > > > > +{ > > > > > > > + if (!sched_can_stop_tick()) > > > > > > > + return false; > > > > > > > + > > > > > > > + /* Is there a grace period to complete ? */ > > > > > > > + if (rcu_pending(smp_processor_id())) > > > > > > > > > > > > You lost me on this one. Why can't this be rcu_needs_cpu()? > > > > > > > > > > We already have an rcu_needs_cpu() check in tick_nohz_stop_sched_tick() > > > > > that prevents the tick to shut down if the CPU has local callbacks to handle. > > > > > > > > > > The rcu_pending() check is there in case some other CPU is waiting for the > > > > > current one to help completing a grace period, by reporting a quiescent state > > > > > for example. This happens because we may stop the tick in the kernel, not only > > > > > userspace. And if we are in the kernel, we still need to be part of the global > > > > > state machine. > > > > > > > > Ah! But RCU will notice that the CPU is in dyntick-idle mode, and will > > > > therefore take any needed quiescent-state action on that CPU's behalf. > > > > So there should be no need to call rcu_pending() anywhere outside of the > > > > RCU core code. > > > > > > No. If the tick is stopped and we are in the kernel, we may be using RCU > > > anytime, so we need to be part of the RCU core. > > > > OK, so the only problem is if we spend a long time CPU-bound in the kernel, > > where "long" is milliseconds or tens of milliseconds. In that case, the > > RCU core will notice that the CPU has not responded but is not idle, for > > example, in rcu_implicit_dynticks_qs(). It can take action at this point > > to get the offending CPU to pay attention to RCU. > > > > Does this make sense, or am I still missing something? > > Yeah that's exactly the purpose of the rcu_pending() check before shutting down > the tick and the IPI to wake it up. Hmmm... We appear to be talking past each other. If you use rcu_pending(), you defeat CONFIG_RCU_FAST_NO_HZ and thus fail to shut of the tick in situations where the application does a system call involving an RCU update every few tens of milliseconds. This is not good. What we should do instead is to call rcu_needs_cpu() instead of rcu_pending(). In the common case of short system calls, this will allow the tick to be turned off a higher fraction of the time with no penalty. In the very unusual case where a system call runs CPU-bound for tens of milliseconds, RCU's existing force_quiescent_state() machinery can easily be used to force the CPU to pay attention to RCU. Make sense, or am I missing something? (And yes, the CONFIG_RCU_FAST_NO_HZ heuristics likely need to be adjusted to better support adaptive ticks -- try less hard to retire callbacks, for example.) Thanx, Paul