From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759670Ab2CWTXq (ORCPT ); Fri, 23 Mar 2012 15:23:46 -0400 Received: from e9.ny.us.ibm.com ([32.97.182.139]:54922 "EHLO e9.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759590Ab2CWTXp (ORCPT ); Fri, 23 Mar 2012 15:23:45 -0400 Date: Fri, 23 Mar 2012 12:23:35 -0700 From: "Paul E. McKenney" To: Mike Galbraith Cc: Dimitri Sivanich , linux-kernel@vger.kernel.org Subject: Re: [PATCH RFC] rcu: Limit GP initialization to CPUs that have been online Message-ID: <20120323192335.GZ2450@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <1331779343.14263.6.camel@marge.simpson.net> <1331780831.14263.15.camel@marge.simpson.net> <20120315175933.GB8705@sgi.com> <1331882861.11010.13.camel@marge.simpson.net> <1331885384.11010.15.camel@marge.simpson.net> <1331887535.11010.18.camel@marge.simpson.net> <20120316172850.GC31290@sgi.com> <1332430533.11517.75.camel@marge.simpson.net> <20120322202418.GA8569@sgi.com> <1332478086.5721.17.camel@marge.simpson.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1332478086.5721.17.camel@marge.simpson.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-Content-Scanned: Fidelis XPS MAILER x-cbid: 12032319-7182-0000-0000-0000011B8E6F Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 23, 2012 at 05:48:06AM +0100, Mike Galbraith wrote: > On Thu, 2012-03-22 at 15:24 -0500, Dimitri Sivanich wrote: > > On Thu, Mar 22, 2012 at 04:35:33PM +0100, Mike Galbraith wrote: > > > > > This patch also shows great improvement in the two > > > > rcu_for_each_node_breadth_first() (nothing over 20 usec and most less than > > > > 10 in initial testing). > > > > > > > > However, there are spinlock holdoffs at the following tracebacks (my nmi > > > > handler does work on the 3.0 kernel): > > > > > > > > [ 584.157019] [] nmi+0x20/0x30 > > > > [ 584.157023] [] _raw_spin_lock_irqsave+0x1a/0x30 > > > > [ 584.157026] [] force_qs_rnp+0x58/0x170 > > > > [ 584.157030] [] force_quiescent_state+0x162/0x1d0 > > > > [ 584.157033] [] __rcu_process_callbacks+0x165/0x200 > > > > [ 584.157037] [] rcu_process_callbacks+0x1d/0x80 > > > > [ 584.157041] [] __do_softirq+0xef/0x220 > > > > [ 584.157044] [] call_softirq+0x1c/0x30 > > > > [ 584.157048] [] do_softirq+0x65/0xa0 > > > > [ 584.157051] [] irq_exit+0xb5/0xe0 > > > > [ 584.157054] [] smp_apic_timer_interrupt+0x68/0xa0 > > > > [ 584.157057] [] apic_timer_interrupt+0x13/0x20 > > > > [ 584.157061] [] native_safe_halt+0x2/0x10 > > > > [ 584.157064] [] default_idle+0x145/0x150 > > > > [ 584.157067] [] cpu_idle+0x66/0xc0 > > > > > > Care to try this? There's likely a better way to defeat ->qsmask == 0 > > > take/release all locks thingy, however, if Paul can safely bail in > > > force_qs_rnp() in tweakable latency for big boxen patch, I should be > > > able to safely (and shamelessly) steal that, and should someone hotplug > > > a CPU, and we race, do the same thing bail for small boxen. > > > > Tested on a 48 cpu UV system with an interrupt latency test on isolated > > cpus and a moderate to heavy load on the rest of the system. > > > > This patch appears to take care of all excessive (> 35 usec) RCU-based > > latency in the 3.0 kernel on this particular system for this particular > > setup. Without the patch, I see many latencies on this system > 150 usec > > (and some > 200 usec). > > Figures. I bet Paul has a better idea though. Too bad we can't whack > those extra barriers, that would likely wipe RCU from your radar. Sorry for the silence -- was hit by the germs going around. I do have some concerns about some of the code, but very much appreciate the two of you continuing on this in my absence! Thanx, Paul