From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751298AbcCHSSS (ORCPT ); Tue, 8 Mar 2016 13:18:18 -0500 Received: from e35.co.us.ibm.com ([32.97.110.153]:42981 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750872AbcCHSSH (ORCPT ); Tue, 8 Mar 2016 13:18:07 -0500 X-IBM-Helo: d03dlp03.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com X-IBM-RcptTo: linux-kernel@vger.kernel.org Date: Tue, 8 Mar 2016 10:18:07 -0800 From: "Paul E. McKenney" To: Ingo Molnar Cc: linux-kernel@vger.kernel.org, yang.shi@linaro.org, tj@kernel.org, paul.gortmaker@windriver.com, boqun.feng@gmail.com, tglx@linutronix.de, gang.chen.5i5j@gmail.com, sj38.park@gmail.com Subject: Re: [GIT PULL rcu/next] RCU commits for 4.6 Message-ID: <20160308181807.GA29849@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20160307202516.GA15509@linux.vnet.ibm.com> <20160308085342.GA12413@gmail.com> <20160308152109.GK3577@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160308152109.GK3577@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16030818-0013-0000-0000-00001DBB24D2 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 08, 2016 at 07:21:09AM -0800, Paul E. McKenney wrote: > On Tue, Mar 08, 2016 at 09:53:42AM +0100, Ingo Molnar wrote: > > * Paul E. McKenney wrote: [ . . . ] > > Pulled, thanks a lot Paul! > > > > So I've done the conflict resolutions with tmp:smp/hotplug and tip:sched/core > > myself, and came up with a mostly identical resolution, except this difference > > with your resolution in wagi.2016.03.01a: > > > > --- linux-next/kernel/rcu/tree.c > > +++ tip/kernel/rcu/tree.c > > @@ -2046,8 +2046,8 @@ static void rcu_gp_cleanup(struct rcu_st > > /* smp_mb() provided by prior unlock-lock pair. */ > > nocb += rcu_future_gp_cleanup(rsp, rnp); > > sq = rcu_nocb_gp_get(rnp); > > - raw_spin_unlock_irq_rcu_node(rnp); > > rcu_nocb_gp_cleanup(sq); > > + raw_spin_unlock_irq_rcu_node(rnp); > > cond_resched_rcu_qs(); > > WRITE_ONCE(rsp->gp_activity, jiffies); > > rcu_gp_slow(rsp, gp_cleanup_delay); > > > > but your resolution is better, rcu_nocb_gp_cleanup() can (and should) be done > > outside of the rcu_node lock. > > > > So we have the same resolution now, which is good! ;-) > > Glad we were close! > > Just for purposes of satisfying curiosity, I am running rcutorture on your > version. ;-) And for whatever it is worth, in one of the sixteen rcutorture scenarios lockdep complained as shown below. On the other hand, your version quite possibly makes a lost-wakeup bug happen more frequently. If my current quest to create a torture test specific to this bug fails, I will revisit your patch. So despite the lockdep splat, it is quite possible that I will be thanking you for it at some point. ;-) Thanx, Paul ------------------------------------------------------------------------ [ 0.546319] ================================= [ 0.547000] [ INFO: inconsistent lock state ] [ 0.547000] 4.5.0-rc6+ #1 Not tainted [ 0.547000] --------------------------------- [ 0.547000] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 0.547000] swapper/0/0 [HC0[0]:SC1[1]:HE0:SE0] takes: [ 0.547000] (rcu_node_2){+.?...}, at: [] rcu_process_callbacks+0xf4/0x860 [ 0.547000] {SOFTIRQ-ON-W} state was registered at: [ 0.547000] [] mark_held_locks+0x66/0x90 [ 0.547000] [] trace_hardirqs_on_caller+0xf4/0x1c0 [ 0.547000] [] trace_hardirqs_on+0xd/0x10 [ 0.547000] [] _raw_spin_unlock_irq+0x27/0x50 [ 0.547000] [] swake_up_all+0xb6/0xd0 [ 0.547000] [] rcu_gp_kthread+0x835/0xaf0 [ 0.547000] [] kthread+0xdf/0x100 [ 0.547000] [] ret_from_fork+0x3f/0x70 [ 0.547000] irq event stamp: 34721 [ 0.547000] hardirqs last enabled at (34720): [] note_gp_changes+0x43/0xa0 [ 0.547000] hardirqs last disabled at (34721): [] _raw_spin_lock_irqsave+0x17/0x60 [ 0.547000] softirqs last enabled at (34712): [] _local_bh_enable+0x1c/0x50 [ 0.547000] softirqs last disabled at (34713): [] irq_exit+0xa5/0xb0 [ 0.547000] [ 0.547000] other info that might help us debug this: [ 0.547000] Possible unsafe locking scenario: [ 0.547000] [ 0.547000] CPU0 [ 0.547000] ---- [ 0.547000] lock(rcu_node_2); [ 0.547000] [ 0.547000] lock(rcu_node_2); [ 0.547000] [ 0.547000] *** DEADLOCK *** [ 0.547000] [ 0.547000] no locks held by swapper/0/0. [ 0.547000] [ 0.547000] stack backtrace: [ 0.547000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.5.0-rc6+ #1 [ 0.547000] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 0.547000] 0000000000000000 ffff88001fc03cc0 ffffffff813730ee ffffffff81e1d500 [ 0.547000] ffffffff83874a20 ffff88001fc03d10 ffffffff8113cf77 0000000000000001 [ 0.547000] ffffffff00000000 ffff880000000000 0000000000000006 ffffffff81e1d500 [ 0.547000] Call Trace: [ 0.547000] [] dump_stack+0x67/0x99 [ 0.547000] [] print_usage_bug+0x1f2/0x203 [ 0.547000] [] ? check_usage_backwards+0x120/0x120 [ 0.547000] [] mark_lock+0x212/0x2a0 [ 0.547000] [] __lock_acquire+0x397/0x1b50 [ 0.547000] [] ? update_blocked_averages+0x3e/0x4a0 [ 0.547000] [] ? _raw_spin_unlock_irqrestore+0x55/0x70 [ 0.547000] [] ? trace_hardirqs_on_caller+0xa4/0x1c0 [ 0.547000] [] ? rebalance_domains+0x10a/0x3b0 [ 0.547000] [] lock_acquire+0xc5/0x1e0 [ 0.547000] [] ? rcu_process_callbacks+0xf4/0x860 [ 0.547000] [] _raw_spin_lock_irqsave+0x41/0x60 [ 0.547000] [] ? rcu_process_callbacks+0xf4/0x860 [ 0.547000] [] rcu_process_callbacks+0xf4/0x860 [ 0.547000] [] ? run_rebalance_domains+0x1c8/0x1f0 [ 0.547000] [] __do_softirq+0x139/0x490 [ 0.547000] [] irq_exit+0xa5/0xb0 [ 0.547000] [] smp_apic_timer_interrupt+0x3d/0x50 [ 0.547000] [] apic_timer_interrupt+0x89/0x90 [ 0.547000] [] ? default_idle+0x18/0x1a0 [ 0.547000] [] ? default_idle+0x16/0x1a0 [ 0.547000] [] arch_cpu_idle+0xa/0x10 [ 0.547000] [] default_idle_call+0x25/0x40 [ 0.547000] [] cpu_startup_entry+0x298/0x3c0 [ 0.547000] [] rest_init+0x12f/0x140 [ 0.547000] [] ? csum_partial_copy_generic+0x170/0x170 [ 0.547000] [] start_kernel+0x435/0x442 [ 0.547000] [] ? set_init_arg+0x55/0x55 [ 0.547000] [] x86_64_start_reservations+0x2a/0x2c [ 0.547000] [] x86_64_start_kernel+0xea/0xed