From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751160AbbEAUKO (ORCPT ); Fri, 1 May 2015 16:10:14 -0400 Received: from e34.co.us.ibm.com ([32.97.110.152]:35025 "EHLO e34.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750737AbbEAUKL (ORCPT ); Fri, 1 May 2015 16:10:11 -0400 Date: Fri, 1 May 2015 13:10:05 -0700 From: "Paul E. McKenney" To: Rik van Riel Cc: Linux kernel Mailing List Subject: Re: RCU recursion? (code inspection) Message-ID: <20150501201005.GA15557@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <5543D184.4070707@redhat.com> <20150501194102.GH5381@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150501194102.GH5381@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15050120-0017-0000-0000-00000A772B39 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 01, 2015 at 12:41:02PM -0700, Paul E. McKenney wrote: > On Fri, May 01, 2015 at 03:18:28PM -0400, Rik van Riel wrote: > > Hi Paul, > > > > While looking at synchronize_rcu(), I noticed that > > synchronize_rcu_expedited() calls synchronize_sched_expedited(), > > which can call synchronize_sched() when it is worried about > > the counter wrapping, which can call synchronize_sched_expedited() > > > > The code is sufficiently convoluted that I am unsure whether this > > recursion can actually happen in practice, but I also did not spot > > anything that would stop it. > > Hmmm... Sounds like I should take a look! And good catch! The following patch should fix this. Bad one on me, given that all the other places in synchronize_sched_expedited() that you would expect to invoke synchronize_sched() instead invoke wait_rcu_gp(call_rcu_sched)... Thanx, Paul ------------------------------------------------------------------------ rcu: Make synchronize_sched_expedited() call wait_rcu_gp() Currently, synchronize_sched_expedited() will call synchronize_sched() if there is danger of counter wrap. But if configuration says to always do expedited grace periods, synchronize_sched() will just call synchronize_sched_expedited() right back again. In theory, the old expedited operations will complete, the counters will get back in synch, and the recursion will end. But we could easily run out of stack long before that time. This commit therefore makes synchronize_sched_expedited() invoke the underlying wait_rcu_gp(call_rcu_sched) instead of synchronize_sched(), the same as all the other calls out from synchronize_sched_expedited(). This bug was introduced by commit 1924bcb02597 (Avoid counter wrap in synchronize_sched_expedited()). Reported-by: Rik van Riel Signed-off-by: Paul E. McKenney diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index bcc59437fc93..4e6902005228 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3310,7 +3310,7 @@ void synchronize_sched_expedited(void) if (ULONG_CMP_GE((ulong)atomic_long_read(&rsp->expedited_start), (ulong)atomic_long_read(&rsp->expedited_done) + ULONG_MAX / 8)) { - synchronize_sched(); + wait_rcu_gp(call_rcu_sched); atomic_long_inc(&rsp->expedited_wrap); return; }