From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762314AbYDZUxR (ORCPT ); Sat, 26 Apr 2008 16:53:17 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1759027AbYDZUxF (ORCPT ); Sat, 26 Apr 2008 16:53:05 -0400 Received: from e36.co.us.ibm.com ([32.97.110.154]:53576 "EHLO e36.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758860AbYDZUxC (ORCPT ); Sat, 26 Apr 2008 16:53:02 -0400 Date: Sat, 26 Apr 2008 13:52:57 -0700 From: "Paul E. McKenney" To: Andrew Morton Cc: linux-kernel@vger.kernel.org Subject: Re: [PATCH] fix rcu_try_flip_waitack_needed() to prevent grace-period stall Message-ID: <20080426205257.GA21687@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <20080321203821.GA16316@linux.vnet.ibm.com> <20080426074327.28db1bfc.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080426074327.28db1bfc.akpm@linux-foundation.org> User-Agent: Mutt/1.5.13 (2006-08-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Apr 26, 2008 at 07:43:27AM -0700, Andrew Morton wrote: > On Fri, 21 Mar 2008 13:38:21 -0700 "Paul E. McKenney" wrote: > > > The comment was correct -- need to make the code match the comment. > > Without this patch, if a CPU goes dynticks idle (and stays there forever) > > in just the right phase of preemptible-RCU grace-period processing, > > grace periods stall. The offending sequence of events (courtesy > > of Promela/spin, at least after I got the liveness criterion coded > > correctly...) is as follows: > > > > o CPU 0 is in dynticks-idle mode. Its dynticks_progress_counter > > is (say) 10. > > > > o CPU 0 takes an interrupt, so rcu_irq_enter() increments CPU 0's > > dynticks_progress_counter to 11. > > > > o CPU 1 is doing RCU grace-period processing in rcu_try_flip_idle(), > > sees rcu_pending(), so invokes dyntick_save_progress_counter(), > > which in turn takes a snapshot of CPU 0's dynticks_progress_counter > > into CPU 0's rcu_dyntick_snapshot -- now set to 11. CPU 1 then > > updates the RCU grace-period state to rcu_try_flip_waitack(). > > > > o CPU 0 returns from its interrupt, so rcu_irq_exit() increments > > CPU 0's dynticks_progress_counter to 12. > > > > o CPU 1 later invokes rcu_try_flip_waitack(), which notices that > > CPU 0 has not yet responded, and hence in turn invokes > > rcu_try_flip_waitack_needed(). This function examines the > > state of CPU 0's dynticks_progress_counter and rcu_dyntick_snapshot > > variables, which it copies to curr (== 12) and snap (== 11), > > respectively. > > > > Because curr!=snap, the first condition fails. > > > > Because curr-snap is only 1 and snap is odd, the second > > condition fails. > > > > rcu_try_flip_waitack_needed() therefore incorrectly concludes > > that it must wait for CPU 0 to explicitly acknowledge the > > counter flip. > > > > o CPU 0 remains forever in dynticks-idle mode, never taking > > any more hardware interrupts or any NMIs, and never running > > any more tasks. (Of course, -something- will usually eventually > > happen, which might be why we haven't seen this one in the > > wild. Still should be fixed!) > > > > Therefore the grace period never ends. Fix is to make the code match > > the comment, as shown below. With this fix, the above scenario > > would be satisfied with curr being even, and allow the grace period > > to proceed. > > Am having a ton of fun here putting my tree back together after a week's > worth of whee-look-at-all-the-stuff-ive-never-seen-before-which-just-got-merged > discoveries. (Which are not too bad actually) Linus did seem to be a bit more active than usual this time around. ;-) > This patch ran afoul of this change in Linus's tree: > > --- a/kernel/rcupreempt.c > +++ b/kernel/rcupreempt.c > @@ -1007,10 +1007,10 @@ void __synchronize_sched(void) > if (sched_getaffinity(0, &oldmask) < 0) > oldmask = cpu_possible_map; > for_each_online_cpu(cpu) { > - sched_setaffinity(0, cpumask_of_cpu(cpu)); > + sched_setaffinity(0, &cpumask_of_cpu(cpu)); > schedule(); > } > - sched_setaffinity(0, oldmask); > + sched_setaffinity(0, &oldmask); > } > EXPORT_SYMBOL_GPL(__synchronize_sched); > > I fixed it by simply removing the above changed lines. Please check that > the result makes sense and that we don't need to carry the above change > forward in any way? Yes, this change to the sched_setaffinity() API shouldn't affect the rest of the patch in any way. The patch below looks sensible. I guess my next forward-port of the call_rcu_sched() patch should be something I get started on quickly, as it might take longer than usual. ;-) BTW, the above code is removed be the call_rcu_sched() patch in any case. Thanx, Paul > I also removed the Cc:stable from this patch based on your followup > discussion with Peter. > > Thanks. > > > > From: "Paul E. McKenney" > > The comment was correct -- need to make the code match the comment. > Without this patch, if a CPU goes dynticks idle (and stays there forever) > in just the right phase of preemptible-RCU grace-period processing, > grace periods stall. The offending sequence of events (courtesy > of Promela/spin, at least after I got the liveness criterion coded > correctly...) is as follows: > > o CPU 0 is in dynticks-idle mode. Its dynticks_progress_counter > is (say) 10. > > o CPU 0 takes an interrupt, so rcu_irq_enter() increments CPU 0's > dynticks_progress_counter to 11. > > o CPU 1 is doing RCU grace-period processing in rcu_try_flip_idle(), > sees rcu_pending(), so invokes dyntick_save_progress_counter(), > which in turn takes a snapshot of CPU 0's dynticks_progress_counter > into CPU 0's rcu_dyntick_snapshot -- now set to 11. CPU 1 then > updates the RCU grace-period state to rcu_try_flip_waitack(). > > o CPU 0 returns from its interrupt, so rcu_irq_exit() increments > CPU 0's dynticks_progress_counter to 12. > > o CPU 1 later invokes rcu_try_flip_waitack(), which notices that > CPU 0 has not yet responded, and hence in turn invokes > rcu_try_flip_waitack_needed(). This function examines the > state of CPU 0's dynticks_progress_counter and rcu_dyntick_snapshot > variables, which it copies to curr (== 12) and snap (== 11), > respectively. > > Because curr!=snap, the first condition fails. > > Because curr-snap is only 1 and snap is odd, the second > condition fails. > > rcu_try_flip_waitack_needed() therefore incorrectly concludes > that it must wait for CPU 0 to explicitly acknowledge the > counter flip. > > o CPU 0 remains forever in dynticks-idle mode, never taking > any more hardware interrupts or any NMIs, and never running > any more tasks. (Of course, -something- will usually eventually > happen, which might be why we haven't seen this one in the > wild. Still should be fixed!) > > Therefore the grace period never ends. Fix is to make the code match > the comment, as shown below. With this fix, the above scenario > would be satisfied with curr being even, and allow the grace period > to proceed. > > Signed-off-by: Paul E. McKenney > Cc: Peter Zijlstra > Cc: Ingo Molnar > Cc: Josh Triplett > Cc: Dipankar Sarma > Signed-off-by: Andrew Morton > --- > > kernel/rcupreempt.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff -puN kernel/rcupreempt.c~rcu-fix-rcu_try_flip_waitack_needed-to-prevent-grace-period-stall kernel/rcupreempt.c > --- a/kernel/rcupreempt.c~rcu-fix-rcu_try_flip_waitack_needed-to-prevent-grace-period-stall > +++ a/kernel/rcupreempt.c > @@ -567,7 +567,7 @@ rcu_try_flip_waitack_needed(int cpu) > * that this CPU already acknowledged the counter. > */ > > - if ((curr - snap) > 2 || (snap & 0x1) == 0) > + if ((curr - snap) > 2 || (curr & 0x1) == 0) > return 0; > > /* We need this CPU to explicitly acknowledge the counter flip. */ > _ >