From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757930AbYHaTRc (ORCPT ); Sun, 31 Aug 2008 15:17:32 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1756014AbYHaTRY (ORCPT ); Sun, 31 Aug 2008 15:17:24 -0400 Received: from e31.co.us.ibm.com ([32.97.110.149]:60839 "EHLO e31.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755957AbYHaTRX (ORCPT ); Sun, 31 Aug 2008 15:17:23 -0400 Date: Sun, 31 Aug 2008 12:17:21 -0700 From: "Paul E. McKenney" To: Manfred Spraul Cc: linux-kernel@vger.kernel.org, Ingo Molnar , akpm@linux-foundation.org Subject: Re: [PATCH] kernel/cpu.c: Move the CPU_DYING notifiers Message-ID: <20080831191721.GH7015@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <200808311809.m7VI9whf014532@mail.q-ag.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200808311809.m7VI9whf014532@mail.q-ag.de> User-Agent: Mutt/1.5.15+20070412 (2007-04-11) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Aug 31, 2008 at 07:58:49PM +0200, Manfred Spraul wrote: > When a cpu is taken offline, the CPU_DYING notifiers are called on the > dying cpu. According to , the cpu should be "not > running any task, not handling interrupts, soon dead". > > For the current implementation, this is not true: > - __cpu_disable can fail. If it fails, then the cpu will remain alive > and happy. > - At least on x86, __cpu_disable() briefly enables the local interrupts > to handle any outstanding interrupts. > > What about moving CPU_DYING down a few lines, behind the __cpu_disable() > line? > There are only two CPU_DYING handlers in the kernel right now: one in > kvm, one in the scheduler. Both should work with the patch applied > [and: I'm not sure if either one handles a failing __cpu_disable()] > > The patch survives simple offlining a cpu. kvm untested due to lack > of a test setup. Several architectures re-enable interrupts in __cpu_disable() or in functions called from __cpu_disable(), which happens after CPU_DYING, if I understand correctly. :-( Thanx, Paul > Signed-Off-By: Manfred Spraul > --- > kernel/cpu.c | 5 +++-- > 1 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/kernel/cpu.c b/kernel/cpu.c > index e202a68..5b7c88f 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -199,13 +199,14 @@ static int __ref take_cpu_down(void *_param) > struct take_cpu_down_param *param = _param; > int err; > > - raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod, > - param->hcpu); > /* Ensure this CPU doesn't handle any more interrupts. */ > err = __cpu_disable(); > if (err < 0) > return err; > > + raw_notifier_call_chain(&cpu_chain, CPU_DYING | param->mod, > + param->hcpu); > + > /* Force idle task to run as soon as we yield: it should > immediately notice cpu is offline and die quickly. */ > sched_idle_next(); > -- > 1.5.5.1 >