From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756689AbbCCUP1 (ORCPT ); Tue, 3 Mar 2015 15:15:27 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:44476 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756099AbbCCUP0 (ORCPT ); Tue, 3 Mar 2015 15:15:26 -0500 Message-ID: <54F615D3.2040802@oracle.com> Date: Tue, 03 Mar 2015 15:13:07 -0500 From: Boris Ostrovsky User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: paulmck@linux.vnet.ibm.com CC: linux-kernel@vger.kernel.org, mingo@kernel.org, laijs@cn.fujitsu.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@efficios.com, josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com, dvhart@linux.intel.com, fweisbec@gmail.com, oleg@redhat.com, bobby.prani@gmail.com, x86@kernel.org, Konrad Rzeszutek Wilk , David Vrabel , xen-devel@lists.xenproject.org Subject: Re: [PATCH tip/core/rcu 02/20] x86: Use common outgoing-CPU-notification code References: <20150303174144.GA13139@linux.vnet.ibm.com> <1425404595-17816-1-git-send-email-paulmck@linux.vnet.ibm.com> <1425404595-17816-2-git-send-email-paulmck@linux.vnet.ibm.com> <54F608C4.40405@oracle.com> <20150303194223.GR15405@linux.vnet.ibm.com> In-Reply-To: <20150303194223.GR15405@linux.vnet.ibm.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: acsinet21.oracle.com [141.146.126.237] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/03/2015 02:42 PM, Paul E. McKenney wrote: > On Tue, Mar 03, 2015 at 02:17:24PM -0500, Boris Ostrovsky wrote: >> On 03/03/2015 12:42 PM, Paul E. McKenney wrote: >>> } >>> @@ -511,7 +508,8 @@ static void xen_cpu_die(unsigned int cpu) >>> schedule_timeout(HZ/10); >>> } >>> - cpu_die_common(cpu); >>> + (void)cpu_wait_death(cpu, 5); >>> + /* FIXME: Are the below calls really safe in case of timeout? */ >> >> >> Not for HVM guests (PV guests will only reach this point after >> target cpu has been marked as down by the hypervisor). >> >> We need at least to have a message similar to what native_cpu_die() >> prints on cpu_wait_death() failure. And I think we should not call >> the two routines below (three, actually --- there is also >> xen_teardown_timer() below, which is not part of the diff). >> >> -boris >> >> >>> xen_smp_intr_free(cpu); >>> xen_uninit_lock_cpu(cpu); > > So something like this, then? > > if (cpu_wait_death(cpu, 5)) { > xen_smp_intr_free(cpu); > xen_uninit_lock_cpu(cpu); > xen_teardown_timer(cpu); > } else pr_err("CPU %u didn't die...\n", cpu); > > Easy change for me to make if so! > > Or do I need some other check for HVM-vs.-PV guests, and, if so, what > would that check be? And also if so, is it OK to online a PV guest's > CPU that timed out during its previous offline? I believe PV VCPUs will always be CPU_DEAD by the time we get here since we are (indirectly) waiting for this in the loop at the beginning of xen_cpu_die(): 'while (xen_pv_domain() && HYPERVISOR_vcpu_op(VCPUOP_is_up, cpu, NULL))' will exit only after 'HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id()' in xen_play_dead(). Which happens after play_dead_common() has marked the cpu as CPU_DEAD. So no test is needed. Thanks. -boris