From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1756689AbbCCUP1 (ORCPT <rfc822;w@1wt.eu>);
	Tue, 3 Mar 2015 15:15:27 -0500
Received: from aserp1040.oracle.com ([141.146.126.69]:44476 "EHLO
	aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1756099AbbCCUP0 (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Tue, 3 Mar 2015 15:15:26 -0500
Message-ID: <54F615D3.2040802@oracle.com>
Date: Tue, 03 Mar 2015 15:13:07 -0500
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0
MIME-Version: 1.0
To: paulmck@linux.vnet.ibm.com
CC: linux-kernel@vger.kernel.org, mingo@kernel.org, laijs@cn.fujitsu.com,
        dipankar@in.ibm.com, akpm@linux-foundation.org,
        mathieu.desnoyers@efficios.com, josh@joshtriplett.org,
        tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org,
        dhowells@redhat.com, edumazet@google.com, dvhart@linux.intel.com,
        fweisbec@gmail.com, oleg@redhat.com, bobby.prani@gmail.com,
        x86@kernel.org, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
        David Vrabel <david.vrabel@citrix.com>, xen-devel@lists.xenproject.org
Subject: Re: [PATCH tip/core/rcu 02/20] x86: Use common outgoing-CPU-notification
 code
References: <20150303174144.GA13139@linux.vnet.ibm.com> <1425404595-17816-1-git-send-email-paulmck@linux.vnet.ibm.com> <1425404595-17816-2-git-send-email-paulmck@linux.vnet.ibm.com> <54F608C4.40405@oracle.com> <20150303194223.GR15405@linux.vnet.ibm.com>
In-Reply-To: <20150303194223.GR15405@linux.vnet.ibm.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-Source-IP: acsinet21.oracle.com [141.146.126.237]
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 03/03/2015 02:42 PM, Paul E. McKenney wrote:
> On Tue, Mar 03, 2015 at 02:17:24PM -0500, Boris Ostrovsky wrote:
>> On 03/03/2015 12:42 PM, Paul E. McKenney wrote:
>>>   }
>>> @@ -511,7 +508,8 @@ static void xen_cpu_die(unsigned int cpu)
>>>   		schedule_timeout(HZ/10);
>>>   	}
>>> -	cpu_die_common(cpu);
>>> +	(void)cpu_wait_death(cpu, 5);
>>> +	/* FIXME: Are the below calls really safe in case of timeout? */
>>
>>
>> Not for HVM guests (PV guests will only reach this point after
>> target cpu has been marked as down by the hypervisor).
>>
>> We need at least to have a message similar to what native_cpu_die()
>> prints on cpu_wait_death() failure. And I think we should not call
>> the two routines below (three, actually --- there is also
>> xen_teardown_timer() below, which is not part of the diff).
>>
>> -boris
>>
>>
>>>   	xen_smp_intr_free(cpu);
>>>   	xen_uninit_lock_cpu(cpu);
>
> So something like this, then?
>
> 	if (cpu_wait_death(cpu, 5)) {
> 		xen_smp_intr_free(cpu);
> 		xen_uninit_lock_cpu(cpu);
> 		xen_teardown_timer(cpu);
> 	}

	else
		pr_err("CPU %u didn't die...\n", cpu);


>
> Easy change for me to make if so!
>
> Or do I need some other check for HVM-vs.-PV guests, and, if so, what
> would that check be?  And also if so, is it OK to online a PV guest's
> CPU that timed out during its previous offline?


I believe PV VCPUs will always be CPU_DEAD by the time we get here since 
we are (indirectly) waiting for this in the loop at the beginning of 
xen_cpu_die():

'while (xen_pv_domain() && HYPERVISOR_vcpu_op(VCPUOP_is_up, cpu, NULL))' 
will exit only after 'HYPERVISOR_vcpu_op(VCPUOP_down, 
smp_processor_id()' in xen_play_dead(). Which happens after 
play_dead_common() has marked the cpu as CPU_DEAD.

So no test is needed.

Thanks.
-boris