From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753893AbbEANlq (ORCPT <rfc822;w@1wt.eu>);
	Fri, 1 May 2015 09:41:46 -0400
Received: from aserp1040.oracle.com ([141.146.126.69]:26960 "EHLO
	aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1753625AbbEANlo (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Fri, 1 May 2015 09:41:44 -0400
Message-ID: <55438217.7@oracle.com>
Date: Fri, 01 May 2015 09:39:35 -0400
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0
MIME-Version: 1.0
To: David Vrabel <david.vrabel@citrix.com>, konrad.wilk@oracle.com
CC: xen-devel@lists.xenproject.org, annie.li@oracle.com,
        linux-kernel@vger.kernel.org
Subject: Re: [Xen-devel] [PATCH v2 1/4] xen/events: Clear cpu_evtchn_mask
 before resuming
References: <1430341815-4935-1-git-send-email-boris.ostrovsky@oracle.com> <1430341815-4935-2-git-send-email-boris.ostrovsky@oracle.com> <55435992.2000202@citrix.com>
In-Reply-To: <55435992.2000202@citrix.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
X-Source-IP: aserv0022.oracle.com [141.146.126.234]
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 05/01/2015 06:46 AM, David Vrabel wrote:
> On 29/04/15 22:10, Boris Ostrovsky wrote:
>> When a guest is resumed, the hypervisor may change event channel
>> assignments. If this happens and the guest uses 2-level events it
>> is possible for the interrupt to be claimed by wrong VCPU since
>> cpu_evtchn_mask bits may be stale. This can happen even though
>> evtchn_2l_bind_to_cpu() attempts to clear old bits: irq_info that
>> is passed in is not necessarily the original one (from pre-migration
>> times) but instead is freshly allocated during resume and so any
>> information about which CPU the channel was bound to is lost.
>>
>> Thus we should clear the mask during resume.
>>
>> We also need to make sure that bits for xenstore and console channels
>> are set when these two subsystems are resumed. While rebind_evtchn_irq()
>> (which is invoked for both of them on a resume) calls irq_set_affinity(),
>> the latter will in fact postpone setting affinity until handling the
>> interrupt. But because cpu_evtchn_mask will have bits for these two
>> cleared we won't be able to take the interrupt.
>>
>> With that in mind, we need to bind those two channels explicitly in
>> rebind_evtchn_irq(). We will keep irq_set_affinity() so that we have a
>> pass through generic irq affinity code later, in case something needs
>> to be updated there as well.
>>
>> (Also replace cpumask_of(0) with cpumask_of(info->cpu) in
>> rebind_evtchn_irq(): it should be set to zero in preceding
>> xen_irq_info_evtchn_setup().)
> [...]
>> @@ -1279,8 +1280,16 @@ void rebind_evtchn_irq(int evtchn, int irq)
>>   
>>   	mutex_unlock(&irq_mapping_update_lock);
>>   
>> -	/* new event channels are always bound to cpu 0 */
>> -	irq_set_affinity(irq, cpumask_of(0));
>> +	bind_vcpu.port = evtchn;
>> +	bind_vcpu.vcpu = info->cpu;
>> +	if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) == 0)
>> +		bind_evtchn_to_cpu(evtchn, info->cpu);
> Isn't the hypercall is unnecessary since this is a new event channel
> it's already bound to VCPU 0 and info->cpu == 0?
>
> I think only the bind_evtchn_to_cpu() call is needed here.


True. However, I added the hypercall here to make the routine 
independent of what happens in other parts (hypervisor binding new 
channels to cpu0, xen_irq_info_evtchn_setup() initializing to zero, 
etc.). This way, if either of these two change in the future (unlikely, 
but possible) this routine will still work as expected.

That's why I also replaced cpumask_of(0) with cpumask_of(info->cpu) in 
irq_set_affinity() call.

-boris


>
> If you agree I can remove the hypercall and apply this series.
>
>> +	else
>> +		pr_warn("Failed binding port %d to cpu %d\n",
>> +			evtchn, info->cpu);
>> +
>> +	/* This will be deferred until interrupt is processed */
>> +	irq_set_affinity(irq, cpumask_of(info->cpu));
> David
>