From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ross Lagerwall Subject: Re: [PATCH] xen/events/fifo: Consume unprocessed events when a CPU dies Date: Tue, 30 Jun 2015 10:51:34 +0100 Message-ID: <559266A6.9080105@citrix.com> References: <1434726957-929-1-git-send-email-ross.lagerwall@citrix.com> <55843D1C.9070905@oracle.com> <55843DF9.5090704@citrix.com> <55911BB5.5030901@citrix.com> <559148F1.6000602@oracle.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1Z9sCa-0006ml-NS for xen-devel@lists.xenproject.org; Tue, 30 Jun 2015 09:51:40 +0000 In-Reply-To: <559148F1.6000602@oracle.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Boris Ostrovsky , David Vrabel , xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org On 06/29/2015 02:32 PM, Boris Ostrovsky wrote: > On 06/29/2015 06:19 AM, Ross Lagerwall wrote: >> On 06/19/2015 05:06 PM, David Vrabel wrote: >>> On 19/06/15 17:02, Boris Ostrovsky wrote: >>>> On 06/19/2015 11:15 AM, Ross Lagerwall wrote: >>>>> When a CPU is offlined, there may be unprocessed events on a port for >>>>> that CPU. If the port is subsequently reused on a different CPU, it >>>>> could be in an unexpected state with the link bit set, resulting in >>>>> interrupts being missed. Fix this by consuming any unprocessed events >>>>> for a particular CPU when that CPU dies. >>>>> >>>>> Signed-off-by: Ross Lagerwall >>>>> --- >>>>> drivers/xen/events/events_fifo.c | 23 ++++++++++++++++++----- >>>>> 1 file changed, 18 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/drivers/xen/events/events_fifo.c >>>>> b/drivers/xen/events/events_fifo.c >>>>> index 417415d..1dd0ba12 100644 >>>>> --- a/drivers/xen/events/events_fifo.c >>>>> +++ b/drivers/xen/events/events_fifo.c >>>>> @@ -281,7 +281,8 @@ static void handle_irq_for_port(unsigned port) >>>>> static void consume_one_event(unsigned cpu, >>>>> struct evtchn_fifo_control_block *control_block, >>>>> - unsigned priority, unsigned long *ready) >>>>> + unsigned priority, unsigned long *ready, >>>>> + bool drop) >>>>> { >>>>> struct evtchn_fifo_queue *q = &per_cpu(cpu_queue, cpu); >>>>> uint32_t head; >>>>> @@ -313,13 +314,17 @@ static void consume_one_event(unsigned cpu, >>>>> if (head == 0) >>>>> clear_bit(priority, ready); >>>>> - if (evtchn_fifo_is_pending(port) && >>>>> !evtchn_fifo_is_masked(port)) >>>>> - handle_irq_for_port(port); >>>>> + if (evtchn_fifo_is_pending(port) && >>>>> !evtchn_fifo_is_masked(port)) { >>>>> + if (unlikely(drop)) >>>>> + pr_warn("Dropping pending event for port %u\n", port); >>>> >>>> Maybe pr_info (or pr_notice)? >>> >>> We want a warning here because we think this shouldn't happen -- if it >>> does we actually need to retrigger the event on its new CPU. >>> >>>> Also, why not do this (testing for unprocessed events) in >>>> xen_evtchn_close()? >>> >>> We can't do anything about them when closing because they may be in the >>> middle of a queue. > > (Sorry, I missed this) > > Why can't (actually, why doesn't) the cpu that is being offlined drain > its queue? > Where would this be done? I thought using CPU notifiers was the correct way to hook when a CPU goes down without having to stick fifo event channel code in the core Xen code. -- Ross Lagerwall