From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCHv1] xen/events/fifo: Handle linked events when closing a PIRQ port Date: Wed, 16 Sep 2015 13:34:59 -0400 Message-ID: <55F9A843.6080109@oracle.com> References: <1439216678-12407-1-git-send-email-david.vrabel@citrix.com> <3b1a3ef1046357c95dd25a8610b1fd11@eikelenboom.it> <55C8DC74.70809@citrix.com> <55F98A3C.1020209@oracle.com> <55F98C21.80805@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZcGdD-0002ig-P2 for xen-devel@lists.xenproject.org; Wed, 16 Sep 2015 17:36:31 +0000 In-Reply-To: <55F98C21.80805@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel , Ross Lagerwall Cc: linux@eikelenboom.it, xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org On 09/16/2015 11:34 AM, David Vrabel wrote: > On 16/09/15 16:26, Boris Ostrovsky wrote: >> On 08/10/2015 01:16 PM, David Vrabel wrote: >>> On 10/08/15 17:47, linux@eikelenboom.it wrote: >>>> On 2015-08-10 16:24, David Vrabel wrote: >>>>> Commit fcdf31a7c162de0c93a2bee51df4688ab0a348f8 (xen/events/fifo: >>>>> Handle linked events when closing a port) did not handle closing a >>>>> port bound to a PIRQ because these are closed from shutdown_pirq() >>>>> which is called with interrupts disabled. >>>>> >>>>> Defer the close to a work queue where we can safely spin waiting for >>>>> the LINKED bit to clear. For simplicity, the close is always deferred >>>>> even if it is not required (i.e., we're already in process context). >>>>> >>>>> Signed-off-by: David Vrabel >>>>> Cc: Ross Lagerwall >>>>> --- >>>>> Cc: Sander Eikelenboom >>>> Hi David, >>>> >>>> Tested your patch, don't know for sure but this doesn't seem to work >>>> out. >>>> I end up with this event channel error on dom0 boot. >>>> >>>> Which ends in state: >>>> Name ID Mem VCPUs State >>>> Time(s) >>>> (null) 0 1536 6 r----- >>>> 183.8 >>>> >>>> -- >>>> Sander >>>> >>>> (XEN) [2015-08-10 16:35:34.584] PCI add device 0000:0d:00.0 >>>> (XEN) [2015-08-10 16:35:34.891] PCI add device 0000:0c:00.0 >>>> (XEN) [2015-08-10 16:35:35.123] PCI add device 0000:0b:00.0 >>>> (XEN) [2015-08-10 16:35:35.325] PCI add device 0000:0a:00.0 >>>> (XEN) [2015-08-10 16:35:35.574] PCI add device 0000:09:00.0 >>>> (XEN) [2015-08-10 16:35:35.642] PCI add device 0000:09:00.1 >>>> (XEN) [2015-08-10 16:35:35.872] PCI add device 0000:05:00.0 >>>> (XEN) [2015-08-10 16:35:36.044] PCI add device 0000:06:01.0 >>>> (XEN) [2015-08-10 16:35:36.109] PCI add device 0000:06:02.0 >>>> (XEN) [2015-08-10 16:35:36.293] PCI add device 0000:08:00.0 >>>> (XEN) [2015-08-10 16:35:36.603] PCI add device 0000:07:00.0 >>>> (XEN) [2015-08-10 16:35:36.906] PCI add device 0000:04:00.0 >>>> (XEN) [2015-08-10 16:35:37.074] PCI add device 0000:03:06.0 >>>> (XEN) [2015-08-10 16:35:39.456] PCI: Using MCFG for segment 0000 bus >>>> 00-ff >>>> (XEN) [2015-08-10 16:35:49.623] d0: Forcing read-only access to MFN >>>> fed00 >>>> (XEN) [2015-08-10 16:35:51.374] event_channel.c:472:d0v0 EVTCHNOP >>>> failure: error -17 >>> This didn't happen on the test box I used but I can see it is possible >>> to rebind a PIRQ whose close is still deferred. >>> >>> I'm going to revert fcdf31a7c162de0c93a2bee51df4688ab0a348f8 >>> (xen/events/fifo: Handle linked events when closing a port) for now. >> >> Any updates on these two patches? We started seeing this problem (stale >> events) in our testing when onlining/offlining vcpus in heavily >> oversubscribed guests. > This is the last attempt I came up with -- using a tasklet to wait for the > event to be unlinked. I was worried that that tasklets could be punted > to the ksoftirqd threads but haven't investigated this or come > up with something else (perhaps a high priority tasklet instead?). Since you are scheduling the tasklet not from an interrupt handler I believe it will run out of ksoftirqd in either case. In which case I think it is still possible that a PIRQ can be rebinded (rebound?) before the tasklet runs. -boris