From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56221) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gQ3iU-0006Oi-4y for qemu-devel@nongnu.org; Fri, 23 Nov 2018 00:09:22 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gQ3iP-0007Ra-93 for qemu-devel@nongnu.org; Fri, 23 Nov 2018 00:09:22 -0500 Received: from mga06.intel.com ([134.134.136.31]:48517) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gQ3iP-0007Os-09 for qemu-devel@nongnu.org; Fri, 23 Nov 2018 00:09:17 -0500 Date: Fri, 23 Nov 2018 00:04:51 -0500 From: Zhao Yan Message-ID: <20181123050451.GC31906@joy-OptiPlex-7040> References: <20181016021439.6212-1-yan.y.zhao@intel.com> <20181018145636.k6ptvn4iszabjhxw@mac.bytemobile.com> <20181122131110.GA31906@joy-OptiPlex-7040> <20181122141805.vyqywi4ep65loye3@mac> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20181122141805.vyqywi4ep65loye3@mac> Subject: Re: [Qemu-devel] [PATCH] Xen PCI passthrough: fix passthrough failure when irq map failure List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Roger Pau =?iso-8859-1?Q?Monn=E9?= Cc: sstabellini@kernel.org, anthony.perard@citrix.com, xen-devel@lists.xenproject.org, qemu-devel@nongnu.org On Thu, Nov 22, 2018 at 03:18:05PM +0100, Roger Pau Monné wrote: > On Thu, Nov 22, 2018 at 08:11:20AM -0500, Zhao Yan wrote: > > On Thu, Oct 18, 2018 at 03:56:36PM +0100, Roger Pau Monné wrote: > > > On Thu, Oct 18, 2018 at 08:22:41AM +0000, Zhao, Yan Y wrote: > > > > Hi > > > > The background for this patch is that: for some pci device, even it's PCI_INTERRUPT_PIN is not 0, it actually does not support INTx mode, so we should just report error, disable INTx mode and continue the passthrough. > > > > However, the commit 5a11d0f7 regards this as error condition and let qemu quit passthrough, which is too rigorous. > > > > > > > > Error message is below: > > > > libxl: error: libxl_qmp.c:287:qmp_handle_error_response: Domain 2:received an error message from QMP server: Mapping machine irq 0 to pirq -1 failed: Operation not permitted > > > > > > I'm having issues figuring out what's happening here. > > > s->real_device.irq is 0, yet the PCI config space read of > > > PCI_INTERRUPT_PIN returns something different than 0. > > > > > > AFAICT this is due to some kind of error in Linux, so that even when > > > the device is supposed to have a valid IRQ the sysfs node it is set to > > > 0, do you know the actual underlying cause of this? > > > > > > Thanks, Roger. > > Hi Roger > > Sorry for the later reply, I just missed this mail... > > On my side, it's because the hardware actually does not support INTx mode, > > but its configuration space does not report PCI_INTERRUPT_PIN to 0. It's a > > hardware bug, but previous version of qemu can tolerate it, switch to MSI > > and make passthrough work. > > Then I think it would be better to check both PCI_INTERRUPT_PIN and > s->real_device.irq before attempting to map the IRQ. > > Making the error non-fatal would mean that a device with a valid IRQ > could fail to be setup correctly but the guest will still be created, > and things won't go as expected when the guest attempts to use it. > > Thanks, Roger. hi roger thanks for your sugguestion. it's right that "s->real_device.irq" is needed to be checked before mapping, like if it's 0. but on the other hand, maybe xc_physdev_map_pirq() itself can serve as a checking of "s->real_device.irq" ? like in our case, it will fail and return -EPERM. then error hanling is still conducted ==>set INTX_DISABLE flag, eventhrough the error is not fatal. machine_irq = s->real_device.irq; rc = xc_physdev_map_pirq(xen_xc, xen_domid, machine_irq, &pirq); if (rc < 0) { error_setg_errno(errp, errno, "Mapping machine irq %u to" " pirq %i failed", machine_irq, pirq); /* Disable PCI intx assertion (turn on bit10 of devctl) */ cmd |= PCI_COMMAND_INTX_DISABLE; machine_irq = 0; s->machine_irq = 0; So, do you think it's all right just converting fatal error to non-fatal? Thanks Yan