From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46315) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cAAIG-0002rs-MW for qemu-devel@nongnu.org; Fri, 25 Nov 2016 01:47:34 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cAAID-000265-JT for qemu-devel@nongnu.org; Fri, 25 Nov 2016 01:47:32 -0500 Received: from mx1.redhat.com ([209.132.183.28]:55638) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cAAID-000258-AQ for qemu-devel@nongnu.org; Fri, 25 Nov 2016 01:47:29 -0500 Date: Fri, 25 Nov 2016 08:47:26 +0200 From: "Michael S. Tsirkin" Message-ID: <20161125084433-mutt-send-email-mst@kernel.org> References: <1479802130-22640-1-git-send-email-peterx@redhat.com> <20161125060705-mutt-send-email-mst@kernel.org> <20161125052836.GE25010@pxdev.xzpeter.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20161125052836.GE25010@pxdev.xzpeter.org> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v2] pci: relax pci_msi_get_message() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Peter Xu Cc: qemu-devel@nongnu.org, david@gibson.dropbear.id.au On Fri, Nov 25, 2016 at 01:28:36PM +0800, Peter Xu wrote: > On Fri, Nov 25, 2016 at 06:11:01AM +0200, Michael S. Tsirkin wrote: > > On Tue, Nov 22, 2016 at 04:08:50PM +0800, Peter Xu wrote: > > > We are very strict in the past getting MSIs from commit > > > d1f6af6a1 ("kvm-irqchip: simplify kvm_irqchip_add_msi_route"), assu= ming > > > that MSI should be configured before hand when fetching. When we ha= ve > > > unrecognized configurations, we panic the system. However looks lik= e > > > this is too strict to be working on some platform, and issues occur= ed. > > > Firstly it's found on a ppc case and fixed by David in: > > >=20 > > > 6d17a01 vfio/pci: Fix regression in MSI routing configuration > > >=20 > > > However we encountered another case now with windows virtio driver = and > > > reported (and possibly more): > > >=20 > > > http://bugs.debian.org/844361 > > >=20 > > > To make every driver/hardware happy, let's loosen the rule and go b= ack > > > to the original behavior - instead of panic the system, when we try= to > > > fetch MSI without configured MSI/MSI-X system, we just provide an e= mpty > > > message to make drivers happy. > > >=20 > > > Reported-by: Maciej Kotli=C5=84ski > > > Signed-off-by: Peter Xu > > > --- > > > hw/pci/pci.c | 8 +++++--- > > > 1 file changed, 5 insertions(+), 3 deletions(-) > > >=20 > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > > > index 24fae16..0cd2bb0 100644 > > > --- a/hw/pci/pci.c > > > +++ b/hw/pci/pci.c > > > @@ -2606,9 +2606,11 @@ MSIMessage pci_get_msi_message(PCIDevice *de= v, int vector) > > > } else if (msi_enabled(dev)) { > > > msg =3D msi_get_message(dev, vector); > > > } else { > > > - /* Should never happen */ > > > - error_report("%s: unknown interrupt type", __func__); > > > - abort(); > > > + /* > > > + * Device is not configured with MSI/MSI-X yet, let's prov= ide > > > + * an empty message to make all device drivers happy. > > > + */ > > > + memset(&msg, 0, sizeof(msg)); > > > } > > > return msg; > > > } > >=20 > > Looks like a hack to me. Even if callers happen to treat > > 0 message as a nop, there's no guarantee that's true for > > all platforms. Pls change pci_get_msi_message to > > return an error code, and fix callers to check that. >=20 > Actually I'd say I still cannot understand why some guests will > encounter this issue - the driver just tries to fetch and enable > MSI/MSI-X messages without fully activate MSI/MSI-X in general (that's > why msi_enabled() and msix_enabled() both returned false). >=20 > But indeed some drivers won't work properly after commit d1f6af6a. > David fixed some case for ppc and vfio (6d17a01), but problem still > exist, like Debian's report (http://bugs.debian.org/844361). >=20 > To avoid going into every details of guest drivers, I was thinking > maybe we can just "recover" the behavior to the past: old QEMU (before > d1f6af6a) allows to fetch an meaningless message (e.g., when > msi_get_message() is called, it's possible that MSI is still not > setup, but IMHO that message is meaningless). So here we emulate that > case, and return a meaningless message. From the Debian bug report, we > can see that this does solve the issue (yeah I know this is not even > an excuse to agree on this patch, but again I just think this is > currently the best way to solve the problem...). >=20 > If we change this into return error, then kvm_irqchip_add_msi_route() > might fail, and that's a different behavior again, I don't know > whether it'll bring more troubles (only if I can test every guest > driver with that code, but that's merely impossible). So IMHO the best > way to solve the problem is use the current patch and try to keep the > old behavior. >=20 > I would really appreciate if you (or anyone) has better suggestion. >=20 > Thanks, >=20 > -- peterx You need to figure out the call stack. What does driver do? I'm fine with working around it for now but 1. maybe we should fix the driver anyway 2. we might want to emit an error message --=20 MST