From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50307) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cAAaG-00050i-Ie for qemu-devel@nongnu.org; Fri, 25 Nov 2016 02:06:10 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cAAaB-0005KV-JQ for qemu-devel@nongnu.org; Fri, 25 Nov 2016 02:06:08 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40732) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cAAaB-0005IN-AY for qemu-devel@nongnu.org; Fri, 25 Nov 2016 02:06:03 -0500 Date: Fri, 25 Nov 2016 15:05:58 +0800 From: Peter Xu Message-ID: <20161125070558.GG25010@pxdev.xzpeter.org> References: <1479802130-22640-1-git-send-email-peterx@redhat.com> <20161125060705-mutt-send-email-mst@kernel.org> <20161125052836.GE25010@pxdev.xzpeter.org> <20161125084433-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20161125084433-mutt-send-email-mst@kernel.org> Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [PATCH v2] pci: relax pci_msi_get_message() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: qemu-devel@nongnu.org, david@gibson.dropbear.id.au On Fri, Nov 25, 2016 at 08:47:26AM +0200, Michael S. Tsirkin wrote: > On Fri, Nov 25, 2016 at 01:28:36PM +0800, Peter Xu wrote: > > On Fri, Nov 25, 2016 at 06:11:01AM +0200, Michael S. Tsirkin wrote: > > > On Tue, Nov 22, 2016 at 04:08:50PM +0800, Peter Xu wrote: > > > > We are very strict in the past getting MSIs from commit > > > > d1f6af6a1 ("kvm-irqchip: simplify kvm_irqchip_add_msi_route"), as= suming > > > > that MSI should be configured before hand when fetching. When we = have > > > > unrecognized configurations, we panic the system. However looks l= ike > > > > this is too strict to be working on some platform, and issues occ= ured. > > > > Firstly it's found on a ppc case and fixed by David in: > > > >=20 > > > > 6d17a01 vfio/pci: Fix regression in MSI routing configuration > > > >=20 > > > > However we encountered another case now with windows virtio drive= r and > > > > reported (and possibly more): > > > >=20 > > > > http://bugs.debian.org/844361 > > > >=20 > > > > To make every driver/hardware happy, let's loosen the rule and go= back > > > > to the original behavior - instead of panic the system, when we t= ry to > > > > fetch MSI without configured MSI/MSI-X system, we just provide an= empty > > > > message to make drivers happy. > > > >=20 > > > > Reported-by: Maciej Kotli=C5=84ski > > > > Signed-off-by: Peter Xu > > > > --- > > > > hw/pci/pci.c | 8 +++++--- > > > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > >=20 > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > > > > index 24fae16..0cd2bb0 100644 > > > > --- a/hw/pci/pci.c > > > > +++ b/hw/pci/pci.c > > > > @@ -2606,9 +2606,11 @@ MSIMessage pci_get_msi_message(PCIDevice *= dev, int vector) > > > > } else if (msi_enabled(dev)) { > > > > msg =3D msi_get_message(dev, vector); > > > > } else { > > > > - /* Should never happen */ > > > > - error_report("%s: unknown interrupt type", __func__); > > > > - abort(); > > > > + /* > > > > + * Device is not configured with MSI/MSI-X yet, let's pr= ovide > > > > + * an empty message to make all device drivers happy. > > > > + */ > > > > + memset(&msg, 0, sizeof(msg)); > > > > } > > > > return msg; > > > > } > > >=20 > > > Looks like a hack to me. Even if callers happen to treat > > > 0 message as a nop, there's no guarantee that's true for > > > all platforms. Pls change pci_get_msi_message to > > > return an error code, and fix callers to check that. > >=20 > > Actually I'd say I still cannot understand why some guests will > > encounter this issue - the driver just tries to fetch and enable > > MSI/MSI-X messages without fully activate MSI/MSI-X in general (that'= s > > why msi_enabled() and msix_enabled() both returned false). > >=20 > > But indeed some drivers won't work properly after commit d1f6af6a. > > David fixed some case for ppc and vfio (6d17a01), but problem still > > exist, like Debian's report (http://bugs.debian.org/844361). > >=20 > > To avoid going into every details of guest drivers, I was thinking > > maybe we can just "recover" the behavior to the past: old QEMU (befor= e > > d1f6af6a) allows to fetch an meaningless message (e.g., when > > msi_get_message() is called, it's possible that MSI is still not > > setup, but IMHO that message is meaningless). So here we emulate that > > case, and return a meaningless message. From the Debian bug report, w= e > > can see that this does solve the issue (yeah I know this is not even > > an excuse to agree on this patch, but again I just think this is > > currently the best way to solve the problem...). > >=20 > > If we change this into return error, then kvm_irqchip_add_msi_route() > > might fail, and that's a different behavior again, I don't know > > whether it'll bring more troubles (only if I can test every guest > > driver with that code, but that's merely impossible). So IMHO the bes= t > > way to solve the problem is use the current patch and try to keep the > > old behavior. > >=20 > > I would really appreciate if you (or anyone) has better suggestion. > >=20 > > Thanks, > >=20 > > -- peterx >=20 > You need to figure out the call stack. What does driver do? > I'm fine with working around it for now but > 1. maybe we should fix the driver anyway > 2. we might want to emit an error message >=20 > --=20 > MST Limin posted another comment here: http://bugs.debian.org/844361 We can see whether that bug is finally caused by pci-assign too, and whether that pci-assign fix can solve the problem. If that works, it'll be nice, then we won't need this patch at least for now. Thanks, -- peterx