From mboxrd@z Thu Jan 1 00:00:00 1970 From: Christoph Hellwig Subject: Re: [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest Date: Thu, 23 Mar 2017 15:22:15 +0100 Message-ID: <20170323142215.GA30988@lst.de> References: <43ffa887-a20d-f836-cba8-11196924d82d@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: virtualization-bounces@lists.linux-foundation.org Errors-To: virtualization-bounces@lists.linux-foundation.org To: Jason Wang Cc: Linux Kernel Mailing List , Laura Abbott , virtualization@lists.linux-foundation.org, Christoph Hellwig , "Michael S. Tsirkin" List-Id: virtualization@lists.linuxfoundation.org T24gVGh1LCBNYXIgMjMsIDIwMTcgYXQgMDE6MTM6NTBQTSArMDgwMCwgSmFzb24gV2FuZyB3cm90 ZToKPgo+Cj4gT24gMjAxN+W5tDAz5pyIMjPml6UgMDg6MzAsIExhdXJhIEFiYm90dCB3cm90ZToK Pj4gSGksCj4+Cj4+IEZlZG9yYSBoYXMgcmVjZWl2ZWQgbXVsdGlwbGUgcmVwb3J0cyBvZiBjcmFz aGVzIHdoZW4gcnVubmluZwo+PiA0LjExIGFzIGEgZ3Vlc3QKPj4KPj4gaHR0cHM6Ly9idWd6aWxs YS5yZWRoYXQuY29tL3Nob3dfYnVnLmNnaT9pZD0xNDMwMjk3Cj4+IGh0dHBzOi8vYnVnemlsbGEu cmVkaGF0LmNvbS9zaG93X2J1Zy5jZ2k/aWQ9MTQzNDQ2Mgo+PiBodHRwczovL2J1Z3ppbGxhLmtl cm5lbC5vcmcvc2hvd19idWcuY2dpP2lkPTE5NDkxMQo+PiBodHRwczovL2J1Z3ppbGxhLnJlZGhh dC5jb20vc2hvd19idWcuY2dpP2lkPTE0MzM4OTkKPj4KPj4gVGhlIGNyYXNoZXMgYXJlIG5vdCBh bHdheXMgY29uc2lzdGVudCBidXQgdGhleSBhcmUgZ2VuZXJhbGx5Cj4+IHNvbWUgZmxhdm9yIG9m IG9vcHMgb3IgR1BGIGluIHZpcnRpbyByZWxhdGVkIGNvZGUuIE11bHRpcGxlIHBlb3BsZQo+PiBo YXZlIGRvbmUgYmlzZWN0aW9ucyAoVGhhbmsgeW91IFRob3JzdGVuIExlZW1odWlzIGFuZAo+PiBS aWNoYXJkIFcuTS4gSm9uZXMpIGFuZCBmb3VuZCB0aGlzIGNvbW1pdCB0byBiZSBhdCBmYXVsdAo+ Pgo+PiAwN2VjNTE0ODBiNWViMTIzM2Y4YzFiMGY1ZDdhN2M4ZDEyNDdjNTA3IGlzIHRoZSBmaXJz dCBiYWQgY29tbWl0Cj4+IGNvbW1pdCAwN2VjNTE0ODBiNWViMTIzM2Y4YzFiMGY1ZDdhN2M4ZDEy NDdjNTA3Cj4+IEF1dGhvcjogQ2hyaXN0b3BoIEhlbGx3aWcgPGhjaEBsc3QuZGU+Cj4+IERhdGU6 ICAgU3VuIEZlYiA1IDE4OjE1OjE5IDIwMTcgKzAxMDAKPj4KPj4gICAgICB2aXJ0aW9fcGNpOiB1 c2Ugc2hhcmVkIGludGVycnVwdHMgZm9yIHZpcnRxdWV1ZXMKPj4gICAgICAgICAgIFRoaXMgbGV0 cyBJUlEgbGF5ZXIgaGFuZGxlIGRpc3BhdGNoaW5nIElSUXMgdG8gc2VwYXJhdGUgaGFuZGxlcnMg Cj4+IGZvciB0aGUKPj4gICAgICBjYXNlIHdoZXJlIHdlIGRvbid0IGhhdmUgcGVyLVZRIE1TSS1Y IHZlY3RvcnMsIGFuZCBhbGxvd3MgdXMgdG8gZ3JlYXRseQo+PiAgICAgIHNpbXBsaWZ5IHRoZSBj b2RlIGJhc2VkIG9uIHRoZSBhc3N1bXB0aW9uIHRoYXQgd2UgYWx3YXlzIGhhdmUgaW50ZXJydXB0 Cj4+ICAgICAgdmVjdG9yIDAgKGxlZ2FjeSBJTlR4IG9yIGNvbmZpZyBpbnRlcnJ1cHQgZm9yIE1T SS1YKSBhdmFpbGFibGUsIGFuZAo+PiAgICAgIGFueSBvdGhlciBpbnRlcnJ1cHQgaXMgcmVxdWVz dC9mcmVlZCB0aHJvdWdodCB0aGUgVlEsIGV2ZW4gaWYgdGhlCj4+ICAgICAgYWN0dWFsIGludGVy cnVwdCBsaW5lIG1pZ2h0IGJlIHNoYXJlZCBpbiBzb21lIGNhc2VzLgo+PiAgICAgICAgICAgVGhp cyBhbGxvd3MgcmVtb3ZpbmcgYSBncmVhdCBkZWFsIG9mIHZhcmlhYmxlcyBrZWVwaW5nIHRyYWNr IG9mIAo+PiB0aGUKPj4gICAgICBpbnRlcnJ1cHQgc3RhdGUgaW4gc3RydWN0IHZpcnRpb19wY2lf ZGV2aWNlLCBhcyB3ZSBjYW4gbm93IHNpbXBseSB3YWxrIHRoZQo+PiAgICAgIGxpc3Qgb2YgVlFz IGFuZCBkZWFsIHdpdGggcGVyLVZRIGludGVycnVwdCBoYW5kbGVycyB0aGVyZSwgYW5kIG9ubHkg dHJlYXQKPj4gICAgICB2ZWN0b3IgMCBzcGVjaWFsLgo+PiAgICAgICAgICAgQWRkaXRpb25hbGx5 IGNsZWFuIHVwIHRoZSBWUSBhbGxvY2F0aW9uIGNvZGUgdG8gcHJvcGVybHkgdW53aW5kIAo+PiBv biBlcnJvcgo+PiAgICAgIGluc3RlYWQgb2YgaGF2aW5nIGEgc2luZ2xlIGdsb2JhbCBjbGVhbnVw IGxhYmVsLCB3aGljaCBpcyBlcnJvciBwcm9uZSwKPj4gICAgICBhbmQgaW4gdGhpcyBjYXNlIGFs c28gbGVhZHMgdG8gbW9yZSBjb2RlLgo+PiAgICAgICAgICAgU2lnbmVkLW9mZi1ieTogQ2hyaXN0 b3BoIEhlbGx3aWcgPGhjaEBsc3QuZGU+Cj4+ICAgICAgU2lnbmVkLW9mZi1ieTogTWljaGFlbCBT LiBUc2lya2luIDxtc3RAcmVkaGF0LmNvbT4KPj4KPj4gOjA0MDAwMCAwNDAwMDAgNzlhODI2N2Zm YjczZjlkMjQ0MjY3YzVmNjgzNjUzMDViZGRkNDY5NiA4ODMyYTE2MGI5Nzg3MTBiYmQyNGJhNjk2 NmY0NjJiM2ZhYTI3ZmNjIE0JZHJpdmVycwo+Pgo+PiBJdCBkb2Vzbid0IHJldmVydCBjbGVhbmx5 IHNvIHdlIGhhdmVuJ3QgYmVlbiBhYmxlIHRvIGRvIGEgY2xlYW4KPj4gdGVzdC4gQW55IGlkZWFz Pwo+Pgo+PiBUaGFua3MsCj4+IExhdXJhCj4KPiBIZWxsbzoKPgo+IENhbiB5b3UgdHJ5IHRoZSBh dHRhY2hlZCBwYXRjaCB0byBzZWUgaWYgaXQgc29sdmVzIHRoZSBwcm9ibGVtPyAoQXQgbGVhc3Qg Cj4gaXQgc2lsZW50IEtBU2FuIHdhcm5pbmdzIGZvciBtZSkuCgpUaGlzIGxvb2tzIGxpa2UgYSBj b3JyZWN0IGZpeCB0byBtZSwgaW5kZXBlbmRlbnQgb2YgZml4aW5nIHRoZSBvcmlnaW5hbApidWcg b3Igbm90OgoKUmV2aWV3ZWQtYnk6IENocmlzdG9waCBIZWxsd2lnIDxoY2hAbHN0LmRlPgpfX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwpWaXJ0dWFsaXphdGlv biBtYWlsaW5nIGxpc3QKVmlydHVhbGl6YXRpb25AbGlzdHMubGludXgtZm91bmRhdGlvbi5vcmcK aHR0cHM6Ly9saXN0cy5saW51eGZvdW5kYXRpb24ub3JnL21haWxtYW4vbGlzdGluZm8vdmlydHVh bGl6YXRpb24= From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965085AbdCWOWT (ORCPT ); Thu, 23 Mar 2017 10:22:19 -0400 Received: from verein.lst.de ([213.95.11.211]:49883 "EHLO newverein.lst.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933717AbdCWOWS (ORCPT ); Thu, 23 Mar 2017 10:22:18 -0400 Date: Thu, 23 Mar 2017 15:22:15 +0100 From: Christoph Hellwig To: Jason Wang Cc: Laura Abbott , Christoph Hellwig , "Michael S. Tsirkin" , Linux Kernel Mailing List , virtualization@lists.linux-foundation.org Subject: Re: [REGRESSION] 07ec51480b5e ("virtio_pci: use shared interrupts for virtqueues") causes crashes in guest Message-ID: <20170323142215.GA30988@lst.de> References: <43ffa887-a20d-f836-cba8-11196924d82d@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 23, 2017 at 01:13:50PM +0800, Jason Wang wrote: > > > On 2017年03月23日 08:30, Laura Abbott wrote: >> Hi, >> >> Fedora has received multiple reports of crashes when running >> 4.11 as a guest >> >> https://bugzilla.redhat.com/show_bug.cgi?id=1430297 >> https://bugzilla.redhat.com/show_bug.cgi?id=1434462 >> https://bugzilla.kernel.org/show_bug.cgi?id=194911 >> https://bugzilla.redhat.com/show_bug.cgi?id=1433899 >> >> The crashes are not always consistent but they are generally >> some flavor of oops or GPF in virtio related code. Multiple people >> have done bisections (Thank you Thorsten Leemhuis and >> Richard W.M. Jones) and found this commit to be at fault >> >> 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 is the first bad commit >> commit 07ec51480b5eb1233f8c1b0f5d7a7c8d1247c507 >> Author: Christoph Hellwig >> Date: Sun Feb 5 18:15:19 2017 +0100 >> >> virtio_pci: use shared interrupts for virtqueues >> This lets IRQ layer handle dispatching IRQs to separate handlers >> for the >> case where we don't have per-VQ MSI-X vectors, and allows us to greatly >> simplify the code based on the assumption that we always have interrupt >> vector 0 (legacy INTx or config interrupt for MSI-X) available, and >> any other interrupt is request/freed throught the VQ, even if the >> actual interrupt line might be shared in some cases. >> This allows removing a great deal of variables keeping track of >> the >> interrupt state in struct virtio_pci_device, as we can now simply walk the >> list of VQs and deal with per-VQ interrupt handlers there, and only treat >> vector 0 special. >> Additionally clean up the VQ allocation code to properly unwind >> on error >> instead of having a single global cleanup label, which is error prone, >> and in this case also leads to more code. >> Signed-off-by: Christoph Hellwig >> Signed-off-by: Michael S. Tsirkin >> >> :040000 040000 79a8267ffb73f9d244267c5f68365305bddd4696 8832a160b978710bbd24ba6966f462b3faa27fcc M drivers >> >> It doesn't revert cleanly so we haven't been able to do a clean >> test. Any ideas? >> >> Thanks, >> Laura > > Hello: > > Can you try the attached patch to see if it solves the problem? (At least > it silent KASan warnings for me). This looks like a correct fix to me, independent of fixing the original bug or not: Reviewed-by: Christoph Hellwig