From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37004) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XP3ZB-0000mf-Jh for qemu-devel@nongnu.org; Wed, 03 Sep 2014 01:57:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XP3Z7-0001Xz-Dj for qemu-devel@nongnu.org; Wed, 03 Sep 2014 01:57:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:2007) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XP3Z7-0001Xq-6Z for qemu-devel@nongnu.org; Wed, 03 Sep 2014 01:57:09 -0400 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s835v7Zc020622 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Wed, 3 Sep 2014 01:57:07 -0400 Received: from [10.66.70.102] ([10.66.70.102]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s835v6ER022438 for ; Wed, 3 Sep 2014 01:57:07 -0400 Message-ID: <5406ADB1.8040706@redhat.com> Date: Wed, 03 Sep 2014 13:57:05 +0800 From: Jason Wang MIME-Version: 1.0 References: <1409160982-16389-1-git-send-email-mdroth@linux.vnet.ibm.com> <20140902152050.32021.68140@loki> <20140902152546.GA23254@redhat.com> <20140902152736.GA23266@redhat.com> <20140902210315.GA25153@redhat.com> <20140902215125.GC25231@redhat.com> <20140903063541.GD1926@redhat.com> In-Reply-To: <20140903063541.GD1926@redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [Qemu-stable] Patch Round-up for stable 2.1.1, freeze on 2014-09-03 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org On 09/03/2014 02:35 PM, Michael S. Tsirkin wrote: > On Wed, Sep 03, 2014 at 02:17:02AM +0400, Andrey Korolyov wrote: >> On Wed, Sep 3, 2014 at 2:09 AM, Andrey Korolyov wrote: >>> On Wed, Sep 3, 2014 at 1:51 AM, Michael S. Tsirkin wrote: >>>> On Wed, Sep 03, 2014 at 01:29:29AM +0400, Andrey Korolyov wrote: >>>>> On Wed, Sep 3, 2014 at 1:03 AM, Michael S. Tsirkin wrote: >>>>>>> bad one is the >>>>>>> >>>>>>> Author: Jason Wang >>>>>>> Date: Tue Sep 2 18:07:46 2014 +0300 >>>>>>> >>>>>>> vhost_net: start/stop guest notifiers properly >>>>>> >>>>>> >>>>>> upstream has this (pull request sent today): >>>>>> vhost_net: cleanup start/stop condition >>>>>> >>>>>> Could you apply it and see if it helps please? >>>>>> >>>>>> Michael, if it helps it should be before start/stop guest notifiers >>>>>> ideally to avoid bisect problems. >>>>> It is already applied as shown from the list in the previous message >>>>> (there are some aio fixes too on top of 2.1 I picked before but they >>>>> should not impact vhost-net interaction in any mean). The symptoms are >>>>> a bit interesting - VM crashes only at PCI device initalization (e.g. >>>>> grub stage after reset and initrd unpacking are passing well, but then >>>>> things getting ugly). I am running 3.14 guest i686-pae kernel from >>>>> debian backports in guest, so it may be version-specific after all. If >>>>> it`ll be hard to reproduce, I can try 64bit, expecting same behavior. >>>>> Please find args in attached file. >>>> >>>> >>>> ok just to make sure - which tree do I clone exactly? >>>> >>> https://github.com/mdroth/qemu.git stable-2.1-staging showing same >>> behavior for me with those patches >> Forgot to mention important detail - I am playing with -mq now, so >> actually virtio-net working in a bit different way than it may >> expected (it also shown in args list from above, but someone may miss >> it): >> ... >> qemu-system-x86_64: unable to start vhost net: 95: falling back on >> userspace virtio >> qemu-system-x86_64: unable to start vhost net: 95: falling back on >> userspace virtio >> ... > Okay, so there's some bug in the error handling then. > I'll dig into it - meanwhile can you please strace > the binary to figure out which ioctl is failing? > > Or just trace it by hand: I am guessing vhost_net_start_one > is the one failing here, add printfs there and check > (note to self: we need more error messages in that function). > > Looks like the issue was caused by this commit: commit 2e6d46d77ed328d34a94688da8371bcbe243479b Author: Nikolay Nikolaev Date: Tue May 27 15:04:42 2014 +0300 vhost: add vhost_get_features and vhost_ack_features It remove the step of initialization of acked_features to backend_features. This will result a unexpected value acked_features which may fail during setting features. Will post a patch for this.