From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50250) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ala9z-00064x-4V for qemu-devel@nongnu.org; Thu, 31 Mar 2016 06:49:08 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ala9u-0004yV-V5 for qemu-devel@nongnu.org; Thu, 31 Mar 2016 06:49:07 -0400 Received: from mailout2.w1.samsung.com ([210.118.77.12]:18731) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ala9u-0004y8-MQ for qemu-devel@nongnu.org; Thu, 31 Mar 2016 06:49:02 -0400 Received: from eucpsbgm1.samsung.com (unknown [203.254.199.244]) by mailout2.w1.samsung.com (Oracle Communications Messaging Server 7.0.5.31.0 64bit (built May 5 2014)) with ESMTP id <0O4W0048DGPNJ1B0@mailout2.w1.samsung.com> for qemu-devel@nongnu.org; Thu, 31 Mar 2016 11:48:59 +0100 (BST) References: <1459350849-31989-1-git-send-email-i.maximets@samsung.com> <20160330195614-mutt-send-email-mst@redhat.com> <56FCBD59.9020203@samsung.com> <20160331122012-mutt-send-email-mst@redhat.com> From: Ilya Maximets Message-id: <56FD009A.6000700@samsung.com> Date: Thu, 31 Mar 2016 13:48:58 +0300 MIME-version: 1.0 In-reply-to: <20160331122012-mutt-send-email-mst@redhat.com> Content-type: text/plain; charset=windows-1252 Content-transfer-encoding: 7bit Subject: Re: [Qemu-devel] [PATCH 0/4] Fix QEMU crash on vhost-user socket disconnect. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: Jason Wang , qemu-devel@nongnu.org, Dyasly Sergey On 31.03.2016 12:21, Michael S. Tsirkin wrote: > On Thu, Mar 31, 2016 at 09:02:01AM +0300, Ilya Maximets wrote: >> On 30.03.2016 20:01, Michael S. Tsirkin wrote: >>> On Wed, Mar 30, 2016 at 06:14:05PM +0300, Ilya Maximets wrote: >>>> Currently QEMU always crashes in following scenario (assume that >>>> vhost-user application is Open vSwitch with 'dpdkvhostuser' port): >>> >>> In fact, wouldn't the right thing to do be stopping the VM? >> >> I don't think that is reasonable to stop VM on failure of just one of >> network interfaces. There may be still working vhost-kernel interfaces. >> Even connection can still be established if vhost-user interface was in >> bonding with kernel interface. >> Anyway user should be able to save all his data before QEMU restart. > > > Unfrotunately some guests are known to crash if we defer > using available buffers indefinitely. What exactly do you mean? Anyway here we have 'crash of some guests' vs. '100% constant QEMU crash'. > > If you see bugs let's fix them, but recovering would be > a new feature, let's defer it until after 2.6. The segmentation fault is a bug. This series *doesn't introduce any recovering* features. Disconnected network interface in guest will never work again. I just tried to avoid segmentation fault in this situation. There will be no difference for guest between QEMU with this patches and without them. There just will not be any crashes. > >>> >>>> 1. # Check that link in guest is in a normal state. >>>> [guest]# ip link show eth0 >>>> 2: eth0: mtu 1500 qdisc <...> >>>> link/ether 00:16:35:af:aa:4b brd ff:ff:ff:ff:ff:ff >>>> >>>> 2. # Kill vhost-user application (using SIGSEGV just to be sure). >>>> [host]# kill -11 `pgrep ovs-vswitchd` >>>> >>>> 3. # Check that guest still thinks that all is good. >>>> [guest]# ip link show eth0 >>>> 2: eth0: mtu 1500 qdisc <...> >>>> link/ether 00:16:35:af:aa:4b brd ff:ff:ff:ff:ff:ff >>>> >>>> 4. # Try to unbind virtio-pci driver and observe QEMU crash. >>>> [guest]# echo -n '0000:00:01.0' > /sys/bus/pci/drivers/virtio-pci/unbind >>>> qemu: Failed to read msg header. Read 0 instead of 12. Original request 11. >>>> qemu: Failed to read msg header. Read 0 instead of 12. Original request 11. >>>> qemu: Failed to read msg header. Read 0 instead of 12. Original request 11. >>>> >>>> Child terminated with signal = 0xb (SIGSEGV) >>>> GDBserver exiting >>>> >>>> After the applying of this patch-set: >>>> >>>> 4. # Try to unbind virtio-pci driver. Unbind works fine with only few errors. >>>> [guest]# echo -n '0000:00:01.0' > /sys/bus/pci/drivers/virtio-pci/unbind >>>> qemu: Failed to read msg header. Read 0 instead of 12. Original request 11. >>>> qemu: Failed to read msg header. Read 0 instead of 12. Original request 11. >>>> >>>> 5. # Bind virtio-pci driver back. >>>> [guest]# echo -n '0000:00:01.0' > /sys/bus/pci/drivers/virtio-pci/bind >>>> >>>> 6. # Check link in guest. No crashes here, link in DOWN state. >>>> [guest]# ip link show eth0 >>>> 7: eth0: mtu 1500 qdisc <...> >>>> link/ether 00:16:35:af:aa:4b brd ff:ff:ff:ff:ff:ff >>>> >>>> 7. QEMU may be gracefully restarted to restore communication after restarting >>>> of vhost-user application. >>>> >>>> Ilya Maximets (4): >>>> vhost-user: fix crash on socket disconnect. >>>> vhost: prevent double stop of vhost_net device. >>>> vhost: check for vhost_net device validity. >>>> net: notify about link status only if it changed. >>>> >>>> hw/net/vhost_net.c | 48 ++++++++++++++++++++++++++++++++++++++---- >>>> hw/net/virtio-net.c | 33 ++++++++++++++++++++++------- >>>> include/hw/virtio/virtio-net.h | 1 + >>>> include/net/vhost-user.h | 1 + >>>> include/net/vhost_net.h | 1 + >>>> net/filter.c | 1 + >>>> net/net.c | 7 +++--- >>>> net/vhost-user.c | 43 +++++++++++++++++++++++++++++-------- >>>> 8 files changed, 111 insertions(+), 24 deletions(-) >>>> >>>> -- >>>> 2.5.0 >>> >>> > >