From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55307) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ao909-0002eC-GN for qemu-devel@nongnu.org; Thu, 07 Apr 2016 08:25:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ao904-00034i-B6 for qemu-devel@nongnu.org; Thu, 07 Apr 2016 08:25:33 -0400 Received: from mx1.redhat.com ([209.132.183.28]:47762) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ao904-00034V-3S for qemu-devel@nongnu.org; Thu, 07 Apr 2016 08:25:28 -0400 Date: Thu, 7 Apr 2016 15:25:23 +0300 From: "Michael S. Tsirkin" Message-ID: <20160407142012-mutt-send-email-mst@redhat.com> References: <1094233965.209701460027388015.JavaMail.weblogic@eumlwas01> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1094233965.209701460027388015.JavaMail.weblogic@eumlwas01> Subject: Re: [Qemu-devel] [PATCH 0/4] Fix QEMU crash on vhost-user socket disconnect. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Ilya Maximets Cc: Jason Wang , "qemu-devel@nongnu.org" , Sergey Dyasly On Thu, Apr 07, 2016 at 11:09:48AM +0000, Ilya Maximets wrote: > > ------- Original Message ------- > > Sender : Michael S. Tsirkin > > Date : Apr 07, 2016 10:01 (GMT+03:00) > > Title : Re: Re: [PATCH 0/4] Fix QEMU crash on vhost-user socket disconnect. > > > > On Wed, Apr 06, 2016 at 11:52:56PM +0000, Ilya Maximets wrote: > > > ------- Original Message ------- > > > Sender : Michael S. Tsirkin > > > Date : Apr 05, 2016 13:46 (GMT+03:00) > > > Title : Re: [PATCH 0/4] Fix QEMU crash on vhost-user socket disconnect. > > > > > > > On Thu, Mar 31, 2016 at 09:02:01AM +0300, Ilya Maximets wrote: > > > > > On 30.03.2016 20:01, Michael S. Tsirkin wrote: > > > > > > On Wed, Mar 30, 2016 at 06:14:05PM +0300, Ilya Maximets wrote: > > > > > >> Currently QEMU always crashes in following scenario (assume that > > > > > >> vhost-user application is Open vSwitch with 'dpdkvhostuser' port): > > > > > > > > > > > > In fact, wouldn't the right thing to do be stopping the VM? > > > > > > > > > > I don't think that is reasonable to stop VM on failure of just one of > > > > > network interfaces. > > > > > > > > We don't start QEMU until vhost user connects, either. > > > > Making guest run properly with a backend is a much bigger > > > > project, let's tackle this separately. > > > > > > > > Also, I think handling graceful disconnect is the correct first step. > > > > Handling misbehaving clients is much harder as we have asserts on remote > > > > obeying protocol rules all over the place. > > > > > > > > > There may be still working vhost-kernel interfaces. > > > > > Even connection can still be established if vhost-user interface was in > > > > > bonding with kernel interface. > > > > > > > > Could not parse this. > > > > > > > > > Anyway user should be able to save all his data before QEMU restart. > > > > > > > > So reconnect a new backend, and VM will keep going. > > > > > > We cant't do this because of 2 reasons: > > > 1. Segmentation fault of QEMU on disconnect. (fixed by this patch-set) > > > 2. There is no reconnect functionality in current QEMU version. > > > > > > So, what are you talking about? > > > > > > Best regards, Ilya Maximets. > > > > One can currently disconnect vhost user clients without killing > > guests using migration: > > - save vm > > - quit qemu > > - start new qemu > > - load vm > > > > Or using hotplug > > - request unplug > > - wait for guest eject > > - create new device > > This may be fixes second reason mentioned above, but in my scenario (described in cover-letter) > guset will know that backend disconnected only after segmentation fault. So, there will be > no opportunity to save or unplug machine. In cover letter you describe killing vhost user client. So don't do it then. > > I would like to make sure we do not create an expectation that guests > > keep going unconditionally with device present even with backend > >disconnected. Unfortunately your patchset might create such > > expectation. > > It's already so, because guest will never know about crach of backend untill it > tries to communicate via socket. I agree the current handling isn't good. I am just reluctant to stick even more stuff in there before 2.6. We need a plan going forward. I'll think it over during the weekend. -- MST