From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:58409)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <yuanhan.liu@linux.intel.com>) id 1axHkD-00051s-7N
	for qemu-devel@nongnu.org; Mon, 02 May 2016 13:34:59 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <yuanhan.liu@linux.intel.com>) id 1axHk1-0006hm-CB
	for qemu-devel@nongnu.org; Mon, 02 May 2016 13:34:47 -0400
Received: from mga02.intel.com ([134.134.136.20]:21006)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <yuanhan.liu@linux.intel.com>) id 1axHk1-0006fD-0P
	for qemu-devel@nongnu.org; Mon, 02 May 2016 13:34:41 -0400
Date: Mon, 2 May 2016 10:37:51 -0700
From: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Message-ID: <20160502173751.GM5641@yliu-dev.sh.intel.com>
References: <1459509388-6185-1-git-send-email-marcandre.lureau@redhat.com>
	<1459509388-6185-12-git-send-email-marcandre.lureau@redhat.com>
	<20160428052304.GF25677@yliu-dev.sh.intel.com>
	<CAJ+F1CLWsQFOf2kn0bQLmT7HyF=jSDKAf8B+f8zVy-20rZbuxA@mail.gmail.com>
	<20160429174835.GH5641@yliu-dev.sh.intel.com>
	<20160501134233-mutt-send-email-mst@redhat.com>
	<20160501210442.GK5641@yliu-dev.sh.intel.com>
	<20160502080610-mutt-send-email-mst@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <20160502080610-mutt-send-email-mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: =?iso-8859-1?Q?Marc-Andr=E9?= Lureau <marcandre.lureau@gmail.com>, QEMU <qemu-devel@nongnu.org>, Ilya Maximets <i.maximets@samsung.com>, jonshin@cisco.com, Tetsuya Mukawa <mukawa@igel.co.jp>, "Xie, Huawei" <huawei.xie@intel.com>

On Mon, May 02, 2016 at 01:45:31PM +0300, Michael S. Tsirkin wrote:
> On Sun, May 01, 2016 at 02:04:42PM -0700, Yuanhan Liu wrote:
> > On Sun, May 01, 2016 at 02:37:19PM +0300, Michael S. Tsirkin wrote:
> > > On Fri, Apr 29, 2016 at 10:48:35AM -0700, Yuanhan Liu wrote:
> > > > > But, I
> > > > > would worry first about a backend that crashes that it may corrupt the
> > > > > VM memory too...
> > > > 
> > > > Not quite sure I follow this. But, backend just touches the virtio
> > > > related memory, so it will do no harm to the VM?
> > > 
> > > It crashed so apparently it's not behaving as designed -
> > > how can we be sure it does not harm the VM?
> > 
> > Hi Michael,
> > 
> > What I know so far, a crash might could corrupt the virtio memory in two
> > ways:
> > 
> > - vring_used_elem might be in stale state when we are in the middle of
> >   updating used vring. Say, we might have updated the "id" field, but
> >   leaving "len" untouched.
> > 
> > - vring desc buff might also be in stale state, when we are copying
> >   data into it.
> 
> 
> - a random write into VM memory due to backend bug corrupts it.
> 
> > However, the two issues will not be real issue, as used->idx is not
> > updated in both case. Thefore, those corrupted memory will not be
> > processed by guest. So, no harm I'd say. Or, am I missing anything?
> 
> Why is backend crashing? It shouldn't so maybe this means it's
> memory is corrupt. That is the claim.

As far as virtio memory is not corrupted (or even corrupt in above
ways), there would be no issue. But, yes, you made a good point: we
make no guarantees that it's not the virtio memory corruption caused
the crash.

> > BTW, Michael, what's your thoughts on the whole reconnect stuff?
> > 
> > 	--yliu
> 
> My thoughts are that we need to split these patchsets up.
> 
> There are several issues here:
> 
> 
> 1. Graceful disconnect
> - One should be able to do vmstop, disconnect, connect then vm start
>   This looks like a nice intermediate step.
> - Why would we always assume it's always remote initiating the disconnect?
>   Monitor commands for disconnecting seem called for.

A monitor command is a good suggestion. But I'm thinking why vmstop is
necessary. Basically, a disconnect is as to a cable unplug to physical
NIC; we don't need pause it before unplugging.

> 
> 2. Unexpected disconnect
> - If buffers are processed in order or flushed before socket close,
>   then on disconnect status can be recovered
>   from ring alone. If that is of interest, we might want to add
>   a protocol flag for that. F_DISCONNECT_FLUSH ?

Sorry, what does this flag supposed to work here?

> Without this,
>   only graceful disconnect or reset with guest's help can be supported.
> - As Marc-André said, without graceful shutdown it is not enough to
>   handle ring state though.  We must handle socket getting disconnected
>   in the middle of send/receive.  While it's more work,
>   it does seem like a nice interface to support.

Same as above, what the backend (or QEMU) can do for this case without
the guest's (reset) help?


> - I understand the concern that existing guests do not handle NEED_RESET
>   status. One way to fix this would be a new feature bit. If NEED_RESET not
>   handled,

I could check the code by myself, but I'm thinking it might be trivial
for you to answer my question: how do we know that NEED_RESET is not
handled?

> request hot-unplug instead.

Same as above, may I know how to request a hot-unplug?

> 
> 3. Running while disconnected
> - At the moment, we wait on vm start for remote to connect,
>   if we support disconnecting backend without stopping
>   we probably should also support starting it without waiting
>   for connection

Agreed. I guess that would anaswer my first question: we don't actually
need to do vmstop before disconnect.

	--yliu

> - Guests expect tx buffers to be used in a timely manner, thus:
> - If vm is running we must use them in qemu, maybe discarding packets
>   in the process.
>   There already are commands for link down, I'm not sure there's value
>   in doing this automatically in qemu.
> - Alternatively, we should also have an option to stop vm automatically (like we do on
>   disk errors) to reduce number of dropped packets.
> 
> 4. Reconnecting
> - When acting as a server, we might want an option to go back to
>   listening state, but it should not be the only option,
>   there should be a monitor command for moving it back to
>   listening state.
> - When acting as a client, auto-reconnect might be handy sometimes, but should not be the only
>   option, there should be a way to manually request connect, possibly to
>   a different target, so a monitor command for re-connecting seems called for.
> - We'll also need monitor events for disconnects so management knows it
>   must re-connect/restart listening.
> - If we stopped VM, there could be an option to restart on reconnect.
> 
> 
> -- 
> MST