From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:57025)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mlureau@redhat.com>) id 1aqHSI-0004E5-F3
	for qemu-devel@nongnu.org; Wed, 13 Apr 2016 05:51:27 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mlureau@redhat.com>) id 1aqHSE-0003L3-EN
	for qemu-devel@nongnu.org; Wed, 13 Apr 2016 05:51:26 -0400
Received: from mx5-phx2.redhat.com ([209.132.183.37]:44202)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mlureau@redhat.com>) id 1aqHSE-0003KU-2R
	for qemu-devel@nongnu.org; Wed, 13 Apr 2016 05:51:22 -0400
Date: Wed, 13 Apr 2016 05:51:15 -0400 (EDT)
From: =?utf-8?Q?Marc-Andr=C3=A9?= Lureau <mlureau@redhat.com>
Message-ID: <38556601.765791.1460541075761.JavaMail.zimbra@redhat.com>
In-Reply-To: <20160413024931.GM3080@yliu-dev.sh.intel.com>
References: <1459509388-6185-1-git-send-email-marcandre.lureau@redhat.com>
	<1459509388-6185-12-git-send-email-marcandre.lureau@redhat.com>
	<20160413024931.GM3080@yliu-dev.sh.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [PATCH 11/18] vhost-user: add shutdown support
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: marcandre lureau <marcandre.lureau@redhat.com>, qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>, Tetsuya Mukawa <mukawa@igel.co.jp>, jonshin@cisco.com, Ilya Maximets <i.maximets@samsung.com>

Hi

----- Original Message -----
> Hi Marc,
>=20
> First of all, sorry again for late response!
>=20
> Last time I tried with your first version, I found few issues related
> with reconnect, mainly on the acked_feautres lost. While checking your
> new code, I found that you've already solved that, which is great.
>=20
> So, I tried harder this time, your patches work great, except that I
> found few nits.
>=20
> On Fri, Apr 01, 2016 at 01:16:21PM +0200, marcandre.lureau@redhat.com wro=
te:
> > From: Marc-Andr=C3=A9 Lureau <marcandre.lureau@redhat.com>
> ...
> > +Slave message types
> > +-------------------
> > +
> > + * VHOST_USER_SLAVE_SHUTDOWN:
> > +      Id: 1
> > +      Master payload: N/A
> > +      Slave payload: u64
> > +
> > +      Request the master to shutdown the slave. A 0 reply is for
> > +      success, in which case the slave may close all connections
> > +      immediately and quit.
>=20
> Assume we are using ovs + dpdk here, that we could have two
> vhost-user connections. While ovs tries to initiate a restart,
> it might unregister the two connections one by one. In such
> case, two VHOST_USER_SLAVE_SHUTDOWN request will be sent,
> and two replies will get. Therefore, I don't think it's a
> proper ask here to let the backend implementation to do quit
> here.
>=20

On success reply, the master sent all the commands to finish the connection=
. So the slave must flush/finish all pending requests first. I think this s=
hould be enough, otherwise we may need a new explicit message?

>=20
> > =20
> >      switch (msg.request) {
> > +    case VHOST_USER_SLAVE_SHUTDOWN: {
> > +        uint64_t success =3D 1; /* 0 is for success */
> > +        if (dev->stop) {
> > +            dev->stop(dev);
> > +            success =3D 0;
> > +        }
> > +        msg.payload.u64 =3D success;
> > +        msg.size =3D sizeof(msg.payload.u64);
> > +        size =3D send(u->slave_fd, &msg, VHOST_USER_HDR_SIZE + msg.siz=
e, 0);
> > +        if (size !=3D VHOST_USER_HDR_SIZE + msg.size) {
> > +            error_report("Failed to write reply.");
> > +        }
> > +        break;
>=20
> You might want to remove the slave_fd from watch list? We
> might also need to close slave_fd here, assuming that we
> will no longer use it when VHOST_USER_SLAVE_SHUTDOWN is
> received?

Makes sense, I will change that in next update.

> I'm asking because I found a seg fault issue sometimes,
> due to opaque is NULL.
>

I would be interested to see the backtrace or have a reproducer.

thanks