From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:33836)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mukawa@igel.co.jp>) id 1akMMp-0003SV-6h
	for qemu-devel@nongnu.org; Sun, 27 Mar 2016 21:53:20 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mukawa@igel.co.jp>) id 1akMMl-0004qE-U3
	for qemu-devel@nongnu.org; Sun, 27 Mar 2016 21:53:19 -0400
Received: from mail-pa0-x22b.google.com ([2607:f8b0:400e:c03::22b]:33308)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <mukawa@igel.co.jp>) id 1akMMl-0004qA-IL
	for qemu-devel@nongnu.org; Sun, 27 Mar 2016 21:53:15 -0400
Received: by mail-pa0-x22b.google.com with SMTP id zm5so1146049pac.0
	for <qemu-devel@nongnu.org>; Sun, 27 Mar 2016 18:53:14 -0700 (PDT)
References: <1441753806-14225-1-git-send-email-marcandre.lureau@redhat.com>
	<20151126121944-mutt-send-email-mst@redhat.com>
	<20160324071001.GA4525@yliu-dev.sh.intel.com>
	<CAJ+F1CL0MQsz5JHhWQDMTR-29hh_D65E5d+FUCdg4YQ8453TLA@mail.gmail.com>
From: Tetsuya Mukawa <mukawa@igel.co.jp>
Message-ID: <56F88E87.2030704@igel.co.jp>
Date: Mon, 28 Mar 2016 10:53:11 +0900
MIME-Version: 1.0
In-Reply-To: <CAJ+F1CL0MQsz5JHhWQDMTR-29hh_D65E5d+FUCdg4YQ8453TLA@mail.gmail.com>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Subject: Re: [Qemu-devel] [PATCH RFC 00/14] vhost-user: shutdown and
 reconnection
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: =?UTF-8?Q?Marc-Andr=c3=a9_Lureau?= <marcandre.lureau@gmail.com>, QEMU <qemu-devel@nongnu.org>, "Michael S. Tsirkin" <mst@redhat.com>

On 2016/03/26 3:00, Marc-André Lureau wrote:
> Hi
>
> On Thu, Mar 24, 2016 at 8:10 AM, Yuanhan Liu
> <yuanhan.liu@linux.intel.com> wrote:
>>>> The following series starts from the idea that the slave can request a
>>>> "managed" shutdown instead and later recover (I guess the use case for
>>>> this is to allow for example to update static dispatching/filter rules
>>>> etc)
>> What if the backend crashes, that no such request will be sent? And
>> I'm wondering why this request is needed, as we are able to detect
>> the disconnect now (with your patches).
> I don't think trying to handle backend crashes is really a thing we
> need to take care of. If the backend is bad enough to crash, it may as
> well corrupt the guest memory (mst: my understanding of vhost-user is
> that backend must be trusted, or it could just throw garbage in the
> queue descriptors with surprising consequences or elsewhere in the
> guest memory actually, right?).
>
>> BTW, you meant to let QEMU as the server and the backend as the client
>> here, right? Honestly, that's what we've thought of, too, in the first
>> time.
>> However, I'm wondering could we still go with the QEMU as the client
>> and the backend as the server (the default and the only way DPDK
>> supports), and let QEMU to try to reconnect when the backend crashes
>> and restarts. In such case, we need enable the "reconnect" option
>> for vhost-user, and once I have done that, it basically works in my
>> test:
>>
> Conceptually, I think if we allow the backend to disconnect, it makes
> sense that qemu is actually the socket server. But it doesn't matter
> much, it's simple to teach qemu to reconnect a timer... So we should
> probably allow both cases anyway.
>
>> - start DPDK vhost-switch example
>>
>> - start QEMU, which will connect to DPDK vhost-user
>>
>>   link is good now.
>>
>> - kill DPDK vhost-switch
>>
>>   link is broken at this stage
>>
>> - start DPDK vhost-switch again
>>
>>   you will find that the link is back again.
>>
>>
>> Will that makes sense to you? If so, we may need do nothing (or just
>> very few) changes at all to DPDK to get the reconnect work.
> The main issue with handling crashes (gone at any time) is that the
> backend my not have time to sync the used idx (at the least). It may
> already have processed incoming packets, so on reconnect, it may
> duplicate the receiving/dispatching work. Similarly, on the backend
> receiving end, some packets may be lost, never received by the VM, and
> later overwritten by the backend after reconnect (for the same used
> idx update reason). This may not be a big deal for unreliable
> protocols, but I am not familiar enough with network usage to know if
> that's fine in all cases. It may be fine for some packets, such as
> udp.
>
> However, in general, vhost-user should not be specific to network
> transmission, and it would be nice to have a reliable way for the the
> backend to reconnect. That's what I try to do in this series. I'll
> repost it after I have done more testing.
>
> thanks
>

Hi Yuanhan,

Probably, we have 2 options here.
One is using DEVICE_NEEDS_RESET, or adding one more new status like
QUEUE_NEEDS_RESET to virtio specification.
In this case, we will need to fix virtio-net drivers and virtio-net
device of QEMU, so it might need to fix a lot of code, but we can handle
unexpected shutdown of vhost-user backend.
The other option is Marc's simple solution. In this case, we don't need
to change virtio-net drivers, but we cannot handle unexpected shutdown.

Thanks,
Tetsuya