From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:36165)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <maxime.coquelin@redhat.com>) id 1c9FJL-0007NS-W1
	for qemu-devel@nongnu.org; Tue, 22 Nov 2016 12:56:53 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <maxime.coquelin@redhat.com>) id 1c9FJH-0004Kg-T3
	for qemu-devel@nongnu.org; Tue, 22 Nov 2016 12:56:52 -0500
Received: from mx1.redhat.com ([209.132.183.28]:42644)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <maxime.coquelin@redhat.com>)
	id 1c9FJH-0004KO-Jk
	for qemu-devel@nongnu.org; Tue, 22 Nov 2016 12:56:47 -0500
References: <1479419887-10515-1-git-send-email-maxime.coquelin@redhat.com>
	<f7tzikwirip.fsf@redhat.com>
	<0105e7e3-6003-04c8-483d-30ed1208e5fc@redhat.com>
	<f7tvavkiofh.fsf@redhat.com>
	<20161121182040-mutt-send-email-mst@kernel.org>
	<38ee6d4c-ff87-d8cc-1a1d-f45b948dcdb3@redhat.com>
	<796ff318-c3e9-e885-f345-7f6e48d3bb8b@redhat.com>
	<f7ttwazoaam.fsf@redhat.com>
	<20161122163413-mutt-send-email-mst@kernel.org>
From: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-ID: <2d971a18-4253-1bac-4098-6879c9ccc01f@redhat.com>
Date: Tue, 22 Nov 2016 18:56:43 +0100
MIME-Version: 1.0
In-Reply-To: <20161122163413-mutt-send-email-mst@kernel.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Qemu-devel] [RFC v2 0/3] virtio-net: Add support to MTU feature
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Michael S. Tsirkin" <mst@redhat.com>, Aaron Conole <aconole@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>, pbonzini@redhat.com, qemu-devel@nongnu.org, yuanhan.liu@linux.intel.com


On 11/22/2016 03:41 PM, Michael S. Tsirkin wrote:
> On Tue, Nov 22, 2016 at 09:32:01AM -0500, Aaron Conole wrote:
>> Maxime Coquelin <maxime.coquelin@redhat.com> writes:
>>
>>> On 11/22/2016 05:07 AM, Jason Wang wrote:
>>>>
>>>>
>>>> On 2016=E5=B9=B411=E6=9C=8822=E6=97=A5 00:23, Michael S. Tsirkin wro=
te:
>>>>> On Fri, Nov 18, 2016 at 02:21:54PM -0500, Aaron Conole wrote:
>>>>>> Maxime Coquelin <maxime.coquelin@redhat.com> writes:
>>>>>>
>>>>>>> On 11/18/2016 07:15 PM, Aaron Conole wrote:
>>>>>>>> Maxime Coquelin <maxime.coquelin@redhat.com> writes:
>>>>>>>>
>>>>>>>>> This series implements Virtio spec update from Aaron Conole whi=
ch
>>>>>>>>> defines a way for the host to expose its max MTU to the guest.
>>>>>>>>>
>>>>>>>>> Changes since RFC v1:
>>>>>>>>> ---------------------
>>>>>>>>>   - Rebased on top of v2.8.0-rc0 (2.7.90)
>>>>>>>>>   - Write MTU unconditionnaly in netcfg to avoid memory leak (P=
aolo)
>>>>>>>>>   - Add host_mtu property to be able to disable the feature fro=
m QEMU
>>>>>>>>>
>>>>>>>>> Maxime Coquelin (3):
>>>>>>>>>    vhost-user: Add new protocol feature MTU
>>>>>>>>>    vhost-net: Add new MTU feature support
>>>>>>>>>    virtio-net: Add MTU feature support
>>>>>>>>>
>>>>>>>>>   hw/net/vhost_net.c             | 11 +++++++++++
>>>>>>>>>   hw/net/virtio-net.c            | 14 ++++++++++++++
>>>>>>>>>   hw/virtio/vhost-user.c         | 11 +++++++++++
>>>>>>>>>   include/hw/virtio/vhost.h      |  1 +
>>>>>>>>>   include/hw/virtio/virtio-net.h |  1 +
>>>>>>>>>   include/net/vhost_net.h        |  2 ++
>>>>>>>>>   6 files changed, 40 insertions(+)
>>>>>>>> I ran this with a VM, but it seems the offered maximum MTU was o=
f
>>>>>>>> value
>>>>>>>> 0 - is this expected with this version?  How can I change the of=
fered
>>>>>>>> value?  Sorry, I'm not as familiar with QEMU/libvirt side of the
>>>>>>>> world.
>>>>>>> They way I implemented it, the MTU value is to be provided by
>>>>>>> vhost-user process (e.g. OVS/DPDK). I added a Vhost protocol
>>>>>>> feature for this. The sequence is:
>>>>>>> 1. Qemu send VHOST_USER_GET_PROTOCOL_FEATURES request
>>>>>>> 2. DPDK replies with providing supported features
>>>>>>> 3. If DPDK supports VHOST_USER_PROTOCOL_F_MTU, Qemu send
>>>>>>>     VHOST_USER_GET_MTU resuest
>>>>>>> 4. DPDK replies with MTU value
>>>>>>>
>>>>>>> Does that make sense?
>>>>>> In the case of a vhost-user backed port, yes (so for instance, if =
I use
>>>>>> ovs+dpdk vhost-user in client or server mode).  However, what abou=
t the
>>>>>> non-dpdk case, where I still use a virtio-net driver in kernel and=
 want
>>>>>> to have it backed with, say, a tap device in the host attached to
>>>>>> virbr0 (or some other bridge).  It should still pull the mtu from =
that
>>>>>> device and offer it, I think.
>>>>>>
>>>>>>> Another possibility would be that we could directly pass the MTU =
value
>>>>>>> to Qemu. It may be easier to implement, and to handle migration.
>>>>>>> Problem is that if we do this, this is not the vSwitch that decid=
es the
>>>>>>> MTU to set.
>>>>>> Might be better to determined the mtu by looking at what actually
>>>>>> provides the back-end for the networking.
>>>>>>
>>>>>>> Regards,
>>>>>>> Maxime
>>>>> Right. So in case it's not vhost-user, I would say it has to
>>>>> be specified from QEMU command line.
>>>>> It's probably easier to do the same everywhere, and just send
>>>>> the MTU from qemu to backend.
>>>>
>>>> Or vice-versa? E.g qemu need to be notified if the MTU of tap or mac=
vtap
>>>> were changed?
>>
>> I think qemu should just query the MTU at start time and then
>> provide that as the value to the VM.  Why specify with command line
>> option?
>
> Passing it on command line is an easy way to make sure we can migrate
> the VM correctly.  Also, if we get it from the backend, this requires
> backend changes.  In particular, tun/macvtap will need new ioctls.  So
> generally that's more work.

I agree this is easier for migration.

>>  Seems to me like an easy way to get out of sync.
>
> If we send it to the backend, that has a chance to check
> mtu and disconnect on error.

For vhost-user backend, we can send it the MTU value with a
vhost-user protocol feature.

For tun/macvtap, how do you do without adding a new ioctl ?

>
>>> The spec says the MTU must not be modified by the device once it has
>>> been set.
>>
>> +1 - we don't have an exchange or negotiation mechanism for this.  Tha=
t
>> would require additional bits and communication, and it seems like it
>> isn't worth it.
>>
>>> I think it would require a device reset if MTU came to change.
>>
>> It's just too much work for not enough gain.  Guest can change MTU all
>> it wants.  Host should just know what it will limit to the guest from
>> the beginning.  Maybe I'm too simple, though.
>>
>> -Aaron
>
> I agree it's a good start.

Thanks,
Maxime