From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36165) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1c9FJL-0007NS-W1 for qemu-devel@nongnu.org; Tue, 22 Nov 2016 12:56:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1c9FJH-0004Kg-T3 for qemu-devel@nongnu.org; Tue, 22 Nov 2016 12:56:52 -0500 Received: from mx1.redhat.com ([209.132.183.28]:42644) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1c9FJH-0004KO-Jk for qemu-devel@nongnu.org; Tue, 22 Nov 2016 12:56:47 -0500 References: <1479419887-10515-1-git-send-email-maxime.coquelin@redhat.com> <0105e7e3-6003-04c8-483d-30ed1208e5fc@redhat.com> <20161121182040-mutt-send-email-mst@kernel.org> <38ee6d4c-ff87-d8cc-1a1d-f45b948dcdb3@redhat.com> <796ff318-c3e9-e885-f345-7f6e48d3bb8b@redhat.com> <20161122163413-mutt-send-email-mst@kernel.org> From: Maxime Coquelin Message-ID: <2d971a18-4253-1bac-4098-6879c9ccc01f@redhat.com> Date: Tue, 22 Nov 2016 18:56:43 +0100 MIME-Version: 1.0 In-Reply-To: <20161122163413-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] [RFC v2 0/3] virtio-net: Add support to MTU feature List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" , Aaron Conole Cc: Jason Wang , pbonzini@redhat.com, qemu-devel@nongnu.org, yuanhan.liu@linux.intel.com On 11/22/2016 03:41 PM, Michael S. Tsirkin wrote: > On Tue, Nov 22, 2016 at 09:32:01AM -0500, Aaron Conole wrote: >> Maxime Coquelin writes: >> >>> On 11/22/2016 05:07 AM, Jason Wang wrote: >>>> >>>> >>>> On 2016=E5=B9=B411=E6=9C=8822=E6=97=A5 00:23, Michael S. Tsirkin wro= te: >>>>> On Fri, Nov 18, 2016 at 02:21:54PM -0500, Aaron Conole wrote: >>>>>> Maxime Coquelin writes: >>>>>> >>>>>>> On 11/18/2016 07:15 PM, Aaron Conole wrote: >>>>>>>> Maxime Coquelin writes: >>>>>>>> >>>>>>>>> This series implements Virtio spec update from Aaron Conole whi= ch >>>>>>>>> defines a way for the host to expose its max MTU to the guest. >>>>>>>>> >>>>>>>>> Changes since RFC v1: >>>>>>>>> --------------------- >>>>>>>>> - Rebased on top of v2.8.0-rc0 (2.7.90) >>>>>>>>> - Write MTU unconditionnaly in netcfg to avoid memory leak (P= aolo) >>>>>>>>> - Add host_mtu property to be able to disable the feature fro= m QEMU >>>>>>>>> >>>>>>>>> Maxime Coquelin (3): >>>>>>>>> vhost-user: Add new protocol feature MTU >>>>>>>>> vhost-net: Add new MTU feature support >>>>>>>>> virtio-net: Add MTU feature support >>>>>>>>> >>>>>>>>> hw/net/vhost_net.c | 11 +++++++++++ >>>>>>>>> hw/net/virtio-net.c | 14 ++++++++++++++ >>>>>>>>> hw/virtio/vhost-user.c | 11 +++++++++++ >>>>>>>>> include/hw/virtio/vhost.h | 1 + >>>>>>>>> include/hw/virtio/virtio-net.h | 1 + >>>>>>>>> include/net/vhost_net.h | 2 ++ >>>>>>>>> 6 files changed, 40 insertions(+) >>>>>>>> I ran this with a VM, but it seems the offered maximum MTU was o= f >>>>>>>> value >>>>>>>> 0 - is this expected with this version? How can I change the of= fered >>>>>>>> value? Sorry, I'm not as familiar with QEMU/libvirt side of the >>>>>>>> world. >>>>>>> They way I implemented it, the MTU value is to be provided by >>>>>>> vhost-user process (e.g. OVS/DPDK). I added a Vhost protocol >>>>>>> feature for this. The sequence is: >>>>>>> 1. Qemu send VHOST_USER_GET_PROTOCOL_FEATURES request >>>>>>> 2. DPDK replies with providing supported features >>>>>>> 3. If DPDK supports VHOST_USER_PROTOCOL_F_MTU, Qemu send >>>>>>> VHOST_USER_GET_MTU resuest >>>>>>> 4. DPDK replies with MTU value >>>>>>> >>>>>>> Does that make sense? >>>>>> In the case of a vhost-user backed port, yes (so for instance, if = I use >>>>>> ovs+dpdk vhost-user in client or server mode). However, what abou= t the >>>>>> non-dpdk case, where I still use a virtio-net driver in kernel and= want >>>>>> to have it backed with, say, a tap device in the host attached to >>>>>> virbr0 (or some other bridge). It should still pull the mtu from = that >>>>>> device and offer it, I think. >>>>>> >>>>>>> Another possibility would be that we could directly pass the MTU = value >>>>>>> to Qemu. It may be easier to implement, and to handle migration. >>>>>>> Problem is that if we do this, this is not the vSwitch that decid= es the >>>>>>> MTU to set. >>>>>> Might be better to determined the mtu by looking at what actually >>>>>> provides the back-end for the networking. >>>>>> >>>>>>> Regards, >>>>>>> Maxime >>>>> Right. So in case it's not vhost-user, I would say it has to >>>>> be specified from QEMU command line. >>>>> It's probably easier to do the same everywhere, and just send >>>>> the MTU from qemu to backend. >>>> >>>> Or vice-versa? E.g qemu need to be notified if the MTU of tap or mac= vtap >>>> were changed? >> >> I think qemu should just query the MTU at start time and then >> provide that as the value to the VM. Why specify with command line >> option? > > Passing it on command line is an easy way to make sure we can migrate > the VM correctly. Also, if we get it from the backend, this requires > backend changes. In particular, tun/macvtap will need new ioctls. So > generally that's more work. I agree this is easier for migration. >> Seems to me like an easy way to get out of sync. > > If we send it to the backend, that has a chance to check > mtu and disconnect on error. For vhost-user backend, we can send it the MTU value with a vhost-user protocol feature. For tun/macvtap, how do you do without adding a new ioctl ? > >>> The spec says the MTU must not be modified by the device once it has >>> been set. >> >> +1 - we don't have an exchange or negotiation mechanism for this. Tha= t >> would require additional bits and communication, and it seems like it >> isn't worth it. >> >>> I think it would require a device reset if MTU came to change. >> >> It's just too much work for not enough gain. Guest can change MTU all >> it wants. Host should just know what it will limit to the guest from >> the beginning. Maybe I'm too simple, though. >> >> -Aaron > > I agree it's a good start. Thanks, Maxime