From: Jiri Pirko <jiri@resnulli.us>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: "Duyck, Alexander H" <alexander.h.duyck@intel.com>,
virtio-dev@lists.oasis-open.org,
"Michael S. Tsirkin" <mst@redhat.com>,
Jakub Kicinski <kubakici@wp.pl>,
Sridhar Samudrala <sridhar.samudrala@intel.com>,
virtualization@lists.linux-foundation.org,
Siwei Liu <loseweigh@gmail.com>, Netdev <netdev@vger.kernel.org>,
David Miller <davem@davemloft.net>
Subject: Re: [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device
Date: Tue, 20 Feb 2018 17:29:33 +0100 [thread overview]
Message-ID: <20180220162933.GD2031@nanopsycho> (raw)
In-Reply-To: <CAKgT0UdU8PDXduzxp4kKfur-DLeFQSJj7-fhW_eTgVzd+AcViw@mail.gmail.com>
Tue, Feb 20, 2018 at 05:04:29PM CET, alexander.duyck@gmail.com wrote:
>On Tue, Feb 20, 2018 at 2:42 AM, Jiri Pirko <jiri@resnulli.us> wrote:
>> Fri, Feb 16, 2018 at 07:11:19PM CET, sridhar.samudrala@intel.com wrote:
>>>Patch 1 introduces a new feature bit VIRTIO_NET_F_BACKUP that can be
>>>used by hypervisor to indicate that virtio_net interface should act as
>>>a backup for another device with the same MAC address.
>>>
>>>Ppatch 2 is in response to the community request for a 3 netdev
>>>solution. However, it creates some issues we'll get into in a moment.
>>>It extends virtio_net to use alternate datapath when available and
>>>registered. When BACKUP feature is enabled, virtio_net driver creates
>>>an additional 'bypass' netdev that acts as a master device and controls
>>>2 slave devices. The original virtio_net netdev is registered as
>>>'backup' netdev and a passthru/vf device with the same MAC gets
>>>registered as 'active' netdev. Both 'bypass' and 'backup' netdevs are
>>>associated with the same 'pci' device. The user accesses the network
>>>interface via 'bypass' netdev. The 'bypass' netdev chooses 'active' netdev
>>>as default for transmits when it is available with link up and running.
>>
>> Sorry, but this is ridiculous. You are apparently re-implemeting part
>> of bonding driver as a part of NIC driver. Bond and team drivers
>> are mature solutions, well tested, broadly used, with lots of issues
>> resolved in the past. What you try to introduce is a weird shortcut
>> that already has couple of issues as you mentioned and will certanly
>> have many more. Also, I'm pretty sure that in future, someone comes up
>> with ideas like multiple VFs, LACP and similar bonding things.
>
>The problem with the bond and team drivers is they are too large and
>have too many interfaces available for configuration so as a result
>they can really screw this interface up.
What? Too large is which sense? Why "too many interfaces" is a problem?
Also, team has only one interface to userspace team-generic-netlink.
>
>Essentially this is meant to be a bond that is more-or-less managed by
>the host, not the guest. We want the host to be able to configure it
How is it managed by the host? In your usecase the guest has 2 netdevs:
virtio_net, pci vf.
I don't see how host can do any managing of that, other than the
obvious. But still, the active/backup decision is done in guest. This is
a simple bond/team usecase. As I said, there is something needed to be
implemented in userspace in order to handle re-appear of vf netdev.
But that should be fairly easy to do in teamd.
>and have it automatically kick in on the guest. For now we want to
>avoid adding too much complexity as this is meant to be just the first
That's what I fear, "for now"..
>step. Trying to go in and implement the whole solution right from the
>start based on existing drivers is going to be a massive time sink and
>will likely never get completed due to the fact that there is always
>going to be some other thing that will interfere.
"implement the whole solution right from the start based on existing
drivers" - what solution are you talking about? I don't understand this
para.
>
>My personal hope is that we can look at doing a virtio-bond sort of
>device that will handle all this as well as providing a communication
>channel, but that is much further down the road. For now we only have
>a single bit so the goal for now is trying to keep this as simple as
>possible.
Oh. So there is really intention to do re-implementation of bonding
in virtio. That is plain-wrong in my opinion.
Could you just use bond/team, please, and don't reinvent the wheel with
this abomination?
>
>> What is the reason for this abomination? According to:
>> https://marc.info/?l=linux-virtualization&m=151189725224231&w=2
>> The reason is quite weak.
>> User in the vm sees 2 (or more) netdevices, he puts them in bond/team
>> and that's it. This works now! If the vm lacks some userspace features,
>> let's fix it there! For example the MAC changes is something that could
>> be easily handled in teamd userspace deamon.
>
>I think you might have missed the point of this. This is meant to be a
>simple interface so the guest should not be able to change the MAC
>address, and it shouldn't require any userspace daemon to setup or
>tear down. Ideally with this solution the virtio bypass will come up
>and be assigned the name of the original virtio, and the "backup"
>interface will come up and be assigned the name of the original virtio
>with an additional "nbackup" tacked on via the phys_port_name, and
>then whenever a VF is added it will automatically be enslaved by the
>bypass interface, and it will be removed when the VF is hotplugged
>out.
>
>In my mind the difference between this and bond or team is where the
>configuration interface lies. In the case of bond it is in the kernel.
>If my understanding is correct team is mostly in user space. With this
>the configuration interface is really down in the hypervisor and
>requests are communicated up to the guest. I would prefer not to make
>virtio_net dependent on the bonding or team drivers, or worse yet a
>userspace daemon in the guest. For now I would argue we should keep
>this as simple as possible just to support basic live migration. There
>has already been discussions of refactoring this after it is in so
>that we can start to combine the functionality here with what is there
>in bonding/team, but the differences in configuration interface and
>the size of the code bases will make it challenging to outright merge
>this into something like that.
next prev parent reply other threads:[~2018-02-20 16:29 UTC|newest]
Thread overview: 61+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-16 18:11 [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device Sridhar Samudrala
2018-02-16 18:11 ` [RFC PATCH v3 1/3] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit Sridhar Samudrala
2018-02-16 18:11 ` [RFC PATCH v3 2/3] virtio_net: Extend virtio to use VF datapath when available Sridhar Samudrala
2018-02-17 3:04 ` Jakub Kicinski
2018-02-17 17:41 ` Alexander Duyck
2018-02-16 18:11 ` [RFC PATCH v3 3/3] virtio_net: Enable alternate datapath without creating an additional netdev Sridhar Samudrala
2018-02-17 2:38 ` [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device Jakub Kicinski
2018-02-17 17:12 ` Alexander Duyck
2018-02-19 6:11 ` Jakub Kicinski
2018-02-20 16:26 ` Samudrala, Sridhar
2018-02-21 23:50 ` Siwei Liu
2018-02-22 0:17 ` Alexander Duyck
2018-02-22 1:59 ` Siwei Liu
2018-02-22 2:35 ` Samudrala, Sridhar
2018-02-22 3:28 ` Samudrala, Sridhar
2018-02-23 22:22 ` Siwei Liu
2018-02-23 22:38 ` Jiri Pirko
2018-02-24 0:17 ` Siwei Liu
2018-02-24 0:03 ` Stephen Hemminger
2018-02-25 22:17 ` Alexander Duyck
2018-02-20 10:42 ` Jiri Pirko
2018-02-20 16:04 ` Alexander Duyck
2018-02-20 16:29 ` Jiri Pirko [this message]
2018-02-20 17:14 ` Samudrala, Sridhar
2018-02-20 20:14 ` Jiri Pirko
2018-02-20 21:02 ` Alexander Duyck
2018-02-20 22:33 ` Jakub Kicinski
2018-02-21 9:51 ` Jiri Pirko
2018-02-21 15:56 ` Alexander Duyck
2018-02-21 16:11 ` Jiri Pirko
2018-02-21 16:49 ` Alexander Duyck
2018-02-21 16:58 ` Jiri Pirko
2018-02-21 17:56 ` Alexander Duyck
2018-02-21 19:38 ` Jiri Pirko
2018-02-21 20:57 ` Alexander Duyck
2018-02-22 2:02 ` Jakub Kicinski
2018-02-22 2:15 ` Samudrala, Sridhar
2018-02-22 8:11 ` Jiri Pirko
2018-02-22 11:54 ` Or Gerlitz
2018-02-22 13:07 ` Jiri Pirko
2018-02-22 15:30 ` Alexander Duyck
2018-02-22 21:30 ` Alexander Duyck
2018-02-23 23:59 ` Stephen Hemminger
2018-02-25 22:21 ` Alexander Duyck
2018-02-26 7:19 ` Jiri Pirko
2018-02-27 1:02 ` Stephen Hemminger
2018-02-27 1:18 ` Michael S. Tsirkin
2018-02-27 8:27 ` Jiri Pirko
2018-02-20 17:23 ` Alexander Duyck
2018-02-20 19:53 ` Jiri Pirko
2018-02-27 8:49 ` Jiri Pirko
2018-02-27 21:16 ` Alexander Duyck
2018-02-27 21:23 ` Michael S. Tsirkin
2018-02-27 21:41 ` Jakub Kicinski
2018-02-28 7:08 ` Jiri Pirko
2018-02-28 14:32 ` Michael S. Tsirkin
2018-02-28 15:11 ` Jiri Pirko
2018-02-28 15:45 ` Michael S. Tsirkin
2018-02-28 19:25 ` Jiri Pirko
2018-02-28 20:48 ` Michael S. Tsirkin
2018-02-27 21:30 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180220162933.GD2031@nanopsycho \
--to=jiri@resnulli.us \
--cc=alexander.duyck@gmail.com \
--cc=alexander.h.duyck@intel.com \
--cc=davem@davemloft.net \
--cc=kubakici@wp.pl \
--cc=loseweigh@gmail.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=sridhar.samudrala@intel.com \
--cc=virtio-dev@lists.oasis-open.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).