From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
To: Alexander Duyck <alexander.duyck@gmail.com>
Cc: Jakub Kicinski <kubakici@wp.pl>,
"Michael S. Tsirkin" <mst@redhat.com>,
Siwei Liu <loseweigh@gmail.com>,
Stephen Hemminger <stephen@networkplumber.org>,
David Miller <davem@davemloft.net>,
Netdev <netdev@vger.kernel.org>,
virtualization@lists.linux-foundation.org,
virtio-dev@lists.oasis-open.org, "Brandeburg,
Jesse" <jesse.brandeburg@intel.com>,
Alexander Duyck <alexander.h.duyck@intel.com>
Subject: Re: [virtio-dev] Re: [RFC PATCH net-next v2 2/2] virtio_net: Extend virtio to use VF datapath when available
Date: Sun, 28 Jan 2018 13:01:59 -0800 [thread overview]
Message-ID: <b6bf20f8-8881-4d25-db4c-24f93d5e6cba@intel.com> (raw)
In-Reply-To: <CAKgT0UdaKtPe6982TuuGbxhhVgeehwS1aAp=s4sok2vKD6wMJg@mail.gmail.com>
On 1/28/2018 12:18 PM, Alexander Duyck wrote:
> On Sun, Jan 28, 2018 at 11:18 AM, Samudrala, Sridhar
> <sridhar.samudrala@intel.com> wrote:
>> On 1/28/2018 9:35 AM, Alexander Duyck wrote:
>>> On Fri, Jan 26, 2018 at 9:58 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
>>>> On Fri, 26 Jan 2018 21:33:01 -0800, Samudrala, Sridhar wrote:
>>>>>>> 3 netdev model breaks this configuration starting with the creation
>>>>>>> and naming of the 2 devices to udev needing to be aware of master and
>>>>>>> slave virtio-net devices.
>>>>>> I don't understand this comment. There is one virtio-net device and
>>>>>> one "virtio-bond" netdev. And user space has to be aware of the
>>>>>> special
>>>>>> automatic arrangement anyway, because it can't touch the VF. It
>>>>>> doesn't make any difference whether it ignores the VF or PV and VF.
>>>>>> It simply can't touch the slaves, no matter how many there are.
>>>>> If the userspace is not expected to touch the slaves, then why do we
>>>>> need to
>>>>> take extra effort to expose a netdev that is just not really useful.
>>>> You said:
>>>> "[user space] needs to be aware of master and slave virtio-net devices."
>>>>
>>>> I'm saying:
>>>> It has to be aware of the special arrangement whether there is an
>>>> explicit bond netdev or not.
>>> To clarify here the kernel should be aware that there is a device that
>>> is an aggregate of 2 other devices. It isn't as if we need to insert
>>> the new device "above" the virtio.
>>>
>>> I have been doing a bit of messing around with a few ideas and it
>>> seems like it would be better if we could replace the virtio interface
>>> with the virtio-bond, renaming my virt-bond concept to this since it
>>> is now supposedly going to live in the virtio driver, interface, and
>>> then push the original virtio down one layer and call it a
>>> virtio-backup. If I am not mistaken we could assign the same dev
>>> pointer used by the virtio netdev to the virtio-bond, and if we
>>> register it first with the "eth%d" name then udev will assume that the
>>> virtio-bond device is the original virtio and all existing scripts
>>> should just work with that. We then would want to change the name of
>>> the virtio interface with the backup feature bit set, maybe call it
>>> something like bkup-00:00:00 where the 00:00:00 would be the last 3
>>> octets of the MAC address. It should solve the issue of inserting an
>>> interface "above" the virtio by making the virtio-bond become the
>>> virtio. The only limitation is that we will probably need to remove
>>> the back-up if the virtio device is removed, however that was already
>>> a limitation of this solution and others like the netvsc solution
>>> anyway.
>>
>> With 3 netdev model, if we make the the master virtio-net associated with
>> the
>> real virtio pci device, i think the userspace scripts would not break.
>> If we go this route, i am still not clear on the purpose of exposing the
>> bkup netdev.
>> Can we start with the 2 netdev model and move to 3 netdev model later if we
>> find out that there are limitiations with the 2 netdev model? I don't think
>> this will
>> break any user API as the userspace is not expected to use the bkup netdev.
> The 2 netdev model breaks a large number of expectations of user
> space. Among them is XDP since we cannot guarantee a symmetric setup
> between any entity and the virtio. How does it make sense that
> enabling XDP on virtio shows zero Rx packets, and in the meantime you
> are getting all of the packets coming in off of the VF?
Sure we cannot support XDP in this model and it needs to be disabled.
>
> In addition we would need to rewrite the VLAN and MAC address
> filtering ndo operations since we likely cannot add any VLANs since in
> most cases VFs are VLAN locked due to things like port VLAN and we
> cannot change the MAC address since the whole bonding concept is built
> around it.
>
> The last bit is the netpoll packet routing which the current code
> assumes is using the virtio only, but I don't know if that is a valid
> assumption since the VF is expected to be the default route for
> everything else when it is available.
>
> The point is by the time you are done you will have rewritten pretty
> much all the network device ops. With that being the case why add all
> the code to virtio itself instead of just coming up with a brand new
> set of ndo_ops that belong to this new device, and you could leave the
> existing virtio code in place and save yourself a bunch of time by
> just accessing it as an existing call as a separate netdev.
When the BACKUP feature is enabled, we can simply disable most of these
ndo ops
that cannot be supported. Not sure we need an additional netdev and ndo_ops.
When we can support all these usecases along with live migration we can move
to the 3 netdev model and i think we will need a new feature bit so that the
hypervisor can allow VM to use both datapaths and configure PF accordingly.
Thanks
Sridhar
next prev parent reply other threads:[~2018-01-28 21:02 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-12 5:58 [RFC PATCH net-next v2 0/2] Enable virtio to act as a backup for a passthru device Sridhar Samudrala
2018-01-12 5:58 ` [RFC PATCH net-next v2 1/2] virtio_net: Introduce VIRTIO_NET_F_BACKUP feature bit Sridhar Samudrala
2018-01-17 18:15 ` Alexander Duyck
2018-01-17 19:02 ` [virtio-dev] " Michael S. Tsirkin
2018-01-17 19:25 ` Samudrala, Sridhar
2018-01-17 19:57 ` [virtio-dev] " Michael S. Tsirkin
2018-01-17 21:49 ` Alexander Duyck
2018-01-22 21:31 ` [virtio-dev] " Michael S. Tsirkin
2018-01-22 23:27 ` Samudrala, Sridhar
2018-01-23 0:02 ` Stephen Hemminger
2018-01-23 1:37 ` Samudrala, Sridhar
2018-01-23 0:05 ` Michael S. Tsirkin
2018-01-23 0:16 ` Jakub Kicinski
2018-01-23 0:47 ` Michael S. Tsirkin
2018-01-23 1:13 ` Jakub Kicinski
2018-01-23 1:23 ` Michael S. Tsirkin
2018-01-23 19:21 ` Jakub Kicinski
2018-01-23 1:34 ` Samudrala, Sridhar
2018-01-23 2:04 ` Michael S. Tsirkin
2018-01-23 3:36 ` [virtio-dev] " Alexander Duyck
2018-01-23 5:54 ` Samudrala, Sridhar
2018-01-23 23:01 ` Michael S. Tsirkin
2018-01-12 5:58 ` [RFC PATCH net-next v2 2/2] virtio_net: Extend virtio to use VF datapath when available Sridhar Samudrala
2018-01-22 20:27 ` Siwei Liu
2018-01-22 21:05 ` Samudrala, Sridhar
2018-01-23 19:53 ` Laine Stump
2018-01-22 21:41 ` Michael S. Tsirkin
2018-01-23 20:24 ` Siwei Liu
2018-01-23 22:58 ` Michael S. Tsirkin
2018-01-26 8:14 ` Siwei Liu
2018-01-26 16:51 ` Samudrala, Sridhar
2018-01-26 21:46 ` Siwei Liu
2018-01-26 22:14 ` [virtio-dev] " Michael S. Tsirkin
2018-01-26 22:47 ` Jakub Kicinski
2018-01-26 23:30 ` Samudrala, Sridhar
2018-01-27 2:30 ` Jakub Kicinski
2018-01-27 5:33 ` Samudrala, Sridhar
2018-01-27 5:58 ` Jakub Kicinski
2018-01-28 17:35 ` Alexander Duyck
2018-01-28 19:18 ` [virtio-dev] " Samudrala, Sridhar
2018-01-28 20:18 ` Alexander Duyck
2018-01-28 21:01 ` Samudrala, Sridhar [this message]
2018-01-29 0:58 ` [virtio-dev] " Alexander Duyck
2018-01-28 23:02 ` Stephen Hemminger
2018-01-29 4:26 ` Alexander Duyck
2018-01-29 18:24 ` Michael S. Tsirkin
2018-01-29 20:09 ` Alexander Duyck
2018-01-23 10:33 ` Jason Wang
2018-01-23 16:03 ` Samudrala, Sridhar
2018-01-29 3:32 ` Jason Wang
2018-01-26 16:58 ` Michael S. Tsirkin
2018-01-26 18:15 ` Samudrala, Sridhar
2018-01-12 5:58 ` [RFC PATCH 1/1] qemu: Introduce VIRTIO_NET_F_BACKUP feature bit to virtio_net Sridhar Samudrala
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b6bf20f8-8881-4d25-db4c-24f93d5e6cba@intel.com \
--to=sridhar.samudrala@intel.com \
--cc=alexander.duyck@gmail.com \
--cc=alexander.h.duyck@intel.com \
--cc=davem@davemloft.net \
--cc=jesse.brandeburg@intel.com \
--cc=kubakici@wp.pl \
--cc=loseweigh@gmail.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=stephen@networkplumber.org \
--cc=virtio-dev@lists.oasis-open.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).