netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
To: Siwei Liu <loseweigh@gmail.com>, David Miller <davem@davemloft.net>
Cc: David Ahern <dsahern@gmail.com>, Jiri Pirko <jiri@resnulli.us>,
	si-wei liu <si-wei.liu@oracle.com>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	Alexander Duyck <alexander.h.duyck@intel.com>,
	"Brandeburg, Jesse" <jesse.brandeburg@intel.com>,
	Jakub Kicinski <kubakici@wp.pl>, Jason Wang <jasowang@redhat.com>,
	Netdev <netdev@vger.kernel.org>,
	virtualization@lists.linux-foundation.org,
	virtio-dev@lists.oasis-open.org
Subject: Re: Re: [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice
Date: Wed, 18 Apr 2018 16:33:34 -0700	[thread overview]
Message-ID: <1f3af59f-fd64-cc0d-f9eb-668636c52db4@intel.com> (raw)
In-Reply-To: <CADGSJ21nQ1BQctfZcThpmOcOUrpqLfG71jsh2trb1utVjNQH=Q@mail.gmail.com>

On 4/17/2018 5:26 PM, Siwei Liu wrote:
> I ran this with a few folks offline and gathered some good feedbacks
> that I'd like to share thus revive the discussion.
>
> First of all, as illustrated in the reply below, cloud service
> providers require transparent live migration. Specifically, the main
> target of our case is to support SR-IOV live migration via kernel
> upgrade while keeping the userspace of old distros unmodified. If it's
> because this use case is not appealing enough for the mainline to
> adopt, I will shut up and not continue discussing, although
> technically it's entirely possible (and there's precedent in other
> implementation) to do so to benefit any cloud service providers.
>
> If it's just the implementation of hiding netdev itself needs to be
> improved, such as implementing it as attribute flag or adding linkdump
> API, that's completely fine and we can look into that. However, the
> specific issue needs to be undestood beforehand is to make transparent
> SR-IOV to be able to take over the name (so inherit all the configs)
> from the lower netdev, which needs some games with uevents and name
> space reservation. So far I don't think it's been well discussed.
>
> One thing in particular I'd like to point out is that the 3-netdev
> model currently missed to address the core problem of live migration:
> migration of hardware specific feature/state, for e.g. ethtool configs
> and hardware offloading states. Only general network state (IP
> address, gateway, for eg.) associated with the bypass interface can be
> migrated. As a follow-up work, bypass driver can/should be enhanced to
> save and apply those hardware specific configs before or after
> migration as needed. The transparent 1-netdev model being proposed as
> part of this patch series will be able to solve that problem naturally
> by making all hardware specific configurations go through the central
> bypass driver, such that hardware configurations can be replayed when
> new VF or passthrough gets plugged back in. Although that
> corresponding function hasn't been implemented today, I'd like to
> refresh everyone's mind that is the core problem any live migration
> proposal should have addressed.
>
> If it would make things more clear to defer netdev hiding until all
> functionalities regarding centralizing and replay are implemented,
> we'd take advices like that and move on to implementing those features
> as follow-up patches. Once all needed features get done, we'd resume
> the work for hiding lower netdev at that point. Think it would be the
> best to make everyone understand the big picture in advance before
> going too far.

I think we should get the 3-netdev model integrated and add any additional
ndo_ops/ethool ops that we would like to support/migrate before looking into
hiding the lower netdevs.


>
> Thanks, comments welcome.
>
> -Siwei
>
>
> On Mon, Apr 9, 2018 at 11:48 PM, Siwei Liu <loseweigh@gmail.com> wrote:
>> On Sun, Apr 8, 2018 at 9:32 AM, David Miller <davem@davemloft.net> wrote:
>>> From: Siwei Liu <loseweigh@gmail.com>
>>> Date: Fri, 6 Apr 2018 19:32:05 -0700
>>>
>>>> And I assume everyone here understands the use case for live
>>>> migration (in the context of providing cloud service) is very
>>>> different, and we have to hide the netdevs. If not, I'm more than
>>>> happy to clarify.
>>> I think you still need to clarify.
>> OK. The short answer is cloud users really want *transparent* live migration.
>>
>> By being transparent it means they don't and shouldn't care about the
>> existence and the occurence of live migration, but they do if
>> userspace toolstack and libraries have to be updated or modified,
>> which means potential dependency brokeness of their applications. They
>> don't like any change to the userspace envinroment (existing apps
>> lift-and-shift, no recompilation, no re-packaging, no re-certification
>> needed), while no one barely cares about ABI or API compatibility in
>> the kernel level, as long as their applications don't break.
>>
>> I agree the current bypass solution for SR-IOV live migration requires
>> guest cooperation. Though it doesn't mean guest *userspace*
>> cooperation. As a matter of fact, techinically it shouldn't invovle
>> userspace at all to get SR-IOV migration working. It's the kernel that
>> does the real work. If I understand the goal of this in-kernel
>> approach correctly, it was meant to save userspace from modification
>> or corresponding toolstack support, as those additional 2 interfaces
>> is more a side product of this approach, rather than being neccessary
>> for users to be aware of. All what the user needs to deal with is one
>> single interface, and that's what they care about. It's more a trouble
>> than help when they see 2 extra interfaces are present. Management
>> tools in the old distros don't recoginze them and try to bring up
>> those extra interfaces for its own. Various odd warnings start to spew
>> out, and there's a lot of caveats for the users to get around...
>>
>> On the other hand, if we "teach" those cloud users to update the
>> userspace toolstack just for trading a feature they don't need, no one
>> is likely going to embrace the change. As such there's just no real
>> value of adopting this in-kernel bypass facility for any cloud service
>> provider. It does not look more appealing than just configure generic
>> bonding using its own set of daemons or scripts. But again, cloud
>> users don't welcome that facility. And basically it would get to
>> nearly the same set of problems if leaving userspace alone.
>>
>> IMHO we're not hiding the devices, think it the way we're adding a
>> feature transparent to user. Those auto-managed slaves are ones users
>> don't care about much. And user is still able to see and configure the
>> lower netdevs if they really desires to do so. But generally the
>> target user for this feature won't need to know that. Why they care
>> how many interfaces a VM virtually has rather than how many interfaces
>> are actually _useable_ to them??
>>
>> Thanks,
>> -Siwei
>>
>>
>>> netdevs are netdevs.  If they have special attributes, mark them as
>>> such and the tools base their actions upon that.
>>>
>>> "Hiding", or changing classes, doesn't make any sense to me still.
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

  reply	other threads:[~2018-04-18 23:33 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-04-01  9:13 [RFC PATCH 0/3] Userspace compatible driver model for virtio_bypass Si-Wei Liu
2018-04-01  9:13 ` [RFC PATCH 1/3] qemu: virtio-bypass should explicitly bind to a passthrough device Si-Wei Liu
2018-04-03 12:25   ` Michael S. Tsirkin
2018-04-04  8:02     ` [virtio-dev] " Siwei Liu
2018-04-05 15:31       ` Paolo Bonzini
2018-04-07  2:54         ` Siwei Liu
2018-04-01  9:13 ` [RFC PATCH 2/3] netdev: kernel-only IFF_HIDDEN netdevice Si-Wei Liu
2018-04-01 16:11   ` David Ahern
2018-04-03  7:40     ` Siwei Liu
2018-04-03 14:57       ` David Ahern
2018-04-03 15:42     ` Jiri Pirko
2018-04-03 19:23       ` Siwei Liu
2018-04-04  1:04       ` David Ahern
2018-04-04  6:19         ` Jiri Pirko
2018-04-04  8:01           ` Siwei Liu
2018-04-04  7:36         ` Siwei Liu
2018-04-04 17:21           ` David Ahern
2018-04-04 17:37             ` David Miller
2018-04-04 18:20               ` Jiri Pirko
2018-04-07  2:32               ` Siwei Liu
2018-04-07  3:19                 ` Andrew Lunn
2018-04-09 22:07                   ` Siwei Liu
2018-04-09 22:15                     ` Andrew Lunn
2018-04-09 22:30                       ` Siwei Liu
2018-04-09 23:03                         ` Stephen Hemminger
2018-04-09 23:31                           ` Siwei Liu
2018-04-08 16:32                 ` David Miller
2018-04-10  6:48                   ` Siwei Liu
2018-04-18  0:26                     ` Siwei Liu
2018-04-18 23:33                       ` Samudrala, Sridhar [this message]
2018-04-19  4:41                         ` Michael S. Tsirkin
2018-04-19  5:00                           ` [virtio-dev] " Samudrala, Sridhar
2018-04-19  5:07                             ` Michael S. Tsirkin
2018-04-19  6:10                               ` [virtio-dev] " Samudrala, Sridhar
2018-04-19  6:43                                 ` Siwei Liu
2018-04-19  6:31                             ` Siwei Liu
2018-04-04 18:02             ` Siwei Liu
2018-04-04  8:28         ` Siwei Liu
2018-04-04 17:37           ` David Ahern
2018-04-04 17:42             ` David Miller
2018-04-04 17:44             ` Stephen Hemminger
2018-04-04 20:08             ` Andrew Lunn
2018-04-03 17:35   ` Stephen Hemminger
     [not found]     ` <CADGSJ23vZdtQzWdc_6M_Hr4MUej--wgvJ785DwRF3VaPWS1rpA@mail.gmail.com>
     [not found]       ` <20180403160834.51594373@xeon-e3>
2018-04-06 21:29         ` Siwei Liu
2018-04-01  9:13 ` [RFC PATCH 3/3] virtio_net: make lower netdevs for virtio_bypass hidden Si-Wei Liu
2018-04-03 12:20   ` Michael S. Tsirkin
2018-04-04  8:03     ` [virtio-dev] " Siwei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f3af59f-fd64-cc0d-f9eb-668636c52db4@intel.com \
    --to=sridhar.samudrala@intel.com \
    --cc=alexander.h.duyck@intel.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@gmail.com \
    --cc=jasowang@redhat.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=jiri@resnulli.us \
    --cc=kubakici@wp.pl \
    --cc=loseweigh@gmail.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=si-wei.liu@oracle.com \
    --cc=stephen@networkplumber.org \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).