netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>,
	kys@microsoft.com, haiyangz@microsoft.com, davem@davemloft.net,
	netdev@vger.kernel.org,
	Stephen Hemminger <sthemmin@microsoft.com>
Subject: Re: [PATCH net] failover: eliminate callback hell
Date: Wed, 6 Jun 2018 15:21:21 -0700	[thread overview]
Message-ID: <20180606152121.597a89ec@xeon-e3> (raw)
In-Reply-To: <20180607002047-mutt-send-email-mst@kernel.org>

On Thu, 7 Jun 2018 00:30:21 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Jun 06, 2018 at 02:16:20PM -0700, Stephen Hemminger wrote:
> > On Tue, 5 Jun 2018 23:11:37 -0700
> > "Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:
> >   
> > > On 6/5/2018 11:00 PM, Stephen Hemminger wrote:  
> > > > On Tue, 5 Jun 2018 22:39:12 -0700
> > > > "Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:
> > > >    
> > > >> On 6/5/2018 8:51 PM, Stephen Hemminger wrote:    
> > > >>> On Tue, 5 Jun 2018 16:52:22 -0700
> > > >>> "Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:
> > > >>>       
> > > >>>> On 6/5/2018 2:52 PM, Stephen Hemminger wrote:    
> > > >>>>> On Tue, 5 Jun 2018 22:38:43 +0300
> > > >>>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > >>>>>          
> > > >>>>>>> See:
> > > >>>>>>>       https://patchwork.ozlabs.org/patch/851711/    
> > > >>>>>> Let me try to summarize that:
> > > >>>>>>
> > > >>>>>> 	You wanted to speed up the delayed link up.  You had an idea to
> > > >>>>>> 	additionally take link up when userspace renames the interface (standby
> > > >>>>>> 	one which is also the failover for netvsc).
> > > >>>>>>
> > > >>>>>> 	But userspace might not do any renames, in which case there will
> > > >>>>>> 	still be the delay, and so this never got applied.
> > > >>>>>>
> > > >>>>>> 	Is this a good summary?
> > > >>>>>>
> > > >>>>>> Davem said delay should go away completely as it's not robust, and I
> > > >>>>>> think I agree.  So I don't think we should make all failover users use
> > > >>>>>> delay. IIUC failover kept a delay option especially for netvsc to
> > > >>>>>> minimize the surprise factor. Hopefully we can come up with
> > > >>>>>> something more robust and drop that option completely.    
> > > >>>>> The timeout was the original solution to how to complete setup after
> > > >>>>> userspace has had a chance to rename the device. Unfortunately, the whole network
> > > >>>>> device initialization (cooperation with udev and userspace) is a a mess because
> > > >>>>> there is no well defined specification, and there are multiple ways userspace
> > > >>>>> does this in old and new distributions.  The timeout has its own issues
> > > >>>>> (how long, handling errors during that window, what if userspace modifies other
> > > >>>>> device state); and open to finding a better solution.
> > > >>>>>
> > > >>>>> My point was that if name change can not be relied on (or used) by netvsc,
> > > >>>>> then we can't allow it for net_failover either.    
> > > >>>> I think the push back was with the usage of the delay, not bringing up the primary/standby
> > > >>>> device in the name change event handler.
> > > >>>> Can't netvsc use this mechanism instead of depending on the delay?
> > > >>>>
> > > >>>>       
> > > >>> The patch that was rejected for netvsc was about using name change.
> > > >>> Also, you can't depend on name change; you still need a timer. Not all distributions
> > > >>> change name of devices. Or user has blocked that by udev rules.    
> > > >> In the net_failover_slave_register() we do a dev_open() and ignore any failure due to
> > > >> EBUSY and do another dev_open() in the name change event handler.
> > > >> If the name is not expected to change, i would think the dev_open() at the time of
> > > >> register will succeed.    
> > > > The problem is your first dev_open will bring device up and lockout
> > > > udev from changing the network device name.    
> > > 
> > > I have tried with/without udev and didn't see any issue with the naming of the primary/standby
> > > devices in my testing.
> > > 
> > > With the 3-netdev failover model, we are only interested in setting the right name for the failover
> > > netdev and that is the reason we do SET_NETDEV_DEV on that netdev. Does it really matter if udev fails
> > > to rename the lower primary/standby netdevs, i don't think it will matter? The user is not expected
> > > to touch the lower netdevs.  
> > 
> > Renaming matters to some users and the udev maintainers. They want the VF to be named enpXXX
> > The primary in virtio case needs to be ens3 with some cloud platforms.  
> 
> Confused. VF can't be a standby, of it's used in a failover it's a
> primary, you can't call it both enpXXX amd ens3. Could you describe the
> use case in a bit more detail?

Sorry, got things backwards.  The primary is VF and it should be possible
to have it renamed based on PCI info.
The standby PV (in 3 dev model) would get renamed by udev to ens3.
So the failover device would just be ethX right?

> > 
> > I think you need to get rid of triggering off of the name change.  
> 
> Worth considering down the road since it's a bit of a hack but it's one
> we won't have trouble supporting, unlike the delayed uplink.

You can't depend on userspace doing rename.

> 
> > Long term, let's open the discussion about allowing network devices to change name when up?
> > Maybe with NETIF_LIVENAME_CHANGE flag?  
> 
> That's probably the cleanest approach assuming it can be made to work
> without races. I suspect we can just live with what we have until then.
> 
> 

  reply	other threads:[~2018-06-06 22:21 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-05  3:42 [PATCH net] failover: eliminate callback hell Stephen Hemminger
2018-06-05 17:22 ` Samudrala, Sridhar
2018-06-05 17:45   ` Stephen Hemminger
2018-06-05 18:14     ` David Miller
2018-06-05 18:35 ` Michael S. Tsirkin
2018-06-05 18:53   ` Stephen Hemminger
2018-06-05 19:38     ` Michael S. Tsirkin
2018-06-05 21:52       ` Stephen Hemminger
2018-06-05 23:52         ` Samudrala, Sridhar
2018-06-06  3:51           ` Stephen Hemminger
2018-06-06  5:39             ` Samudrala, Sridhar
2018-06-06  6:00               ` Stephen Hemminger
2018-06-06  6:11                 ` Samudrala, Sridhar
2018-06-06 21:16                   ` Stephen Hemminger
2018-06-06 21:30                     ` Michael S. Tsirkin
2018-06-06 22:21                       ` Stephen Hemminger [this message]
2018-06-11 18:07                         ` Michael S. Tsirkin
2018-06-06 12:19             ` Michael S. Tsirkin
2018-06-06 21:17               ` Stephen Hemminger
2018-06-06  7:25 ` Jiri Pirko
2018-06-06 12:30   ` Michael S. Tsirkin
2018-06-06 21:24     ` Stephen Hemminger
2018-06-06 21:47       ` Michael S. Tsirkin
2018-06-06 22:24         ` Stephen Hemminger
2018-06-07 14:57           ` Michael S. Tsirkin
2018-06-07 15:23             ` Stephen Hemminger
2018-06-06 21:54       ` Samudrala, Sridhar
2018-06-06 22:25         ` Stephen Hemminger
2018-06-07 14:17           ` Alexander Duyck
2018-06-07 14:51             ` Stephen Hemminger
2018-06-07 15:41               ` Michael S. Tsirkin
2018-06-07 16:17                 ` Stephen Hemminger
2018-06-07 17:22                   ` Michael S. Tsirkin
2018-06-08 18:30                     ` Stephen Hemminger
2018-06-08 19:04                       ` Michael S. Tsirkin
2018-06-08 22:54         ` Siwei Liu
2018-06-11 15:17           ` Stephen Hemminger
2018-06-08 22:25       ` Siwei Liu
2018-06-08 23:18         ` Stephen Hemminger
2018-06-08 23:44           ` Siwei Liu
2018-06-09  0:02             ` Stephen Hemminger
2018-06-09  0:42               ` Siwei Liu
2018-06-11 15:22                 ` Stephen Hemminger
2018-06-11 19:23                   ` Siwei Liu
2018-06-11 14:01               ` Michael S. Tsirkin
2018-06-09  1:29             ` Jakub Kicinski
2018-06-11 18:56               ` Siwei Liu
2018-06-12  2:14                 ` Michael S. Tsirkin
2018-06-06 21:26   ` Stephen Hemminger
2018-06-11 18:10 ` Michael S. Tsirkin
2018-06-11 19:34   ` Samudrala, Sridhar
2018-06-12  0:08     ` Samudrala, Sridhar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180606152121.597a89ec@xeon-e3 \
    --to=stephen@networkplumber.org \
    --cc=davem@davemloft.net \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=sridhar.samudrala@intel.com \
    --cc=sthemmin@microsoft.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).