From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCH net] failover: eliminate callback hell Date: Thu, 7 Jun 2018 00:30:21 +0300 Message-ID: <20180607002047-mutt-send-email-mst@kernel.org> References: <20180605211927-mutt-send-email-mst@kernel.org> <20180605115305.502a7ebb@xeon-e3> <20180605221049-mutt-send-email-mst@kernel.org> <20180605145222.477e5ae8@xeon-e3> <20180605205118.439a7873@xeon-e3> <72cf1d79-f311-98e5-23a0-6ee23dc8fd6b@intel.com> <20180605230038.521d5c18@xeon-e3> <20180606141620.0b333dff@xeon-e3> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "Samudrala, Sridhar" , kys@microsoft.com, haiyangz@microsoft.com, davem@davemloft.net, netdev@vger.kernel.org, Stephen Hemminger To: Stephen Hemminger Return-path: Received: from mx3-rdu2.redhat.com ([66.187.233.73]:54464 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752035AbeFFVaX (ORCPT ); Wed, 6 Jun 2018 17:30:23 -0400 Content-Disposition: inline In-Reply-To: <20180606141620.0b333dff@xeon-e3> Sender: netdev-owner@vger.kernel.org List-ID: On Wed, Jun 06, 2018 at 02:16:20PM -0700, Stephen Hemminger wrote: > On Tue, 5 Jun 2018 23:11:37 -0700 > "Samudrala, Sridhar" wrote: > > > On 6/5/2018 11:00 PM, Stephen Hemminger wrote: > > > On Tue, 5 Jun 2018 22:39:12 -0700 > > > "Samudrala, Sridhar" wrote: > > > > > >> On 6/5/2018 8:51 PM, Stephen Hemminger wrote: > > >>> On Tue, 5 Jun 2018 16:52:22 -0700 > > >>> "Samudrala, Sridhar" wrote: > > >>> > > >>>> On 6/5/2018 2:52 PM, Stephen Hemminger wrote: > > >>>>> On Tue, 5 Jun 2018 22:38:43 +0300 > > >>>>> "Michael S. Tsirkin" wrote: > > >>>>> > > >>>>>>> See: > > >>>>>>> https://patchwork.ozlabs.org/patch/851711/ > > >>>>>> Let me try to summarize that: > > >>>>>> > > >>>>>> You wanted to speed up the delayed link up. You had an idea to > > >>>>>> additionally take link up when userspace renames the interface (standby > > >>>>>> one which is also the failover for netvsc). > > >>>>>> > > >>>>>> But userspace might not do any renames, in which case there will > > >>>>>> still be the delay, and so this never got applied. > > >>>>>> > > >>>>>> Is this a good summary? > > >>>>>> > > >>>>>> Davem said delay should go away completely as it's not robust, and I > > >>>>>> think I agree. So I don't think we should make all failover users use > > >>>>>> delay. IIUC failover kept a delay option especially for netvsc to > > >>>>>> minimize the surprise factor. Hopefully we can come up with > > >>>>>> something more robust and drop that option completely. > > >>>>> The timeout was the original solution to how to complete setup after > > >>>>> userspace has had a chance to rename the device. Unfortunately, the whole network > > >>>>> device initialization (cooperation with udev and userspace) is a a mess because > > >>>>> there is no well defined specification, and there are multiple ways userspace > > >>>>> does this in old and new distributions. The timeout has its own issues > > >>>>> (how long, handling errors during that window, what if userspace modifies other > > >>>>> device state); and open to finding a better solution. > > >>>>> > > >>>>> My point was that if name change can not be relied on (or used) by netvsc, > > >>>>> then we can't allow it for net_failover either. > > >>>> I think the push back was with the usage of the delay, not bringing up the primary/standby > > >>>> device in the name change event handler. > > >>>> Can't netvsc use this mechanism instead of depending on the delay? > > >>>> > > >>>> > > >>> The patch that was rejected for netvsc was about using name change. > > >>> Also, you can't depend on name change; you still need a timer. Not all distributions > > >>> change name of devices. Or user has blocked that by udev rules. > > >> In the net_failover_slave_register() we do a dev_open() and ignore any failure due to > > >> EBUSY and do another dev_open() in the name change event handler. > > >> If the name is not expected to change, i would think the dev_open() at the time of > > >> register will succeed. > > > The problem is your first dev_open will bring device up and lockout > > > udev from changing the network device name. > > > > I have tried with/without udev and didn't see any issue with the naming of the primary/standby > > devices in my testing. > > > > With the 3-netdev failover model, we are only interested in setting the right name for the failover > > netdev and that is the reason we do SET_NETDEV_DEV on that netdev. Does it really matter if udev fails > > to rename the lower primary/standby netdevs, i don't think it will matter? The user is not expected > > to touch the lower netdevs. > > Renaming matters to some users and the udev maintainers. They want the VF to be named enpXXX > The primary in virtio case needs to be ens3 with some cloud platforms. Confused. VF can't be a standby, of it's used in a failover it's a primary, you can't call it both enpXXX amd ens3. Could you describe the use case in a bit more detail? > > I think you need to get rid of triggering off of the name change. Worth considering down the road since it's a bit of a hack but it's one we won't have trouble supporting, unlike the delayed uplink. > Long term, let's open the discussion about allowing network devices to change name when up? > Maybe with NETIF_LIVENAME_CHANGE flag? That's probably the cleanest approach assuming it can be made to work without races. I suspect we can just live with what we have until then. -- MST