From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.5 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02921C43381 for ; Wed, 27 Mar 2019 11:11:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B7DED20700 for ; Wed, 27 Mar 2019 11:11:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=resnulli-us.20150623.gappssmtp.com header.i=@resnulli-us.20150623.gappssmtp.com header.b="gHLzP0+0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733044AbfC0LLg (ORCPT ); Wed, 27 Mar 2019 07:11:36 -0400 Received: from mail-wm1-f66.google.com ([209.85.128.66]:33175 "EHLO mail-wm1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728110AbfC0LLg (ORCPT ); Wed, 27 Mar 2019 07:11:36 -0400 Received: by mail-wm1-f66.google.com with SMTP id z6so4562691wmi.0 for ; Wed, 27 Mar 2019 04:11:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=resnulli-us.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=svmg5kJq7FOITWGv+RhOrBMwNm5y4N4Rnvzx3E7Lu+I=; b=gHLzP0+0L5JLFxifKR64PjlBkG0909/ZCjW0H5KD09pxuDtwmOpu7dIdwPEe3MBC2s 8NVh/odtko0/uT6q1bXLDx4oIt3fpHSNCgS9QKl/sk9272emdXxgqO+kcvuUAEXIuTuX HtlwhijDVvv5O1kbkr3Pb7w320ytYiOAJaB+oH9OuIqDtE4P4YtESgtQaQ6ECw0Tm+pZ rP5PTrRd7J/AxPc1/aV8XSvKtG67wa0eqTd+9R/igaAPNhS6/DK3a3zgewrS9pYop4aS iqS0j1+bWu89jSbYD2bkv6P7/ncfY0hQVcpU40apMvkjWVDhF9FqHa7B6IQOHQCyu2+A 3iGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=svmg5kJq7FOITWGv+RhOrBMwNm5y4N4Rnvzx3E7Lu+I=; b=sTPnLhKBVHGSP2xGYBtHU5V9SONdMbS41bPwxcr5A5lAWM+DmcMuo9KD2blJVXpoKx qi+hovvpbvp/K5OPLJVpe62DDPzFbpKxyGdrUkJlanYSVgcRREpcwRwOArK9ZKgdYrrs zHDKD3If/cCjnO0xPvPrjDQ/487g55WhwcsEJz7GczhaPRSxQp4aEKGhnGDOpTU8YAf4 MUygxtbmJZHfxGSg7/q9a7n6XaKtERow/F5Spk7aXU0ZeBx+W6kkJy/+61GHlB2RDZvf oO0jQxxTAn4IhrR8A4XQlEz5ufzQmh0KhqWeq/BE6Q6bnuyGHz0Ljrrx7WVN4Nduw49e tIdQ== X-Gm-Message-State: APjAAAXK76DttYAvRMKp6ZKO6LOrMKzCA9ygXqLhdajqChRbfxb+ebPK gy4FabAX/qqMMh9ojnfZt+csfw== X-Google-Smtp-Source: APXvYqy4x62LbasNCr8Mog8lOvLYJje3Q7mWaAoDIMof5mWioHbAv1dx5OI11vyIDivxrjzPG7CrNA== X-Received: by 2002:a1c:1f49:: with SMTP id f70mr17666213wmf.147.1553685094015; Wed, 27 Mar 2019 04:11:34 -0700 (PDT) Received: from localhost (ip-94-113-223-73.net.upcbroadband.cz. [94.113.223.73]) by smtp.gmail.com with ESMTPSA id g84sm5425978wmf.25.2019.03.27.04.11.33 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 27 Mar 2019 04:11:33 -0700 (PDT) Date: Wed, 27 Mar 2019 12:11:32 +0100 From: Jiri Pirko To: Si-Wei Liu Cc: mst@redhat.com, sridhar.samudrala@intel.com, stephen@networkplumber.org, davem@davemloft.net, kubakici@wp.pl, alexander.duyck@gmail.com, netdev@vger.kernel.org, virtualization@lists.linux-foundation.org, liran.alon@oracle.com, boris.ostrovsky@oracle.com, vijay.balakrishna@oracle.com Subject: Re: [PATCH net v3] failover: allow name change on IFF_UP slave interfaces Message-ID: <20190327111132.GI6979@nanopsycho> References: <1553644093-10917-1-git-send-email-si-wei.liu@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1553644093-10917-1-git-send-email-si-wei.liu@oracle.com> User-Agent: Mutt/1.11.3 (2019-02-01) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Wed, Mar 27, 2019 at 12:48:13AM CET, si-wei.liu@oracle.com wrote: >When a netdev appears through hot plug then gets enslaved by a failover >master that is already up and running, the slave will be opened >right away after getting enslaved. Today there's a race that userspace >(udev) may fail to rename the slave if the kernel (net_failover) >opens the slave earlier than when the userspace rename happens. >Unlike bond or team, the primary slave of failover can't be renamed by >userspace ahead of time, since the kernel initiated auto-enslavement is >unable to, or rather, is never meant to be synchronized with the rename >request from userspace. > >As the failover slave interfaces are not designed to be operated >directly by userspace apps: IP configuration, filter rules with >regard to network traffic passing and etc., should all be done on master >interface. In general, userspace apps only care about the >name of master interface, while slave names are less important as long >as admin users can see reliable names that may carry >other information describing the netdev. For e.g., they can infer that >"ens3nsby" is a standby slave of "ens3", while for a >name like "eth0" they can't tell which master it belongs to. > >Historically the name of IFF_UP interface can't be changed because >there might be admin script or management software that is already >relying on such behavior and assumes that the slave name can't be >changed once UP. But failover is special: with the in-kernel >auto-enslavement mechanism, the userspace expectation for device >enumeration and bring-up order is already broken. Previously initramfs >and various userspace config tools were modified to bypass failover >slaves because of auto-enslavement and duplicate MAC address. Similarly, >in case that users care about seeing reliable slave name, the new type >of failover slaves needs to be taken care of specifically in userspace >anyway. > >It's less risky to lift up the rename restriction on failover slave >which is already UP. Although it's possible this change may potentially >break userspace component (most likely configuration scripts or >management software) that assumes slave name can't be changed while >UP, it's relatively a limited and controllable set among all userspace >components, which can be fixed specifically to listen for the rename >and/or link down/up events on failover slaves. Userspace component >interacting with slaves is expected to be changed to operate on failover >master interface instead, as the failover slave is dynamic in nature >which may come and go at any point. The goal is to make the role of >failover slaves less relevant, and userspace components should only >deal with failover master in the long run. > >Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module") >Signed-off-by: Si-Wei Liu >Reviewed-by: Liran Alon > >-- >v1 -> v2: >- Drop configurable module parameter (Sridhar) > >v2 -> v3: >- Drop additional IFF_SLAVE_RENAME_OK flag (Sridhar) >- Send down and up events around rename (Michael S. Tsirkin) >--- > net/core/dev.c | 37 ++++++++++++++++++++++++++++++++++--- > 1 file changed, 34 insertions(+), 3 deletions(-) > >diff --git a/net/core/dev.c b/net/core/dev.c >index 722d50d..3e0cd80 100644 >--- a/net/core/dev.c >+++ b/net/core/dev.c >@@ -1171,6 +1171,7 @@ int dev_get_valid_name(struct net *net, struct net_device *dev, > int dev_change_name(struct net_device *dev, const char *newname) > { > unsigned char old_assign_type; >+ bool reopen_needed = false; > char oldname[IFNAMSIZ]; > int err = 0; > int ret; >@@ -1180,8 +1181,24 @@ int dev_change_name(struct net_device *dev, const char *newname) > BUG_ON(!dev_net(dev)); > > net = dev_net(dev); >- if (dev->flags & IFF_UP) >- return -EBUSY; >+ >+ /* Allow failover slave to rename even when >+ * it is up and running. >+ * >+ * Failover slaves are special, since userspace >+ * might rename the slave after the interface >+ * has been brought up and running due to >+ * auto-enslavement. >+ * >+ * Failover users don't actually care about slave >+ * name change, as they are only expected to operate >+ * on master interface directly. >+ */ >+ if (dev->flags & IFF_UP) { >+ if (likely(!(dev->priv_flags & IFF_FAILOVER_SLAVE))) >+ return -EBUSY; >+ reopen_needed = true; >+ } > > write_seqcount_begin(&devnet_rename_seq); > >@@ -1198,6 +1215,9 @@ int dev_change_name(struct net_device *dev, const char *newname) > return err; > } > >+ if (reopen_needed) >+ dev_close(dev); Ugh. Don't dev_close/dev_open on name change. >+ > if (oldname[0] && !strchr(oldname, '%')) > netdev_info(dev, "renamed from %s\n", oldname); > >@@ -1210,7 +1230,9 @@ int dev_change_name(struct net_device *dev, const char *newname) > memcpy(dev->name, oldname, IFNAMSIZ); > dev->name_assign_type = old_assign_type; > write_seqcount_end(&devnet_rename_seq); >- return ret; >+ if (err >= 0) >+ err = ret; >+ goto reopen; > } > > write_seqcount_end(&devnet_rename_seq); >@@ -1246,6 +1268,15 @@ int dev_change_name(struct net_device *dev, const char *newname) > } > } > >+reopen: >+ if (reopen_needed) { >+ ret = dev_open(dev); >+ if (ret) { >+ pr_err("%s: reopen device failed: %d\n", >+ dev->name, ret); >+ } >+ } >+ > return err; > } > >-- >1.8.3.1 >