From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 053B3C43381 for ; Thu, 7 Mar 2019 04:55:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B631520835 for ; Thu, 7 Mar 2019 04:55:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="D5ce6Z3M" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726250AbfCGEzD (ORCPT ); Wed, 6 Mar 2019 23:55:03 -0500 Received: from aserp2130.oracle.com ([141.146.126.79]:36684 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726010AbfCGEzD (ORCPT ); Wed, 6 Mar 2019 23:55:03 -0500 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x274iaFl115049; Thu, 7 Mar 2019 04:54:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : references : cc : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=Ki98vZSF+PVetUUI9rtHyK5SqIzNOdsNsVZhMnXVyQ8=; b=D5ce6Z3Mj1wRqCROEcXHyUyQkJkhJUVLVccswwPiUnw7/dOwW14hvYHsQqjD54ZtAW27 U6YLubAf818k98TLkgAQxBpQfzFgQSwGn6FSIxzu1na7J4OjIRkhEsJE8w4bCSBm8Gyj rqQPM2mg362e47TOS2wXZQySpDlLDIpZBM5MWPKX5ts/DnASHbbIhZ8TN80buTfPsx66 xioB9YRE84ff1KDfHLRbIzv9vcrjA6janFvU2ryfH4k/dWC+Suv4Xw8N0tcGHIFHRfmd KRDIwbgnMS99QdZyLMubCwlXc71P6hkfo6nUbdbySXAaiT6XKQrQk5hu+ufq2l+diBZ9 yQ== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp2130.oracle.com with ESMTP id 2qyfbeg07h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 07 Mar 2019 04:54:52 +0000 Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id x274somb022632 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 7 Mar 2019 04:54:50 GMT Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by aserv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x274snnL025630; Thu, 7 Mar 2019 04:54:49 GMT Received: from [10.159.232.131] (/10.159.232.131) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Wed, 06 Mar 2019 20:54:48 -0800 Subject: Re: [PATCH net v2] failover: allow name change on IFF_UP slave interfaces To: "Samudrala, Sridhar" , "Michael S. Tsirkin" , Alexander Duyck , Stephen Hemminger , Jakub Kicinski , Jiri Pirko , David Miller , Netdev , virtualization@lists.linux-foundation.org References: <1551928112-32109-1-git-send-email-si-wei.liu@oracle.com> <5131f60c-5910-01e4-936a-6d8f4e086dd3@intel.com> Cc: liran.alon@oracle.com, boris.ostrovsky@oracle.com, vijay.balakrishna@oracle.com From: si-wei liu Organization: Oracle Corporation Message-ID: Date: Wed, 6 Mar 2019 20:54:44 -0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <5131f60c-5910-01e4-936a-6d8f4e086dd3@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9187 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1903070033 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On 3/6/2019 8:13 PM, Samudrala, Sridhar wrote: > > On 3/6/2019 7:08 PM, Si-Wei Liu wrote: >> When a netdev appears through hot plug then gets enslaved by a failover >> master that is already up and running, the slave will be opened >> right away after getting enslaved. Today there's a race that userspace >> (udev) may fail to rename the slave if the kernel (net_failover) >> opens the slave earlier than when the userspace rename happens. >> Unlike bond or team, the primary slave of failover can't be renamed by >> userspace ahead of time, since the kernel initiated auto-enslavement is >> unable to, or rather, is never meant to be synchronized with the rename >> request from userspace. >> >> As the failover slave interfaces are not designed to be operated >> directly by userspace apps: IP configuration, filter rules with >> regard to network traffic passing and etc., should all be done on master >> interface. In general, userspace apps only care about the >> name of master interface, while slave names are less important as long >> as admin users can see reliable names that may carry >> other information describing the netdev. For e.g., they can infer that >> "ens3nsby" is a standby slave of "ens3", while for a >> name like "eth0" they can't tell which master it belongs to. >> >> Historically the name of IFF_UP interface can't be changed because >> there might be admin script or management software that is already >> relying on such behavior and assumes that the slave name can't be >> changed once UP. But failover is special: with the in-kernel >> auto-enslavement mechanism, the userspace expectation for device >> enumeration and bring-up order is already broken. Previously initramfs >> and various userspace config tools were modified to bypass failover >> slaves because of auto-enslavement and duplicate MAC address. Similarly, >> in case that users care about seeing reliable slave name, the new type >> of failover slaves needs to be taken care of specifically in userspace >> anyway. >> >> It's less risky to lift up the rename restriction on failover slave >> which is already UP. Although it's possible this change may potentially >> break userspace component (most likely configuration scripts or >> management software) that assumes slave name can't be changed while >> UP, it's relatively a limited and controllable set among all userspace >> components, which can be fixed specifically to work with the new naming >> behavior of failover slaves. Userspace component interacting with >> slaves should be changed to operate on failover master instead, as the >> failover slave is dynamic in nature which may come and go at any point. >> The goal is to make the role of failover slaves less relevant, and >> all userspace should only deal with master in the long run. >> >> Fixes: 30c8bd5aa8b2 ("net: Introduce generic failover module") >> Signed-off-by: Si-Wei Liu >> Reviewed-by: Liran Alon >> Acked-by: Michael S. Tsirkin >> >> --- >> v1 -> v2: >> - Drop configurable module parameter (Sridhar) >> >> >> include/linux/netdevice.h | 3 +++ >> net/core/dev.c | 3 ++- >> net/core/failover.c | 6 +++--- >> 3 files changed, 8 insertions(+), 4 deletions(-) >> >> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >> index 857f8ab..6d9e4e0 100644 >> --- a/include/linux/netdevice.h >> +++ b/include/linux/netdevice.h >> @@ -1487,6 +1487,7 @@ struct net_device_ops { >> * @IFF_NO_RX_HANDLER: device doesn't support the rx_handler hook >> * @IFF_FAILOVER: device is a failover master device >> * @IFF_FAILOVER_SLAVE: device is lower dev of a failover master >> device >> + * @IFF_SLAVE_RENAME_OK: rename is allowed while slave device is >> running >> */ >> enum netdev_priv_flags { >> IFF_802_1Q_VLAN = 1<<0, >> @@ -1518,6 +1519,7 @@ enum netdev_priv_flags { >> IFF_NO_RX_HANDLER = 1<<26, >> IFF_FAILOVER = 1<<27, >> IFF_FAILOVER_SLAVE = 1<<28, >> + IFF_SLAVE_RENAME_OK = 1<<29, >> }; >> #define IFF_802_1Q_VLAN IFF_802_1Q_VLAN >> @@ -1548,6 +1550,7 @@ enum netdev_priv_flags { >> #define IFF_NO_RX_HANDLER IFF_NO_RX_HANDLER >> #define IFF_FAILOVER IFF_FAILOVER >> #define IFF_FAILOVER_SLAVE IFF_FAILOVER_SLAVE >> +#define IFF_SLAVE_RENAME_OK IFF_SLAVE_RENAME_OK >> /** >> * struct net_device - The DEVICE structure. >> diff --git a/net/core/dev.c b/net/core/dev.c >> index 722d50d..ae070de 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -1180,7 +1180,8 @@ int dev_change_name(struct net_device *dev, >> const char *newname) >> BUG_ON(!dev_net(dev)); >> net = dev_net(dev); >> - if (dev->flags & IFF_UP) >> + if (dev->flags & IFF_UP && >> + !(dev->priv_flags & IFF_SLAVE_RENAME_OK)) >> return -EBUSY; > > Without the configurable module parameter, i think we don't even need > the new SLAVE_RENAME_OK private flag. > Can't we simply check for IFF_FAILOVER_SLAVE ? I'd prefer keeping this flag for now, even though without configurable module parameter. This has clear semantics that helps decouple behavior against specific link type, and may benefit other auto-enslaved netdevs as well. -Siwei > >> write_seqcount_begin(&devnet_rename_seq); >> diff --git a/net/core/failover.c b/net/core/failover.c >> index 4a92a98..34c5c87 100644 >> --- a/net/core/failover.c >> +++ b/net/core/failover.c >> @@ -80,14 +80,14 @@ static int failover_slave_register(struct >> net_device *slave_dev) >> goto err_upper_link; >> } >> - slave_dev->priv_flags |= IFF_FAILOVER_SLAVE; >> + slave_dev->priv_flags |= (IFF_FAILOVER_SLAVE | >> IFF_SLAVE_RENAME_OK); >> if (fops && fops->slave_register && >> !fops->slave_register(slave_dev, failover_dev)) >> return NOTIFY_OK; >> netdev_upper_dev_unlink(slave_dev, failover_dev); >> - slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE; >> + slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | >> IFF_SLAVE_RENAME_OK); >> err_upper_link: >> netdev_rx_handler_unregister(slave_dev); >> done: >> @@ -121,7 +121,7 @@ int failover_slave_unregister(struct net_device >> *slave_dev) >> netdev_rx_handler_unregister(slave_dev); >> netdev_upper_dev_unlink(slave_dev, failover_dev); >> - slave_dev->priv_flags &= ~IFF_FAILOVER_SLAVE; >> + slave_dev->priv_flags &= ~(IFF_FAILOVER_SLAVE | >> IFF_SLAVE_RENAME_OK); >> if (fops && fops->slave_unregister && >> !fops->slave_unregister(slave_dev, failover_dev)) >>