Netdev List
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: davem@davemloft.net
Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com,
	andrew+netdev@lunn.ch, horms@kernel.org,
	michael.chan@broadcom.com, hkallweit1@gmail.com,
	maxime.chevallier@bootlin.com, joshwash@google.com,
	tariqt@nvidia.com, alexanderduyck@fb.com, willemb@google.com,
	jacob.e.keller@intel.com, kory.maincent@bootlin.com,
	sdf.kernel@gmail.com, jakub@cloudflare.com, nb@tipi-net.de,
	Jakub Kicinski <kuba@kernel.org>
Subject: [PATCH net-next v2 11/12] net: ethtool: optionally skip rtnl_lock on IOCTL path
Date: Thu,  4 Jun 2026 17:29:11 -0700	[thread overview]
Message-ID: <20260605002912.3456868-12-kuba@kernel.org> (raw)
In-Reply-To: <20260605002912.3456868-1-kuba@kernel.org>

Convert the IOCTL path similarly to how we converted Netlink.
The device lookup gets a little hairy. We could take rtnl_lock
unconditionally and drop it before calling the driver (this would
avoid the reference + liveness check). But I think being able
to make progress even if rtnl is dead-locked is quite useful.

First extra concern is handling features. List all the cmds which
modify features and always take rtnl_lock. We could fold this list
into ethtool_ioctl_needs_rtnl() but seems cleaner to keep
ethtool_ioctl_needs_rtnl() driver-related. If a driver changed
features and we were not holding rtnl_lock - warn about it.
It can only happen on buggy ops locked drivers (buggy because
they should have set appropriate "I need rtnl for op X" bit).

Second wrinkle is the PHY ID hack which drops the locks while
sleeping. Convert its static "busy" variable which used to
be protected by rtnl_lock to a field in struct ethtool_netdev_state.
This feature is about identifying an adapter or a port within
a system, so being able to blink multiple LEDs at the same
time is likely not very useful in practice. But it's the simplest
fix, we can add a mutex if someone thinks a system should only
be ID'ing one port at a time.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 include/linux/ethtool.h |   2 +
 net/ethtool/ioctl.c     | 106 ++++++++++++++++++++++++++++++----------
 2 files changed, 82 insertions(+), 26 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 74c8109b0cf3..da29017c757f 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -1375,6 +1375,7 @@ int ethtool_virtdev_set_link_ksettings(struct net_device *dev,
  *			within RTNL.
  * @rss_indir_user_size: Number of user provided entries for the default
  *			 (context 0) indirection table.
+ * @phys_id_busy:	Loop blinking the device LED is running.
  * @wol_enabled:	Wake-on-LAN is enabled
  * @module_fw_flash_in_progress: Module firmware flashing is in progress.
  */
@@ -1382,6 +1383,7 @@ struct ethtool_netdev_state {
 	struct xarray		rss_ctx;
 	struct mutex		rss_lock;
 	u32			rss_indir_user_size;
+	unsigned		phys_id_busy:1;
 	unsigned		wol_enabled:1;
 	unsigned		module_fw_flash_in_progress:1;
 };
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index c32c099845e3..a7bff829b758 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -544,7 +544,7 @@ static int ethtool_get_link_ksettings(struct net_device *dev,
 	int err = 0;
 	struct ethtool_link_ksettings link_ksettings;
 
-	ASSERT_RTNL();
+	netdev_assert_locked_ops_compat(dev);
 	if (!dev->ethtool_ops->get_link_ksettings)
 		return -EOPNOTSUPP;
 
@@ -601,7 +601,7 @@ static int ethtool_set_link_ksettings(struct net_device *dev,
 	struct ethtool_link_ksettings link_ksettings = {};
 	int err;
 
-	ASSERT_RTNL();
+	netdev_assert_locked_ops_compat(dev);
 
 	if (!dev->ethtool_ops->set_link_ksettings)
 		return -EOPNOTSUPP;
@@ -675,7 +675,7 @@ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr)
 	struct ethtool_cmd cmd;
 	int err;
 
-	ASSERT_RTNL();
+	netdev_assert_locked_ops_compat(dev);
 	if (!dev->ethtool_ops->get_link_ksettings)
 		return -EOPNOTSUPP;
 
@@ -711,7 +711,7 @@ static int ethtool_set_settings(struct net_device *dev, void __user *useraddr)
 	struct ethtool_cmd cmd;
 	int ret;
 
-	ASSERT_RTNL();
+	netdev_assert_locked_ops_compat(dev);
 
 	if (copy_from_user(&cmd, useraddr, sizeof(cmd)))
 		return -EFAULT;
@@ -2452,10 +2452,10 @@ void ethtool_puts(u8 **data, const char *str)
 }
 EXPORT_SYMBOL(ethtool_puts);
 
-static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
+static int ethtool_phys_id(struct net_device *dev, void __user *useraddr,
+			   bool has_rtnl_lock)
 {
 	struct ethtool_value id;
-	static bool busy;
 	const struct ethtool_ops *ops = dev->ethtool_ops;
 	netdevice_tracker dev_tracker;
 	int rc;
@@ -2463,7 +2463,7 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
 	if (!ops->set_phys_id)
 		return -EOPNOTSUPP;
 
-	if (busy)
+	if (dev->ethtool->phys_id_busy)
 		return -EBUSY;
 
 	if (copy_from_user(&id, useraddr, sizeof(id)))
@@ -2473,13 +2473,14 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
 	if (rc < 0)
 		return rc;
 
-	/* Drop the RTNL lock while waiting, but prevent reentry or
+	/* Drop the locks while waiting, but prevent reentry or
 	 * removal of the device.
 	 */
-	busy = true;
+	dev->ethtool->phys_id_busy = true;
 	netdev_hold(dev, &dev_tracker, GFP_KERNEL);
 	netdev_unlock_ops(dev);
-	rtnl_unlock();
+	if (has_rtnl_lock)
+		rtnl_unlock();
 
 	if (rc == 0) {
 		/* Driver will handle this itself */
@@ -2492,22 +2493,25 @@ static int ethtool_phys_id(struct net_device *dev, void __user *useraddr)
 		u64 i = 0;
 
 		do {
-			rtnl_lock();
+			if (has_rtnl_lock)
+				rtnl_lock();
 			netdev_lock_ops(dev);
 			rc = ops->set_phys_id(dev,
 				    (i++ & 1) ? ETHTOOL_ID_OFF : ETHTOOL_ID_ON);
 			netdev_unlock_ops(dev);
-			rtnl_unlock();
+			if (has_rtnl_lock)
+				rtnl_unlock();
 			if (rc)
 				break;
 			schedule_timeout_interruptible(interval);
 		} while (!signal_pending(current) && (!id.data || i < count));
 	}
 
-	rtnl_lock();
+	if (has_rtnl_lock)
+		rtnl_lock();
 	netdev_lock_ops(dev);
 	netdev_put(dev, &dev_tracker);
-	busy = false;
+	dev->ethtool->phys_id_busy = false;
 
 	(void) ops->set_phys_id(dev, ETHTOOL_ID_INACTIVE);
 	return rc;
@@ -3260,7 +3264,8 @@ static int ethtool_set_fecparam(struct net_device *dev, void __user *useraddr)
 static int
 dev_ethtool_locked(struct net *net, struct net_device *dev,
 		   void __user *useraddr,
-		   u32 ethcmd, struct ethtool_devlink_compat *devlink_state)
+		   u32 ethcmd, struct ethtool_devlink_compat *devlink_state,
+		   bool has_rtnl_lock)
 {
 	u32 sub_cmd;
 	int rc;
@@ -3316,6 +3321,8 @@ dev_ethtool_locked(struct net *net, struct net_device *dev,
 			return -EPERM;
 	}
 
+	netdev_assert_locked_ops_compat(dev);
+
 	if (dev->dev.parent)
 		pm_runtime_get_sync(dev->dev.parent);
 
@@ -3403,7 +3410,7 @@ dev_ethtool_locked(struct net *net, struct net_device *dev,
 		rc = ethtool_get_strings(dev, useraddr);
 		break;
 	case ETHTOOL_PHYS_ID:
-		rc = ethtool_phys_id(dev, useraddr);
+		rc = ethtool_phys_id(dev, useraddr, has_rtnl_lock);
 		break;
 	case ETHTOOL_GSTATS:
 		rc = ethtool_get_stats(dev, useraddr);
@@ -3550,8 +3557,20 @@ dev_ethtool_locked(struct net *net, struct net_device *dev,
 	if (dev->ethtool_ops->complete)
 		dev->ethtool_ops->complete(dev);
 
-	if (old_features != dev->features)
-		netdev_features_change(dev);
+	switch (ethcmd) {
+	case ETHTOOL_PHYS_ID:
+		/* Don't check features if operation drops the locks.
+		 * Someone else may have changed features in parallel.
+		 */
+		break;
+	default:
+		if (old_features != dev->features) {
+			if (has_rtnl_lock)
+				netdev_features_change(dev);
+			else
+				netdev_WARN(dev, "ethtool cmd %u changed features without rtnl_lock", ethcmd);
+		}
+	}
 out:
 	if (dev->dev.parent)
 		pm_runtime_put(dev->dev.parent);
@@ -3559,25 +3578,60 @@ dev_ethtool_locked(struct net *net, struct net_device *dev,
 	return rc;
 }
 
+/* Commands that may toggle dev->features in net/ethtool/ioctl.c and so
+ * call into __netdev_update_features(), which still requires rtnl_lock.
+ * Driver-decided SET commands that may chain into rtnl-only helpers are
+ * covered by ethtool_ioctl_needs_rtnl()/ETHTOOL_OP_NEEDS_RTNL_*.
+ */
+static bool ethtool_cmd_changes_features(u32 ethcmd)
+{
+	switch (ethcmd) {
+	case ETHTOOL_SFEATURES:
+	case ETHTOOL_SFLAGS:
+	case ETHTOOL_STXCSUM:
+	case ETHTOOL_SRXCSUM:
+	case ETHTOOL_SSG:
+	case ETHTOOL_STSO:
+	case ETHTOOL_SGSO:
+	case ETHTOOL_SGRO:
+		return true;
+	}
+	return false;
+}
+
 static int
 __dev_ethtool(struct net *net, struct ifreq *ifr, void __user *useraddr,
 	      u32 ethcmd, struct ethtool_devlink_compat *devlink_state)
 {
+	netdevice_tracker dev_tracker;
 	struct net_device *dev;
+	bool need_rtnl;
 	int rc;
 
-	rtnl_lock();
-	dev = __dev_get_by_name(net, ifr->ifr_name);
-	if (!dev) {
+	dev = netdev_get_by_name(net, ifr->ifr_name, &dev_tracker, GFP_KERNEL);
+	if (!dev)
+		return -ENODEV;
+
+	need_rtnl = !netdev_need_ops_lock(dev) ||
+		    ethtool_cmd_changes_features(ethcmd) ||
+		    ethtool_ioctl_needs_rtnl(dev, ethcmd);
+	if (need_rtnl)
+		rtnl_lock();
+	netdev_lock_ops(dev);
+	if (dev->reg_state > NETREG_REGISTERED ||
+	    dev->moving_ns || !net_eq(dev_net(dev), net)) {
 		rc = -ENODEV;
-		goto exit_rtnl_unlock;
+		goto exit_ops_unlock;
 	}
 
-	netdev_lock_ops(dev);
-	rc = dev_ethtool_locked(net, dev, useraddr, ethcmd, devlink_state);
+	rc = dev_ethtool_locked(net, dev, useraddr, ethcmd, devlink_state,
+				need_rtnl);
+
+exit_ops_unlock:
 	netdev_unlock_ops(dev);
-exit_rtnl_unlock:
-	rtnl_unlock();
+	if (need_rtnl)
+		rtnl_unlock();
+	netdev_put(dev, &dev_tracker);
 
 	return rc;
 }
-- 
2.54.0


  parent reply	other threads:[~2026-06-05  0:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-05  0:29 [PATCH net-next v2 00/12] net: ethtool: let ops locked drivers run without rtnl_lock Jakub Kicinski
2026-06-05  0:29 ` [PATCH net-next v2 01/12] net: ethtool: serialize broadcast notification sequence allocation Jakub Kicinski
2026-06-05  0:29 ` [PATCH net-next v2 02/12] net: ethtool: relax ethnl_req_get_phydev() locking assertion Jakub Kicinski
2026-06-05  0:29 ` [PATCH net-next v2 03/12] net: ethtool: make dev->hwprov ops-protected Jakub Kicinski
2026-06-05  0:29 ` [PATCH net-next v2 04/12] net: ethtool: optionally skip rtnl_lock on Netlink path for GET ops Jakub Kicinski
2026-06-05  0:29 ` [PATCH net-next v2 05/12] net: ethtool: optionally skip rtnl_lock on Netlink path for SET ops Jakub Kicinski
2026-06-05  0:29 ` [PATCH net-next v2 06/12] net: ethtool: optionally skip rtnl_lock in cable test handlers Jakub Kicinski
2026-06-05  0:29 ` [PATCH net-next v2 07/12] net: ethtool: optionally skip rtnl_lock in ethnl_tsinfo_dumpit() Jakub Kicinski
2026-06-05  0:29 ` [PATCH net-next v2 08/12] net: ethtool: optionally skip rtnl_lock in ethnl_act_module_fw_flash() Jakub Kicinski
2026-06-05  0:29 ` [PATCH net-next v2 09/12] net: ethtool: optionally skip rtnl_lock in RSS context handlers Jakub Kicinski
2026-06-05  0:29 ` [PATCH net-next v2 10/12] net: ethtool: ioctl: concentrate the locking Jakub Kicinski
2026-06-05  0:29 ` Jakub Kicinski [this message]
2026-06-05  0:29 ` [PATCH net-next v2 12/12] docs: net: ethtool: document ops-locked drivers and op_needs_rtnl Jakub Kicinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260605002912.3456868-12-kuba@kernel.org \
    --to=kuba@kernel.org \
    --cc=alexanderduyck@fb.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=hkallweit1@gmail.com \
    --cc=horms@kernel.org \
    --cc=jacob.e.keller@intel.com \
    --cc=jakub@cloudflare.com \
    --cc=joshwash@google.com \
    --cc=kory.maincent@bootlin.com \
    --cc=maxime.chevallier@bootlin.com \
    --cc=michael.chan@broadcom.com \
    --cc=nb@tipi-net.de \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf.kernel@gmail.com \
    --cc=tariqt@nvidia.com \
    --cc=willemb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox