netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI
@ 2025-01-15  3:53 Jakub Kicinski
  2025-01-15  3:53 ` [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers Jakub Kicinski
                   ` (11 more replies)
  0 siblings, 12 replies; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski

We recently added a lock member to struct net_device, with a vague
plan to start using it to protect netdev-local state, removing
the need to take rtnl_lock for new configuration APIs.

Lay some groundwork and use this lock for protecting NAPI APIs.

v2:
 - reorder patches 2 and 3
 - add missing READ_ONCE()
 - fix up the kdoc to please Sphinx / htmldocs
 - use napi_disabled_locked() in via-velocity
 - update the comment on dev_isalive()
v1: https://lore.kernel.org/20250114035118.110297-1-kuba@kernel.org

Jakub Kicinski (11):
  net: add netdev_lock() / netdev_unlock() helpers
  net: make netdev_lock() protect netdev->reg_state
  net: add helpers for lookup and walking netdevs under netdev_lock()
  net: add netdev->up protected by netdev_lock()
  net: protect netdev->napi_list with netdev_lock()
  net: protect NAPI enablement with netdev_lock()
  net: make netdev netlink ops hold netdev_lock()
  net: protect threaded status of NAPI with netdev_lock()
  net: protect napi->irq with netdev_lock()
  net: protect NAPI config fields with netdev_lock()
  netdev-genl: remove rtnl_lock protection from NAPI ops

 include/linux/netdevice.h                   | 118 +++++++++++--
 net/core/dev.h                              |  29 +++-
 drivers/net/ethernet/amd/pcnet32.c          |  11 +-
 drivers/net/ethernet/intel/iavf/iavf_main.c |  84 ++++-----
 drivers/net/ethernet/marvell/mvneta.c       |   5 +-
 drivers/net/ethernet/via/via-velocity.c     |   6 +-
 drivers/net/netdevsim/ethtool.c             |   4 +-
 net/core/dev.c                              | 183 ++++++++++++++++++--
 net/core/net-sysfs.c                        |  39 ++++-
 net/core/netdev-genl.c                      |  56 +++---
 net/shaper/shaper.c                         |   6 +-
 11 files changed, 420 insertions(+), 121 deletions(-)

-- 
2.48.0


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
@ 2025-01-15  3:53 ` Jakub Kicinski
  2025-01-15  8:21   ` Kuniyuki Iwashima
  2025-01-15  8:36   ` Przemek Kitszel
  2025-01-15  3:53 ` [PATCH net-next v2 02/11] net: make netdev_lock() protect netdev->reg_state Jakub Kicinski
                   ` (10 subsequent siblings)
  11 siblings, 2 replies; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski, anthony.l.nguyen, przemyslaw.kitszel, jiri

Add helpers for locking the netdev instance, use it in drivers
and the shaper code. This will make grepping for the lock usage
much easier, as we extend the lock to cover more fields.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
---
CC: anthony.l.nguyen@intel.com
CC: przemyslaw.kitszel@intel.com
CC: jiri@resnulli.us
---
 include/linux/netdevice.h                   | 23 ++++++-
 drivers/net/ethernet/intel/iavf/iavf_main.c | 74 ++++++++++-----------
 drivers/net/netdevsim/ethtool.c             |  4 +-
 net/shaper/shaper.c                         |  6 +-
 4 files changed, 63 insertions(+), 44 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index bced03fb349e..891c5bdb894c 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2444,8 +2444,12 @@ struct net_device {
 	u32			napi_defer_hard_irqs;
 
 	/**
-	 * @lock: protects @net_shaper_hierarchy, feel free to use for other
-	 * netdev-scope protection. Ordering: take after rtnl_lock.
+	 * @lock: netdev-scope lock, protects a small selection of fields.
+	 * Should always be taken using netdev_lock() / netdev_unlock() helpers.
+	 * Drivers are free to use it for other protection.
+	 *
+	 * Protects: @net_shaper_hierarchy.
+	 * Ordering: take after rtnl_lock.
 	 */
 	struct mutex		lock;
 
@@ -2671,6 +2675,21 @@ void netif_queue_set_napi(struct net_device *dev, unsigned int queue_index,
 			  enum netdev_queue_type type,
 			  struct napi_struct *napi);
 
+static inline void netdev_lock(struct net_device *dev)
+{
+	mutex_lock(&dev->lock);
+}
+
+static inline void netdev_unlock(struct net_device *dev)
+{
+	mutex_unlock(&dev->lock);
+}
+
+static inline void netdev_assert_locked(struct net_device *dev)
+{
+	lockdep_assert_held(&dev->lock);
+}
+
 static inline void netif_napi_set_irq(struct napi_struct *napi, int irq)
 {
 	napi->irq = irq;
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 7740f446c73f..ab908d620285 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -1977,7 +1977,7 @@ static void iavf_finish_config(struct work_struct *work)
 	 * The dev->lock is needed to update the queue number
 	 */
 	rtnl_lock();
-	mutex_lock(&adapter->netdev->lock);
+	netdev_lock(adapter->netdev);
 	mutex_lock(&adapter->crit_lock);
 
 	if ((adapter->flags & IAVF_FLAG_SETUP_NETDEV_FEATURES) &&
@@ -1997,7 +1997,7 @@ static void iavf_finish_config(struct work_struct *work)
 		netif_set_real_num_tx_queues(adapter->netdev, pairs);
 
 		if (adapter->netdev->reg_state != NETREG_REGISTERED) {
-			mutex_unlock(&adapter->netdev->lock);
+			netdev_unlock(adapter->netdev);
 			netdev_released = true;
 			err = register_netdevice(adapter->netdev);
 			if (err) {
@@ -2027,7 +2027,7 @@ static void iavf_finish_config(struct work_struct *work)
 out:
 	mutex_unlock(&adapter->crit_lock);
 	if (!netdev_released)
-		mutex_unlock(&adapter->netdev->lock);
+		netdev_unlock(adapter->netdev);
 	rtnl_unlock();
 }
 
@@ -2724,10 +2724,10 @@ static void iavf_watchdog_task(struct work_struct *work)
 	struct iavf_hw *hw = &adapter->hw;
 	u32 reg_val;
 
-	mutex_lock(&netdev->lock);
+	netdev_lock(netdev);
 	if (!mutex_trylock(&adapter->crit_lock)) {
 		if (adapter->state == __IAVF_REMOVE) {
-			mutex_unlock(&netdev->lock);
+			netdev_unlock(netdev);
 			return;
 		}
 
@@ -2741,35 +2741,35 @@ static void iavf_watchdog_task(struct work_struct *work)
 	case __IAVF_STARTUP:
 		iavf_startup(adapter);
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		queue_delayed_work(adapter->wq, &adapter->watchdog_task,
 				   msecs_to_jiffies(30));
 		return;
 	case __IAVF_INIT_VERSION_CHECK:
 		iavf_init_version_check(adapter);
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		queue_delayed_work(adapter->wq, &adapter->watchdog_task,
 				   msecs_to_jiffies(30));
 		return;
 	case __IAVF_INIT_GET_RESOURCES:
 		iavf_init_get_resources(adapter);
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		queue_delayed_work(adapter->wq, &adapter->watchdog_task,
 				   msecs_to_jiffies(1));
 		return;
 	case __IAVF_INIT_EXTENDED_CAPS:
 		iavf_init_process_extended_caps(adapter);
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		queue_delayed_work(adapter->wq, &adapter->watchdog_task,
 				   msecs_to_jiffies(1));
 		return;
 	case __IAVF_INIT_CONFIG_ADAPTER:
 		iavf_init_config_adapter(adapter);
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		queue_delayed_work(adapter->wq, &adapter->watchdog_task,
 				   msecs_to_jiffies(1));
 		return;
@@ -2781,7 +2781,7 @@ static void iavf_watchdog_task(struct work_struct *work)
 			 * as it can loop forever
 			 */
 			mutex_unlock(&adapter->crit_lock);
-			mutex_unlock(&netdev->lock);
+			netdev_unlock(netdev);
 			return;
 		}
 		if (++adapter->aq_wait_count > IAVF_AQ_MAX_ERR) {
@@ -2790,7 +2790,7 @@ static void iavf_watchdog_task(struct work_struct *work)
 			adapter->flags |= IAVF_FLAG_PF_COMMS_FAILED;
 			iavf_shutdown_adminq(hw);
 			mutex_unlock(&adapter->crit_lock);
-			mutex_unlock(&netdev->lock);
+			netdev_unlock(netdev);
 			queue_delayed_work(adapter->wq,
 					   &adapter->watchdog_task, (5 * HZ));
 			return;
@@ -2798,7 +2798,7 @@ static void iavf_watchdog_task(struct work_struct *work)
 		/* Try again from failed step*/
 		iavf_change_state(adapter, adapter->last_state);
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		queue_delayed_work(adapter->wq, &adapter->watchdog_task, HZ);
 		return;
 	case __IAVF_COMM_FAILED:
@@ -2811,7 +2811,7 @@ static void iavf_watchdog_task(struct work_struct *work)
 			iavf_change_state(adapter, __IAVF_INIT_FAILED);
 			adapter->flags &= ~IAVF_FLAG_PF_COMMS_FAILED;
 			mutex_unlock(&adapter->crit_lock);
-			mutex_unlock(&netdev->lock);
+			netdev_unlock(netdev);
 			return;
 		}
 		reg_val = rd32(hw, IAVF_VFGEN_RSTAT) &
@@ -2831,14 +2831,14 @@ static void iavf_watchdog_task(struct work_struct *work)
 		adapter->aq_required = 0;
 		adapter->current_op = VIRTCHNL_OP_UNKNOWN;
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		queue_delayed_work(adapter->wq,
 				   &adapter->watchdog_task,
 				   msecs_to_jiffies(10));
 		return;
 	case __IAVF_RESETTING:
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		queue_delayed_work(adapter->wq, &adapter->watchdog_task,
 				   HZ * 2);
 		return;
@@ -2869,7 +2869,7 @@ static void iavf_watchdog_task(struct work_struct *work)
 	case __IAVF_REMOVE:
 	default:
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		return;
 	}
 
@@ -2881,14 +2881,14 @@ static void iavf_watchdog_task(struct work_struct *work)
 		dev_err(&adapter->pdev->dev, "Hardware reset detected\n");
 		iavf_schedule_reset(adapter, IAVF_FLAG_RESET_PENDING);
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		queue_delayed_work(adapter->wq,
 				   &adapter->watchdog_task, HZ * 2);
 		return;
 	}
 
 	mutex_unlock(&adapter->crit_lock);
-	mutex_unlock(&netdev->lock);
+	netdev_unlock(netdev);
 restart_watchdog:
 	if (adapter->state >= __IAVF_DOWN)
 		queue_work(adapter->wq, &adapter->adminq_task);
@@ -3015,12 +3015,12 @@ static void iavf_reset_task(struct work_struct *work)
 	/* When device is being removed it doesn't make sense to run the reset
 	 * task, just return in such a case.
 	 */
-	mutex_lock(&netdev->lock);
+	netdev_lock(netdev);
 	if (!mutex_trylock(&adapter->crit_lock)) {
 		if (adapter->state != __IAVF_REMOVE)
 			queue_work(adapter->wq, &adapter->reset_task);
 
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		return;
 	}
 
@@ -3068,7 +3068,7 @@ static void iavf_reset_task(struct work_struct *work)
 			reg_val);
 		iavf_disable_vf(adapter);
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		return; /* Do not attempt to reinit. It's dead, Jim. */
 	}
 
@@ -3209,7 +3209,7 @@ static void iavf_reset_task(struct work_struct *work)
 
 	wake_up(&adapter->reset_waitqueue);
 	mutex_unlock(&adapter->crit_lock);
-	mutex_unlock(&netdev->lock);
+	netdev_unlock(netdev);
 
 	return;
 reset_err:
@@ -3220,7 +3220,7 @@ static void iavf_reset_task(struct work_struct *work)
 	iavf_disable_vf(adapter);
 
 	mutex_unlock(&adapter->crit_lock);
-	mutex_unlock(&netdev->lock);
+	netdev_unlock(netdev);
 	dev_err(&adapter->pdev->dev, "failed to allocate resources during reinit\n");
 }
 
@@ -3692,10 +3692,10 @@ static int __iavf_setup_tc(struct net_device *netdev, void *type_data)
 	if (test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section))
 		return 0;
 
-	mutex_lock(&netdev->lock);
+	netdev_lock(netdev);
 	netif_set_real_num_rx_queues(netdev, total_qps);
 	netif_set_real_num_tx_queues(netdev, total_qps);
-	mutex_unlock(&netdev->lock);
+	netdev_unlock(netdev);
 
 	return ret;
 }
@@ -4365,7 +4365,7 @@ static int iavf_open(struct net_device *netdev)
 		return -EIO;
 	}
 
-	mutex_lock(&netdev->lock);
+	netdev_lock(netdev);
 	while (!mutex_trylock(&adapter->crit_lock)) {
 		/* If we are in __IAVF_INIT_CONFIG_ADAPTER state the crit_lock
 		 * is already taken and iavf_open is called from an upper
@@ -4373,7 +4373,7 @@ static int iavf_open(struct net_device *netdev)
 		 * We have to leave here to avoid dead lock.
 		 */
 		if (adapter->state == __IAVF_INIT_CONFIG_ADAPTER) {
-			mutex_unlock(&netdev->lock);
+			netdev_unlock(netdev);
 			return -EBUSY;
 		}
 
@@ -4424,7 +4424,7 @@ static int iavf_open(struct net_device *netdev)
 	iavf_irq_enable(adapter, true);
 
 	mutex_unlock(&adapter->crit_lock);
-	mutex_unlock(&netdev->lock);
+	netdev_unlock(netdev);
 
 	return 0;
 
@@ -4437,7 +4437,7 @@ static int iavf_open(struct net_device *netdev)
 	iavf_free_all_tx_resources(adapter);
 err_unlock:
 	mutex_unlock(&adapter->crit_lock);
-	mutex_unlock(&netdev->lock);
+	netdev_unlock(netdev);
 
 	return err;
 }
@@ -4459,12 +4459,12 @@ static int iavf_close(struct net_device *netdev)
 	u64 aq_to_restore;
 	int status;
 
-	mutex_lock(&netdev->lock);
+	netdev_lock(netdev);
 	mutex_lock(&adapter->crit_lock);
 
 	if (adapter->state <= __IAVF_DOWN_PENDING) {
 		mutex_unlock(&adapter->crit_lock);
-		mutex_unlock(&netdev->lock);
+		netdev_unlock(netdev);
 		return 0;
 	}
 
@@ -4498,7 +4498,7 @@ static int iavf_close(struct net_device *netdev)
 	iavf_free_traffic_irqs(adapter);
 
 	mutex_unlock(&adapter->crit_lock);
-	mutex_unlock(&netdev->lock);
+	netdev_unlock(netdev);
 
 	/* We explicitly don't free resources here because the hardware is
 	 * still active and can DMA into memory. Resources are cleared in
@@ -5375,7 +5375,7 @@ static int iavf_suspend(struct device *dev_d)
 
 	netif_device_detach(netdev);
 
-	mutex_lock(&netdev->lock);
+	netdev_lock(netdev);
 	mutex_lock(&adapter->crit_lock);
 
 	if (netif_running(netdev)) {
@@ -5387,7 +5387,7 @@ static int iavf_suspend(struct device *dev_d)
 	iavf_reset_interrupt_capability(adapter);
 
 	mutex_unlock(&adapter->crit_lock);
-	mutex_unlock(&netdev->lock);
+	netdev_unlock(netdev);
 
 	return 0;
 }
@@ -5486,7 +5486,7 @@ static void iavf_remove(struct pci_dev *pdev)
 	if (netdev->reg_state == NETREG_REGISTERED)
 		unregister_netdev(netdev);
 
-	mutex_lock(&netdev->lock);
+	netdev_lock(netdev);
 	mutex_lock(&adapter->crit_lock);
 	dev_info(&adapter->pdev->dev, "Removing device\n");
 	iavf_change_state(adapter, __IAVF_REMOVE);
@@ -5523,7 +5523,7 @@ static void iavf_remove(struct pci_dev *pdev)
 	mutex_destroy(&hw->aq.asq_mutex);
 	mutex_unlock(&adapter->crit_lock);
 	mutex_destroy(&adapter->crit_lock);
-	mutex_unlock(&netdev->lock);
+	netdev_unlock(netdev);
 
 	iounmap(hw->hw_addr);
 	pci_release_regions(pdev);
diff --git a/drivers/net/netdevsim/ethtool.c b/drivers/net/netdevsim/ethtool.c
index 5fe1eaef99b5..3f44a11aec83 100644
--- a/drivers/net/netdevsim/ethtool.c
+++ b/drivers/net/netdevsim/ethtool.c
@@ -103,10 +103,10 @@ nsim_set_channels(struct net_device *dev, struct ethtool_channels *ch)
 	struct netdevsim *ns = netdev_priv(dev);
 	int err;
 
-	mutex_lock(&dev->lock);
+	netdev_lock(dev);
 	err = netif_set_real_num_queues(dev, ch->combined_count,
 					ch->combined_count);
-	mutex_unlock(&dev->lock);
+	netdev_unlock(dev);
 	if (err)
 		return err;
 
diff --git a/net/shaper/shaper.c b/net/shaper/shaper.c
index 15463062fe7b..7101a48bce54 100644
--- a/net/shaper/shaper.c
+++ b/net/shaper/shaper.c
@@ -40,7 +40,7 @@ static void net_shaper_lock(struct net_shaper_binding *binding)
 {
 	switch (binding->type) {
 	case NET_SHAPER_BINDING_TYPE_NETDEV:
-		mutex_lock(&binding->netdev->lock);
+		netdev_lock(binding->netdev);
 		break;
 	}
 }
@@ -49,7 +49,7 @@ static void net_shaper_unlock(struct net_shaper_binding *binding)
 {
 	switch (binding->type) {
 	case NET_SHAPER_BINDING_TYPE_NETDEV:
-		mutex_unlock(&binding->netdev->lock);
+		netdev_unlock(binding->netdev);
 		break;
 	}
 }
@@ -1398,7 +1398,7 @@ void net_shaper_set_real_num_tx_queues(struct net_device *dev,
 	/* Only drivers implementing shapers support ensure
 	 * the lock is acquired in advance.
 	 */
-	lockdep_assert_held(&dev->lock);
+	netdev_assert_locked(dev);
 
 	/* Take action only when decreasing the tx queue number. */
 	for (i = txq; i < dev->real_num_tx_queues; ++i) {
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v2 02/11] net: make netdev_lock() protect netdev->reg_state
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
  2025-01-15  3:53 ` [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers Jakub Kicinski
@ 2025-01-15  3:53 ` Jakub Kicinski
  2025-01-15  8:30   ` Kuniyuki Iwashima
  2025-01-15  3:53 ` [PATCH net-next v2 03/11] net: add helpers for lookup and walking netdevs under netdev_lock() Jakub Kicinski
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski

Protect writes to netdev->reg_state with netdev_lock().
From now on holding netdev_lock() is sufficient to prevent
the net_device from getting unregistered, so code which
wants to hold just a single netdev around no longer needs
to hold rtnl_lock.

We do not protect the NETREG_UNREGISTERED -> NETREG_RELEASED
transition. We'd need to move mutex_destroy(netdev->lock)
to .release, but the real reason is that trying to stop
the unregistration process mid-way would be unsafe / crazy.
Taking references on such devices is not safe, either.
So the intended semantics are to lock REGISTERED devices.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v2:
 - reorder with next patch
v1: https://lore.kernel.org/20250114035118.110297-4-kuba@kernel.org
---
 include/linux/netdevice.h | 2 +-
 net/core/dev.c            | 6 ++++++
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 891c5bdb894c..30963c5d409b 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2448,7 +2448,7 @@ struct net_device {
 	 * Should always be taken using netdev_lock() / netdev_unlock() helpers.
 	 * Drivers are free to use it for other protection.
 	 *
-	 * Protects: @net_shaper_hierarchy.
+	 * Protects: @reg_state, @net_shaper_hierarchy.
 	 * Ordering: take after rtnl_lock.
 	 */
 	struct mutex		lock;
diff --git a/net/core/dev.c b/net/core/dev.c
index fda4e1039bf0..6603c08768f6 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10668,7 +10668,9 @@ int register_netdevice(struct net_device *dev)
 
 	ret = netdev_register_kobject(dev);
 
+	netdev_lock(dev);
 	WRITE_ONCE(dev->reg_state, ret ? NETREG_UNREGISTERED : NETREG_REGISTERED);
+	netdev_unlock(dev);
 
 	if (ret)
 		goto err_uninit_notify;
@@ -10942,7 +10944,9 @@ void netdev_run_todo(void)
 			continue;
 		}
 
+		netdev_lock(dev);
 		WRITE_ONCE(dev->reg_state, NETREG_UNREGISTERED);
+		netdev_unlock(dev);
 		linkwatch_sync_dev(dev);
 	}
 
@@ -11548,7 +11552,9 @@ void unregister_netdevice_many_notify(struct list_head *head,
 	list_for_each_entry(dev, head, unreg_list) {
 		/* And unlink it from device chain. */
 		unlist_netdevice(dev);
+		netdev_lock(dev);
 		WRITE_ONCE(dev->reg_state, NETREG_UNREGISTERING);
+		netdev_unlock(dev);
 	}
 	flush_all_backlogs();
 
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v2 03/11] net: add helpers for lookup and walking netdevs under netdev_lock()
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
  2025-01-15  3:53 ` [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers Jakub Kicinski
  2025-01-15  3:53 ` [PATCH net-next v2 02/11] net: make netdev_lock() protect netdev->reg_state Jakub Kicinski
@ 2025-01-15  3:53 ` Jakub Kicinski
  2025-01-15  8:41   ` Kuniyuki Iwashima
  2025-01-15  3:53 ` [PATCH net-next v2 04/11] net: add netdev->up protected by netdev_lock() Jakub Kicinski
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski

Add helpers for accessing netdevs under netdev_lock().
There's some careful handling needed to find the device and lock it
safely, without it getting unregistered, and without taking rtnl_lock
(the latter being the whole point of the new locking, after all).

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v2:
 - add missing READ_ONCE() when under RCU
v1: https://lore.kernel.org/20250114035118.110297-3-kuba@kernel.org
---
 net/core/dev.h |  16 +++++++
 net/core/dev.c | 110 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 126 insertions(+)

diff --git a/net/core/dev.h b/net/core/dev.h
index d8966847794c..25ae732c0775 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -2,6 +2,7 @@
 #ifndef _NET_CORE_DEV_H
 #define _NET_CORE_DEV_H
 
+#include <linux/cleanup.h>
 #include <linux/types.h>
 #include <linux/rwsem.h>
 #include <linux/netdevice.h>
@@ -23,8 +24,23 @@ struct sd_flow_limit {
 extern int netdev_flow_limit_table_len;
 
 struct napi_struct *netdev_napi_by_id(struct net *net, unsigned int napi_id);
+struct napi_struct *
+netdev_napi_by_id_lock(struct net *net, unsigned int napi_id);
 struct net_device *dev_get_by_napi_id(unsigned int napi_id);
 
+struct net_device *netdev_get_by_index_lock(struct net *net, int ifindex);
+struct net_device *__netdev_put_lock(struct net_device *dev);
+struct net_device *
+netdev_xa_find_lock(struct net *net, struct net_device *dev,
+		    unsigned long *index);
+
+DEFINE_FREE(netdev_unlock, struct net_device *, if (_T) netdev_unlock(_T));
+
+#define for_each_netdev_lock_scoped(net, var_name, ifindex)		\
+	for (struct net_device *var_name __free(netdev_unlock) = NULL;	\
+	     (var_name = netdev_xa_find_lock(net, var_name, &ifindex)); \
+	     ifindex++)
+
 #ifdef CONFIG_PROC_FS
 int __init dev_proc_init(void);
 #else
diff --git a/net/core/dev.c b/net/core/dev.c
index 6603c08768f6..c871e2697fb6 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -783,6 +783,49 @@ struct napi_struct *netdev_napi_by_id(struct net *net, unsigned int napi_id)
 	return napi;
 }
 
+/**
+ *	netdev_napi_by_id_lock() - find a device by NAPI ID and lock it
+ *	@net: the applicable net namespace
+ *	@napi_id: ID of a NAPI of a target device
+ *
+ *	Find a NAPI instance with @napi_id. Lock its device.
+ *	The device must be in %NETREG_REGISTERED state for lookup to succeed.
+ *	netdev_unlock() must be called to release it.
+ *
+ *	Return: pointer to NAPI, its device with lock held, NULL if not found.
+ */
+struct napi_struct *
+netdev_napi_by_id_lock(struct net *net, unsigned int napi_id)
+{
+	struct napi_struct *napi;
+	struct net_device *dev;
+
+	rcu_read_lock();
+	napi = netdev_napi_by_id(net, napi_id);
+	if (!napi || READ_ONCE(napi->dev->reg_state) != NETREG_REGISTERED) {
+		rcu_read_unlock();
+		return NULL;
+	}
+
+	dev = napi->dev;
+	dev_hold(dev);
+	rcu_read_unlock();
+
+	dev = __netdev_put_lock(dev);
+	if (!dev)
+		return NULL;
+
+	rcu_read_lock();
+	napi = netdev_napi_by_id(net, napi_id);
+	if (napi && napi->dev != dev)
+		napi = NULL;
+	rcu_read_unlock();
+
+	if (!napi)
+		netdev_unlock(dev);
+	return napi;
+}
+
 /**
  *	__dev_get_by_name	- find a device by its name
  *	@net: the applicable net namespace
@@ -971,6 +1014,73 @@ struct net_device *dev_get_by_napi_id(unsigned int napi_id)
 	return napi ? napi->dev : NULL;
 }
 
+/* Release the held reference on the net_device, and if the net_device
+ * is still registered try to lock the instance lock. If device is being
+ * unregistered NULL will be returned (but the reference has been released,
+ * either way!)
+ *
+ * This helper is intended for locking net_device after it has been looked up
+ * using a lockless lookup helper. Lock prevents the instance from going away.
+ */
+struct net_device *__netdev_put_lock(struct net_device *dev)
+{
+	netdev_lock(dev);
+	if (dev->reg_state > NETREG_REGISTERED) {
+		netdev_unlock(dev);
+		dev_put(dev);
+		return NULL;
+	}
+	dev_put(dev);
+	return dev;
+}
+
+/**
+ *	netdev_get_by_index_lock() - find a device by its ifindex
+ *	@net: the applicable net namespace
+ *	@ifindex: index of device
+ *
+ *	Search for an interface by index. If a valid device
+ *	with @ifindex is found it will be returned with netdev->lock held.
+ *	netdev_unlock() must be called to release it.
+ *
+ *	Return: pointer to a device with lock held, NULL if not found.
+ */
+struct net_device *netdev_get_by_index_lock(struct net *net, int ifindex)
+{
+	struct net_device *dev;
+
+	dev = dev_get_by_index(net, ifindex);
+	if (!dev)
+		return NULL;
+
+	return __netdev_put_lock(dev);
+}
+
+struct net_device *
+netdev_xa_find_lock(struct net *net, struct net_device *dev,
+		    unsigned long *index)
+{
+	if (dev)
+		netdev_unlock(dev);
+
+	do {
+		rcu_read_lock();
+		dev = xa_find(&net->dev_by_index, index, ULONG_MAX, XA_PRESENT);
+		if (!dev) {
+			rcu_read_unlock();
+			return NULL;
+		}
+		dev_hold(dev);
+		rcu_read_unlock();
+
+		dev = __netdev_put_lock(dev);
+		if (dev)
+			return dev;
+
+		(*index)++;
+	} while (true);
+}
+
 static DEFINE_SEQLOCK(netdev_rename_lock);
 
 void netdev_copy_name(struct net_device *dev, char *name)
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v2 04/11] net: add netdev->up protected by netdev_lock()
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
                   ` (2 preceding siblings ...)
  2025-01-15  3:53 ` [PATCH net-next v2 03/11] net: add helpers for lookup and walking netdevs under netdev_lock() Jakub Kicinski
@ 2025-01-15  3:53 ` Jakub Kicinski
  2025-01-15  8:45   ` Kuniyuki Iwashima
  2025-01-15  3:53 ` [PATCH net-next v2 05/11] net: protect netdev->napi_list with netdev_lock() Jakub Kicinski
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski

Some uAPI (netdev netlink) hide net_device's sub-objects while
the interface is down to ensure uniform behavior across drivers.
To remove the rtnl_lock dependency from those uAPIs we need a way
to safely tell if the device is down or up.

Add an indication of whether device is open or closed, protected
by netdev->lock. The semantics are the same as IFF_UP, but taking
netdev_lock around every write to ->flags would be a lot of code
churn.

We don't want to blanket the entire open / close path by netdev_lock,
because it will prevent us from applying it to specific structures -
core helpers won't be able to take that lock from any function
called by the drivers on open/close paths.

So the state of the flag is "pessimistic", as in it may report false
negatives, but never false positives.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v2:
 - rework the kdoc to please Sphinx
v1:  https://lore.kernel.org/20250114035118.110297-5-kuba@kernel.org
---
 include/linux/netdevice.h | 14 +++++++++++++-
 net/core/dev.h            | 12 ++++++++++++
 net/core/dev.c            |  4 ++--
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 30963c5d409b..fdf3a8d93185 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2443,12 +2443,24 @@ struct net_device {
 	unsigned long		gro_flush_timeout;
 	u32			napi_defer_hard_irqs;
 
+	/**
+	 * @up: copy of @state's IFF_UP, but safe to read with just @lock.
+	 *	May report false negatives while the device is being opened
+	 *	or closed (@lock does not protect .ndo_open, or .ndo_close).
+	 */
+	bool			up;
+
 	/**
 	 * @lock: netdev-scope lock, protects a small selection of fields.
 	 * Should always be taken using netdev_lock() / netdev_unlock() helpers.
 	 * Drivers are free to use it for other protection.
 	 *
-	 * Protects: @reg_state, @net_shaper_hierarchy.
+	 * Protects:
+	 *	@net_shaper_hierarchy, @reg_state
+	 *
+	 * Partially protects (writers must hold both @lock and rtnl_lock):
+	 *	@up
+	 *
 	 * Ordering: take after rtnl_lock.
 	 */
 	struct mutex		lock;
diff --git a/net/core/dev.h b/net/core/dev.h
index 25ae732c0775..ef37e2dd44f4 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -128,6 +128,18 @@ void __dev_notify_flags(struct net_device *dev, unsigned int old_flags,
 void unregister_netdevice_many_notify(struct list_head *head,
 				      u32 portid, const struct nlmsghdr *nlh);
 
+static inline void netif_set_up(struct net_device *dev, bool value)
+{
+	if (value)
+		dev->flags |= IFF_UP;
+	else
+		dev->flags &= ~IFF_UP;
+
+	netdev_lock(dev);
+	dev->up = value;
+	netdev_unlock(dev);
+}
+
 static inline void netif_set_gso_max_size(struct net_device *dev,
 					  unsigned int size)
 {
diff --git a/net/core/dev.c b/net/core/dev.c
index c871e2697fb6..4cba553a4742 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1618,7 +1618,7 @@ static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack)
 	if (ret)
 		clear_bit(__LINK_STATE_START, &dev->state);
 	else {
-		dev->flags |= IFF_UP;
+		netif_set_up(dev, true);
 		dev_set_rx_mode(dev);
 		dev_activate(dev);
 		add_device_randomness(dev->dev_addr, dev->addr_len);
@@ -1697,7 +1697,7 @@ static void __dev_close_many(struct list_head *head)
 		if (ops->ndo_stop)
 			ops->ndo_stop(dev);
 
-		dev->flags &= ~IFF_UP;
+		netif_set_up(dev, false);
 		netpoll_poll_enable(dev);
 	}
 }
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v2 05/11] net: protect netdev->napi_list with netdev_lock()
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
                   ` (3 preceding siblings ...)
  2025-01-15  3:53 ` [PATCH net-next v2 04/11] net: add netdev->up protected by netdev_lock() Jakub Kicinski
@ 2025-01-15  3:53 ` Jakub Kicinski
  2025-01-15  8:57   ` Kuniyuki Iwashima
  2025-01-15  3:53 ` [PATCH net-next v2 06/11] net: protect NAPI enablement " Jakub Kicinski
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski

Hold netdev->lock when NAPIs are getting added or removed.
This will allow safe access to NAPI instances of a net_device
without rtnl_lock.

Create a family of helpers which assume the lock is already taken.
Switch iavf to them, as it makes extensive use of netdev->lock,
already.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 include/linux/netdevice.h                   | 54 ++++++++++++++++++---
 drivers/net/ethernet/intel/iavf/iavf_main.c |  6 +--
 net/core/dev.c                              | 15 ++++--
 3 files changed, 60 insertions(+), 15 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index fdf3a8d93185..e8c8a5ea7326 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2456,7 +2456,7 @@ struct net_device {
 	 * Drivers are free to use it for other protection.
 	 *
 	 * Protects:
-	 *	@net_shaper_hierarchy, @reg_state
+	 *	@napi_list, @net_shaper_hierarchy, @reg_state
 	 *
 	 * Partially protects (writers must hold both @lock and rtnl_lock):
 	 *	@up
@@ -2712,8 +2712,19 @@ static inline void netif_napi_set_irq(struct napi_struct *napi, int irq)
  */
 #define NAPI_POLL_WEIGHT 64
 
-void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
-			   int (*poll)(struct napi_struct *, int), int weight);
+void netif_napi_add_weight_locked(struct net_device *dev,
+				  struct napi_struct *napi,
+				  int (*poll)(struct napi_struct *, int),
+				  int weight);
+
+static inline void
+netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
+		      int (*poll)(struct napi_struct *, int), int weight)
+{
+	netdev_lock(dev);
+	netif_napi_add_weight_locked(dev, napi, poll, weight);
+	netdev_unlock(dev);
+}
 
 /**
  * netif_napi_add() - initialize a NAPI context
@@ -2731,6 +2742,13 @@ netif_napi_add(struct net_device *dev, struct napi_struct *napi,
 	netif_napi_add_weight(dev, napi, poll, NAPI_POLL_WEIGHT);
 }
 
+static inline void
+netif_napi_add_locked(struct net_device *dev, struct napi_struct *napi,
+		      int (*poll)(struct napi_struct *, int))
+{
+	netif_napi_add_weight_locked(dev, napi, poll, NAPI_POLL_WEIGHT);
+}
+
 static inline void
 netif_napi_add_tx_weight(struct net_device *dev,
 			 struct napi_struct *napi,
@@ -2741,6 +2759,15 @@ netif_napi_add_tx_weight(struct net_device *dev,
 	netif_napi_add_weight(dev, napi, poll, weight);
 }
 
+static inline void
+netif_napi_add_config_locked(struct net_device *dev, struct napi_struct *napi,
+			     int (*poll)(struct napi_struct *, int), int index)
+{
+	napi->index = index;
+	napi->config = &dev->napi_config[index];
+	netif_napi_add_weight_locked(dev, napi, poll, NAPI_POLL_WEIGHT);
+}
+
 /**
  * netif_napi_add_config - initialize a NAPI context with persistent config
  * @dev: network device
@@ -2752,9 +2779,9 @@ static inline void
 netif_napi_add_config(struct net_device *dev, struct napi_struct *napi,
 		      int (*poll)(struct napi_struct *, int), int index)
 {
-	napi->index = index;
-	napi->config = &dev->napi_config[index];
-	netif_napi_add_weight(dev, napi, poll, NAPI_POLL_WEIGHT);
+	netdev_lock(dev);
+	netif_napi_add_config_locked(dev, napi, poll, index);
+	netdev_unlock(dev);
 }
 
 /**
@@ -2774,6 +2801,8 @@ static inline void netif_napi_add_tx(struct net_device *dev,
 	netif_napi_add_tx_weight(dev, napi, poll, NAPI_POLL_WEIGHT);
 }
 
+void __netif_napi_del_locked(struct napi_struct *napi);
+
 /**
  *  __netif_napi_del - remove a NAPI context
  *  @napi: NAPI context
@@ -2782,7 +2811,18 @@ static inline void netif_napi_add_tx(struct net_device *dev,
  * containing @napi. Drivers might want to call this helper to combine
  * all the needed RCU grace periods into a single one.
  */
-void __netif_napi_del(struct napi_struct *napi);
+static inline void __netif_napi_del(struct napi_struct *napi)
+{
+	netdev_lock(napi->dev);
+	__netif_napi_del_locked(napi);
+	netdev_unlock(napi->dev);
+}
+
+static inline void netif_napi_del_locked(struct napi_struct *napi)
+{
+	__netif_napi_del_locked(napi);
+	synchronize_net();
+}
 
 /**
  *  netif_napi_del - remove a NAPI context
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index ab908d620285..2db97c5d9f9e 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -1800,8 +1800,8 @@ static int iavf_alloc_q_vectors(struct iavf_adapter *adapter)
 		q_vector->v_idx = q_idx;
 		q_vector->reg_idx = q_idx;
 		cpumask_copy(&q_vector->affinity_mask, cpu_possible_mask);
-		netif_napi_add(adapter->netdev, &q_vector->napi,
-			       iavf_napi_poll);
+		netif_napi_add_locked(adapter->netdev, &q_vector->napi,
+				      iavf_napi_poll);
 	}
 
 	return 0;
@@ -1827,7 +1827,7 @@ static void iavf_free_q_vectors(struct iavf_adapter *adapter)
 	for (q_idx = 0; q_idx < num_q_vectors; q_idx++) {
 		struct iavf_q_vector *q_vector = &adapter->q_vectors[q_idx];
 
-		netif_napi_del(&q_vector->napi);
+		netif_napi_del_locked(&q_vector->napi);
 	}
 	kfree(adapter->q_vectors);
 	adapter->q_vectors = NULL;
diff --git a/net/core/dev.c b/net/core/dev.c
index 4cba553a4742..7511207057e5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6909,9 +6909,12 @@ netif_napi_dev_list_add(struct net_device *dev, struct napi_struct *napi)
 	list_add_rcu(&napi->dev_list, higher); /* adds after higher */
 }
 
-void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
-			   int (*poll)(struct napi_struct *, int), int weight)
+void netif_napi_add_weight_locked(struct net_device *dev,
+				  struct napi_struct *napi,
+				  int (*poll)(struct napi_struct *, int),
+				  int weight)
 {
+	netdev_assert_locked(dev);
 	if (WARN_ON(test_and_set_bit(NAPI_STATE_LISTED, &napi->state)))
 		return;
 
@@ -6952,7 +6955,7 @@ void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
 		dev->threaded = false;
 	netif_napi_set_irq(napi, -1);
 }
-EXPORT_SYMBOL(netif_napi_add_weight);
+EXPORT_SYMBOL(netif_napi_add_weight_locked);
 
 void napi_disable(struct napi_struct *n)
 {
@@ -7023,8 +7026,10 @@ static void flush_gro_hash(struct napi_struct *napi)
 }
 
 /* Must be called in process context */
-void __netif_napi_del(struct napi_struct *napi)
+void __netif_napi_del_locked(struct napi_struct *napi)
 {
+	netdev_assert_locked(napi->dev);
+
 	if (!test_and_clear_bit(NAPI_STATE_LISTED, &napi->state))
 		return;
 
@@ -7044,7 +7049,7 @@ void __netif_napi_del(struct napi_struct *napi)
 		napi->thread = NULL;
 	}
 }
-EXPORT_SYMBOL(__netif_napi_del);
+EXPORT_SYMBOL(__netif_napi_del_locked);
 
 static int __napi_poll(struct napi_struct *n, bool *repoll)
 {
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v2 06/11] net: protect NAPI enablement with netdev_lock()
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
                   ` (4 preceding siblings ...)
  2025-01-15  3:53 ` [PATCH net-next v2 05/11] net: protect netdev->napi_list with netdev_lock() Jakub Kicinski
@ 2025-01-15  3:53 ` Jakub Kicinski
  2025-01-15  9:02   ` Kuniyuki Iwashima
  2025-01-21  8:32   ` Dan Carpenter
  2025-01-15  3:53 ` [PATCH net-next v2 07/11] net: make netdev netlink ops hold netdev_lock() Jakub Kicinski
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski, Francois Romieu, pcnet32, anthony.l.nguyen,
	przemyslaw.kitszel, marcin.s.wojtas

Wrap napi_enable() / napi_disable() with netdev_lock().
Provide the "already locked" flavor of the API.

iavf needs the usual adjustment. A number of drivers call
napi_enable() under a spin lock, so they have to be modified
to take netdev_lock() first, then spin lock then call
napi_enable_locked().

Protecting napi_enable() implies that napi->napi_id is protected
by netdev_lock().

Acked-by: Francois Romieu <romieu@fr.zoreil.com> # via-velocity
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v2:
 - use napi_disabled_locked() in via-velocity
v1: https://lore.kernel.org/20250114035118.110297-7-kuba@kernel.org

CC: pcnet32@frontier.com
CC: anthony.l.nguyen@intel.com
CC: przemyslaw.kitszel@intel.com
CC: marcin.s.wojtas@gmail.com
CC: romieu@fr.zoreil.com
---
 include/linux/netdevice.h                   | 11 ++----
 drivers/net/ethernet/amd/pcnet32.c          | 11 +++++-
 drivers/net/ethernet/intel/iavf/iavf_main.c |  4 +-
 drivers/net/ethernet/marvell/mvneta.c       |  5 ++-
 drivers/net/ethernet/via/via-velocity.c     |  6 ++-
 net/core/dev.c                              | 41 +++++++++++++++++----
 6 files changed, 56 insertions(+), 22 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e8c8a5ea7326..4ab33fbadd9f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -382,7 +382,7 @@ struct napi_struct {
 	struct sk_buff		*skb;
 	struct list_head	rx_list; /* Pending GRO_NORMAL skbs */
 	int			rx_count; /* length of rx_list */
-	unsigned int		napi_id;
+	unsigned int		napi_id; /* protected by netdev_lock */
 	struct hrtimer		timer;
 	struct task_struct	*thread;
 	unsigned long		gro_flush_timeout;
@@ -570,16 +570,11 @@ static inline bool napi_complete(struct napi_struct *n)
 
 int dev_set_threaded(struct net_device *dev, bool threaded);
 
-/**
- *	napi_disable - prevent NAPI from scheduling
- *	@n: NAPI context
- *
- * Stop NAPI from being scheduled on this context.
- * Waits till any outstanding processing completes.
- */
 void napi_disable(struct napi_struct *n);
+void napi_disable_locked(struct napi_struct *n);
 
 void napi_enable(struct napi_struct *n);
+void napi_enable_locked(struct napi_struct *n);
 
 /**
  *	napi_synchronize - wait until NAPI is not running
diff --git a/drivers/net/ethernet/amd/pcnet32.c b/drivers/net/ethernet/amd/pcnet32.c
index 72db9f9e7bee..c6bd803f5b0c 100644
--- a/drivers/net/ethernet/amd/pcnet32.c
+++ b/drivers/net/ethernet/amd/pcnet32.c
@@ -462,7 +462,7 @@ static void pcnet32_netif_start(struct net_device *dev)
 	val = lp->a->read_csr(ioaddr, CSR3);
 	val &= 0x00ff;
 	lp->a->write_csr(ioaddr, CSR3, val);
-	napi_enable(&lp->napi);
+	napi_enable_locked(&lp->napi);
 }
 
 /*
@@ -889,6 +889,7 @@ static int pcnet32_set_ringparam(struct net_device *dev,
 	if (netif_running(dev))
 		pcnet32_netif_stop(dev);
 
+	netdev_lock(dev);
 	spin_lock_irqsave(&lp->lock, flags);
 	lp->a->write_csr(ioaddr, CSR0, CSR0_STOP);	/* stop the chip */
 
@@ -920,6 +921,7 @@ static int pcnet32_set_ringparam(struct net_device *dev,
 	}
 
 	spin_unlock_irqrestore(&lp->lock, flags);
+	netdev_unlock(dev);
 
 	netif_info(lp, drv, dev, "Ring Param Settings: RX: %d, TX: %d\n",
 		   lp->rx_ring_size, lp->tx_ring_size);
@@ -985,6 +987,7 @@ static int pcnet32_loopback_test(struct net_device *dev, uint64_t * data1)
 	if (netif_running(dev))
 		pcnet32_netif_stop(dev);
 
+	netdev_lock(dev);
 	spin_lock_irqsave(&lp->lock, flags);
 	lp->a->write_csr(ioaddr, CSR0, CSR0_STOP);	/* stop the chip */
 
@@ -1122,6 +1125,7 @@ static int pcnet32_loopback_test(struct net_device *dev, uint64_t * data1)
 		lp->a->write_bcr(ioaddr, 20, 4);	/* return to 16bit mode */
 	}
 	spin_unlock_irqrestore(&lp->lock, flags);
+	netdev_unlock(dev);
 
 	return rc;
 }				/* end pcnet32_loopback_test  */
@@ -2101,6 +2105,7 @@ static int pcnet32_open(struct net_device *dev)
 		return -EAGAIN;
 	}
 
+	netdev_lock(dev);
 	spin_lock_irqsave(&lp->lock, flags);
 	/* Check for a valid station address */
 	if (!is_valid_ether_addr(dev->dev_addr)) {
@@ -2266,7 +2271,7 @@ static int pcnet32_open(struct net_device *dev)
 		goto err_free_ring;
 	}
 
-	napi_enable(&lp->napi);
+	napi_enable_locked(&lp->napi);
 
 	/* Re-initialize the PCNET32, and start it when done. */
 	lp->a->write_csr(ioaddr, 1, (lp->init_dma_addr & 0xffff));
@@ -2300,6 +2305,7 @@ static int pcnet32_open(struct net_device *dev)
 		     lp->a->read_csr(ioaddr, CSR0));
 
 	spin_unlock_irqrestore(&lp->lock, flags);
+	netdev_unlock(dev);
 
 	return 0;		/* Always succeed */
 
@@ -2315,6 +2321,7 @@ static int pcnet32_open(struct net_device *dev)
 
 err_free_irq:
 	spin_unlock_irqrestore(&lp->lock, flags);
+	netdev_unlock(dev);
 	free_irq(dev->irq, dev);
 	return rc;
 }
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 2db97c5d9f9e..cbfaaa5b7d02 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -1180,7 +1180,7 @@ static void iavf_napi_enable_all(struct iavf_adapter *adapter)
 
 		q_vector = &adapter->q_vectors[q_idx];
 		napi = &q_vector->napi;
-		napi_enable(napi);
+		napi_enable_locked(napi);
 	}
 }
 
@@ -1196,7 +1196,7 @@ static void iavf_napi_disable_all(struct iavf_adapter *adapter)
 
 	for (q_idx = 0; q_idx < q_vectors; q_idx++) {
 		q_vector = &adapter->q_vectors[q_idx];
-		napi_disable(&q_vector->napi);
+		napi_disable_locked(&q_vector->napi);
 	}
 }
 
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index fe6261b81540..cc97474852ef 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -4392,6 +4392,7 @@ static int mvneta_cpu_online(unsigned int cpu, struct hlist_node *node)
 	if (pp->neta_armada3700)
 		return 0;
 
+	netdev_lock(port->napi.dev);
 	spin_lock(&pp->lock);
 	/*
 	 * Configuring the driver for a new CPU while the driver is
@@ -4418,7 +4419,7 @@ static int mvneta_cpu_online(unsigned int cpu, struct hlist_node *node)
 
 	/* Mask all ethernet port interrupts */
 	on_each_cpu(mvneta_percpu_mask_interrupt, pp, true);
-	napi_enable(&port->napi);
+	napi_enable_locked(&port->napi);
 
 	/*
 	 * Enable per-CPU interrupts on the CPU that is
@@ -4439,6 +4440,8 @@ static int mvneta_cpu_online(unsigned int cpu, struct hlist_node *node)
 		    MVNETA_CAUSE_LINK_CHANGE);
 	netif_tx_start_all_queues(pp->dev);
 	spin_unlock(&pp->lock);
+	netdev_unlock(port->napi.dev);
+
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/via/via-velocity.c b/drivers/net/ethernet/via/via-velocity.c
index dd4a07c97eee..5aa93144a4f5 100644
--- a/drivers/net/ethernet/via/via-velocity.c
+++ b/drivers/net/ethernet/via/via-velocity.c
@@ -2320,7 +2320,8 @@ static int velocity_change_mtu(struct net_device *dev, int new_mtu)
 		if (ret < 0)
 			goto out_free_tmp_vptr_1;
 
-		napi_disable(&vptr->napi);
+		netdev_lock(dev);
+		napi_disable_locked(&vptr->napi);
 
 		spin_lock_irqsave(&vptr->lock, flags);
 
@@ -2342,12 +2343,13 @@ static int velocity_change_mtu(struct net_device *dev, int new_mtu)
 
 		velocity_give_many_rx_descs(vptr);
 
-		napi_enable(&vptr->napi);
+		napi_enable_locked(&vptr->napi);
 
 		mac_enable_int(vptr->mac_regs);
 		netif_start_queue(dev);
 
 		spin_unlock_irqrestore(&vptr->lock, flags);
+		netdev_unlock(dev);
 
 		velocity_free_rings(tmp_vptr);
 
diff --git a/net/core/dev.c b/net/core/dev.c
index 7511207057e5..9cf93868ac7e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6957,11 +6957,13 @@ void netif_napi_add_weight_locked(struct net_device *dev,
 }
 EXPORT_SYMBOL(netif_napi_add_weight_locked);
 
-void napi_disable(struct napi_struct *n)
+void napi_disable_locked(struct napi_struct *n)
 {
 	unsigned long val, new;
 
 	might_sleep();
+	netdev_assert_locked(n->dev);
+
 	set_bit(NAPI_STATE_DISABLE, &n->state);
 
 	val = READ_ONCE(n->state);
@@ -6984,16 +6986,25 @@ void napi_disable(struct napi_struct *n)
 
 	clear_bit(NAPI_STATE_DISABLE, &n->state);
 }
-EXPORT_SYMBOL(napi_disable);
+EXPORT_SYMBOL(napi_disable_locked);
 
 /**
- *	napi_enable - enable NAPI scheduling
- *	@n: NAPI context
+ * napi_disable() - prevent NAPI from scheduling
+ * @n: NAPI context
  *
- * Resume NAPI from being scheduled on this context.
- * Must be paired with napi_disable.
+ * Stop NAPI from being scheduled on this context.
+ * Waits till any outstanding processing completes.
+ * Takes netdev_lock() for associated net_device.
  */
-void napi_enable(struct napi_struct *n)
+void napi_disable(struct napi_struct *n)
+{
+	netdev_lock(n->dev);
+	napi_disable_locked(n);
+	netdev_unlock(n->dev);
+}
+EXPORT_SYMBOL(napi_disable);
+
+void napi_enable_locked(struct napi_struct *n)
 {
 	unsigned long new, val = READ_ONCE(n->state);
 
@@ -7010,6 +7021,22 @@ void napi_enable(struct napi_struct *n)
 			new |= NAPIF_STATE_THREADED;
 	} while (!try_cmpxchg(&n->state, &val, new));
 }
+EXPORT_SYMBOL(napi_enable_locked);
+
+/**
+ * napi_enable() - enable NAPI scheduling
+ * @n: NAPI context
+ *
+ * Enable scheduling of a NAPI instance.
+ * Must be paired with napi_disable().
+ * Takes netdev_lock() for associated net_device.
+ */
+void napi_enable(struct napi_struct *n)
+{
+	netdev_lock(n->dev);
+	napi_enable_locked(n);
+	netdev_unlock(n->dev);
+}
 EXPORT_SYMBOL(napi_enable);
 
 static void flush_gro_hash(struct napi_struct *napi)
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v2 07/11] net: make netdev netlink ops hold netdev_lock()
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
                   ` (5 preceding siblings ...)
  2025-01-15  3:53 ` [PATCH net-next v2 06/11] net: protect NAPI enablement " Jakub Kicinski
@ 2025-01-15  3:53 ` Jakub Kicinski
  2025-01-15  9:07   ` Kuniyuki Iwashima
  2025-01-15  3:53 ` [PATCH net-next v2 08/11] net: protect threaded status of NAPI with netdev_lock() Jakub Kicinski
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski

In prep for dropping rtnl_lock, start locking netdev->lock in netlink
genl ops. We need to be using netdev->up instead of flags & IFF_UP.

We can remove the RCU lock protection for the NAPI since NAPI list
is protected by netdev->lock already.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 net/core/dev.h         |  1 -
 net/core/dev.c         |  3 ++-
 net/core/netdev-genl.c | 46 +++++++++++++++++++++++-------------------
 3 files changed, 27 insertions(+), 23 deletions(-)

diff --git a/net/core/dev.h b/net/core/dev.h
index ef37e2dd44f4..a5b166bbd169 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -23,7 +23,6 @@ struct sd_flow_limit {
 
 extern int netdev_flow_limit_table_len;
 
-struct napi_struct *netdev_napi_by_id(struct net *net, unsigned int napi_id);
 struct napi_struct *
 netdev_napi_by_id_lock(struct net *net, unsigned int napi_id);
 struct net_device *dev_get_by_napi_id(unsigned int napi_id);
diff --git a/net/core/dev.c b/net/core/dev.c
index 9cf93868ac7e..9734c3f5b862 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -767,7 +767,8 @@ static struct napi_struct *napi_by_id(unsigned int napi_id)
 }
 
 /* must be called under rcu_read_lock(), as we dont take a reference */
-struct napi_struct *netdev_napi_by_id(struct net *net, unsigned int napi_id)
+static struct napi_struct *
+netdev_napi_by_id(struct net *net, unsigned int napi_id)
 {
 	struct napi_struct *napi;
 
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index c59619a2ec23..810a446ab62c 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -167,7 +167,7 @@ netdev_nl_napi_fill_one(struct sk_buff *rsp, struct napi_struct *napi,
 	void *hdr;
 	pid_t pid;
 
-	if (!(napi->dev->flags & IFF_UP))
+	if (!napi->dev->up)
 		return 0;
 
 	hdr = genlmsg_iput(rsp, info);
@@ -230,17 +230,16 @@ int netdev_nl_napi_get_doit(struct sk_buff *skb, struct genl_info *info)
 		return -ENOMEM;
 
 	rtnl_lock();
-	rcu_read_lock();
 
-	napi = netdev_napi_by_id(genl_info_net(info), napi_id);
+	napi = netdev_napi_by_id_lock(genl_info_net(info), napi_id);
 	if (napi) {
 		err = netdev_nl_napi_fill_one(rsp, napi, info);
+		netdev_unlock(napi->dev);
 	} else {
 		NL_SET_BAD_ATTR(info->extack, info->attrs[NETDEV_A_NAPI_ID]);
 		err = -ENOENT;
 	}
 
-	rcu_read_unlock();
 	rtnl_unlock();
 
 	if (err) {
@@ -266,7 +265,7 @@ netdev_nl_napi_dump_one(struct net_device *netdev, struct sk_buff *rsp,
 	unsigned int prev_id;
 	int err = 0;
 
-	if (!(netdev->flags & IFF_UP))
+	if (!netdev->up)
 		return err;
 
 	prev_id = UINT_MAX;
@@ -303,13 +302,15 @@ int netdev_nl_napi_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 
 	rtnl_lock();
 	if (ifindex) {
-		netdev = __dev_get_by_index(net, ifindex);
-		if (netdev)
+		netdev = netdev_get_by_index_lock(net, ifindex);
+		if (netdev) {
 			err = netdev_nl_napi_dump_one(netdev, skb, info, ctx);
-		else
+			netdev_unlock(netdev);
+		} else {
 			err = -ENODEV;
+		}
 	} else {
-		for_each_netdev_dump(net, netdev, ctx->ifindex) {
+		for_each_netdev_lock_scoped(net, netdev, ctx->ifindex) {
 			err = netdev_nl_napi_dump_one(netdev, skb, info, ctx);
 			if (err < 0)
 				break;
@@ -358,17 +359,16 @@ int netdev_nl_napi_set_doit(struct sk_buff *skb, struct genl_info *info)
 	napi_id = nla_get_u32(info->attrs[NETDEV_A_NAPI_ID]);
 
 	rtnl_lock();
-	rcu_read_lock();
 
-	napi = netdev_napi_by_id(genl_info_net(info), napi_id);
+	napi = netdev_napi_by_id_lock(genl_info_net(info), napi_id);
 	if (napi) {
 		err = netdev_nl_napi_set_config(napi, info);
+		netdev_unlock(napi->dev);
 	} else {
 		NL_SET_BAD_ATTR(info->extack, info->attrs[NETDEV_A_NAPI_ID]);
 		err = -ENOENT;
 	}
 
-	rcu_read_unlock();
 	rtnl_unlock();
 
 	return err;
@@ -442,7 +442,7 @@ netdev_nl_queue_fill(struct sk_buff *rsp, struct net_device *netdev, u32 q_idx,
 {
 	int err;
 
-	if (!(netdev->flags & IFF_UP))
+	if (!netdev->up)
 		return -ENOENT;
 
 	err = netdev_nl_queue_validate(netdev, q_idx, q_type);
@@ -474,11 +474,13 @@ int netdev_nl_queue_get_doit(struct sk_buff *skb, struct genl_info *info)
 
 	rtnl_lock();
 
-	netdev = __dev_get_by_index(genl_info_net(info), ifindex);
-	if (netdev)
+	netdev = netdev_get_by_index_lock(genl_info_net(info), ifindex);
+	if (netdev) {
 		err = netdev_nl_queue_fill(rsp, netdev, q_id, q_type, info);
-	else
+		netdev_unlock(netdev);
+	} else {
 		err = -ENODEV;
+	}
 
 	rtnl_unlock();
 
@@ -499,7 +501,7 @@ netdev_nl_queue_dump_one(struct net_device *netdev, struct sk_buff *rsp,
 {
 	int err = 0;
 
-	if (!(netdev->flags & IFF_UP))
+	if (!netdev->up)
 		return err;
 
 	for (; ctx->rxq_idx < netdev->real_num_rx_queues; ctx->rxq_idx++) {
@@ -532,13 +534,15 @@ int netdev_nl_queue_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 
 	rtnl_lock();
 	if (ifindex) {
-		netdev = __dev_get_by_index(net, ifindex);
-		if (netdev)
+		netdev = netdev_get_by_index_lock(net, ifindex);
+		if (netdev) {
 			err = netdev_nl_queue_dump_one(netdev, skb, info, ctx);
-		else
+			netdev_unlock(netdev);
+		} else {
 			err = -ENODEV;
+		}
 	} else {
-		for_each_netdev_dump(net, netdev, ctx->ifindex) {
+		for_each_netdev_lock_scoped(net, netdev, ctx->ifindex) {
 			err = netdev_nl_queue_dump_one(netdev, skb, info, ctx);
 			if (err < 0)
 				break;
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v2 08/11] net: protect threaded status of NAPI with netdev_lock()
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
                   ` (6 preceding siblings ...)
  2025-01-15  3:53 ` [PATCH net-next v2 07/11] net: make netdev netlink ops hold netdev_lock() Jakub Kicinski
@ 2025-01-15  3:53 ` Jakub Kicinski
  2025-01-15  9:09   ` Kuniyuki Iwashima
  2025-01-15  3:53 ` [PATCH net-next v2 09/11] net: protect napi->irq " Jakub Kicinski
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski, leitao

Now that NAPI instances can't come and go without holding
netdev->lock we can trivially switch from rtnl_lock() to
netdev_lock() for setting netdev->threaded via sysfs.

Note that since we do not lock netdev_lock around sysfs
calls in the core we don't have to "trylock" like we do
with rtnl_lock.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
v2:
 - update the comment on dev_isalive()
v1: https://lore.kernel.org/20250114035118.110297-9-kuba@kernel.org

CC: leitao@debian.org
---
 include/linux/netdevice.h | 13 +++++++++++--
 net/core/dev.c            |  2 ++
 net/core/net-sysfs.c      | 34 ++++++++++++++++++++++++++++++++--
 3 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 4ab33fbadd9f..bf3da95c9350 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -384,7 +384,7 @@ struct napi_struct {
 	int			rx_count; /* length of rx_list */
 	unsigned int		napi_id; /* protected by netdev_lock */
 	struct hrtimer		timer;
-	struct task_struct	*thread;
+	struct task_struct	*thread; /* protected by netdev_lock */
 	unsigned long		gro_flush_timeout;
 	unsigned long		irq_suspend_timeout;
 	u32			defer_hard_irqs;
@@ -2451,11 +2451,13 @@ struct net_device {
 	 * Drivers are free to use it for other protection.
 	 *
 	 * Protects:
-	 *	@napi_list, @net_shaper_hierarchy, @reg_state
+	 *	@napi_list, @net_shaper_hierarchy, @reg_state, @threaded
 	 *
 	 * Partially protects (writers must hold both @lock and rtnl_lock):
 	 *	@up
 	 *
+	 * Also protects some fields in struct napi_struct.
+	 *
 	 * Ordering: take after rtnl_lock.
 	 */
 	struct mutex		lock;
@@ -2697,6 +2699,13 @@ static inline void netdev_assert_locked(struct net_device *dev)
 	lockdep_assert_held(&dev->lock);
 }
 
+static inline void netdev_assert_locked_or_invisible(struct net_device *dev)
+{
+	if (dev->reg_state == NETREG_REGISTERED ||
+	    dev->reg_state == NETREG_UNREGISTERING)
+		netdev_assert_locked(dev);
+}
+
 static inline void netif_napi_set_irq(struct napi_struct *napi, int irq)
 {
 	napi->irq = irq;
diff --git a/net/core/dev.c b/net/core/dev.c
index 9734c3f5b862..d90bb100285d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6784,6 +6784,8 @@ int dev_set_threaded(struct net_device *dev, bool threaded)
 	struct napi_struct *napi;
 	int err = 0;
 
+	netdev_assert_locked_or_invisible(dev);
+
 	if (dev->threaded == threaded)
 		return 0;
 
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 2d9afc6e2161..9365a7185a1d 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -36,7 +36,7 @@ static const char fmt_uint[] = "%u\n";
 static const char fmt_ulong[] = "%lu\n";
 static const char fmt_u64[] = "%llu\n";
 
-/* Caller holds RTNL or RCU */
+/* Caller holds RTNL, netdev->lock or RCU */
 static inline int dev_isalive(const struct net_device *dev)
 {
 	return READ_ONCE(dev->reg_state) <= NETREG_REGISTERED;
@@ -108,6 +108,36 @@ static ssize_t netdev_store(struct device *dev, struct device_attribute *attr,
 	return ret;
 }
 
+/* Same as netdev_store() but takes netdev_lock() instead of rtnl_lock() */
+static ssize_t
+netdev_lock_store(struct device *dev, struct device_attribute *attr,
+		  const char *buf, size_t len,
+		  int (*set)(struct net_device *, unsigned long))
+{
+	struct net_device *netdev = to_net_dev(dev);
+	struct net *net = dev_net(netdev);
+	unsigned long new;
+	int ret;
+
+	if (!ns_capable(net->user_ns, CAP_NET_ADMIN))
+		return -EPERM;
+
+	ret = kstrtoul(buf, 0, &new);
+	if (ret)
+		return ret;
+
+	netdev_lock(netdev);
+
+	if (dev_isalive(netdev)) {
+		ret = (*set)(netdev, new);
+		if (ret == 0)
+			ret = len;
+	}
+	netdev_unlock(netdev);
+
+	return ret;
+}
+
 NETDEVICE_SHOW_RO(dev_id, fmt_hex);
 NETDEVICE_SHOW_RO(dev_port, fmt_dec);
 NETDEVICE_SHOW_RO(addr_assign_type, fmt_dec);
@@ -638,7 +668,7 @@ static ssize_t threaded_store(struct device *dev,
 			      struct device_attribute *attr,
 			      const char *buf, size_t len)
 {
-	return netdev_store(dev, attr, buf, len, modify_napi_threaded);
+	return netdev_lock_store(dev, attr, buf, len, modify_napi_threaded);
 }
 static DEVICE_ATTR_RW(threaded);
 
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v2 09/11] net: protect napi->irq with netdev_lock()
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
                   ` (7 preceding siblings ...)
  2025-01-15  3:53 ` [PATCH net-next v2 08/11] net: protect threaded status of NAPI with netdev_lock() Jakub Kicinski
@ 2025-01-15  3:53 ` Jakub Kicinski
  2025-01-15  9:12   ` Kuniyuki Iwashima
  2025-01-15  3:53 ` [PATCH net-next v2 10/11] net: protect NAPI config fields " Jakub Kicinski
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski

Take netdev_lock() in netif_napi_set_irq(). All NAPI "control fields"
are now protected by that lock (most of the other ones are set during
napi add/del). The napi_hash_node is fully protected by the hash
spin lock, but close enough for the kdoc...

Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 include/linux/netdevice.h | 10 +++++++++-
 net/core/dev.c            |  2 +-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index bf3da95c9350..390fb70667bf 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -388,6 +388,7 @@ struct napi_struct {
 	unsigned long		gro_flush_timeout;
 	unsigned long		irq_suspend_timeout;
 	u32			defer_hard_irqs;
+	/* all fields past this point are write-protected by netdev_lock */
 	/* control-path-only fields follow */
 	struct list_head	dev_list;
 	struct hlist_node	napi_hash_node;
@@ -2706,11 +2707,18 @@ static inline void netdev_assert_locked_or_invisible(struct net_device *dev)
 		netdev_assert_locked(dev);
 }
 
-static inline void netif_napi_set_irq(struct napi_struct *napi, int irq)
+static inline void netif_napi_set_irq_locked(struct napi_struct *napi, int irq)
 {
 	napi->irq = irq;
 }
 
+static inline void netif_napi_set_irq(struct napi_struct *napi, int irq)
+{
+	netdev_lock(napi->dev);
+	netif_napi_set_irq_locked(napi, irq);
+	netdev_unlock(napi->dev);
+}
+
 /* Default NAPI poll() weight
  * Device drivers are strongly advised to not use bigger value
  */
diff --git a/net/core/dev.c b/net/core/dev.c
index d90bb100285d..495ceefcdc34 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6956,7 +6956,7 @@ void netif_napi_add_weight_locked(struct net_device *dev,
 	 */
 	if (dev->threaded && napi_kthread_create(napi))
 		dev->threaded = false;
-	netif_napi_set_irq(napi, -1);
+	netif_napi_set_irq_locked(napi, -1);
 }
 EXPORT_SYMBOL(netif_napi_add_weight_locked);
 
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v2 10/11] net: protect NAPI config fields with netdev_lock()
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
                   ` (8 preceding siblings ...)
  2025-01-15  3:53 ` [PATCH net-next v2 09/11] net: protect napi->irq " Jakub Kicinski
@ 2025-01-15  3:53 ` Jakub Kicinski
  2025-01-15  9:15   ` Kuniyuki Iwashima
  2025-01-15  3:53 ` [PATCH net-next v2 11/11] netdev-genl: remove rtnl_lock protection from NAPI ops Jakub Kicinski
  2025-01-16  3:30 ` [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI patchwork-bot+netdevbpf
  11 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski

Protect the following members of netdev and napi by netdev_lock:
 - defer_hard_irqs,
 - gro_flush_timeout,
 - irq_suspend_timeout.

The first two are written via sysfs (which this patch switches
to new lock), and netdev genl which holds both netdev and rtnl locks.

irq_suspend_timeout is only written by netdev genl.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 include/linux/netdevice.h | 7 ++++---
 net/core/net-sysfs.c      | 5 +++--
 2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 390fb70667bf..6f1e5725aca9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -384,11 +384,11 @@ struct napi_struct {
 	int			rx_count; /* length of rx_list */
 	unsigned int		napi_id; /* protected by netdev_lock */
 	struct hrtimer		timer;
-	struct task_struct	*thread; /* protected by netdev_lock */
+	/* all fields past this point are write-protected by netdev_lock */
+	struct task_struct	*thread;
 	unsigned long		gro_flush_timeout;
 	unsigned long		irq_suspend_timeout;
 	u32			defer_hard_irqs;
-	/* all fields past this point are write-protected by netdev_lock */
 	/* control-path-only fields follow */
 	struct list_head	dev_list;
 	struct hlist_node	napi_hash_node;
@@ -2452,7 +2452,8 @@ struct net_device {
 	 * Drivers are free to use it for other protection.
 	 *
 	 * Protects:
-	 *	@napi_list, @net_shaper_hierarchy, @reg_state, @threaded
+	 *	@gro_flush_timeout, @napi_defer_hard_irqs, @napi_list,
+	 *	@net_shaper_hierarchy, @reg_state, @threaded
 	 *
 	 * Partially protects (writers must hold both @lock and rtnl_lock):
 	 *	@up
diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index 9365a7185a1d..07cb99b114bd 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -450,7 +450,7 @@ static ssize_t gro_flush_timeout_store(struct device *dev,
 	if (!capable(CAP_NET_ADMIN))
 		return -EPERM;
 
-	return netdev_store(dev, attr, buf, len, change_gro_flush_timeout);
+	return netdev_lock_store(dev, attr, buf, len, change_gro_flush_timeout);
 }
 NETDEVICE_SHOW_RW(gro_flush_timeout, fmt_ulong);
 
@@ -470,7 +470,8 @@ static ssize_t napi_defer_hard_irqs_store(struct device *dev,
 	if (!capable(CAP_NET_ADMIN))
 		return -EPERM;
 
-	return netdev_store(dev, attr, buf, len, change_napi_defer_hard_irqs);
+	return netdev_lock_store(dev, attr, buf, len,
+				 change_napi_defer_hard_irqs);
 }
 NETDEVICE_SHOW_RW(napi_defer_hard_irqs, fmt_uint);
 
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH net-next v2 11/11] netdev-genl: remove rtnl_lock protection from NAPI ops
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
                   ` (9 preceding siblings ...)
  2025-01-15  3:53 ` [PATCH net-next v2 10/11] net: protect NAPI config fields " Jakub Kicinski
@ 2025-01-15  3:53 ` Jakub Kicinski
  2025-01-15  9:18   ` Kuniyuki Iwashima
  2025-01-16  3:30 ` [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI patchwork-bot+netdevbpf
  11 siblings, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15  3:53 UTC (permalink / raw)
  To: davem
  Cc: netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Jakub Kicinski

NAPI lifetime, visibility and config are all fully under
netdev_lock protection now.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 net/core/netdev-genl.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index 810a446ab62c..715f85c6b62e 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -229,8 +229,6 @@ int netdev_nl_napi_get_doit(struct sk_buff *skb, struct genl_info *info)
 	if (!rsp)
 		return -ENOMEM;
 
-	rtnl_lock();
-
 	napi = netdev_napi_by_id_lock(genl_info_net(info), napi_id);
 	if (napi) {
 		err = netdev_nl_napi_fill_one(rsp, napi, info);
@@ -240,8 +238,6 @@ int netdev_nl_napi_get_doit(struct sk_buff *skb, struct genl_info *info)
 		err = -ENOENT;
 	}
 
-	rtnl_unlock();
-
 	if (err) {
 		goto err_free_msg;
 	} else if (!rsp->len) {
@@ -300,7 +296,6 @@ int netdev_nl_napi_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 	if (info->attrs[NETDEV_A_NAPI_IFINDEX])
 		ifindex = nla_get_u32(info->attrs[NETDEV_A_NAPI_IFINDEX]);
 
-	rtnl_lock();
 	if (ifindex) {
 		netdev = netdev_get_by_index_lock(net, ifindex);
 		if (netdev) {
@@ -317,7 +312,6 @@ int netdev_nl_napi_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
 			ctx->napi_id = 0;
 		}
 	}
-	rtnl_unlock();
 
 	return err;
 }
@@ -358,8 +352,6 @@ int netdev_nl_napi_set_doit(struct sk_buff *skb, struct genl_info *info)
 
 	napi_id = nla_get_u32(info->attrs[NETDEV_A_NAPI_ID]);
 
-	rtnl_lock();
-
 	napi = netdev_napi_by_id_lock(genl_info_net(info), napi_id);
 	if (napi) {
 		err = netdev_nl_napi_set_config(napi, info);
@@ -369,8 +361,6 @@ int netdev_nl_napi_set_doit(struct sk_buff *skb, struct genl_info *info)
 		err = -ENOENT;
 	}
 
-	rtnl_unlock();
-
 	return err;
 }
 
-- 
2.48.0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers
  2025-01-15  3:53 ` [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers Jakub Kicinski
@ 2025-01-15  8:21   ` Kuniyuki Iwashima
  2025-01-15  8:36   ` Przemek Kitszel
  1 sibling, 0 replies; 34+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-15  8:21 UTC (permalink / raw)
  To: kuba
  Cc: andrew+netdev, anthony.l.nguyen, davem, edumazet, horms, jdamato,
	jiri, netdev, pabeni, przemyslaw.kitszel, Kuniyuki Iwashima

From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 14 Jan 2025 19:53:09 -0800
> Add helpers for locking the netdev instance, use it in drivers
> and the shaper code. This will make grepping for the lock usage
> much easier, as we extend the lock to cover more fields.
> 
> Reviewed-by: Joe Damato <jdamato@fastly.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Reviewed-by: Eric Dumazet <edumazet@google.com>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 02/11] net: make netdev_lock() protect netdev->reg_state
  2025-01-15  3:53 ` [PATCH net-next v2 02/11] net: make netdev_lock() protect netdev->reg_state Jakub Kicinski
@ 2025-01-15  8:30   ` Kuniyuki Iwashima
  2025-01-15 14:18     ` Jakub Kicinski
  0 siblings, 1 reply; 34+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-15  8:30 UTC (permalink / raw)
  To: kuba
  Cc: andrew+netdev, davem, edumazet, horms, jdamato, netdev, pabeni,
	Kuniyuki Iwashima

From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 14 Jan 2025 19:53:10 -0800
> Protect writes to netdev->reg_state with netdev_lock().
> From now on holding netdev_lock() is sufficient to prevent
> the net_device from getting unregistered, so code which
> wants to hold just a single netdev around no longer needs
> to hold rtnl_lock.
> 
> We do not protect the NETREG_UNREGISTERED -> NETREG_RELEASED
> transition. We'd need to move mutex_destroy(netdev->lock)
> to .release, but the real reason is that trying to stop
> the unregistration process mid-way would be unsafe / crazy.
> Taking references on such devices is not safe, either.
> So the intended semantics are to lock REGISTERED devices.
> 
> Reviewed-by: Joe Damato <jdamato@fastly.com>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> v2:
>  - reorder with next patch
> v1: https://lore.kernel.org/20250114035118.110297-4-kuba@kernel.org
> ---
>  include/linux/netdevice.h | 2 +-
>  net/core/dev.c            | 6 ++++++
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 891c5bdb894c..30963c5d409b 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2448,7 +2448,7 @@ struct net_device {
>  	 * Should always be taken using netdev_lock() / netdev_unlock() helpers.
>  	 * Drivers are free to use it for other protection.
>  	 *
> -	 * Protects: @net_shaper_hierarchy.
> +	 * Protects: @reg_state, @net_shaper_hierarchy.
>  	 * Ordering: take after rtnl_lock.
>  	 */
>  	struct mutex		lock;
> diff --git a/net/core/dev.c b/net/core/dev.c
> index fda4e1039bf0..6603c08768f6 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -10668,7 +10668,9 @@ int register_netdevice(struct net_device *dev)
>  
>  	ret = netdev_register_kobject(dev);
>  
> +	netdev_lock(dev);
>  	WRITE_ONCE(dev->reg_state, ret ? NETREG_UNREGISTERED : NETREG_REGISTERED);
> +	netdev_unlock(dev);

Do we need the lock before list_netdevice() ?

It's not a big deal, so

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers
  2025-01-15  3:53 ` [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers Jakub Kicinski
  2025-01-15  8:21   ` Kuniyuki Iwashima
@ 2025-01-15  8:36   ` Przemek Kitszel
  2025-01-15  9:24     ` Ido Schimmel
  2025-01-15 14:08     ` Jakub Kicinski
  1 sibling, 2 replies; 34+ messages in thread
From: Przemek Kitszel @ 2025-01-15  8:36 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, jiri, anthony.l.nguyen, edumazet, pabeni, andrew+netdev,
	horms, jdamato, davem

On 1/15/25 04:53, Jakub Kicinski wrote:
> Add helpers for locking the netdev instance, use it in drivers
> and the shaper code. This will make grepping for the lock usage
> much easier, as we extend the lock to cover more fields.
> 
> Reviewed-by: Joe Damato <jdamato@fastly.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> ---
> CC: anthony.l.nguyen@intel.com
> CC: przemyslaw.kitszel@intel.com
> CC: jiri@resnulli.us
> ---
>   include/linux/netdevice.h                   | 23 ++++++-
>   drivers/net/ethernet/intel/iavf/iavf_main.c | 74 ++++++++++-----------
>   drivers/net/netdevsim/ethtool.c             |  4 +-
>   net/shaper/shaper.c                         |  6 +-
>   4 files changed, 63 insertions(+), 44 deletions(-)

Thank you,
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>

and Ack for iavf too

> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index bced03fb349e..891c5bdb894c 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -2444,8 +2444,12 @@ struct net_device {
>   	u32			napi_defer_hard_irqs;
>   
>   	/**
> -	 * @lock: protects @net_shaper_hierarchy, feel free to use for other
> -	 * netdev-scope protection. Ordering: take after rtnl_lock.
> +	 * @lock: netdev-scope lock, protects a small selection of fields.
> +	 * Should always be taken using netdev_lock() / netdev_unlock() helpers.
> +	 * Drivers are free to use it for other protection.

As with devl_lock(), would be good to specify the ordering for those who
happen to take both. My guess is that devl_lock() is after netdev_lock()

> +	 *
> +	 * Protects: @net_shaper_hierarchy.
> +	 * Ordering: take after rtnl_lock.
>   	 */
>   	struct mutex		lock;
>   


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 03/11] net: add helpers for lookup and walking netdevs under netdev_lock()
  2025-01-15  3:53 ` [PATCH net-next v2 03/11] net: add helpers for lookup and walking netdevs under netdev_lock() Jakub Kicinski
@ 2025-01-15  8:41   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 34+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-15  8:41 UTC (permalink / raw)
  To: kuba
  Cc: andrew+netdev, davem, edumazet, horms, jdamato, netdev, pabeni,
	Kuniyuki Iwashima

From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 14 Jan 2025 19:53:11 -0800
> Add helpers for accessing netdevs under netdev_lock().
> There's some careful handling needed to find the device and lock it
> safely, without it getting unregistered, and without taking rtnl_lock
> (the latter being the whole point of the new locking, after all).
> 
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 04/11] net: add netdev->up protected by netdev_lock()
  2025-01-15  3:53 ` [PATCH net-next v2 04/11] net: add netdev->up protected by netdev_lock() Jakub Kicinski
@ 2025-01-15  8:45   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 34+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-15  8:45 UTC (permalink / raw)
  To: kuba
  Cc: andrew+netdev, davem, edumazet, horms, jdamato, netdev, pabeni,
	Kuniyuki Iwashima

From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 14 Jan 2025 19:53:12 -0800
> Some uAPI (netdev netlink) hide net_device's sub-objects while
> the interface is down to ensure uniform behavior across drivers.
> To remove the rtnl_lock dependency from those uAPIs we need a way
> to safely tell if the device is down or up.
> 
> Add an indication of whether device is open or closed, protected
> by netdev->lock. The semantics are the same as IFF_UP, but taking
> netdev_lock around every write to ->flags would be a lot of code
> churn.
> 
> We don't want to blanket the entire open / close path by netdev_lock,
> because it will prevent us from applying it to specific structures -
> core helpers won't be able to take that lock from any function
> called by the drivers on open/close paths.
> 
> So the state of the flag is "pessimistic", as in it may report false
> negatives, but never false positives.
> 
> Reviewed-by: Joe Damato <jdamato@fastly.com>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 05/11] net: protect netdev->napi_list with netdev_lock()
  2025-01-15  3:53 ` [PATCH net-next v2 05/11] net: protect netdev->napi_list with netdev_lock() Jakub Kicinski
@ 2025-01-15  8:57   ` Kuniyuki Iwashima
  2025-01-17 22:21     ` Eric Dumazet
  0 siblings, 1 reply; 34+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-15  8:57 UTC (permalink / raw)
  To: kuba
  Cc: andrew+netdev, davem, edumazet, horms, jdamato, netdev, pabeni,
	Kuniyuki Iwashima

From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 14 Jan 2025 19:53:13 -0800
> Hold netdev->lock when NAPIs are getting added or removed.
> This will allow safe access to NAPI instances of a net_device
> without rtnl_lock.
> 
> Create a family of helpers which assume the lock is already taken.
> Switch iavf to them, as it makes extensive use of netdev->lock,
> already.
> 
> Reviewed-by: Joe Damato <jdamato@fastly.com>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 06/11] net: protect NAPI enablement with netdev_lock()
  2025-01-15  3:53 ` [PATCH net-next v2 06/11] net: protect NAPI enablement " Jakub Kicinski
@ 2025-01-15  9:02   ` Kuniyuki Iwashima
  2025-01-21  8:32   ` Dan Carpenter
  1 sibling, 0 replies; 34+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-15  9:02 UTC (permalink / raw)
  To: kuba
  Cc: andrew+netdev, anthony.l.nguyen, davem, edumazet, horms, jdamato,
	marcin.s.wojtas, netdev, pabeni, pcnet32, przemyslaw.kitszel,
	romieu, Kuniyuki Iwashima

From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 14 Jan 2025 19:53:14 -0800
> Wrap napi_enable() / napi_disable() with netdev_lock().
> Provide the "already locked" flavor of the API.
> 
> iavf needs the usual adjustment. A number of drivers call
> napi_enable() under a spin lock, so they have to be modified
> to take netdev_lock() first, then spin lock then call
> napi_enable_locked().
> 
> Protecting napi_enable() implies that napi->napi_id is protected
> by netdev_lock().
> 
> Acked-by: Francois Romieu <romieu@fr.zoreil.com> # via-velocity
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 07/11] net: make netdev netlink ops hold netdev_lock()
  2025-01-15  3:53 ` [PATCH net-next v2 07/11] net: make netdev netlink ops hold netdev_lock() Jakub Kicinski
@ 2025-01-15  9:07   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 34+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-15  9:07 UTC (permalink / raw)
  To: kuba
  Cc: andrew+netdev, davem, edumazet, horms, jdamato, netdev, pabeni,
	Kuniyuki Iwashima

From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 14 Jan 2025 19:53:15 -0800
> In prep for dropping rtnl_lock, start locking netdev->lock in netlink
> genl ops. We need to be using netdev->up instead of flags & IFF_UP.
> 
> We can remove the RCU lock protection for the NAPI since NAPI list
> is protected by netdev->lock already.
> 
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 08/11] net: protect threaded status of NAPI with netdev_lock()
  2025-01-15  3:53 ` [PATCH net-next v2 08/11] net: protect threaded status of NAPI with netdev_lock() Jakub Kicinski
@ 2025-01-15  9:09   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 34+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-15  9:09 UTC (permalink / raw)
  To: kuba
  Cc: andrew+netdev, davem, edumazet, horms, jdamato, leitao, netdev,
	pabeni, Kuniyuki Iwashima

From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 14 Jan 2025 19:53:16 -0800
> Now that NAPI instances can't come and go without holding
> netdev->lock we can trivially switch from rtnl_lock() to
> netdev_lock() for setting netdev->threaded via sysfs.
> 
> Note that since we do not lock netdev_lock around sysfs
> calls in the core we don't have to "trylock" like we do
> with rtnl_lock.
> 
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 09/11] net: protect napi->irq with netdev_lock()
  2025-01-15  3:53 ` [PATCH net-next v2 09/11] net: protect napi->irq " Jakub Kicinski
@ 2025-01-15  9:12   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 34+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-15  9:12 UTC (permalink / raw)
  To: kuba
  Cc: andrew+netdev, davem, edumazet, horms, jdamato, netdev, pabeni,
	Kuniyuki Iwashima

From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 14 Jan 2025 19:53:17 -0800
> Take netdev_lock() in netif_napi_set_irq(). All NAPI "control fields"
> are now protected by that lock (most of the other ones are set during
> napi add/del). The napi_hash_node is fully protected by the hash
> spin lock, but close enough for the kdoc...
> 
> Reviewed-by: Joe Damato <jdamato@fastly.com>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 10/11] net: protect NAPI config fields with netdev_lock()
  2025-01-15  3:53 ` [PATCH net-next v2 10/11] net: protect NAPI config fields " Jakub Kicinski
@ 2025-01-15  9:15   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 34+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-15  9:15 UTC (permalink / raw)
  To: kuba
  Cc: andrew+netdev, davem, edumazet, horms, jdamato, netdev, pabeni,
	Kuniyuki Iwashima

From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 14 Jan 2025 19:53:18 -0800
> Protect the following members of netdev and napi by netdev_lock:
>  - defer_hard_irqs,
>  - gro_flush_timeout,
>  - irq_suspend_timeout.
> 
> The first two are written via sysfs (which this patch switches
> to new lock), and netdev genl which holds both netdev and rtnl locks.
> 
> irq_suspend_timeout is only written by netdev genl.
> 
> Reviewed-by: Joe Damato <jdamato@fastly.com>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 11/11] netdev-genl: remove rtnl_lock protection from NAPI ops
  2025-01-15  3:53 ` [PATCH net-next v2 11/11] netdev-genl: remove rtnl_lock protection from NAPI ops Jakub Kicinski
@ 2025-01-15  9:18   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 34+ messages in thread
From: Kuniyuki Iwashima @ 2025-01-15  9:18 UTC (permalink / raw)
  To: kuba
  Cc: andrew+netdev, davem, edumazet, horms, jdamato, netdev, pabeni,
	Kuniyuki Iwashima

From: Jakub Kicinski <kuba@kernel.org>
Date: Tue, 14 Jan 2025 19:53:19 -0800
> NAPI lifetime, visibility and config are all fully under
> netdev_lock protection now.
> 
> Reviewed-by: Joe Damato <jdamato@fastly.com>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

Thanks for the series!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers
  2025-01-15  8:36   ` Przemek Kitszel
@ 2025-01-15  9:24     ` Ido Schimmel
  2025-01-15 14:08     ` Jakub Kicinski
  1 sibling, 0 replies; 34+ messages in thread
From: Ido Schimmel @ 2025-01-15  9:24 UTC (permalink / raw)
  To: Przemek Kitszel
  Cc: Jakub Kicinski, netdev, jiri, anthony.l.nguyen, edumazet, pabeni,
	andrew+netdev, horms, jdamato, davem

On Wed, Jan 15, 2025 at 09:36:11AM +0100, Przemek Kitszel wrote:
> On 1/15/25 04:53, Jakub Kicinski wrote:
> > Add helpers for locking the netdev instance, use it in drivers
> > and the shaper code. This will make grepping for the lock usage
> > much easier, as we extend the lock to cover more fields.
> > 
> > Reviewed-by: Joe Damato <jdamato@fastly.com>
> > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> > Reviewed-by: Eric Dumazet <edumazet@google.com>
> > ---
> > CC: anthony.l.nguyen@intel.com
> > CC: przemyslaw.kitszel@intel.com
> > CC: jiri@resnulli.us
> > ---
> >   include/linux/netdevice.h                   | 23 ++++++-
> >   drivers/net/ethernet/intel/iavf/iavf_main.c | 74 ++++++++++-----------
> >   drivers/net/netdevsim/ethtool.c             |  4 +-
> >   net/shaper/shaper.c                         |  6 +-
> >   4 files changed, 63 insertions(+), 44 deletions(-)
> 
> Thank you,
> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
> 
> and Ack for iavf too
> 
> > 
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index bced03fb349e..891c5bdb894c 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -2444,8 +2444,12 @@ struct net_device {
> >   	u32			napi_defer_hard_irqs;
> >   	/**
> > -	 * @lock: protects @net_shaper_hierarchy, feel free to use for other
> > -	 * netdev-scope protection. Ordering: take after rtnl_lock.
> > +	 * @lock: netdev-scope lock, protects a small selection of fields.
> > +	 * Should always be taken using netdev_lock() / netdev_unlock() helpers.
> > +	 * Drivers are free to use it for other protection.
> 
> As with devl_lock(), would be good to specify the ordering for those who
> happen to take both. My guess is that devl_lock() is after netdev_lock()

devl_lock() protects the entire devlink instance and net devices are
registered under the instance lock, so I expect the order to be:

devl_lock() -> rtnl_lock() -> netdev_lock()

> 
> > +	 *
> > +	 * Protects: @net_shaper_hierarchy.
> > +	 * Ordering: take after rtnl_lock.
> >   	 */
> >   	struct mutex		lock;
> 
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers
  2025-01-15  8:36   ` Przemek Kitszel
  2025-01-15  9:24     ` Ido Schimmel
@ 2025-01-15 14:08     ` Jakub Kicinski
  2025-01-15 14:24       ` Przemek Kitszel
  1 sibling, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15 14:08 UTC (permalink / raw)
  To: Przemek Kitszel
  Cc: netdev, jiri, anthony.l.nguyen, edumazet, pabeni, andrew+netdev,
	horms, jdamato, davem

On Wed, 15 Jan 2025 09:36:11 +0100 Przemek Kitszel wrote:
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index bced03fb349e..891c5bdb894c 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -2444,8 +2444,12 @@ struct net_device {
> >   	u32			napi_defer_hard_irqs;
> >   
> >   	/**
> > -	 * @lock: protects @net_shaper_hierarchy, feel free to use for other
> > -	 * netdev-scope protection. Ordering: take after rtnl_lock.
> > +	 * @lock: netdev-scope lock, protects a small selection of fields.
> > +	 * Should always be taken using netdev_lock() / netdev_unlock() helpers.
> > +	 * Drivers are free to use it for other protection.  
> 
> As with devl_lock(), would be good to specify the ordering for those who
> happen to take both. My guess is that devl_lock() is after netdev_lock()

The ordering is transitive, since devl_ is before rtnl_ there is 
no ambiguity. Or so I think :)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 02/11] net: make netdev_lock() protect netdev->reg_state
  2025-01-15  8:30   ` Kuniyuki Iwashima
@ 2025-01-15 14:18     ` Jakub Kicinski
  0 siblings, 0 replies; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-15 14:18 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: andrew+netdev, davem, edumazet, horms, jdamato, netdev, pabeni

On Wed, 15 Jan 2025 17:30:23 +0900 Kuniyuki Iwashima wrote:
> > @@ -10668,7 +10668,9 @@ int register_netdevice(struct net_device *dev)
> >  
> >  	ret = netdev_register_kobject(dev);
> >  
> > +	netdev_lock(dev);
> >  	WRITE_ONCE(dev->reg_state, ret ? NETREG_UNREGISTERED : NETREG_REGISTERED);
> > +	netdev_unlock(dev);  
> 
> Do we need the lock before list_netdevice() ?

Fair point, we don't. I couldn't decide whether it's more logical 
to skip the locking since device is not listed, or lock it, just
because we say that @reg_state is supposed to be write protected.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers
  2025-01-15 14:08     ` Jakub Kicinski
@ 2025-01-15 14:24       ` Przemek Kitszel
  0 siblings, 0 replies; 34+ messages in thread
From: Przemek Kitszel @ 2025-01-15 14:24 UTC (permalink / raw)
  To: Jakub Kicinski, Ido Schimmel
  Cc: netdev, jiri, anthony.l.nguyen, edumazet, pabeni, andrew+netdev,
	horms, jdamato, davem

On 1/15/25 15:08, Jakub Kicinski wrote:
> On Wed, 15 Jan 2025 09:36:11 +0100 Przemek Kitszel wrote:
>>> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>> index bced03fb349e..891c5bdb894c 100644
>>> --- a/include/linux/netdevice.h
>>> +++ b/include/linux/netdevice.h
>>> @@ -2444,8 +2444,12 @@ struct net_device {
>>>    	u32			napi_defer_hard_irqs;
>>>    
>>>    	/**
>>> -	 * @lock: protects @net_shaper_hierarchy, feel free to use for other
>>> -	 * netdev-scope protection. Ordering: take after rtnl_lock.
>>> +	 * @lock: netdev-scope lock, protects a small selection of fields.
>>> +	 * Should always be taken using netdev_lock() / netdev_unlock() helpers.
>>> +	 * Drivers are free to use it for other protection.
>>
>> As with devl_lock(), would be good to specify the ordering for those who
>> happen to take both. My guess is that devl_lock() is after netdev_lock()
> 
> The ordering is transitive, since devl_ is before rtnl_ there is
> no ambiguity. Or so I think :)
> 

sure thing, sorry for not checking prior to asking :)
thanks to both you and Ido for answering!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI
  2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
                   ` (10 preceding siblings ...)
  2025-01-15  3:53 ` [PATCH net-next v2 11/11] netdev-genl: remove rtnl_lock protection from NAPI ops Jakub Kicinski
@ 2025-01-16  3:30 ` patchwork-bot+netdevbpf
  11 siblings, 0 replies; 34+ messages in thread
From: patchwork-bot+netdevbpf @ 2025-01-16  3:30 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, jdamato

Hello:

This series was applied to netdev/net-next.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Tue, 14 Jan 2025 19:53:08 -0800 you wrote:
> We recently added a lock member to struct net_device, with a vague
> plan to start using it to protect netdev-local state, removing
> the need to take rtnl_lock for new configuration APIs.
> 
> Lay some groundwork and use this lock for protecting NAPI APIs.
> 
> v2:
>  - reorder patches 2 and 3
>  - add missing READ_ONCE()
>  - fix up the kdoc to please Sphinx / htmldocs
>  - use napi_disabled_locked() in via-velocity
>  - update the comment on dev_isalive()
> v1: https://lore.kernel.org/20250114035118.110297-1-kuba@kernel.org
> 
> [...]

Here is the summary with links:
  - [net-next,v2,01/11] net: add netdev_lock() / netdev_unlock() helpers
    (no matching commit)
  - [net-next,v2,02/11] net: make netdev_lock() protect netdev->reg_state
    https://git.kernel.org/netdev/net-next/c/5fda3f35349b
  - [net-next,v2,03/11] net: add helpers for lookup and walking netdevs under netdev_lock()
    (no matching commit)
  - [net-next,v2,04/11] net: add netdev->up protected by netdev_lock()
    (no matching commit)
  - [net-next,v2,05/11] net: protect netdev->napi_list with netdev_lock()
    (no matching commit)
  - [net-next,v2,06/11] net: protect NAPI enablement with netdev_lock()
    (no matching commit)
  - [net-next,v2,07/11] net: make netdev netlink ops hold netdev_lock()
    (no matching commit)
  - [net-next,v2,08/11] net: protect threaded status of NAPI with netdev_lock()
    https://git.kernel.org/netdev/net-next/c/1bb86cf8f44b
  - [net-next,v2,09/11] net: protect napi->irq with netdev_lock()
    https://git.kernel.org/netdev/net-next/c/53ed30800d3f
  - [net-next,v2,10/11] net: protect NAPI config fields with netdev_lock()
    https://git.kernel.org/netdev/net-next/c/e7ed2ba757bf
  - [net-next,v2,11/11] netdev-genl: remove rtnl_lock protection from NAPI ops
    https://git.kernel.org/netdev/net-next/c/062e78917222

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 05/11] net: protect netdev->napi_list with netdev_lock()
  2025-01-15  8:57   ` Kuniyuki Iwashima
@ 2025-01-17 22:21     ` Eric Dumazet
  2025-01-17 22:52       ` Jakub Kicinski
  0 siblings, 1 reply; 34+ messages in thread
From: Eric Dumazet @ 2025-01-17 22:21 UTC (permalink / raw)
  To: Kuniyuki Iwashima
  Cc: kuba, andrew+netdev, davem, horms, jdamato, netdev, pabeni

On Wed, Jan 15, 2025 at 9:57 AM Kuniyuki Iwashima <kuniyu@amazon.com> wrote:
>
> From: Jakub Kicinski <kuba@kernel.org>
> Date: Tue, 14 Jan 2025 19:53:13 -0800
> > Hold netdev->lock when NAPIs are getting added or removed.
> > This will allow safe access to NAPI instances of a net_device
> > without rtnl_lock.
> >
> > Create a family of helpers which assume the lock is already taken.
> > Switch iavf to them, as it makes extensive use of netdev->lock,
> > already.
> >
> > Reviewed-by: Joe Damato <jdamato@fastly.com>
> > Reviewed-by: Eric Dumazet <edumazet@google.com>
> > Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>
> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>

Jakub, has anyone sent the following fix yet ?

CONFIG_DEBUG_MUTEXES=y should show a splat I think otherwise.

[1]
DEBUG_LOCKS_WARN_ON(lock->magic != lock)
WARNING: CPU: 0 PID: 5971 at kernel/locking/mutex.c:564
__mutex_lock_common kernel/locking/mutex.c:564 [inline]
WARNING: CPU: 0 PID: 5971 at kernel/locking/mutex.c:564
__mutex_lock+0xdac/0xee0 kernel/locking/mutex.c:735
Modules linked in:
CPU: 0 UID: 0 PID: 5971 Comm: syz-executor Not tainted
6.13.0-rc7-syzkaller-01131-g8d20dcda404d #0
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 12/27/2024
RIP: 0010:__mutex_lock_common kernel/locking/mutex.c:564 [inline]
RIP: 0010:__mutex_lock+0xdac/0xee0 kernel/locking/mutex.c:735
Code: 0f b6 04 38 84 c0 0f 85 1a 01 00 00 83 3d 6f 40 4c 04 00 75 19
90 48 c7 c7 60 84 0a 8c 48 c7 c6 00 85 0a 8c e8 f5 dc 91 f5 90 <0f> 0b
90 90 90 e9 c7 f3 ff ff 90 0f 0b 90 e9 29 f8 ff ff 90 0f 0b
RSP: 0018:ffffc90003317580 EFLAGS: 00010246
RAX: ee0f97edaf7b7d00 RBX: ffff8880299f8cb0 RCX: ffff8880323c9e00
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90003317710 R08: ffffffff81602ac2 R09: 1ffff110170c519a
R10: dffffc0000000000 R11: ffffed10170c519b R12: 0000000000000000
R13: 0000000000000000 R14: 1ffff92000662ec4 R15: dffffc0000000000
FS: 000055557a046500(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd581d46ff8 CR3: 000000006f870000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
netdev_lock include/linux/netdevice.h:2691 [inline]
__netif_napi_del include/linux/netdevice.h:2829 [inline]
netif_napi_del include/linux/netdevice.h:2848 [inline]
free_netdev+0x2d9/0x610 net/core/dev.c:11621
netdev_run_todo+0xf21/0x10d0 net/core/dev.c:11189
nsim_destroy+0x3c3/0x620 drivers/net/netdevsim/netdev.c:1028
__nsim_dev_port_del+0x14b/0x1b0 drivers/net/netdevsim/dev.c:1428
nsim_dev_port_del_all drivers/net/netdevsim/dev.c:1440 [inline]
nsim_dev_reload_destroy+0x28a/0x490 drivers/net/netdevsim/dev.c:1661
nsim_drv_remove+0x58/0x160 drivers/net/netdevsim/dev.c:1676
device_remove drivers/base/dd.c:567 [inline]
__device_release_driver drivers/base/dd.c:1273 [inline]
device_release_driver_internal+0x4a9/0x7c0 drivers/base/dd.c:1296
bus_remove_device+0x34f/0x420 drivers/base/bus.c:576
device_del+0x57a/0x9b0 drivers/base/core.c:3854
device_unregister+0x20/0xc0 drivers/base/core.c:3895
nsim_bus_dev_del drivers/net/netdevsim/bus.c:462 [inline]
del_device_store+0x363/0x480 drivers/net/netdevsim/bus.c:226
kernfs_fop_write_iter+0x3a0/0x500 fs/kernfs/file.c:334
new_sync_write fs/read_write.c:586 [inline]
vfs_write+0xaeb/0xd30 fs/read_write.c:679
ksys_write+0x18f/0x2b0 fs/read_write.c:731
do_syscall_x64 arch/x86/entry/common.c:52 [inline]

diff --git a/net/core/dev.c b/net/core/dev.c
index fe5f5855593db34cb4bc31e6a637b59b9041bb73..fab4899b83f745a3c13c982775e287b1ff2f547d
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11593,8 +11593,6 @@ void free_netdev(struct net_device *dev)
                return;
        }

-       mutex_destroy(&dev->lock);
-
        kfree(dev->ethtool);
        netif_free_tx_queues(dev);
        netif_free_rx_queues(dev);
@@ -11621,6 +11619,8 @@ void free_netdev(struct net_device *dev)

        netdev_free_phy_link_topology(dev);

+       mutex_destroy(&dev->lock);
+
        /*  Compatibility with error handling in drivers */
        if (dev->reg_state == NETREG_UNINITIALIZED ||
            dev->reg_state == NETREG_DUMMY) {

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 05/11] net: protect netdev->napi_list with netdev_lock()
  2025-01-17 22:21     ` Eric Dumazet
@ 2025-01-17 22:52       ` Jakub Kicinski
  0 siblings, 0 replies; 34+ messages in thread
From: Jakub Kicinski @ 2025-01-17 22:52 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Kuniyuki Iwashima, andrew+netdev, davem, horms, jdamato, netdev,
	pabeni

On Fri, 17 Jan 2025 23:21:43 +0100 Eric Dumazet wrote:
> > > Reviewed-by: Joe Damato <jdamato@fastly.com>
> > > Reviewed-by: Eric Dumazet <edumazet@google.com>
> > > Signed-off-by: Jakub Kicinski <kuba@kernel.org>  
> >
> > Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>  
> 
> Jakub, has anyone sent the following fix yet ?
> 
> CONFIG_DEBUG_MUTEXES=y should show a splat I think otherwise.

Ugh, no, I didn't see such a fix, looks good!

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 06/11] net: protect NAPI enablement with netdev_lock()
  2025-01-15  3:53 ` [PATCH net-next v2 06/11] net: protect NAPI enablement " Jakub Kicinski
  2025-01-15  9:02   ` Kuniyuki Iwashima
@ 2025-01-21  8:32   ` Dan Carpenter
  2025-01-21  8:50     ` David Laight
  2025-01-21 15:27     ` Dan Carpenter
  1 sibling, 2 replies; 34+ messages in thread
From: Dan Carpenter @ 2025-01-21  8:32 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Francois Romieu, pcnet32, anthony.l.nguyen, przemyslaw.kitszel,
	marcin.s.wojtas

On Tue, Jan 14, 2025 at 07:53:14PM -0800, Jakub Kicinski wrote:
> Wrap napi_enable() / napi_disable() with netdev_lock().
> Provide the "already locked" flavor of the API.
> 
> iavf needs the usual adjustment. A number of drivers call
> napi_enable() under a spin lock, so they have to be modified
> to take netdev_lock() first, then spin lock then call
> napi_enable_locked().

You missed some.

drivers/net/ethernet/broadcom/tg3.c:7427 tg3_napi_enable() warn: sleeping in atomic context
drivers/net/ethernet/nvidia/forcedeth.c:5597 nv_open() warn: sleeping in atomic context
drivers/net/ethernet/realtek/8139too.c:1697 rtl8139_tx_timeout_task() warn: sleeping in atomic context
drivers/net/ethernet/sun/niu.c:6530 niu_reset_task() warn: sleeping in atomic context
drivers/net/ethernet/sun/niu.c:9944 niu_resume() warn: sleeping in atomic context
drivers/net/ethernet/via/via-rhine.c:1740 rhine_reset_task() warn: sleeping in atomic context
drivers/net/ethernet/via/via-rhine.c:2545 rhine_resume() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt7603/mac.c:1483 mt7603_mac_watchdog_reset() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt7615/pci.c:169 mt7615_pci_resume() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt7615/pci_mac.c:266 mt7615_mac_reset_work() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c:508 mt76x02_watchdog_reset() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt76x0/pci.c:288 mt76x0e_resume() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt76x2/pci.c:156 mt76x2e_resume() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt7915/mac.c:1362 mt7915_mac_restart() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt7915/mac.c:1575 mt7915_mac_reset_work() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt7921/pci.c:528 mt7921_pci_resume() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c:86 mt7921e_mac_reset() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt7925/pci.c:561 mt7925_pci_resume() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt7925/pci_mac.c:106 mt7925e_mac_reset() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt7996/mac.c:1705 mt7996_mac_restart() warn: sleeping in atomic context
drivers/net/wireless/mediatek/mt76/mt7996/mac.c:1967 mt7996_mac_reset_work() warn: sleeping in atomic context

The big one is the local_bh_disable() the mediatek driver.  Let me break
it down a bit more below.

regards,
dan carpenter

=========================================================
drivers/net/ethernet/broadcom/tg3.c:7427 tg3_napi_enable() warn: sleeping in atomic context
static void tg3_napi_enable(struct tg3 *tp)
{
	int txq_idx = 0, rxq_idx = 0;
	struct tg3_napi *tnapi;
	int i;

	for (i = 0; i < tp->irq_cnt; i++) {
		tnapi = &tp->napi[i];
---> 		napi_enable(&tnapi->napi);
		if (tnapi->tx_buffers) {
			netif_queue_set_napi(tp->dev, txq_idx,
					     NETDEV_QUEUE_TYPE_TX,
					     &tnapi->napi);
			txq_idx++;
		}
		if (tnapi->rx_rcb) {
			netif_queue_set_napi(tp->dev, rxq_idx,
					     NETDEV_QUEUE_TYPE_RX,
					     &tnapi->napi);
			rxq_idx++;
		}
	}
}

This is the trickiest one.  The spinlock is two steps away in
the tg3_full_lock().

tg3_reset_task() <- disables preempt
tg3_set_ringparam() <- disables preempt
tg3_set_pauseparam() <- disables preempt
tg3_self_test() <- disables preempt
tg3_change_mtu() <- disables preempt
tg3_resume() <- disables preempt
tg3_io_resume() <- disables preempt
-> tg3_netif_start()
   -> tg3_napi_enable()

=========================================================
drivers/net/ethernet/nvidia/forcedeth.c:5597 nv_open() warn: sleeping in atomic context
	spin_lock_irq(&np->lock);
        ^^^^^^^^^^^^^^^^^^^^^^^^
	writel(NVREG_MCASTADDRA_FORCE, base + NvRegMulticastAddrA);
	writel(0, base + NvRegMulticastAddrB);
	writel(NVREG_MCASTMASKA_NONE, base + NvRegMulticastMaskA);
	writel(NVREG_MCASTMASKB_NONE, base + NvRegMulticastMaskB);
	writel(NVREG_PFF_ALWAYS|NVREG_PFF_MYADDR, base + NvRegPacketFilterFlags);
	/* One manual link speed update: Interrupts are enabled, future link
	 * speed changes cause interrupts and are handled by nv_link_irq().
	 */
	readl(base + NvRegMIIStatus);
	writel(NVREG_MIISTAT_MASK_ALL, base + NvRegMIIStatus);

	/* set linkspeed to invalid value, thus force nv_update_linkspeed
	 * to init hw */
	np->linkspeed = 0;
	ret = nv_update_linkspeed(dev);
	nv_start_rxtx(dev);
	netif_start_queue(dev);
---> 	nv_napi_enable(dev);

	if (ret) {
		netif_carrier_on(dev);
	} else {
		netdev_info(dev, "no link during initialization\n");
		netif_carrier_off(dev);
	}
	if (oom)
		mod_timer(&np->oom_kick, jiffies + OOM_REFILL);

	/* start statistics timer */
	if (np->driver_data & (DEV_HAS_STATISTICS_V1|DEV_HAS_STATISTICS_V2|DEV_HAS_STATISTICS_V3))
		mod_timer(&np->stats_poll,
			round_jiffies(jiffies + STATS_INTERVAL));

	spin_unlock_irq(&np->lock);

	/* If the loopback feature was set while the device was down, make sure
	 * that it's set correctly now.
=========================================================
drivers/net/ethernet/realtek/8139too.c:1697 rtl8139_tx_timeout_task() warn: sleeping in atomic context
	spin_lock_bh(&tp->rx_lock);
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
	/* Disable interrupts by clearing the interrupt mask. */
	RTL_W16 (IntrMask, 0x0000);

	/* Stop a shared interrupt from scavenging while we are. */
	spin_lock_irq(&tp->lock);
	rtl8139_tx_clear (tp);
	spin_unlock_irq(&tp->lock);

	/* ...and finally, reset everything */
---> 	napi_enable(&tp->napi);
	rtl8139_hw_start(dev);
	netif_wake_queue(dev);

	spin_unlock_bh(&tp->rx_lock);
}
=========================================================
drivers/net/ethernet/sun/niu.c:6530 niu_reset_task() warn: sleeping in atomic context
	spin_lock_irqsave(&np->lock, flags);
        ^^^^^^^^^^^^^^^^^^
	err = niu_init_hw(np);
	if (!err) {
		np->timer.expires = jiffies + HZ;
		add_timer(&np->timer);
---> 		niu_netif_start(np);
	}

	spin_unlock_irqrestore(&np->lock, flags);
=========================================================
drivers/net/ethernet/sun/niu.c:9944 niu_resume() warn: sleeping in atomic context
	spin_lock_irqsave(&np->lock, flags);
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
	err = niu_init_hw(np);
	if (!err) {
		np->timer.expires = jiffies + HZ;
		add_timer(&np->timer);
---> 		niu_netif_start(np);
	}

	spin_unlock_irqrestore(&np->lock, flags);

	return err;
}
=========================================================
drivers/net/ethernet/via/via-rhine.c:1740 rhine_reset_task() warn: sleeping in atomic context
	spin_lock_bh(&rp->lock);
        ^^^^^^^^^^^^^^^^^^^^^^^
	/* clear all descriptors */
	free_tbufs(dev);
	alloc_tbufs(dev);

	rhine_reset_rbufs(rp);

	/* Reinitialize the hardware. */
	rhine_chip_reset(dev);
---> 	init_registers(dev);

	spin_unlock_bh(&rp->lock);

	netif_trans_update(dev); /* prevent tx timeout */
	dev->stats.tx_errors++;
	netif_wake_queue(dev);

out_unlock:
	mutex_unlock(&rp->task_lock);
}
=========================================================
drivers/net/ethernet/via/via-rhine.c:2545 rhine_resume() warn: sleeping in atomic context
static int rhine_resume(struct device *device)
{
	struct net_device *dev = dev_get_drvdata(device);
	struct rhine_private *rp = netdev_priv(dev);

	if (!netif_running(dev))
		return 0;

	enable_mmio(rp->pioaddr, rp->quirks);
	rhine_power_init(dev);
	free_tbufs(dev);
	alloc_tbufs(dev);
	rhine_reset_rbufs(rp);
	rhine_task_enable(rp);
	spin_lock_bh(&rp->lock);
        ^^^^^^^^^^^^^^^^^^^^^^
---> 	init_registers(dev);
	spin_unlock_bh(&rp->lock);

	netif_device_attach(dev);

	return 0;
}
=========================================================
drivers/net/wireless/mediatek/mt76/mt7603/mac.c:1483 mt7603_mac_watchdog_reset() warn: sleeping in atomic context
	tasklet_enable(&dev->mt76.pre_tbtt_tasklet);
	mt7603_beacon_set_timer(dev, -1, beacon_int);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^
---> 	napi_enable(&dev->mt76.tx_napi);
	napi_schedule(&dev->mt76.tx_napi);

	napi_enable(&dev->mt76.napi[0]);
	napi_schedule(&dev->mt76.napi[0]);

	napi_enable(&dev->mt76.napi[1]);
	napi_schedule(&dev->mt76.napi[1]);
	local_bh_enable();

	ieee80211_wake_queues(dev->mt76.hw);
	mt76_txq_schedule_all(&dev->mphy);
}
=========================================================
drivers/net/wireless/mediatek/mt76/mt7615/pci.c:169 mt7615_pci_resume() warn: sleeping in atomic context
	if (pdma_reset)
		dev_err(mdev->dev, "PDMA engine must be reinitialized\n");

	mt76_worker_enable(&mdev->tx_worker);
	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^^
	mt76_for_each_q_rx(mdev, i) {
---> 		napi_enable(&mdev->napi[i]);
		napi_schedule(&mdev->napi[i]);
	}
	napi_enable(&mdev->tx_napi);
	napi_schedule(&mdev->tx_napi);
	local_bh_enable();

	if (!test_bit(MT76_STATE_SUSPEND, &dev->mphy.state) &&
	    mt7615_firmware_offload(dev))
		err = mt76_connac_mcu_set_hif_suspend(mdev, false, true);

	return err;
}
=========================================================
drivers/net/wireless/mediatek/mt76/mt7615/pci_mac.c:266 mt7615_mac_reset_work() warn: sleeping in atomic context
	clear_bit(MT76_MCU_RESET, &dev->mphy.state);
	clear_bit(MT76_RESET, &dev->mphy.state);
	if (phy2)
		clear_bit(MT76_RESET, &phy2->mt76->state);

	mt76_worker_enable(&dev->mt76.tx_worker);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^^
---> 	napi_enable(&dev->mt76.tx_napi);
	napi_schedule(&dev->mt76.tx_napi);

	mt76_for_each_q_rx(&dev->mt76, i) {
		napi_enable(&dev->mt76.napi[i]);
		napi_schedule(&dev->mt76.napi[i]);
	}
	local_bh_enable();

	ieee80211_wake_queues(mt76_hw(dev));
	if (ext_phy)
		ieee80211_wake_queues(ext_phy->hw);

	mt7615_hif_int_event_trigger(dev, MT_MCU_INT_EVENT_RESET_DONE);
	mt7615_wait_reset_state(dev, MT_MCU_CMD_NORMAL_STATE);

	mt7615_update_beacons(dev);

	mt7615_mutex_release(dev);

=========================================================
drivers/net/wireless/mediatek/mt76/mt76x02_mmio.c:508 mt76x02_watchdog_reset() warn: sleeping in atomic context
	mutex_unlock(&dev->mt76.mutex);

	clear_bit(MT76_RESET, &dev->mphy.state);

	mt76_worker_enable(&dev->mt76.tx_worker);
	tasklet_enable(&dev->mt76.pre_tbtt_tasklet);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^
---> 	napi_enable(&dev->mt76.tx_napi);
	napi_schedule(&dev->mt76.tx_napi);

	mt76_for_each_q_rx(&dev->mt76, i) {
		napi_enable(&dev->mt76.napi[i]);
		napi_schedule(&dev->mt76.napi[i]);
	}
	local_bh_enable();

	if (restart) {
		set_bit(MT76_RESTART, &dev->mphy.state);
		mt76x02_mcu_function_select(dev, Q_SELECT, 1);
		ieee80211_restart_hw(dev->mt76.hw);
	} else {
		ieee80211_wake_queues(dev->mt76.hw);
		mt76_txq_schedule_all(&dev->mphy);
	}
}
=========================================================
drivers/net/wireless/mediatek/mt76/mt76x0/pci.c:288 mt76x0e_resume() warn: sleeping in atomic context
	err = pci_set_power_state(pdev, PCI_D0);
	if (err)
		return err;

	pci_restore_state(pdev);

	mt76_worker_enable(&mdev->tx_worker);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^
	mt76_for_each_q_rx(mdev, i) {
		mt76_queue_rx_reset(dev, i);
---> 		napi_enable(&mdev->napi[i]);
		napi_schedule(&mdev->napi[i]);
	}

	napi_enable(&mdev->tx_napi);
	napi_schedule(&mdev->tx_napi);
	local_bh_enable();

	return mt76x0e_init_hardware(dev, true);
}
=========================================================
drivers/net/wireless/mediatek/mt76/mt76x2/pci.c:156 mt76x2e_resume() warn: sleeping in atomic context
static int __maybe_unused
mt76x2e_resume(struct pci_dev *pdev)
{
	struct mt76_dev *mdev = pci_get_drvdata(pdev);
	struct mt76x02_dev *dev = container_of(mdev, struct mt76x02_dev, mt76);
	int i, err;

	err = pci_set_power_state(pdev, PCI_D0);
	if (err)
		return err;

	pci_restore_state(pdev);

	mt76_worker_enable(&mdev->tx_worker);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^
	mt76_for_each_q_rx(mdev, i) {
---> 		napi_enable(&mdev->napi[i]);
		napi_schedule(&mdev->napi[i]);
	}
	napi_enable(&mdev->tx_napi);
	napi_schedule(&mdev->tx_napi);
	local_bh_enable();

	return mt76x2_resume_device(dev);
}
=========================================================
drivers/net/wireless/mediatek/mt76/mt7915/mac.c:1362 mt7915_mac_restart() warn: sleeping in atomic context
	/* token reinit */
	mt76_connac2_tx_token_put(&dev->mt76);
	idr_init(&dev->mt76.token);

	mt7915_dma_reset(dev, true);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^
	mt76_for_each_q_rx(mdev, i) {
		if (mdev->q_rx[i].ndesc) {
---> 			napi_enable(&dev->mt76.napi[i]);
			napi_schedule(&dev->mt76.napi[i]);
		}
	}
	local_bh_enable();
	clear_bit(MT76_MCU_RESET, &dev->mphy.state);
	clear_bit(MT76_STATE_MCU_RUNNING, &dev->mphy.state);

	mt76_wr(dev, MT_INT_MASK_CSR, dev->mt76.mmio.irqmask);
	mt76_wr(dev, MT_INT_SOURCE_CSR, ~0);

	if (dev->hif2) {
		mt76_wr(dev, MT_INT1_MASK_CSR, dev->mt76.mmio.irqmask);
		mt76_wr(dev, MT_INT1_SOURCE_CSR, ~0);
	}
	if (dev_is_pci(mdev->dev)) {
		mt76_wr(dev, MT_PCIE_MAC_INT_ENABLE, 0xff);
		if (dev->hif2) {
			mt76_wr(dev, MT_PCIE_RECOG_ID,
				dev->hif2->index | MT_PCIE_RECOG_ID_SEM);
=========================================================
drivers/net/wireless/mediatek/mt76/mt7915/mac.c:1575 mt7915_mac_reset_work() warn: sleeping in atomic context
	/* enable DMA Tx/Rx and interrupt */
	mt7915_dma_start(dev, false, false);

	clear_bit(MT76_MCU_RESET, &dev->mphy.state);
	clear_bit(MT76_RESET, &dev->mphy.state);
	if (phy2)
		clear_bit(MT76_RESET, &phy2->mt76->state);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^
	mt76_for_each_q_rx(&dev->mt76, i) {
---> 		napi_enable(&dev->mt76.napi[i]);
		napi_schedule(&dev->mt76.napi[i]);
	}
	local_bh_enable();

	tasklet_schedule(&dev->mt76.irq_tasklet);

	mt76_worker_enable(&dev->mt76.tx_worker);

	local_bh_disable();
	napi_enable(&dev->mt76.tx_napi);
	napi_schedule(&dev->mt76.tx_napi);
	local_bh_enable();

	ieee80211_wake_queues(mt76_hw(dev));
	if (ext_phy)
		ieee80211_wake_queues(ext_phy->hw);

	mutex_unlock(&dev->mt76.mutex);
=========================================================
drivers/net/wireless/mediatek/mt76/mt7921/pci.c:528 mt7921_pci_resume() warn: sleeping in atomic context
	mt76_set(dev, MT_MCU2HOST_SW_INT_ENA, MT_MCU_CMD_WAKE_RX_PCIE);

	/* put dma enabled */
	mt76_set(dev, MT_WFDMA0_GLO_CFG,
		 MT_WFDMA0_GLO_CFG_TX_DMA_EN | MT_WFDMA0_GLO_CFG_RX_DMA_EN);

	mt76_worker_enable(&mdev->tx_worker);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^
	mt76_for_each_q_rx(mdev, i) {
---> 		napi_enable(&mdev->napi[i]);
		napi_schedule(&mdev->napi[i]);
	}
	napi_enable(&mdev->tx_napi);
	napi_schedule(&mdev->tx_napi);
	local_bh_enable();

	/* restore previous ds setting */
	if (!pm->ds_enable)
		mt76_connac_mcu_set_deep_sleep(&dev->mt76, false);

	err = mt76_connac_mcu_set_hif_suspend(mdev, false, true);
	if (err < 0)
		goto failed;

	mt7921_regd_update(dev);
	err = mt7921_mcu_radio_led_ctrl(dev, EXT_CMD_RADIO_ON_LED);
failed:
	pm->suspended = false;
=========================================================
drivers/net/wireless/mediatek/mt76/mt7921/pci_mac.c:86 mt7921e_mac_reset() warn: sleeping in atomic context
	napi_disable(&dev->mt76.napi[MT_RXQ_MCU_WA]);
	napi_disable(&dev->mt76.tx_napi);

	mt76_connac2_tx_token_put(&dev->mt76);
	idr_init(&dev->mt76.token);

	mt792x_wpdma_reset(dev, true);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^
	mt76_for_each_q_rx(&dev->mt76, i) {
---> 		napi_enable(&dev->mt76.napi[i]);
		napi_schedule(&dev->mt76.napi[i]);
	}
	local_bh_enable();

	dev->fw_assert = false;
	clear_bit(MT76_MCU_RESET, &dev->mphy.state);

	mt76_wr(dev, dev->irq_map->host_irq_enable,
		dev->irq_map->tx.all_complete_mask |
		MT_INT_RX_DONE_ALL | MT_INT_MCU_CMD);
	mt76_wr(dev, MT_PCIE_MAC_INT_ENABLE, 0xff);

	err = mt7921e_driver_own(dev);
	if (err)
		goto out;

	err = mt7921_run_firmware(dev);
	if (err)
		goto out;
=========================================================
drivers/net/wireless/mediatek/mt76/mt7925/pci.c:561 mt7925_pci_resume() warn: sleeping in atomic context
	mt76_set(dev, MT_MCU2HOST_SW_INT_ENA, MT_MCU_CMD_WAKE_RX_PCIE);

	/* put dma enabled */
	mt76_set(dev, MT_WFDMA0_GLO_CFG,
		 MT_WFDMA0_GLO_CFG_TX_DMA_EN | MT_WFDMA0_GLO_CFG_RX_DMA_EN);

	mt76_worker_enable(&mdev->tx_worker);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^
	mt76_for_each_q_rx(mdev, i) {
---> 		napi_enable(&mdev->napi[i]);
		napi_schedule(&mdev->napi[i]);
	}
	napi_enable(&mdev->tx_napi);
	napi_schedule(&mdev->tx_napi);
	local_bh_enable();

	mt76_connac_mcu_set_hif_suspend(mdev, false, false);
	ret = wait_event_timeout(dev->wait,
				 dev->hif_resumed, 3 * HZ);
	if (!ret) {
		err = -ETIMEDOUT;
		goto failed;
	}

	/* restore previous ds setting */
=========================================================
drivers/net/wireless/mediatek/mt76/mt7925/pci_mac.c:106 mt7925e_mac_reset() warn: sleeping in atomic context
	mt7925_tx_token_put(dev);
	idr_init(&dev->mt76.token);

	mt792x_wpdma_reset(dev, true);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^
	mt76_for_each_q_rx(&dev->mt76, i) {
---> 		napi_enable(&dev->mt76.napi[i]);
		napi_schedule(&dev->mt76.napi[i]);
	}
	napi_enable(&dev->mt76.tx_napi);
	napi_schedule(&dev->mt76.tx_napi);
	local_bh_enable();

	dev->fw_assert = false;
	clear_bit(MT76_MCU_RESET, &dev->mphy.state);

	mt76_wr(dev, dev->irq_map->host_irq_enable,
		dev->irq_map->tx.all_complete_mask |
		MT_INT_RX_DONE_ALL | MT_INT_MCU_CMD);
	mt76_wr(dev, MT_PCIE_MAC_INT_ENABLE, 0xff);

	err = mt792xe_mcu_fw_pmctrl(dev);
=========================================================
drivers/net/wireless/mediatek/mt76/mt7996/mac.c:1705 mt7996_mac_restart() warn: sleeping in atomic context
	napi_disable(&dev->mt76.tx_napi);

	/* token reinit */
	mt7996_tx_token_put(dev);
	idr_init(&dev->mt76.token);

	mt7996_dma_reset(dev, true);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^
	mt76_for_each_q_rx(mdev, i) {
		if (mtk_wed_device_active(&dev->mt76.mmio.wed) &&
		    mt76_queue_is_wed_rro(&mdev->q_rx[i]))
			continue;

		if (mdev->q_rx[i].ndesc) {
---> 			napi_enable(&dev->mt76.napi[i]);
			napi_schedule(&dev->mt76.napi[i]);
		}
	}
	local_bh_enable();
	clear_bit(MT76_MCU_RESET, &dev->mphy.state);
	clear_bit(MT76_STATE_MCU_RUNNING, &dev->mphy.state);

	mt76_wr(dev, MT_INT_MASK_CSR, dev->mt76.mmio.irqmask);
	mt76_wr(dev, MT_INT_SOURCE_CSR, ~0);
	if (dev->hif2) {
		mt76_wr(dev, MT_INT1_MASK_CSR, dev->mt76.mmio.irqmask);
		mt76_wr(dev, MT_INT1_SOURCE_CSR, ~0);
	}
=========================================================
drivers/net/wireless/mediatek/mt76/mt7996/mac.c:1967 mt7996_mac_reset_work() warn: sleeping in atomic context
	clear_bit(MT76_MCU_RESET, &dev->mphy.state);
	clear_bit(MT76_RESET, &dev->mphy.state);
	if (phy2)
		clear_bit(MT76_RESET, &phy2->mt76->state);
	if (phy3)
		clear_bit(MT76_RESET, &phy3->mt76->state);

	local_bh_disable();
        ^^^^^^^^^^^^^^^^^^^
	mt76_for_each_q_rx(&dev->mt76, i) {
		if (mtk_wed_device_active(&dev->mt76.mmio.wed) &&
		    mt76_queue_is_wed_rro(&dev->mt76.q_rx[i]))
			continue;

---> 		napi_enable(&dev->mt76.napi[i]);
		napi_schedule(&dev->mt76.napi[i]);
	}
	local_bh_enable();

	tasklet_schedule(&dev->mt76.irq_tasklet);

	mt76_worker_enable(&dev->mt76.tx_worker);

	local_bh_disable();
	napi_enable(&dev->mt76.tx_napi);
	napi_schedule(&dev->mt76.tx_napi);
	local_bh_enable();

	ieee80211_wake_queues(mt76_hw(dev));
	if (phy2)
		ieee80211_wake_queues(phy2->mt76->hw);
	if (phy3)
		ieee80211_wake_queues(phy3->mt76->hw);




^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 06/11] net: protect NAPI enablement with netdev_lock()
  2025-01-21  8:32   ` Dan Carpenter
@ 2025-01-21  8:50     ` David Laight
  2025-01-21 15:27     ` Dan Carpenter
  1 sibling, 0 replies; 34+ messages in thread
From: David Laight @ 2025-01-21  8:50 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Jakub Kicinski, davem, netdev, edumazet, pabeni, andrew+netdev,
	horms, jdamato, Francois Romieu, pcnet32, anthony.l.nguyen,
	przemyslaw.kitszel, marcin.s.wojtas

On Tue, 21 Jan 2025 11:32:24 +0300
Dan Carpenter <dan.carpenter@linaro.org> wrote:

> On Tue, Jan 14, 2025 at 07:53:14PM -0800, Jakub Kicinski wrote:
> > Wrap napi_enable() / napi_disable() with netdev_lock().
> > Provide the "already locked" flavor of the API.
> > 
> > iavf needs the usual adjustment. A number of drivers call
> > napi_enable() under a spin lock, so they have to be modified
> > to take netdev_lock() first, then spin lock then call
> > napi_enable_locked().  
> 
> You missed some.
> 
> drivers/net/ethernet/broadcom/tg3.c:7427 tg3_napi_enable() warn: sleeping in atomic context
> drivers/net/ethernet/nvidia/forcedeth.c:5597 nv_open() warn: sleeping in atomic context
...

Looks like the whole patch is very fragile.
You really need to keep the existing function names having their existing semantics.
Add a new function, change all the code, then delete the old function.

It also looks as though drivers will end up holding netdev_lock() for long
periods just so they can do a napi_enable() much later on.

	David

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH net-next v2 06/11] net: protect NAPI enablement with netdev_lock()
  2025-01-21  8:32   ` Dan Carpenter
  2025-01-21  8:50     ` David Laight
@ 2025-01-21 15:27     ` Dan Carpenter
  1 sibling, 0 replies; 34+ messages in thread
From: Dan Carpenter @ 2025-01-21 15:27 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, jdamato,
	Francois Romieu, pcnet32, anthony.l.nguyen, przemyslaw.kitszel,
	marcin.s.wojtas

There were four more from my arm32 build.

regards,
dan carpenter

===============================================================
drivers/net/ethernet/nvidia/forcedeth.c:1127 nv_napi_enable() warn: sleeping in atomic context
nv_open() <- disables preempt
-> nv_napi_enable()

===============================================================
drivers/net/ethernet/sun/niu.c:6089 niu_enable_napi() warn: sleeping in atomic context
niu_reset_task() <- disables preempt
niu_resume() <- disables preempt
-> niu_netif_start()
   -> niu_enable_napi()

===============================================================
drivers/net/ethernet/sun/niu.c:6444 niu_netif_start() warn: sleeping in atomic context
niu_reset_task() <- disables preempt
niu_resume() <- disables preempt
-> niu_netif_start()

===============================================================
drivers/net/ethernet/via/via-rhine.c:1571 init_registers() warn: sleeping in atomic context
netdev_open() <- disables preempt
tx_timeout() <- disables preempt
w840_resume() <- disables preempt
rhine_reset_task() <- disables preempt
rhine_resume() <- disables preempt
netdev_timer() <- disables preempt
ns_tx_timeout() <- disables preempt
natsemi_resume() <- disables preempt
-> init_registers()


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2025-01-21 15:27 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-15  3:53 [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI Jakub Kicinski
2025-01-15  3:53 ` [PATCH net-next v2 01/11] net: add netdev_lock() / netdev_unlock() helpers Jakub Kicinski
2025-01-15  8:21   ` Kuniyuki Iwashima
2025-01-15  8:36   ` Przemek Kitszel
2025-01-15  9:24     ` Ido Schimmel
2025-01-15 14:08     ` Jakub Kicinski
2025-01-15 14:24       ` Przemek Kitszel
2025-01-15  3:53 ` [PATCH net-next v2 02/11] net: make netdev_lock() protect netdev->reg_state Jakub Kicinski
2025-01-15  8:30   ` Kuniyuki Iwashima
2025-01-15 14:18     ` Jakub Kicinski
2025-01-15  3:53 ` [PATCH net-next v2 03/11] net: add helpers for lookup and walking netdevs under netdev_lock() Jakub Kicinski
2025-01-15  8:41   ` Kuniyuki Iwashima
2025-01-15  3:53 ` [PATCH net-next v2 04/11] net: add netdev->up protected by netdev_lock() Jakub Kicinski
2025-01-15  8:45   ` Kuniyuki Iwashima
2025-01-15  3:53 ` [PATCH net-next v2 05/11] net: protect netdev->napi_list with netdev_lock() Jakub Kicinski
2025-01-15  8:57   ` Kuniyuki Iwashima
2025-01-17 22:21     ` Eric Dumazet
2025-01-17 22:52       ` Jakub Kicinski
2025-01-15  3:53 ` [PATCH net-next v2 06/11] net: protect NAPI enablement " Jakub Kicinski
2025-01-15  9:02   ` Kuniyuki Iwashima
2025-01-21  8:32   ` Dan Carpenter
2025-01-21  8:50     ` David Laight
2025-01-21 15:27     ` Dan Carpenter
2025-01-15  3:53 ` [PATCH net-next v2 07/11] net: make netdev netlink ops hold netdev_lock() Jakub Kicinski
2025-01-15  9:07   ` Kuniyuki Iwashima
2025-01-15  3:53 ` [PATCH net-next v2 08/11] net: protect threaded status of NAPI with netdev_lock() Jakub Kicinski
2025-01-15  9:09   ` Kuniyuki Iwashima
2025-01-15  3:53 ` [PATCH net-next v2 09/11] net: protect napi->irq " Jakub Kicinski
2025-01-15  9:12   ` Kuniyuki Iwashima
2025-01-15  3:53 ` [PATCH net-next v2 10/11] net: protect NAPI config fields " Jakub Kicinski
2025-01-15  9:15   ` Kuniyuki Iwashima
2025-01-15  3:53 ` [PATCH net-next v2 11/11] netdev-genl: remove rtnl_lock protection from NAPI ops Jakub Kicinski
2025-01-15  9:18   ` Kuniyuki Iwashima
2025-01-16  3:30 ` [PATCH net-next v2 00/11] net: use netdev->lock to protect NAPI patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).