* [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations
@ 2025-02-04 23:00 Stanislav Fomichev
2025-02-04 23:00 ` [RFC net-next 1/4] net: Hold netdev instance lock during ndo_open/ndo_stop Stanislav Fomichev
` (4 more replies)
0 siblings, 5 replies; 10+ messages in thread
From: Stanislav Fomichev @ 2025-02-04 23:00 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni, Saeed Mahameed
As the gradual purging of rtnl continues, start grabbing netdev
instance lock in more places so we can get to the state where
most paths are working without rtnl. Start with requiring the
drivers that use shaper api (and later queue mgmt api) to work
with both rtnl and netdev instance lock. Eventually we might
attempt to drop rtnl. This mostly affects iavf, gve, bnxt and
netdev sim (as the drivers that implement shaper/queue mgmt)
so those drivers are converted in the process.
This is part one of the process, the next step is to do similar locking
for the rest of ndo handlers that are being called from sysfs/ethtool/netlink.
Cc: Saeed Mahameed <saeed@kernel.org>
Stanislav Fomichev (4):
net: Hold netdev instance lock during ndo_open/ndo_stop
net: Hold netdev instance lock during ndo_setup_tc
net: Hold netdev instance lock for more NDOs
net: Hold netdev instance lock during queue operations
Documentation/networking/netdevices.rst | 52 +++++++++++++-----
drivers/net/bonding/bond_main.c | 9 ++--
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 54 +++++++++++++++----
.../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 11 ++--
.../net/ethernet/broadcom/bnxt/bnxt_sriov.c | 2 +
drivers/net/ethernet/google/gve/gve_main.c | 8 +--
drivers/net/ethernet/google/gve/gve_utils.c | 8 +--
drivers/net/ethernet/intel/iavf/iavf_main.c | 16 +++---
drivers/net/netdevsim/netdev.c | 36 ++++++++-----
include/linux/netdevice.h | 27 ++++++++++
net/8021q/vlan_dev.c | 4 +-
net/core/dev.c | 34 ++++++++++++
net/core/dev.h | 6 ++-
net/core/dev_ioctl.c | 44 ++++++++++-----
net/core/netdev_rx_queue.c | 5 ++
net/dsa/user.c | 5 +-
net/ieee802154/socket.c | 2 +
net/netfilter/nf_flow_table_offload.c | 2 +-
net/netfilter/nf_tables_offload.c | 2 +-
net/phonet/pn_dev.c | 2 +
net/sched/cls_api.c | 2 +-
net/sched/sch_api.c | 13 ++---
net/sched/sch_cbs.c | 9 +---
net/sched/sch_etf.c | 9 +---
net/sched/sch_ets.c | 10 +---
net/sched/sch_fifo.c | 10 +---
net/sched/sch_gred.c | 5 +-
net/sched/sch_htb.c | 2 +-
net/sched/sch_mq.c | 5 +-
net/sched/sch_mqprio.c | 6 +--
net/sched/sch_prio.c | 5 +-
net/sched/sch_red.c | 8 +--
net/sched/sch_taprio.c | 16 ++----
net/sched/sch_tbf.c | 10 +---
34 files changed, 266 insertions(+), 173 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC net-next 1/4] net: Hold netdev instance lock during ndo_open/ndo_stop
2025-02-04 23:00 [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations Stanislav Fomichev
@ 2025-02-04 23:00 ` Stanislav Fomichev
2025-02-05 19:32 ` Joe Damato
2025-02-04 23:00 ` [RFC net-next 2/4] net: Hold netdev instance lock during ndo_setup_tc Stanislav Fomichev
` (3 subsequent siblings)
4 siblings, 1 reply; 10+ messages in thread
From: Stanislav Fomichev @ 2025-02-04 23:00 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni, Saeed Mahameed
For the drivers that use shaper API, switch to the mode where
core stack holds the netdev lock. This affects two drivers:
* iavf - already grabs netdev lock in ndo_open/ndo_stop, so mostly
remove these
* netdevsim - switch to _locked APIs to avoid deadlock
Cc: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
Documentation/networking/netdevices.rst | 6 ++++--
drivers/net/ethernet/intel/iavf/iavf_main.c | 14 ++++++-------
drivers/net/netdevsim/netdev.c | 14 ++++++++-----
include/linux/netdevice.h | 23 +++++++++++++++++++++
net/core/dev.c | 12 +++++++++++
net/core/dev.h | 6 ++++--
6 files changed, 58 insertions(+), 17 deletions(-)
diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
index 1d37038e9fbe..78213e476ce6 100644
--- a/Documentation/networking/netdevices.rst
+++ b/Documentation/networking/netdevices.rst
@@ -210,11 +210,13 @@ packets is preferred.
struct net_device synchronization rules
=======================================
ndo_open:
- Synchronization: rtnl_lock() semaphore.
+ Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+ lock if the driver implements shaper API.
Context: process
ndo_stop:
- Synchronization: rtnl_lock() semaphore.
+ Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+ lock if the driver implements shaper API.
Context: process
Note: netif_running() is guaranteed false
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 2d7a18fcc3be..176f9bb871d0 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -4375,22 +4375,21 @@ static int iavf_open(struct net_device *netdev)
struct iavf_adapter *adapter = netdev_priv(netdev);
int err;
+ netdev_assert_locked(netdev);
+
if (adapter->flags & IAVF_FLAG_PF_COMMS_FAILED) {
dev_err(&adapter->pdev->dev, "Unable to open device due to PF driver failure.\n");
return -EIO;
}
- netdev_lock(netdev);
while (!mutex_trylock(&adapter->crit_lock)) {
/* If we are in __IAVF_INIT_CONFIG_ADAPTER state the crit_lock
* is already taken and iavf_open is called from an upper
* device's notifier reacting on NETDEV_REGISTER event.
* We have to leave here to avoid dead lock.
*/
- if (adapter->state == __IAVF_INIT_CONFIG_ADAPTER) {
- netdev_unlock(netdev);
+ if (adapter->state == __IAVF_INIT_CONFIG_ADAPTER)
return -EBUSY;
- }
usleep_range(500, 1000);
}
@@ -4439,7 +4438,6 @@ static int iavf_open(struct net_device *netdev)
iavf_irq_enable(adapter, true);
mutex_unlock(&adapter->crit_lock);
- netdev_unlock(netdev);
return 0;
@@ -4452,7 +4450,6 @@ static int iavf_open(struct net_device *netdev)
iavf_free_all_tx_resources(adapter);
err_unlock:
mutex_unlock(&adapter->crit_lock);
- netdev_unlock(netdev);
return err;
}
@@ -4474,12 +4471,12 @@ static int iavf_close(struct net_device *netdev)
u64 aq_to_restore;
int status;
- netdev_lock(netdev);
+ netdev_assert_locked(netdev);
+
mutex_lock(&adapter->crit_lock);
if (adapter->state <= __IAVF_DOWN_PENDING) {
mutex_unlock(&adapter->crit_lock);
- netdev_unlock(netdev);
return 0;
}
@@ -4532,6 +4529,7 @@ static int iavf_close(struct net_device *netdev)
if (!status)
netdev_warn(netdev, "Device resources not yet released\n");
+ netdev_lock(netdev);
mutex_lock(&adapter->crit_lock);
adapter->aq_required |= aq_to_restore;
mutex_unlock(&adapter->crit_lock);
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index 42f247cbdcee..efec03b34c9f 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -401,7 +401,7 @@ static int nsim_init_napi(struct netdevsim *ns)
for (i = 0; i < dev->num_rx_queues; i++) {
rq = ns->rq[i];
- netif_napi_add_config(dev, &rq->napi, nsim_poll, i);
+ netif_napi_add_config_locked(dev, &rq->napi, nsim_poll, i);
}
for (i = 0; i < dev->num_rx_queues; i++) {
@@ -421,7 +421,7 @@ static int nsim_init_napi(struct netdevsim *ns)
}
for (i = 0; i < dev->num_rx_queues; i++)
- __netif_napi_del(&ns->rq[i]->napi);
+ __netif_napi_del_locked(&ns->rq[i]->napi);
return err;
}
@@ -435,7 +435,7 @@ static void nsim_enable_napi(struct netdevsim *ns)
struct nsim_rq *rq = ns->rq[i];
netif_queue_set_napi(dev, i, NETDEV_QUEUE_TYPE_RX, &rq->napi);
- napi_enable(&rq->napi);
+ napi_enable_locked(&rq->napi);
}
}
@@ -444,6 +444,8 @@ static int nsim_open(struct net_device *dev)
struct netdevsim *ns = netdev_priv(dev);
int err;
+ netdev_assert_locked(dev);
+
err = nsim_init_napi(ns);
if (err)
return err;
@@ -461,8 +463,8 @@ static void nsim_del_napi(struct netdevsim *ns)
for (i = 0; i < dev->num_rx_queues; i++) {
struct nsim_rq *rq = ns->rq[i];
- napi_disable(&rq->napi);
- __netif_napi_del(&rq->napi);
+ napi_disable_locked(&rq->napi);
+ __netif_napi_del_locked(&rq->napi);
}
synchronize_net();
@@ -477,6 +479,8 @@ static int nsim_stop(struct net_device *dev)
struct netdevsim *ns = netdev_priv(dev);
struct netdevsim *peer;
+ netdev_assert_locked(dev);
+
netif_carrier_off(dev);
peer = rtnl_dereference(ns->peer);
if (peer)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 2a59034a5fa2..962774cbce55 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2717,6 +2717,29 @@ static inline void netdev_assert_locked_or_invisible(struct net_device *dev)
netdev_assert_locked(dev);
}
+static inline bool need_netdev_ops_lock(struct net_device *dev)
+{
+ bool ret = false;
+
+#if IS_ENABLED(CONFIG_NET_SHAPER)
+ ret |= !!(dev)->netdev_ops->net_shaper_ops;
+#endif
+
+ return ret;
+}
+
+static inline void netdev_lock_ops(struct net_device *dev)
+{
+ if (need_netdev_ops_lock(dev))
+ netdev_lock(dev);
+}
+
+static inline void netdev_unlock_ops(struct net_device *dev)
+{
+ if (need_netdev_ops_lock(dev))
+ netdev_unlock(dev);
+}
+
static inline void netif_napi_set_irq_locked(struct napi_struct *napi, int irq)
{
napi->irq = irq;
diff --git a/net/core/dev.c b/net/core/dev.c
index c0021cbd28fc..fda42b2415fc 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1595,6 +1595,8 @@ static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack)
if (ret)
return ret;
+ netdev_lock_ops(dev);
+
set_bit(__LINK_STATE_START, &dev->state);
if (ops->ndo_validate_addr)
@@ -1614,6 +1616,8 @@ static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack)
add_device_randomness(dev->dev_addr, dev->addr_len);
}
+ netdev_unlock_ops(dev);
+
return ret;
}
@@ -1684,11 +1688,19 @@ static void __dev_close_many(struct list_head *head)
* We allow it to be called even after a DETACH hot-plug
* event.
*/
+
+ /* TODO: move the lock up before clearing __LINK_STATE_START.
+ * Generates spurious lockdep warning.
+ */
+ netdev_lock_ops(dev);
+
if (ops->ndo_stop)
ops->ndo_stop(dev);
netif_set_up(dev, false);
netpoll_poll_enable(dev);
+
+ netdev_unlock_ops(dev);
}
}
diff --git a/net/core/dev.h b/net/core/dev.h
index a5b166bbd169..18070c0452e3 100644
--- a/net/core/dev.h
+++ b/net/core/dev.h
@@ -134,9 +134,11 @@ static inline void netif_set_up(struct net_device *dev, bool value)
else
dev->flags &= ~IFF_UP;
- netdev_lock(dev);
+ if (!need_netdev_ops_lock(dev))
+ netdev_lock(dev);
dev->up = value;
- netdev_unlock(dev);
+ if (!need_netdev_ops_lock(dev))
+ netdev_unlock(dev);
}
static inline void netif_set_gso_max_size(struct net_device *dev,
--
2.48.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [RFC net-next 2/4] net: Hold netdev instance lock during ndo_setup_tc
2025-02-04 23:00 [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations Stanislav Fomichev
2025-02-04 23:00 ` [RFC net-next 1/4] net: Hold netdev instance lock during ndo_open/ndo_stop Stanislav Fomichev
@ 2025-02-04 23:00 ` Stanislav Fomichev
2025-02-04 23:00 ` [RFC net-next 3/4] net: Hold netdev instance lock for more NDOs Stanislav Fomichev
` (2 subsequent siblings)
4 siblings, 0 replies; 10+ messages in thread
From: Stanislav Fomichev @ 2025-02-04 23:00 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni, Saeed Mahameed
Introduce new dev_setup_tc that handles the details and call it from
all qdiscs/classifiers. The instance lock is still applied only to
the drivers that implement shaper API so only iavf is affected.
Cc: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
Documentation/networking/netdevices.rst | 4 ++++
drivers/net/ethernet/intel/iavf/iavf_main.c | 2 --
include/linux/netdevice.h | 2 ++
net/core/dev.c | 22 +++++++++++++++++++++
net/dsa/user.c | 5 +----
net/netfilter/nf_flow_table_offload.c | 2 +-
net/netfilter/nf_tables_offload.c | 2 +-
net/sched/cls_api.c | 2 +-
net/sched/sch_api.c | 13 +++---------
net/sched/sch_cbs.c | 9 ++-------
net/sched/sch_etf.c | 9 ++-------
net/sched/sch_ets.c | 10 ++--------
net/sched/sch_fifo.c | 10 ++--------
net/sched/sch_gred.c | 5 +----
net/sched/sch_htb.c | 2 +-
net/sched/sch_mq.c | 5 +----
net/sched/sch_mqprio.c | 6 ++----
net/sched/sch_prio.c | 5 +----
net/sched/sch_red.c | 8 ++------
net/sched/sch_taprio.c | 16 +++------------
net/sched/sch_tbf.c | 10 ++--------
21 files changed, 56 insertions(+), 93 deletions(-)
diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
index 78213e476ce6..c6087d92d740 100644
--- a/Documentation/networking/netdevices.rst
+++ b/Documentation/networking/netdevices.rst
@@ -257,6 +257,10 @@ struct net_device synchronization rules
Synchronization: rtnl_lock() semaphore, or RCU.
Context: atomic (can't sleep under RCU)
+ndo_setup_tc:
+ Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+ lock if the driver implements shaper API.
+
ndo_start_xmit:
Synchronization: __netif_tx_lock spinlock.
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 176f9bb871d0..4fe481433842 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -3707,10 +3707,8 @@ static int __iavf_setup_tc(struct net_device *netdev, void *type_data)
if (test_bit(__IAVF_IN_REMOVE_TASK, &adapter->crit_section))
return 0;
- netdev_lock(netdev);
netif_set_real_num_rx_queues(netdev, total_qps);
netif_set_real_num_tx_queues(netdev, total_qps);
- netdev_unlock(netdev);
return ret;
}
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 962774cbce55..6f2eb129ef3e 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3315,6 +3315,8 @@ int dev_alloc_name(struct net_device *dev, const char *name);
int dev_open(struct net_device *dev, struct netlink_ext_ack *extack);
void dev_close(struct net_device *dev);
void dev_close_many(struct list_head *head, bool unlink);
+int dev_setup_tc(struct net_device *dev, enum tc_setup_type type,
+ void *type_data);
void dev_disable_lro(struct net_device *dev);
int dev_loopback_xmit(struct net *net, struct sock *sk, struct sk_buff *newskb);
u16 dev_pick_tx_zero(struct net_device *dev, struct sk_buff *skb,
diff --git a/net/core/dev.c b/net/core/dev.c
index fda42b2415fc..e55da80d24e2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1754,6 +1754,28 @@ void dev_close(struct net_device *dev)
}
EXPORT_SYMBOL(dev_close);
+int dev_setup_tc(struct net_device *dev, enum tc_setup_type type,
+ void *type_data)
+{
+ const struct net_device_ops *ops = dev->netdev_ops;
+
+ ASSERT_RTNL();
+
+ if (tc_can_offload(dev) && ops->ndo_setup_tc) {
+ int ret = -ENODEV;
+
+ if (netif_device_present(dev)) {
+ netdev_lock_ops(dev);
+ ret = ops->ndo_setup_tc(dev, type, type_data);
+ netdev_unlock_ops(dev);
+ }
+
+ return ret;
+ }
+
+ return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL(dev_setup_tc);
/**
* dev_disable_lro - disable Large Receive Offload on a device
diff --git a/net/dsa/user.c b/net/dsa/user.c
index 291ab1b4acc4..f2ac7662e4cc 100644
--- a/net/dsa/user.c
+++ b/net/dsa/user.c
@@ -1729,10 +1729,7 @@ static int dsa_user_setup_ft_block(struct dsa_switch *ds, int port,
{
struct net_device *conduit = dsa_port_to_conduit(dsa_to_port(ds, port));
- if (!conduit->netdev_ops->ndo_setup_tc)
- return -EOPNOTSUPP;
-
- return conduit->netdev_ops->ndo_setup_tc(conduit, TC_SETUP_FT, type_data);
+ return dev_setup_tc(conduit, TC_SETUP_FT, type_data);
}
static int dsa_user_setup_tc(struct net_device *dev, enum tc_setup_type type,
diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index e06bc36f49fe..0ec4abded10d 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -1175,7 +1175,7 @@ static int nf_flow_table_offload_cmd(struct flow_block_offload *bo,
nf_flow_table_block_offload_init(bo, dev_net(dev), cmd, flowtable,
extack);
down_write(&flowtable->flow_block_lock);
- err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_FT, bo);
+ err = dev_setup_tc(dev, TC_SETUP_FT, bo);
up_write(&flowtable->flow_block_lock);
if (err < 0)
return err;
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 64675f1c7f29..b761899c143c 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -390,7 +390,7 @@ static int nft_block_offload_cmd(struct nft_base_chain *chain,
nft_flow_block_offload_init(&bo, dev_net(dev), cmd, chain, &extack);
- err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
+ err = dev_setup_tc(dev, TC_SETUP_BLOCK, &bo);
if (err < 0)
return err;
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 8e47e5355be6..082b355be8e6 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -835,7 +835,7 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
if (dev->netdev_ops->ndo_setup_tc) {
int err;
- err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
+ err = dev_setup_tc(dev, TC_SETUP_BLOCK, &bo);
if (err < 0) {
if (err != -EOPNOTSUPP)
NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index e3e91cf867eb..6d0728cfc19f 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -833,10 +833,8 @@ int qdisc_offload_dump_helper(struct Qdisc *sch, enum tc_setup_type type,
int err;
sch->flags &= ~TCQ_F_OFFLOADED;
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return 0;
- err = dev->netdev_ops->ndo_setup_tc(dev, type, type_data);
+ err = dev_setup_tc(dev, type, type_data);
if (err == -EOPNOTSUPP)
return 0;
@@ -855,10 +853,7 @@ void qdisc_offload_graft_helper(struct net_device *dev, struct Qdisc *sch,
bool any_qdisc_is_offloaded;
int err;
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return;
-
- err = dev->netdev_ops->ndo_setup_tc(dev, type, type_data);
+ err = dev_setup_tc(dev, type, type_data);
/* Don't report error if the graft is part of destroy operation. */
if (!err || !new || new == &noop_qdisc)
@@ -880,7 +875,6 @@ void qdisc_offload_query_caps(struct net_device *dev,
enum tc_setup_type type,
void *caps, size_t caps_len)
{
- const struct net_device_ops *ops = dev->netdev_ops;
struct tc_query_caps_base base = {
.type = type,
.caps = caps,
@@ -888,8 +882,7 @@ void qdisc_offload_query_caps(struct net_device *dev,
memset(caps, 0, caps_len);
- if (ops->ndo_setup_tc)
- ops->ndo_setup_tc(dev, TC_QUERY_CAPS, &base);
+ dev_setup_tc(dev, TC_QUERY_CAPS, &base);
}
EXPORT_SYMBOL(qdisc_offload_query_caps);
diff --git a/net/sched/sch_cbs.c b/net/sched/sch_cbs.c
index 8c9a0400c862..492a98fa209b 100644
--- a/net/sched/sch_cbs.c
+++ b/net/sched/sch_cbs.c
@@ -251,7 +251,6 @@ static void cbs_disable_offload(struct net_device *dev,
struct cbs_sched_data *q)
{
struct tc_cbs_qopt_offload cbs = { };
- const struct net_device_ops *ops;
int err;
if (!q->offload)
@@ -260,14 +259,10 @@ static void cbs_disable_offload(struct net_device *dev,
q->enqueue = cbs_enqueue_soft;
q->dequeue = cbs_dequeue_soft;
- ops = dev->netdev_ops;
- if (!ops->ndo_setup_tc)
- return;
-
cbs.queue = q->queue;
cbs.enable = 0;
- err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_CBS, &cbs);
+ err = dev_setup_tc(dev, TC_SETUP_QDISC_CBS, &cbs);
if (err < 0)
pr_warn("Couldn't disable CBS offload for queue %d\n",
cbs.queue);
@@ -294,7 +289,7 @@ static int cbs_enable_offload(struct net_device *dev, struct cbs_sched_data *q,
cbs.idleslope = opt->idleslope;
cbs.sendslope = opt->sendslope;
- err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_CBS, &cbs);
+ err = dev_setup_tc(dev, TC_SETUP_QDISC_CBS, &cbs);
if (err < 0) {
NL_SET_ERR_MSG(extack, "Specified device failed to setup cbs hardware offload");
return err;
diff --git a/net/sched/sch_etf.c b/net/sched/sch_etf.c
index c74d778c32a1..f183c2e9a4e8 100644
--- a/net/sched/sch_etf.c
+++ b/net/sched/sch_etf.c
@@ -297,20 +297,15 @@ static void etf_disable_offload(struct net_device *dev,
struct etf_sched_data *q)
{
struct tc_etf_qopt_offload etf = { };
- const struct net_device_ops *ops;
int err;
if (!q->offload)
return;
- ops = dev->netdev_ops;
- if (!ops->ndo_setup_tc)
- return;
-
etf.queue = q->queue;
etf.enable = 0;
- err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_ETF, &etf);
+ err = dev_setup_tc(dev, TC_SETUP_QDISC_ETF, &etf);
if (err < 0)
pr_warn("Couldn't disable ETF offload for queue %d\n",
etf.queue);
@@ -331,7 +326,7 @@ static int etf_enable_offload(struct net_device *dev, struct etf_sched_data *q,
etf.queue = q->queue;
etf.enable = 1;
- err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_ETF, &etf);
+ err = dev_setup_tc(dev, TC_SETUP_QDISC_ETF, &etf);
if (err < 0) {
NL_SET_ERR_MSG(extack, "Specified device failed to setup ETF hardware offload");
return err;
diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c
index 516038a44163..1757dd3b0552 100644
--- a/net/sched/sch_ets.c
+++ b/net/sched/sch_ets.c
@@ -117,9 +117,6 @@ static void ets_offload_change(struct Qdisc *sch)
unsigned int weight;
unsigned int i;
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return;
-
qopt.command = TC_ETS_REPLACE;
qopt.handle = sch->handle;
qopt.parent = sch->parent;
@@ -142,7 +139,7 @@ static void ets_offload_change(struct Qdisc *sch)
qopt.replace_params.weights[i] = weight;
}
- dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_ETS, &qopt);
+ dev_setup_tc(dev, TC_SETUP_QDISC_ETS, &qopt);
}
static void ets_offload_destroy(struct Qdisc *sch)
@@ -150,13 +147,10 @@ static void ets_offload_destroy(struct Qdisc *sch)
struct net_device *dev = qdisc_dev(sch);
struct tc_ets_qopt_offload qopt;
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return;
-
qopt.command = TC_ETS_DESTROY;
qopt.handle = sch->handle;
qopt.parent = sch->parent;
- dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_ETS, &qopt);
+ dev_setup_tc(dev, TC_SETUP_QDISC_ETS, &qopt);
}
static void ets_offload_graft(struct Qdisc *sch, struct Qdisc *new,
diff --git a/net/sched/sch_fifo.c b/net/sched/sch_fifo.c
index b50b2c2cc09b..4729a090c876 100644
--- a/net/sched/sch_fifo.c
+++ b/net/sched/sch_fifo.c
@@ -58,13 +58,10 @@ static void fifo_offload_init(struct Qdisc *sch)
struct net_device *dev = qdisc_dev(sch);
struct tc_fifo_qopt_offload qopt;
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return;
-
qopt.command = TC_FIFO_REPLACE;
qopt.handle = sch->handle;
qopt.parent = sch->parent;
- dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_FIFO, &qopt);
+ dev_setup_tc(dev, TC_SETUP_QDISC_FIFO, &qopt);
}
static void fifo_offload_destroy(struct Qdisc *sch)
@@ -72,13 +69,10 @@ static void fifo_offload_destroy(struct Qdisc *sch)
struct net_device *dev = qdisc_dev(sch);
struct tc_fifo_qopt_offload qopt;
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return;
-
qopt.command = TC_FIFO_DESTROY;
qopt.handle = sch->handle;
qopt.parent = sch->parent;
- dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_FIFO, &qopt);
+ dev_setup_tc(dev, TC_SETUP_QDISC_FIFO, &qopt);
}
static int fifo_offload_dump(struct Qdisc *sch)
diff --git a/net/sched/sch_gred.c b/net/sched/sch_gred.c
index ab6234b4fcd5..d1a29842a2b4 100644
--- a/net/sched/sch_gred.c
+++ b/net/sched/sch_gred.c
@@ -314,9 +314,6 @@ static void gred_offload(struct Qdisc *sch, enum tc_gred_command command)
struct net_device *dev = qdisc_dev(sch);
struct tc_gred_qopt_offload *opt = table->opt;
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return;
-
memset(opt, 0, sizeof(*opt));
opt->command = command;
opt->handle = sch->handle;
@@ -348,7 +345,7 @@ static void gred_offload(struct Qdisc *sch, enum tc_gred_command command)
opt->set.qstats = &sch->qstats;
}
- dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_GRED, opt);
+ dev_setup_tc(dev, TC_SETUP_QDISC_GRED, opt);
}
static int gred_offload_dump_stats(struct Qdisc *sch)
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index c31bc5489bdd..ed406b3ceb7b 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1041,7 +1041,7 @@ static void htb_work_func(struct work_struct *work)
static int htb_offload(struct net_device *dev, struct tc_htb_qopt_offload *opt)
{
- return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_HTB, opt);
+ return dev_setup_tc(dev, TC_SETUP_QDISC_HTB, opt);
}
static int htb_init(struct Qdisc *sch, struct nlattr *opt,
diff --git a/net/sched/sch_mq.c b/net/sched/sch_mq.c
index c860119a8f09..a5ba59728d63 100644
--- a/net/sched/sch_mq.c
+++ b/net/sched/sch_mq.c
@@ -29,10 +29,7 @@ static int mq_offload(struct Qdisc *sch, enum tc_mq_command cmd)
.handle = sch->handle,
};
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return -EOPNOTSUPP;
-
- return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_MQ, &opt);
+ return dev_setup_tc(dev, TC_SETUP_QDISC_MQ, &opt);
}
static int mq_offload_stats(struct Qdisc *sch)
diff --git a/net/sched/sch_mqprio.c b/net/sched/sch_mqprio.c
index 51d4013b6121..2c9a90b83c2b 100644
--- a/net/sched/sch_mqprio.c
+++ b/net/sched/sch_mqprio.c
@@ -67,8 +67,7 @@ static int mqprio_enable_offload(struct Qdisc *sch,
mqprio_fp_to_offload(priv->fp, &mqprio);
- err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_MQPRIO,
- &mqprio);
+ err = dev_setup_tc(dev, TC_SETUP_QDISC_MQPRIO, &mqprio);
if (err)
return err;
@@ -86,8 +85,7 @@ static void mqprio_disable_offload(struct Qdisc *sch)
switch (priv->mode) {
case TC_MQPRIO_MODE_DCB:
case TC_MQPRIO_MODE_CHANNEL:
- dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_MQPRIO,
- &mqprio);
+ dev_setup_tc(dev, TC_SETUP_QDISC_MQPRIO, &mqprio);
break;
}
}
diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index cc30f7a32f1a..276e290b4071 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -145,9 +145,6 @@ static int prio_offload(struct Qdisc *sch, struct tc_prio_qopt *qopt)
.parent = sch->parent,
};
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return -EOPNOTSUPP;
-
if (qopt) {
opt.command = TC_PRIO_REPLACE;
opt.replace_params.bands = qopt->bands;
@@ -158,7 +155,7 @@ static int prio_offload(struct Qdisc *sch, struct tc_prio_qopt *qopt)
opt.command = TC_PRIO_DESTROY;
}
- return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_PRIO, &opt);
+ return dev_setup_tc(dev, TC_SETUP_QDISC_PRIO, &opt);
}
static void
diff --git a/net/sched/sch_red.c b/net/sched/sch_red.c
index ef8a2afed26b..235ec57b29b8 100644
--- a/net/sched/sch_red.c
+++ b/net/sched/sch_red.c
@@ -192,9 +192,6 @@ static int red_offload(struct Qdisc *sch, bool enable)
.parent = sch->parent,
};
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return -EOPNOTSUPP;
-
if (enable) {
opt.command = TC_RED_REPLACE;
opt.set.min = q->parms.qth_min >> q->parms.Wlog;
@@ -209,7 +206,7 @@ static int red_offload(struct Qdisc *sch, bool enable)
opt.command = TC_RED_DESTROY;
}
- return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_RED, &opt);
+ return dev_setup_tc(dev, TC_SETUP_QDISC_RED, &opt);
}
static void red_destroy(struct Qdisc *sch)
@@ -460,8 +457,7 @@ static int red_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
.xstats = &q->stats,
},
};
- dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_RED,
- &hw_stats_request);
+ dev_setup_tc(dev, TC_SETUP_QDISC_RED, &hw_stats_request);
}
st.early = q->stats.prob_drop + q->stats.forced_drop;
st.pdrop = q->stats.pdrop;
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c
index a68e17891b0b..18b7e59df786 100644
--- a/net/sched/sch_taprio.c
+++ b/net/sched/sch_taprio.c
@@ -1539,7 +1539,7 @@ static int taprio_enable_offload(struct net_device *dev,
for (tc = 0; tc < TC_MAX_QUEUE; tc++)
offload->max_sdu[tc] = q->max_sdu[tc];
- err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
+ err = dev_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
if (err < 0) {
NL_SET_ERR_MSG_WEAK(extack,
"Device failed to setup taprio offload");
@@ -1579,7 +1579,7 @@ static int taprio_disable_offload(struct net_device *dev,
}
offload->cmd = TAPRIO_CMD_DESTROY;
- err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
+ err = dev_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
if (err < 0) {
NL_SET_ERR_MSG(extack,
"Device failed to disable offload");
@@ -2314,24 +2314,14 @@ static int taprio_dump_xstats(struct Qdisc *sch, struct gnet_dump *d,
struct tc_taprio_qopt_stats *stats)
{
struct net_device *dev = qdisc_dev(sch);
- const struct net_device_ops *ops;
struct sk_buff *skb = d->skb;
struct nlattr *xstats;
int err;
- ops = qdisc_dev(sch)->netdev_ops;
-
- /* FIXME I could use qdisc_offload_dump_helper(), but that messes
- * with sch->flags depending on whether the device reports taprio
- * stats, and I'm not sure whether that's a good idea, considering
- * that stats are optional to the offload itself
- */
- if (!ops->ndo_setup_tc)
- return 0;
memset(stats, 0xff, sizeof(*stats));
- err = ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
+ err = dev_setup_tc(dev, TC_SETUP_QDISC_TAPRIO, offload);
if (err == -EOPNOTSUPP)
return 0;
if (err)
diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
index dc26b22d53c7..5a2133716446 100644
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -145,9 +145,6 @@ static void tbf_offload_change(struct Qdisc *sch)
struct net_device *dev = qdisc_dev(sch);
struct tc_tbf_qopt_offload qopt;
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return;
-
qopt.command = TC_TBF_REPLACE;
qopt.handle = sch->handle;
qopt.parent = sch->parent;
@@ -155,7 +152,7 @@ static void tbf_offload_change(struct Qdisc *sch)
qopt.replace_params.max_size = q->max_size;
qopt.replace_params.qstats = &sch->qstats;
- dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TBF, &qopt);
+ dev_setup_tc(dev, TC_SETUP_QDISC_TBF, &qopt);
}
static void tbf_offload_destroy(struct Qdisc *sch)
@@ -163,13 +160,10 @@ static void tbf_offload_destroy(struct Qdisc *sch)
struct net_device *dev = qdisc_dev(sch);
struct tc_tbf_qopt_offload qopt;
- if (!tc_can_offload(dev) || !dev->netdev_ops->ndo_setup_tc)
- return;
-
qopt.command = TC_TBF_DESTROY;
qopt.handle = sch->handle;
qopt.parent = sch->parent;
- dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_QDISC_TBF, &qopt);
+ dev_setup_tc(dev, TC_SETUP_QDISC_TBF, &qopt);
}
static int tbf_offload_dump(struct Qdisc *sch)
--
2.48.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [RFC net-next 3/4] net: Hold netdev instance lock for more NDOs
2025-02-04 23:00 [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations Stanislav Fomichev
2025-02-04 23:00 ` [RFC net-next 1/4] net: Hold netdev instance lock during ndo_open/ndo_stop Stanislav Fomichev
2025-02-04 23:00 ` [RFC net-next 2/4] net: Hold netdev instance lock during ndo_setup_tc Stanislav Fomichev
@ 2025-02-04 23:00 ` Stanislav Fomichev
2025-02-04 23:00 ` [RFC net-next 4/4] net: Hold netdev instance lock during queue operations Stanislav Fomichev
2025-02-14 2:10 ` [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations Saeed Mahameed
4 siblings, 0 replies; 10+ messages in thread
From: Stanislav Fomichev @ 2025-02-04 23:00 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni, Saeed Mahameed
Convert all ndo_eth_ioctl invocations to dev_eth_ioctl which does the
locking. Reflow some of the dev_siocxxx to drop else clause.
Fix tabs vs spaces in neighboring lines while at it..
Remove rtnl_lock from ndo_get_stats and clarify that read path can
race with the write path. Still shaper-only drivers (iavf/netdevim).
Cc: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
Documentation/networking/netdevices.rst | 28 +++++++++-------
drivers/net/bonding/bond_main.c | 9 +++--
include/linux/netdevice.h | 2 ++
net/8021q/vlan_dev.c | 4 +--
net/core/dev_ioctl.c | 44 +++++++++++++++++--------
net/ieee802154/socket.c | 2 ++
net/phonet/pn_dev.c | 2 ++
7 files changed, 58 insertions(+), 33 deletions(-)
diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
index c6087d92d740..3ed1bf322a5c 100644
--- a/Documentation/networking/netdevices.rst
+++ b/Documentation/networking/netdevices.rst
@@ -221,40 +221,46 @@ struct net_device synchronization rules
Note: netif_running() is guaranteed false
ndo_do_ioctl:
- Synchronization: rtnl_lock() semaphore.
+ Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+ lock if the driver implements shaper API.
Context: process
- This is only called by network subsystems internally,
- not by user space calling ioctl as it was in before
- linux-5.14.
+ This is only called by network subsystems internally,
+ not by user space calling ioctl as it was in before
+ linux-5.14.
ndo_siocbond:
- Synchronization: rtnl_lock() semaphore.
+ Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+ lock if the driver implements shaper API.
Context: process
- Used by the bonding driver for the SIOCBOND family of
- ioctl commands.
+ Used by the bonding driver for the SIOCBOND family of
+ ioctl commands.
ndo_siocwandev:
- Synchronization: rtnl_lock() semaphore.
+ Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+ lock if the driver implements shaper API.
Context: process
Used by the drivers/net/wan framework to handle
the SIOCWANDEV ioctl with the if_settings structure.
ndo_siocdevprivate:
- Synchronization: rtnl_lock() semaphore.
+ Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+ lock if the driver implements shaper API.
Context: process
This is used to implement SIOCDEVPRIVATE ioctl helpers.
These should not be added to new drivers, so don't use.
ndo_eth_ioctl:
- Synchronization: rtnl_lock() semaphore.
+ Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+ lock if the driver implements shaper API.
Context: process
ndo_get_stats:
- Synchronization: rtnl_lock() semaphore, or RCU.
+ Synchronization: RCU (can be called concurrently with the stats
+ update path).
Context: atomic (can't sleep under RCU)
ndo_setup_tc:
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index e45bba240cbc..025d605166c3 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -858,7 +858,6 @@ static int bond_check_dev_link(struct bonding *bond,
struct net_device *slave_dev, int reporting)
{
const struct net_device_ops *slave_ops = slave_dev->netdev_ops;
- int (*ioctl)(struct net_device *, struct ifreq *, int);
struct ifreq ifr;
struct mii_ioctl_data *mii;
@@ -874,8 +873,7 @@ static int bond_check_dev_link(struct bonding *bond,
BMSR_LSTATUS : 0;
/* Ethtool can't be used, fallback to MII ioctls. */
- ioctl = slave_ops->ndo_eth_ioctl;
- if (ioctl) {
+ if (slave_ops->ndo_eth_ioctl) {
/* TODO: set pointer to correct ioctl on a per team member
* bases to make this more efficient. that is, once
* we determine the correct ioctl, we will always
@@ -891,9 +889,10 @@ static int bond_check_dev_link(struct bonding *bond,
/* Yes, the mii is overlaid on the ifreq.ifr_ifru */
strscpy_pad(ifr.ifr_name, slave_dev->name, IFNAMSIZ);
mii = if_mii(&ifr);
- if (ioctl(slave_dev, &ifr, SIOCGMIIPHY) == 0) {
+
+ if (dev_eth_ioctl(slave_dev, &ifr, SIOCGMIIPHY) == 0) {
mii->reg_num = MII_BMSR;
- if (ioctl(slave_dev, &ifr, SIOCGMIIREG) == 0)
+ if (dev_eth_ioctl(slave_dev, &ifr, SIOCGMIIREG) == 0)
return mii->val_out & BMSR_LSTATUS;
}
}
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 6f2eb129ef3e..e49b818054b9 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4159,6 +4159,8 @@ int put_user_ifreq(struct ifreq *ifr, void __user *arg);
int dev_ioctl(struct net *net, unsigned int cmd, struct ifreq *ifr,
void __user *data, bool *need_copyout);
int dev_ifconf(struct net *net, struct ifconf __user *ifc);
+int dev_eth_ioctl(struct net_device *dev,
+ struct ifreq *ifr, unsigned int cmd);
int generic_hwtstamp_get_lower(struct net_device *dev,
struct kernel_hwtstamp_config *kernel_cfg);
int generic_hwtstamp_set_lower(struct net_device *dev,
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c
index 91d134961357..ee3283400716 100644
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -377,7 +377,6 @@ static int vlan_hwtstamp_set(struct net_device *dev,
static int vlan_dev_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
{
struct net_device *real_dev = vlan_dev_priv(dev)->real_dev;
- const struct net_device_ops *ops = real_dev->netdev_ops;
struct ifreq ifrr;
int err = -EOPNOTSUPP;
@@ -388,8 +387,7 @@ static int vlan_dev_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd)
case SIOCGMIIPHY:
case SIOCGMIIREG:
case SIOCSMIIREG:
- if (netif_device_present(real_dev) && ops->ndo_eth_ioctl)
- err = ops->ndo_eth_ioctl(real_dev, &ifrr, cmd);
+ err = dev_eth_ioctl(real_dev, &ifrr, cmd);
break;
}
diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
index 4c2098ac9d72..8dc2c323fe58 100644
--- a/net/core/dev_ioctl.c
+++ b/net/core/dev_ioctl.c
@@ -240,19 +240,23 @@ int net_hwtstamp_validate(const struct kernel_hwtstamp_config *cfg)
return 0;
}
-static int dev_eth_ioctl(struct net_device *dev,
- struct ifreq *ifr, unsigned int cmd)
+int dev_eth_ioctl(struct net_device *dev,
+ struct ifreq *ifr, unsigned int cmd)
{
const struct net_device_ops *ops = dev->netdev_ops;
+ int ret = -ENODEV;
if (!ops->ndo_eth_ioctl)
return -EOPNOTSUPP;
- if (!netif_device_present(dev))
- return -ENODEV;
+ netdev_lock_ops(dev);
+ if (netif_device_present(dev))
+ ret = ops->ndo_eth_ioctl(dev, ifr, cmd);
+ netdev_unlock_ops(dev);
- return ops->ndo_eth_ioctl(dev, ifr, cmd);
+ return ret;
}
+EXPORT_SYMBOL(dev_eth_ioctl);
/**
* dev_get_hwtstamp_phylib() - Get hardware timestamping settings of NIC
@@ -504,10 +508,14 @@ static int dev_siocbond(struct net_device *dev,
const struct net_device_ops *ops = dev->netdev_ops;
if (ops->ndo_siocbond) {
+ int ret = -ENODEV;
+
+ netdev_lock_ops(dev);
if (netif_device_present(dev))
- return ops->ndo_siocbond(dev, ifr, cmd);
- else
- return -ENODEV;
+ ret = ops->ndo_siocbond(dev, ifr, cmd);
+ netdev_unlock_ops(dev);
+
+ return ret;
}
return -EOPNOTSUPP;
@@ -519,10 +527,14 @@ static int dev_siocdevprivate(struct net_device *dev, struct ifreq *ifr,
const struct net_device_ops *ops = dev->netdev_ops;
if (ops->ndo_siocdevprivate) {
+ int ret = -ENODEV;
+
+ netdev_lock_ops(dev);
if (netif_device_present(dev))
- return ops->ndo_siocdevprivate(dev, ifr, data, cmd);
- else
- return -ENODEV;
+ ret = ops->ndo_siocdevprivate(dev, ifr, data, cmd);
+ netdev_unlock_ops(dev);
+
+ return ret;
}
return -EOPNOTSUPP;
@@ -533,10 +545,14 @@ static int dev_siocwandev(struct net_device *dev, struct if_settings *ifs)
const struct net_device_ops *ops = dev->netdev_ops;
if (ops->ndo_siocwandev) {
+ int ret = -ENODEV;
+
+ netdev_lock_ops(dev);
if (netif_device_present(dev))
- return ops->ndo_siocwandev(dev, ifs);
- else
- return -ENODEV;
+ ret = ops->ndo_siocwandev(dev, ifs);
+ netdev_unlock_ops(dev);
+
+ return ret;
}
return -EOPNOTSUPP;
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
index 18d267921bb5..b86d090a2ba5 100644
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -139,8 +139,10 @@ static int ieee802154_dev_ioctl(struct sock *sk, struct ifreq __user *arg,
if (!dev)
return -ENODEV;
+ netdev_lock_ops(dev);
if (dev->type == ARPHRD_IEEE802154 && dev->netdev_ops->ndo_do_ioctl)
ret = dev->netdev_ops->ndo_do_ioctl(dev, &ifr, cmd);
+ netdev_unlock_ops(dev);
if (!ret && put_user_ifreq(&ifr, arg))
ret = -EFAULT;
diff --git a/net/phonet/pn_dev.c b/net/phonet/pn_dev.c
index 5c36bae37b8f..945f2315499a 100644
--- a/net/phonet/pn_dev.c
+++ b/net/phonet/pn_dev.c
@@ -246,8 +246,10 @@ static int phonet_device_autoconf(struct net_device *dev)
if (!dev->netdev_ops->ndo_siocdevprivate)
return -EOPNOTSUPP;
+ netdev_lock_ops(dev);
ret = dev->netdev_ops->ndo_siocdevprivate(dev, (struct ifreq *)&req,
NULL, SIOCPNGAUTOCONF);
+ netdev_unlock_ops(dev);
if (ret < 0)
return ret;
--
2.48.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [RFC net-next 4/4] net: Hold netdev instance lock during queue operations
2025-02-04 23:00 [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations Stanislav Fomichev
` (2 preceding siblings ...)
2025-02-04 23:00 ` [RFC net-next 3/4] net: Hold netdev instance lock for more NDOs Stanislav Fomichev
@ 2025-02-04 23:00 ` Stanislav Fomichev
2025-02-14 2:10 ` [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations Saeed Mahameed
4 siblings, 0 replies; 10+ messages in thread
From: Stanislav Fomichev @ 2025-02-04 23:00 UTC (permalink / raw)
To: netdev; +Cc: davem, edumazet, kuba, pabeni, Saeed Mahameed
For the drivers that use queue management API, switch to the mode where
core stack holds the netdev instance lock. This affects the following drivers:
- bnxt
- gve
- netdevsim
Originally I locked only start/stop, but switched to holding the
lock over all iterations to make them look atomic to the device
(feels like it should be easier to reason about).
Cc: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
Documentation/networking/netdevices.rst | 30 ++++++++---
drivers/net/ethernet/broadcom/bnxt/bnxt.c | 54 +++++++++++++++----
.../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 11 ++--
.../net/ethernet/broadcom/bnxt/bnxt_sriov.c | 2 +
drivers/net/ethernet/google/gve/gve_main.c | 8 +--
drivers/net/ethernet/google/gve/gve_utils.c | 8 +--
drivers/net/netdevsim/netdev.c | 22 ++++----
include/linux/netdevice.h | 2 +-
net/core/netdev_rx_queue.c | 5 ++
9 files changed, 103 insertions(+), 39 deletions(-)
diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
index 3ed1bf322a5c..1ebd3bf011f3 100644
--- a/Documentation/networking/netdevices.rst
+++ b/Documentation/networking/netdevices.rst
@@ -211,18 +211,18 @@ struct net_device synchronization rules
=======================================
ndo_open:
Synchronization: rtnl_lock() semaphore. In addition, netdev instance
- lock if the driver implements shaper API.
+ lock if the driver implements queue management or shaper API.
Context: process
ndo_stop:
Synchronization: rtnl_lock() semaphore. In addition, netdev instance
- lock if the driver implements shaper API.
+ lock if the driver implements queue management or shaper API.
Context: process
Note: netif_running() is guaranteed false
ndo_do_ioctl:
Synchronization: rtnl_lock() semaphore. In addition, netdev instance
- lock if the driver implements shaper API.
+ lock if the driver implements queue management or shaper API.
Context: process
This is only called by network subsystems internally,
@@ -231,7 +231,7 @@ struct net_device synchronization rules
ndo_siocbond:
Synchronization: rtnl_lock() semaphore. In addition, netdev instance
- lock if the driver implements shaper API.
+ lock if the driver implements queue management or shaper API.
Context: process
Used by the bonding driver for the SIOCBOND family of
@@ -239,7 +239,7 @@ struct net_device synchronization rules
ndo_siocwandev:
Synchronization: rtnl_lock() semaphore. In addition, netdev instance
- lock if the driver implements shaper API.
+ lock if the driver implements queue management or shaper API.
Context: process
Used by the drivers/net/wan framework to handle
@@ -247,7 +247,7 @@ struct net_device synchronization rules
ndo_siocdevprivate:
Synchronization: rtnl_lock() semaphore. In addition, netdev instance
- lock if the driver implements shaper API.
+ lock if the driver implements queue management or shaper API.
Context: process
This is used to implement SIOCDEVPRIVATE ioctl helpers.
@@ -255,7 +255,7 @@ struct net_device synchronization rules
ndo_eth_ioctl:
Synchronization: rtnl_lock() semaphore. In addition, netdev instance
- lock if the driver implements shaper API.
+ lock if the driver implements queue management or shaper API.
Context: process
ndo_get_stats:
@@ -265,7 +265,7 @@ struct net_device synchronization rules
ndo_setup_tc:
Synchronization: rtnl_lock() semaphore. In addition, netdev instance
- lock if the driver implements shaper API.
+ lock if the driver implements queue management or shaper API.
ndo_start_xmit:
Synchronization: __netif_tx_lock spinlock.
@@ -310,6 +310,20 @@ struct napi_struct synchronization rules
softirq
will be called with interrupts disabled by netconsole.
+struct netdev_queue_mgmt_ops synchronization rules
+==================================================
+ndo_queue_mem_alloc:
+ Synchronization: Netdev instance lock.
+
+ndo_queue_mem_free:
+ Synchronization: Netdev instance lock.
+
+ndo_queue_start:
+ Synchronization: Netdev instance lock.
+
+ndo_queue_stop:
+ Synchronization: Netdev instance lock.
+
NETDEV_INTERNAL symbol namespace
================================
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 7b8b5b39c7bb..a16dcccfb482 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -11301,7 +11301,7 @@ static int bnxt_request_irq(struct bnxt *bp)
if (rc)
break;
- netif_napi_set_irq(&bp->bnapi[i]->napi, irq->vector);
+ netif_napi_set_irq_locked(&bp->bnapi[i]->napi, irq->vector);
irq->requested = 1;
if (zalloc_cpumask_var(&irq->cpu_mask, GFP_KERNEL)) {
@@ -11337,9 +11337,9 @@ static void bnxt_del_napi(struct bnxt *bp)
for (i = 0; i < bp->cp_nr_rings; i++) {
struct bnxt_napi *bnapi = bp->bnapi[i];
- __netif_napi_del(&bnapi->napi);
+ __netif_napi_del_locked(&bnapi->napi);
}
- /* We called __netif_napi_del(), we need
+ /* We called __netif_napi_del_locked(), we need
* to respect an RCU grace period before freeing napi structures.
*/
synchronize_net();
@@ -11352,18 +11352,20 @@ static void bnxt_init_napi(struct bnxt *bp)
struct bnxt_napi *bnapi;
int i;
+ netdev_assert_locked(bp->dev);
+
if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS)
poll_fn = bnxt_poll_p5;
else if (BNXT_CHIP_TYPE_NITRO_A0(bp))
cp_nr_rings--;
for (i = 0; i < cp_nr_rings; i++) {
bnapi = bp->bnapi[i];
- netif_napi_add_config(bp->dev, &bnapi->napi, poll_fn,
- bnapi->index);
+ netif_napi_add_config_locked(bp->dev, &bnapi->napi, poll_fn,
+ bnapi->index);
}
if (BNXT_CHIP_TYPE_NITRO_A0(bp)) {
bnapi = bp->bnapi[cp_nr_rings];
- netif_napi_add(bp->dev, &bnapi->napi, bnxt_poll_nitroa0);
+ netif_napi_add_locked(bp->dev, &bnapi->napi, bnxt_poll_nitroa0);
}
}
@@ -11375,6 +11377,8 @@ static void bnxt_disable_napi(struct bnxt *bp)
test_and_set_bit(BNXT_STATE_NAPI_DISABLED, &bp->state))
return;
+ netdev_assert_locked(bp->dev);
+
for (i = 0; i < bp->cp_nr_rings; i++) {
struct bnxt_napi *bnapi = bp->bnapi[i];
struct bnxt_cp_ring_info *cpr;
@@ -11384,7 +11388,7 @@ static void bnxt_disable_napi(struct bnxt *bp)
cpr->sw_stats->tx.tx_resets++;
if (bnapi->in_reset)
cpr->sw_stats->rx.rx_resets++;
- napi_disable(&bnapi->napi);
+ napi_disable_locked(&bnapi->napi);
}
}
@@ -11392,6 +11396,8 @@ static void bnxt_enable_napi(struct bnxt *bp)
{
int i;
+ netdev_assert_locked(bp->dev);
+
clear_bit(BNXT_STATE_NAPI_DISABLED, &bp->state);
for (i = 0; i < bp->cp_nr_rings; i++) {
struct bnxt_napi *bnapi = bp->bnapi[i];
@@ -11406,7 +11412,7 @@ static void bnxt_enable_napi(struct bnxt *bp)
INIT_WORK(&cpr->dim.work, bnxt_dim_work);
cpr->dim.mode = DIM_CQ_PERIOD_MODE_START_FROM_EQE;
}
- napi_enable(&bnapi->napi);
+ napi_enable_locked(&bnapi->napi);
}
}
@@ -13291,11 +13297,17 @@ static netdev_features_t bnxt_fix_features(struct net_device *dev,
static int bnxt_reinit_features(struct bnxt *bp, bool irq_re_init,
bool link_re_init, u32 flags, bool update_tpa)
{
+ int rc;
+
+ netdev_lock(bp->dev);
bnxt_close_nic(bp, irq_re_init, link_re_init);
bp->flags = flags;
if (update_tpa)
bnxt_set_ring_params(bp);
- return bnxt_open_nic(bp, irq_re_init, link_re_init);
+ rc = bnxt_open_nic(bp, irq_re_init, link_re_init);
+ netdev_unlock(bp->dev);
+
+ return rc;
}
static int bnxt_set_features(struct net_device *dev, netdev_features_t features)
@@ -13754,11 +13766,13 @@ static void bnxt_rtnl_lock_sp(struct bnxt *bp)
*/
clear_bit(BNXT_STATE_IN_SP_TASK, &bp->state);
rtnl_lock();
+ netdev_lock(bp->dev);
}
static void bnxt_rtnl_unlock_sp(struct bnxt *bp)
{
set_bit(BNXT_STATE_IN_SP_TASK, &bp->state);
+ netdev_unlock(bp->dev);
rtnl_unlock();
}
@@ -14622,8 +14636,10 @@ static void bnxt_fw_reset_task(struct work_struct *work)
}
bp->fw_reset_timestamp = jiffies;
rtnl_lock();
+ netdev_lock(bp->dev);
if (test_bit(BNXT_STATE_ABORT_ERR, &bp->state)) {
bnxt_fw_reset_abort(bp, rc);
+ netdev_unlock(bp->dev);
rtnl_unlock();
goto ulp_start;
}
@@ -14635,6 +14651,7 @@ static void bnxt_fw_reset_task(struct work_struct *work)
bp->fw_reset_state = BNXT_FW_RESET_STATE_ENABLE_DEV;
tmo = bp->fw_reset_min_dsecs * HZ / 10;
}
+ netdev_unlock(bp->dev);
rtnl_unlock();
bnxt_queue_fw_reset_work(bp, tmo);
return;
@@ -14713,7 +14730,9 @@ static void bnxt_fw_reset_task(struct work_struct *work)
bnxt_queue_fw_reset_work(bp, HZ / 10);
return;
}
+ netdev_lock(dev);
rc = bnxt_open(bp->dev);
+ netdev_unlock(dev);
if (rc) {
netdev_err(bp->dev, "bnxt_open() failed during FW reset\n");
bnxt_fw_reset_abort(bp, rc);
@@ -14868,10 +14887,12 @@ static int bnxt_change_mac_addr(struct net_device *dev, void *p)
eth_hw_addr_set(dev, addr->sa_data);
bnxt_clear_usr_fltrs(bp, true);
+ netdev_lock(dev);
if (netif_running(dev)) {
bnxt_close_nic(bp, false, false);
rc = bnxt_open_nic(bp, false, false);
}
+ netdev_unlock(dev);
return rc;
}
@@ -14880,6 +14901,9 @@ static int bnxt_change_mac_addr(struct net_device *dev, void *p)
static int bnxt_change_mtu(struct net_device *dev, int new_mtu)
{
struct bnxt *bp = netdev_priv(dev);
+ int rc = 0;
+
+ netdev_lock(dev);
if (netif_running(dev))
bnxt_close_nic(bp, true, false);
@@ -14896,9 +14920,11 @@ static int bnxt_change_mtu(struct net_device *dev, int new_mtu)
bnxt_set_ring_params(bp);
if (netif_running(dev))
- return bnxt_open_nic(bp, true, false);
+ rc = bnxt_open_nic(bp, true, false);
- return 0;
+ netdev_unlock(dev);
+
+ return rc;
}
int bnxt_setup_mq_tc(struct net_device *dev, u8 tc)
@@ -16426,6 +16452,7 @@ static int bnxt_suspend(struct device *device)
bnxt_ulp_stop(bp);
rtnl_lock();
+ netdev_lock(dev);
if (netif_running(dev)) {
netif_device_detach(dev);
rc = bnxt_close(dev);
@@ -16434,6 +16461,7 @@ static int bnxt_suspend(struct device *device)
bnxt_ptp_clear(bp);
pci_disable_device(bp->pdev);
bnxt_free_ctx_mem(bp, false);
+ netdev_unlock(dev);
rtnl_unlock();
return rc;
}
@@ -16445,6 +16473,7 @@ static int bnxt_resume(struct device *device)
int rc = 0;
rtnl_lock();
+ netdev_lock(dev);
rc = pci_enable_device(bp->pdev);
if (rc) {
netdev_err(dev, "Cannot re-enable PCI device during resume, err = %d\n",
@@ -16487,6 +16516,7 @@ static int bnxt_resume(struct device *device)
}
resume_exit:
+ netdev_unlock(dev);
rtnl_unlock();
bnxt_ulp_start(bp, rc);
if (!rc)
@@ -16655,6 +16685,7 @@ static void bnxt_io_resume(struct pci_dev *pdev)
netdev_info(bp->dev, "PCI Slot Resume\n");
rtnl_lock();
+ netdev_lock(dev);
err = bnxt_hwrm_func_qcaps(bp);
if (!err) {
@@ -16667,6 +16698,7 @@ static void bnxt_io_resume(struct pci_dev *pdev)
if (!err)
netif_device_attach(netdev);
+ netdev_unlock(dev);
rtnl_unlock();
bnxt_ulp_start(bp, err);
if (!err)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index 9c5820839514..2246afcdcea2 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -4929,10 +4929,12 @@ static void bnxt_self_test(struct net_device *dev, struct ethtool_test *etest,
return;
}
+ netdev_lock(dev);
+
memset(buf, 0, sizeof(u64) * bp->num_tests);
if (!netif_running(dev)) {
etest->flags |= ETH_TEST_FL_FAILED;
- return;
+ goto unlock;
}
if ((etest->flags & ETH_TEST_FL_EXTERNAL_LB) &&
@@ -4943,7 +4945,7 @@ static void bnxt_self_test(struct net_device *dev, struct ethtool_test *etest,
if (bp->pf.active_vfs || !BNXT_SINGLE_PF(bp)) {
etest->flags |= ETH_TEST_FL_FAILED;
netdev_warn(dev, "Offline tests cannot be run with active VFs or on shared PF\n");
- return;
+ goto unlock;
}
offline = true;
}
@@ -4965,7 +4967,7 @@ static void bnxt_self_test(struct net_device *dev, struct ethtool_test *etest,
rc = bnxt_half_open_nic(bp);
if (rc) {
etest->flags |= ETH_TEST_FL_FAILED;
- return;
+ goto unlock;
}
buf[BNXT_MACLPBK_TEST_IDX] = 1;
if (bp->mac_flags & BNXT_MAC_FL_NO_MAC_LPBK)
@@ -5017,6 +5019,9 @@ static void bnxt_self_test(struct net_device *dev, struct ethtool_test *etest,
etest->flags |= ETH_TEST_FL_FAILED;
}
}
+
+unlock:
+ netdev_unlock(dev);
}
static int bnxt_reset(struct net_device *dev, u32 *flags)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index 12b6ed51fd88..dc61cf63fe3f 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -946,7 +946,9 @@ void bnxt_sriov_disable(struct bnxt *bp)
/* Reclaim all resources for the PF. */
rtnl_lock();
+ netdev_lock(bp->dev);
bnxt_restore_pf_fw_resources(bp);
+ netdev_unlock(bp->dev);
rtnl_unlock();
}
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
index 533e659b15b3..cf9bd08d04b2 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -1886,7 +1886,7 @@ static void gve_turndown(struct gve_priv *priv)
netif_queue_set_napi(priv->dev, idx,
NETDEV_QUEUE_TYPE_TX, NULL);
- napi_disable(&block->napi);
+ napi_disable_locked(&block->napi);
}
for (idx = 0; idx < priv->rx_cfg.num_queues; idx++) {
int ntfy_idx = gve_rx_idx_to_ntfy(priv, idx);
@@ -1897,7 +1897,7 @@ static void gve_turndown(struct gve_priv *priv)
netif_queue_set_napi(priv->dev, idx, NETDEV_QUEUE_TYPE_RX,
NULL);
- napi_disable(&block->napi);
+ napi_disable_locked(&block->napi);
}
/* Stop tx queues */
@@ -1925,7 +1925,7 @@ static void gve_turnup(struct gve_priv *priv)
if (!gve_tx_was_added_to_block(priv, idx))
continue;
- napi_enable(&block->napi);
+ napi_enable_locked(&block->napi);
if (idx < priv->tx_cfg.num_queues)
netif_queue_set_napi(priv->dev, idx,
@@ -1953,7 +1953,7 @@ static void gve_turnup(struct gve_priv *priv)
if (!gve_rx_was_added_to_block(priv, idx))
continue;
- napi_enable(&block->napi);
+ napi_enable_locked(&block->napi);
netif_queue_set_napi(priv->dev, idx, NETDEV_QUEUE_TYPE_RX,
&block->napi);
diff --git a/drivers/net/ethernet/google/gve/gve_utils.c b/drivers/net/ethernet/google/gve/gve_utils.c
index 30fef100257e..fa21d240806b 100644
--- a/drivers/net/ethernet/google/gve/gve_utils.c
+++ b/drivers/net/ethernet/google/gve/gve_utils.c
@@ -110,13 +110,15 @@ void gve_add_napi(struct gve_priv *priv, int ntfy_idx,
{
struct gve_notify_block *block = &priv->ntfy_blocks[ntfy_idx];
- netif_napi_add(priv->dev, &block->napi, gve_poll);
- netif_napi_set_irq(&block->napi, block->irq);
+ netdev_assert_locked(priv->dev);
+ netif_napi_add_locked(priv->dev, &block->napi, gve_poll);
+ netif_napi_set_irq_locked(&block->napi, block->irq);
}
void gve_remove_napi(struct gve_priv *priv, int ntfy_idx)
{
struct gve_notify_block *block = &priv->ntfy_blocks[ntfy_idx];
- netif_napi_del(&block->napi);
+ netdev_assert_locked(priv->dev);
+ netif_napi_del_locked(&block->napi);
}
diff --git a/drivers/net/netdevsim/netdev.c b/drivers/net/netdevsim/netdev.c
index efec03b34c9f..8faa5d22289c 100644
--- a/drivers/net/netdevsim/netdev.c
+++ b/drivers/net/netdevsim/netdev.c
@@ -661,7 +661,7 @@ nsim_queue_mem_alloc(struct net_device *dev, void *per_queue_mem, int idx)
goto err_free;
if (!ns->rq_reset_mode)
- netif_napi_add_config(dev, &qmem->rq->napi, nsim_poll, idx);
+ netif_napi_add_config_locked(dev, &qmem->rq->napi, nsim_poll, idx);
return 0;
@@ -678,7 +678,7 @@ static void nsim_queue_mem_free(struct net_device *dev, void *per_queue_mem)
page_pool_destroy(qmem->pp);
if (qmem->rq) {
if (!ns->rq_reset_mode)
- netif_napi_del(&qmem->rq->napi);
+ netif_napi_del_locked(&qmem->rq->napi);
page_pool_destroy(qmem->rq->page_pool);
nsim_queue_free(qmem->rq);
}
@@ -690,9 +690,11 @@ nsim_queue_start(struct net_device *dev, void *per_queue_mem, int idx)
struct nsim_queue_mem *qmem = per_queue_mem;
struct netdevsim *ns = netdev_priv(dev);
+ netdev_assert_locked(dev);
+
if (ns->rq_reset_mode == 1) {
ns->rq[idx]->page_pool = qmem->pp;
- napi_enable(&ns->rq[idx]->napi);
+ napi_enable_locked(&ns->rq[idx]->napi);
return 0;
}
@@ -700,15 +702,15 @@ nsim_queue_start(struct net_device *dev, void *per_queue_mem, int idx)
* here we want to test various call orders.
*/
if (ns->rq_reset_mode == 2) {
- netif_napi_del(&ns->rq[idx]->napi);
- netif_napi_add_config(dev, &qmem->rq->napi, nsim_poll, idx);
+ netif_napi_del_locked(&ns->rq[idx]->napi);
+ netif_napi_add_config_locked(dev, &qmem->rq->napi, nsim_poll, idx);
} else if (ns->rq_reset_mode == 3) {
- netif_napi_add_config(dev, &qmem->rq->napi, nsim_poll, idx);
- netif_napi_del(&ns->rq[idx]->napi);
+ netif_napi_add_config_locked(dev, &qmem->rq->napi, nsim_poll, idx);
+ netif_napi_del_locked(&ns->rq[idx]->napi);
}
ns->rq[idx] = qmem->rq;
- napi_enable(&ns->rq[idx]->napi);
+ napi_enable_locked(&ns->rq[idx]->napi);
return 0;
}
@@ -718,7 +720,9 @@ static int nsim_queue_stop(struct net_device *dev, void *per_queue_mem, int idx)
struct nsim_queue_mem *qmem = per_queue_mem;
struct netdevsim *ns = netdev_priv(dev);
- napi_disable(&ns->rq[idx]->napi);
+ netdev_assert_locked(dev);
+
+ napi_disable_locked(&ns->rq[idx]->napi);
if (ns->rq_reset_mode == 1) {
qmem->pp = ns->rq[idx]->page_pool;
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e49b818054b9..cafa587233fd 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2719,7 +2719,7 @@ static inline void netdev_assert_locked_or_invisible(struct net_device *dev)
static inline bool need_netdev_ops_lock(struct net_device *dev)
{
- bool ret = false;
+ bool ret = !!(dev)->queue_mgmt_ops;
#if IS_ENABLED(CONFIG_NET_SHAPER)
ret |= !!(dev)->netdev_ops->net_shaper_ops;
diff --git a/net/core/netdev_rx_queue.c b/net/core/netdev_rx_queue.c
index db82786fa0c4..04a681aef907 100644
--- a/net/core/netdev_rx_queue.c
+++ b/net/core/netdev_rx_queue.c
@@ -30,6 +30,8 @@ int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq_idx)
goto err_free_new_mem;
}
+ netdev_lock(dev);
+
err = dev->queue_mgmt_ops->ndo_queue_mem_alloc(dev, new_mem, rxq_idx);
if (err)
goto err_free_old_mem;
@@ -48,6 +50,8 @@ int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq_idx)
dev->queue_mgmt_ops->ndo_queue_mem_free(dev, old_mem);
+ netdev_unlock(dev);
+
kvfree(old_mem);
kvfree(new_mem);
@@ -72,6 +76,7 @@ int netdev_rx_queue_restart(struct net_device *dev, unsigned int rxq_idx)
dev->queue_mgmt_ops->ndo_queue_mem_free(dev, new_mem);
err_free_old_mem:
+ netdev_unlock(dev);
kvfree(old_mem);
err_free_new_mem:
--
2.48.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC net-next 1/4] net: Hold netdev instance lock during ndo_open/ndo_stop
2025-02-04 23:00 ` [RFC net-next 1/4] net: Hold netdev instance lock during ndo_open/ndo_stop Stanislav Fomichev
@ 2025-02-05 19:32 ` Joe Damato
2025-02-05 21:31 ` Stanislav Fomichev
0 siblings, 1 reply; 10+ messages in thread
From: Joe Damato @ 2025-02-05 19:32 UTC (permalink / raw)
To: Stanislav Fomichev; +Cc: netdev, davem, edumazet, kuba, pabeni, Saeed Mahameed
On Tue, Feb 04, 2025 at 03:00:54PM -0800, Stanislav Fomichev wrote:
> For the drivers that use shaper API, switch to the mode where
> core stack holds the netdev lock. This affects two drivers:
>
> * iavf - already grabs netdev lock in ndo_open/ndo_stop, so mostly
> remove these
> * netdevsim - switch to _locked APIs to avoid deadlock
>
> Cc: Saeed Mahameed <saeed@kernel.org>
> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
> ---
> Documentation/networking/netdevices.rst | 6 ++++--
> drivers/net/ethernet/intel/iavf/iavf_main.c | 14 ++++++-------
> drivers/net/netdevsim/netdev.c | 14 ++++++++-----
> include/linux/netdevice.h | 23 +++++++++++++++++++++
> net/core/dev.c | 12 +++++++++++
> net/core/dev.h | 6 ++++--
> 6 files changed, 58 insertions(+), 17 deletions(-)
[...]
> @@ -4474,12 +4471,12 @@ static int iavf_close(struct net_device *netdev)
> u64 aq_to_restore;
> int status;
>
> - netdev_lock(netdev);
> + netdev_assert_locked(netdev);
> +
> mutex_lock(&adapter->crit_lock);
>
> if (adapter->state <= __IAVF_DOWN_PENDING) {
> mutex_unlock(&adapter->crit_lock);
> - netdev_unlock(netdev);
> return 0;
> }
>
> @@ -4532,6 +4529,7 @@ static int iavf_close(struct net_device *netdev)
> if (!status)
> netdev_warn(netdev, "Device resources not yet released\n");
>
> + netdev_lock(netdev);
I'm probably just misreading the rest of the patch, but I was just
wondering: is this netdev_lock call here intentional? I am asking
because I thought the lock was taken in ndo_stop before this is
called?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC net-next 1/4] net: Hold netdev instance lock during ndo_open/ndo_stop
2025-02-05 19:32 ` Joe Damato
@ 2025-02-05 21:31 ` Stanislav Fomichev
2025-02-05 22:32 ` Joe Damato
0 siblings, 1 reply; 10+ messages in thread
From: Stanislav Fomichev @ 2025-02-05 21:31 UTC (permalink / raw)
To: Joe Damato, Stanislav Fomichev, netdev, davem, edumazet, kuba,
pabeni, Saeed Mahameed
On 02/05, Joe Damato wrote:
> On Tue, Feb 04, 2025 at 03:00:54PM -0800, Stanislav Fomichev wrote:
> > For the drivers that use shaper API, switch to the mode where
> > core stack holds the netdev lock. This affects two drivers:
> >
> > * iavf - already grabs netdev lock in ndo_open/ndo_stop, so mostly
> > remove these
> > * netdevsim - switch to _locked APIs to avoid deadlock
> >
> > Cc: Saeed Mahameed <saeed@kernel.org>
> > Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
> > ---
> > Documentation/networking/netdevices.rst | 6 ++++--
> > drivers/net/ethernet/intel/iavf/iavf_main.c | 14 ++++++-------
> > drivers/net/netdevsim/netdev.c | 14 ++++++++-----
> > include/linux/netdevice.h | 23 +++++++++++++++++++++
> > net/core/dev.c | 12 +++++++++++
> > net/core/dev.h | 6 ++++--
> > 6 files changed, 58 insertions(+), 17 deletions(-)
>
> [...]
>
> > @@ -4474,12 +4471,12 @@ static int iavf_close(struct net_device *netdev)
> > u64 aq_to_restore;
> > int status;
> >
> > - netdev_lock(netdev);
> > + netdev_assert_locked(netdev);
> > +
> > mutex_lock(&adapter->crit_lock);
> >
> > if (adapter->state <= __IAVF_DOWN_PENDING) {
> > mutex_unlock(&adapter->crit_lock);
> > - netdev_unlock(netdev);
> > return 0;
> > }
> >
> > @@ -4532,6 +4529,7 @@ static int iavf_close(struct net_device *netdev)
> > if (!status)
> > netdev_warn(netdev, "Device resources not yet released\n");
> >
> > + netdev_lock(netdev);
>
> I'm probably just misreading the rest of the patch, but I was just
> wondering: is this netdev_lock call here intentional? I am asking
> because I thought the lock was taken in ndo_stop before this is
> called?
Yes, this part is a bit confusing. Existing iavf_close looks like
this:
iavf_close() {
netdev_lock()
..
netdev_unlock()
wait_event_timeout(down_waitqueue)
}
I change it to the following:
netdev_lock()
iavf_close() {
..
netdev_unlock()
wait_event_timeout(down_waitqueue)
netdev_lock()
}
netdev_unlock()
And the diff is confusing because I reuse existing netdev_lock call,
so it looks like I only add netdev_unlock...
I don't think I can hold instance lock during wait_event_timeout because
the wake_up(down_waitqueue) callers grab netdev instance lock as well.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC net-next 1/4] net: Hold netdev instance lock during ndo_open/ndo_stop
2025-02-05 21:31 ` Stanislav Fomichev
@ 2025-02-05 22:32 ` Joe Damato
0 siblings, 0 replies; 10+ messages in thread
From: Joe Damato @ 2025-02-05 22:32 UTC (permalink / raw)
To: Stanislav Fomichev
Cc: Stanislav Fomichev, netdev, davem, edumazet, kuba, pabeni,
Saeed Mahameed
On Wed, Feb 05, 2025 at 01:31:40PM -0800, Stanislav Fomichev wrote:
> On 02/05, Joe Damato wrote:
> > On Tue, Feb 04, 2025 at 03:00:54PM -0800, Stanislav Fomichev wrote:
> > > For the drivers that use shaper API, switch to the mode where
> > > core stack holds the netdev lock. This affects two drivers:
> > >
> > > * iavf - already grabs netdev lock in ndo_open/ndo_stop, so mostly
> > > remove these
> > > * netdevsim - switch to _locked APIs to avoid deadlock
> > >
> > > Cc: Saeed Mahameed <saeed@kernel.org>
> > > Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
> > > ---
> > > Documentation/networking/netdevices.rst | 6 ++++--
> > > drivers/net/ethernet/intel/iavf/iavf_main.c | 14 ++++++-------
> > > drivers/net/netdevsim/netdev.c | 14 ++++++++-----
> > > include/linux/netdevice.h | 23 +++++++++++++++++++++
> > > net/core/dev.c | 12 +++++++++++
> > > net/core/dev.h | 6 ++++--
> > > 6 files changed, 58 insertions(+), 17 deletions(-)
> >
> > [...]
> >
> > > @@ -4474,12 +4471,12 @@ static int iavf_close(struct net_device *netdev)
> > > u64 aq_to_restore;
> > > int status;
> > >
> > > - netdev_lock(netdev);
> > > + netdev_assert_locked(netdev);
> > > +
> > > mutex_lock(&adapter->crit_lock);
> > >
> > > if (adapter->state <= __IAVF_DOWN_PENDING) {
> > > mutex_unlock(&adapter->crit_lock);
> > > - netdev_unlock(netdev);
> > > return 0;
> > > }
> > >
> > > @@ -4532,6 +4529,7 @@ static int iavf_close(struct net_device *netdev)
> > > if (!status)
> > > netdev_warn(netdev, "Device resources not yet released\n");
> > >
> > > + netdev_lock(netdev);
> >
> > I'm probably just misreading the rest of the patch, but I was just
> > wondering: is this netdev_lock call here intentional? I am asking
> > because I thought the lock was taken in ndo_stop before this is
> > called?
>
> Yes, this part is a bit confusing. Existing iavf_close looks like
> this:
>
> iavf_close() {
> netdev_lock()
> ..
> netdev_unlock()
> wait_event_timeout(down_waitqueue)
> }
>
> I change it to the following:
>
> netdev_lock()
> iavf_close() {
> ..
> netdev_unlock()
> wait_event_timeout(down_waitqueue)
> netdev_lock()
> }
> netdev_unlock()
>
> And the diff is confusing because I reuse existing netdev_lock call,
> so it looks like I only add netdev_unlock...
Ah, I see. Thanks for explaining that; I looked at it a second time
more closely and I see that you are correct and I did miss the
existing netdev_lock call.
> I don't think I can hold instance lock during wait_event_timeout because
> the wake_up(down_waitqueue) callers grab netdev instance lock as well.
Agreed.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations
2025-02-04 23:00 [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations Stanislav Fomichev
` (3 preceding siblings ...)
2025-02-04 23:00 ` [RFC net-next 4/4] net: Hold netdev instance lock during queue operations Stanislav Fomichev
@ 2025-02-14 2:10 ` Saeed Mahameed
2025-02-14 2:55 ` Stanislav Fomichev
4 siblings, 1 reply; 10+ messages in thread
From: Saeed Mahameed @ 2025-02-14 2:10 UTC (permalink / raw)
To: Stanislav Fomichev; +Cc: netdev, davem, edumazet, kuba, pabeni
On 04 Feb 15:00, Stanislav Fomichev wrote:
>As the gradual purging of rtnl continues, start grabbing netdev
>instance lock in more places so we can get to the state where
>most paths are working without rtnl. Start with requiring the
>drivers that use shaper api (and later queue mgmt api) to work
>with both rtnl and netdev instance lock. Eventually we might
>attempt to drop rtnl. This mostly affects iavf, gve, bnxt and
>netdev sim (as the drivers that implement shaper/queue mgmt)
>so those drivers are converted in the process.
>
>This is part one of the process, the next step is to do similar locking
>for the rest of ndo handlers that are being called from sysfs/ethtool/netlink.
Hi Stan, thanks for the patch, sorry I didn't have the time that week to
look at it and it fill between the cracks, I've glanced through the patches
quickly and they seem reasonable. but obviously we need much more,
so what's the plan? currently I am not able to personally work on
this.
Also the locking scheme is still not well define with this opt-in idea the
locking shceme is actually still not clear to me? for me it should be as easy
as netdev_lock protects all paths including, ndos/ioctl/netlinks/etc .. paths
that will access the netdev's underlying driver queues.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations
2025-02-14 2:10 ` [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations Saeed Mahameed
@ 2025-02-14 2:55 ` Stanislav Fomichev
0 siblings, 0 replies; 10+ messages in thread
From: Stanislav Fomichev @ 2025-02-14 2:55 UTC (permalink / raw)
To: Saeed Mahameed; +Cc: Stanislav Fomichev, netdev, davem, edumazet, kuba, pabeni
On 02/13, Saeed Mahameed wrote:
> On 04 Feb 15:00, Stanislav Fomichev wrote:
> > As the gradual purging of rtnl continues, start grabbing netdev
> > instance lock in more places so we can get to the state where
> > most paths are working without rtnl. Start with requiring the
> > drivers that use shaper api (and later queue mgmt api) to work
> > with both rtnl and netdev instance lock. Eventually we might
> > attempt to drop rtnl. This mostly affects iavf, gve, bnxt and
> > netdev sim (as the drivers that implement shaper/queue mgmt)
> > so those drivers are converted in the process.
> >
> > This is part one of the process, the next step is to do similar locking
> > for the rest of ndo handlers that are being called from sysfs/ethtool/netlink.
>
> Hi Stan, thanks for the patch, sorry I didn't have the time that week to
> look at it and it fill between the cracks, I've glanced through the patches
> quickly and they seem reasonable. but obviously we need much more, so what's
> the plan? currently I am not able to personally work on
> this.
>
> Also the locking scheme is still not well define with this opt-in idea the
> locking shceme is actually still not clear to me? for me it should be as easy
> as netdev_lock protects all paths including, ndos/ioctl/netlinks/etc .. paths
> that will access the netdev's underlying driver queues.
Hey Saeed,
There is a follow up which locks the other paths:
https://lore.kernel.org/netdev/20250210192043.439074-1-sdf@fomichev.me/
I'm gonna try to do a v2 tomorrow to address Jakub's feedback. There is no
opt-in currently; any driver that uses shaper/queue-mgmt API will have
its ndo running with the instance lock. Can you try to run your mlx5
queue-mgmt changes on top of it? (v1 should be good enough, no need to
wait for v2)
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-02-14 2:55 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-04 23:00 [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations Stanislav Fomichev
2025-02-04 23:00 ` [RFC net-next 1/4] net: Hold netdev instance lock during ndo_open/ndo_stop Stanislav Fomichev
2025-02-05 19:32 ` Joe Damato
2025-02-05 21:31 ` Stanislav Fomichev
2025-02-05 22:32 ` Joe Damato
2025-02-04 23:00 ` [RFC net-next 2/4] net: Hold netdev instance lock during ndo_setup_tc Stanislav Fomichev
2025-02-04 23:00 ` [RFC net-next 3/4] net: Hold netdev instance lock for more NDOs Stanislav Fomichev
2025-02-04 23:00 ` [RFC net-next 4/4] net: Hold netdev instance lock during queue operations Stanislav Fomichev
2025-02-14 2:10 ` [RFC net-next 0/4] net: Hold netdev instance lock during ndo operations Saeed Mahameed
2025-02-14 2:55 ` Stanislav Fomichev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).