* [PATCH net-next 0/2] net: do not rely on rtnl in netdev_nl_napi_get_xxx()
@ 2024-10-09 23:27 Eric Dumazet
2024-10-09 23:27 ` [PATCH net-next 1/2] netdev-genl: do not use rtnl in netdev_nl_napi_get_doit() Eric Dumazet
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Eric Dumazet @ 2024-10-09 23:27 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, netdev, eric.dumazet, Eric Dumazet
With upcoming per netns RTNL, rtnl use in netdev_nl_napi_get_doit()
and netdev_nl_napi_get_dumpit() is a bit problematic.
They can be changed to only rely on RCU.
Eric Dumazet (2):
netdev-genl: do not use rtnl in netdev_nl_napi_get_doit()
netdev-genl: do not use rtnl in netdev_nl_napi_get_dumpit()
include/linux/netdevice.h | 3 ++-
net/core/dev.c | 21 ++++++++++++---------
net/core/netdev-genl.c | 31 ++++++++++++++++---------------
3 files changed, 30 insertions(+), 25 deletions(-)
--
2.47.0.rc0.187.ge670bccf7e-goog
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH net-next 1/2] netdev-genl: do not use rtnl in netdev_nl_napi_get_doit()
2024-10-09 23:27 [PATCH net-next 0/2] net: do not rely on rtnl in netdev_nl_napi_get_xxx() Eric Dumazet
@ 2024-10-09 23:27 ` Eric Dumazet
2024-10-09 23:27 ` [PATCH net-next 2/2] netdev-genl: do not use rtnl in netdev_nl_napi_get_dumpit() Eric Dumazet
2024-10-10 15:55 ` [PATCH net-next 0/2] net: do not rely on rtnl in netdev_nl_napi_get_xxx() Jakub Kicinski
2 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2024-10-09 23:27 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, netdev, eric.dumazet, Eric Dumazet
With upcoming per netns RTNL, rtnl use in netdev_nl_napi_get_doit()
is a bit problematic.
Prepare netdev_nl_napi_fill_one() to not rely on RTNL:
1) netif_napi_set_irq() uses WRITE_ONCE(napi->irq, ...)
2) napi_kthread_create() uses WRITE_ONCE(napi->thread, ...)
3) Add napi->thread_pid_nr to avoid race in netdev_nl_napi_fill_one
and __netif_napi_del()
4) netdev_nl_napi_fill_one() uses corresponding READ_ONCE()
5) netdev_nl_napi_get_doit() can use RCU instead of RTNL
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
include/linux/netdevice.h | 3 ++-
net/core/dev.c | 21 ++++++++++++---------
net/core/netdev-genl.c | 21 +++++++++++----------
3 files changed, 25 insertions(+), 20 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 3baf8e539b6f33caaf83961c4cf619b799e5e41d..64a5e4927901740db8dbc255ed19faca96820333 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -377,6 +377,7 @@ struct napi_struct {
struct list_head dev_list;
struct hlist_node napi_hash_node;
int irq;
+ pid_t thread_pid_nr;
};
enum {
@@ -2618,7 +2619,7 @@ void netif_queue_set_napi(struct net_device *dev, unsigned int queue_index,
static inline void netif_napi_set_irq(struct napi_struct *napi, int irq)
{
- napi->irq = irq;
+ WRITE_ONCE(napi->irq, irq);
}
/* Default NAPI poll() weight
diff --git a/net/core/dev.c b/net/core/dev.c
index ea5fbcd133ae4c743545945def00790ec74e2bb6..77c39a95e74df2485777bc008a507bdcc4e75a00 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1423,21 +1423,23 @@ static int napi_threaded_poll(void *data);
static int napi_kthread_create(struct napi_struct *n)
{
- int err = 0;
+ struct task_struct *thread;
/* Create and wake up the kthread once to put it in
* TASK_INTERRUPTIBLE mode to avoid the blocked task
* warning and work with loadavg.
*/
- n->thread = kthread_run(napi_threaded_poll, n, "napi/%s-%d",
+ thread = kthread_run(napi_threaded_poll, n, "napi/%s-%d",
n->dev->name, n->napi_id);
- if (IS_ERR(n->thread)) {
- err = PTR_ERR(n->thread);
+ if (IS_ERR(thread)) {
+ int err = PTR_ERR(thread);
+
pr_err("kthread_run failed with err %d\n", err);
- n->thread = NULL;
+ return err;
}
-
- return err;
+ WRITE_ONCE(n->thread, thread);
+ WRITE_ONCE(n->thread_pid_nr, task_pid_nr(thread));
+ return 0;
}
static int __dev_open(struct net_device *dev, struct netlink_ext_ack *extack)
@@ -6668,6 +6670,7 @@ void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
set_bit(NAPI_STATE_SCHED, &napi->state);
set_bit(NAPI_STATE_NPSVC, &napi->state);
list_add_rcu(&napi->dev_list, &dev->napi_list);
+ netif_napi_set_irq(napi, -1);
napi_hash_add(napi);
napi_get_frags_check(napi);
/* Create kthread for this napi if dev->threaded is set.
@@ -6676,7 +6679,6 @@ void netif_napi_add_weight(struct net_device *dev, struct napi_struct *napi,
*/
if (dev->threaded && napi_kthread_create(napi))
dev->threaded = false;
- netif_napi_set_irq(napi, -1);
}
EXPORT_SYMBOL(netif_napi_add_weight);
@@ -6753,7 +6755,8 @@ void __netif_napi_del(struct napi_struct *napi)
if (napi->thread) {
kthread_stop(napi->thread);
- napi->thread = NULL;
+ WRITE_ONCE(napi->thread, NULL);
+ WRITE_ONCE(napi->thread_pid_nr, 0);
}
}
EXPORT_SYMBOL(__netif_napi_del);
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index 1cb954f2d39e8248bffd854cdf27eceb25293425..0dcfe3527c122884c5713e56d5e27d4e638d936f 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -163,10 +163,11 @@ netdev_nl_napi_fill_one(struct sk_buff *rsp, struct napi_struct *napi,
{
void *hdr;
pid_t pid;
+ int irq;
if (WARN_ON_ONCE(!napi->dev))
return -EINVAL;
- if (!(napi->dev->flags & IFF_UP))
+ if (!(READ_ONCE(napi->dev->flags) & IFF_UP))
return 0;
hdr = genlmsg_iput(rsp, info);
@@ -177,17 +178,17 @@ netdev_nl_napi_fill_one(struct sk_buff *rsp, struct napi_struct *napi,
nla_put_u32(rsp, NETDEV_A_NAPI_ID, napi->napi_id))
goto nla_put_failure;
- if (nla_put_u32(rsp, NETDEV_A_NAPI_IFINDEX, napi->dev->ifindex))
+ if (nla_put_u32(rsp, NETDEV_A_NAPI_IFINDEX,
+ READ_ONCE(napi->dev->ifindex)))
goto nla_put_failure;
- if (napi->irq >= 0 && nla_put_u32(rsp, NETDEV_A_NAPI_IRQ, napi->irq))
+ irq = READ_ONCE(napi->irq);
+ if (irq >= 0 && nla_put_u32(rsp, NETDEV_A_NAPI_IRQ, irq))
goto nla_put_failure;
- if (napi->thread) {
- pid = task_pid_nr(napi->thread);
- if (nla_put_u32(rsp, NETDEV_A_NAPI_PID, pid))
- goto nla_put_failure;
- }
+ pid = READ_ONCE(napi->thread_pid_nr);
+ if (pid && nla_put_u32(rsp, NETDEV_A_NAPI_PID, pid))
+ goto nla_put_failure;
genlmsg_end(rsp, hdr);
@@ -214,7 +215,7 @@ int netdev_nl_napi_get_doit(struct sk_buff *skb, struct genl_info *info)
if (!rsp)
return -ENOMEM;
- rtnl_lock();
+ rcu_read_lock();
napi = napi_by_id(napi_id);
if (napi) {
@@ -224,7 +225,7 @@ int netdev_nl_napi_get_doit(struct sk_buff *skb, struct genl_info *info)
err = -ENOENT;
}
- rtnl_unlock();
+ rcu_read_unlock();
if (err)
goto err_free_msg;
--
2.47.0.rc0.187.ge670bccf7e-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH net-next 2/2] netdev-genl: do not use rtnl in netdev_nl_napi_get_dumpit()
2024-10-09 23:27 [PATCH net-next 0/2] net: do not rely on rtnl in netdev_nl_napi_get_xxx() Eric Dumazet
2024-10-09 23:27 ` [PATCH net-next 1/2] netdev-genl: do not use rtnl in netdev_nl_napi_get_doit() Eric Dumazet
@ 2024-10-09 23:27 ` Eric Dumazet
2024-10-10 15:55 ` [PATCH net-next 0/2] net: do not rely on rtnl in netdev_nl_napi_get_xxx() Jakub Kicinski
2 siblings, 0 replies; 4+ messages in thread
From: Eric Dumazet @ 2024-10-09 23:27 UTC (permalink / raw)
To: David S . Miller, Jakub Kicinski, Paolo Abeni
Cc: Kuniyuki Iwashima, netdev, eric.dumazet, Eric Dumazet
Both netdev_nl_napi_dump_one() and netdev_nl_napi_get_dumpit()
can use RCU instead of RTNL to dump napi related information,
after prior patch prepared netdev_nl_napi_fill_one().
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/core/netdev-genl.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c
index 0dcfe3527c122884c5713e56d5e27d4e638d936f..22f766619630f3dc43e3b0ed1708fa9ef38a5451 100644
--- a/net/core/netdev-genl.c
+++ b/net/core/netdev-genl.c
@@ -245,10 +245,10 @@ netdev_nl_napi_dump_one(struct net_device *netdev, struct sk_buff *rsp,
struct napi_struct *napi;
int err = 0;
- if (!(netdev->flags & IFF_UP))
+ if (!(READ_ONCE(netdev->flags) & IFF_UP))
return err;
- list_for_each_entry(napi, &netdev->napi_list, dev_list) {
+ list_for_each_entry_rcu(napi, &netdev->napi_list, dev_list) {
if (ctx->napi_id && napi->napi_id >= ctx->napi_id)
continue;
@@ -272,9 +272,9 @@ int netdev_nl_napi_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
if (info->attrs[NETDEV_A_NAPI_IFINDEX])
ifindex = nla_get_u32(info->attrs[NETDEV_A_NAPI_IFINDEX]);
- rtnl_lock();
+ rcu_read_lock();
if (ifindex) {
- netdev = __dev_get_by_index(net, ifindex);
+ netdev = dev_get_by_index_rcu(net, ifindex);
if (netdev)
err = netdev_nl_napi_dump_one(netdev, skb, info, ctx);
else
@@ -287,7 +287,7 @@ int netdev_nl_napi_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
ctx->napi_id = 0;
}
}
- rtnl_unlock();
+ rcu_read_unlock();
return err;
}
--
2.47.0.rc0.187.ge670bccf7e-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net-next 0/2] net: do not rely on rtnl in netdev_nl_napi_get_xxx()
2024-10-09 23:27 [PATCH net-next 0/2] net: do not rely on rtnl in netdev_nl_napi_get_xxx() Eric Dumazet
2024-10-09 23:27 ` [PATCH net-next 1/2] netdev-genl: do not use rtnl in netdev_nl_napi_get_doit() Eric Dumazet
2024-10-09 23:27 ` [PATCH net-next 2/2] netdev-genl: do not use rtnl in netdev_nl_napi_get_dumpit() Eric Dumazet
@ 2024-10-10 15:55 ` Jakub Kicinski
2 siblings, 0 replies; 4+ messages in thread
From: Jakub Kicinski @ 2024-10-10 15:55 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, Paolo Abeni, Kuniyuki Iwashima, netdev,
eric.dumazet
On Wed, 9 Oct 2024 23:27:25 +0000 Eric Dumazet wrote:
> With upcoming per netns RTNL, rtnl use in netdev_nl_napi_get_doit()
> and netdev_nl_napi_get_dumpit() is a bit problematic.
>
> They can be changed to only rely on RCU.
Oh, you already posted this.
I was hoping we can start moving NAPIs and queues under dev->lock,
gradually move all local netdev state under that lock. RCU only
gets us so far when we have to cross reference objects (e.g. link
queue to NAPI). Even with NAPI SET not sure if RCU will work given
the writes have to happen into the instance and the "config struct".
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-10-10 15:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-09 23:27 [PATCH net-next 0/2] net: do not rely on rtnl in netdev_nl_napi_get_xxx() Eric Dumazet
2024-10-09 23:27 ` [PATCH net-next 1/2] netdev-genl: do not use rtnl in netdev_nl_napi_get_doit() Eric Dumazet
2024-10-09 23:27 ` [PATCH net-next 2/2] netdev-genl: do not use rtnl in netdev_nl_napi_get_dumpit() Eric Dumazet
2024-10-10 15:55 ` [PATCH net-next 0/2] net: do not rely on rtnl in netdev_nl_napi_get_xxx() Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).