* [RFC PATCH net-next] netpoll: hold RCU while walking napi_list
@ 2026-06-27 10:12 Runyu Xiao
2026-06-27 21:21 ` Jakub Kicinski
0 siblings, 1 reply; 5+ messages in thread
From: Runyu Xiao @ 2026-06-27 10:12 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni
Cc: horms, leitao, sashal, bigeasy, netdev, linux-kernel, runyu.xiao,
jianhao.xu
poll_napi() walks dev->napi_list with list_for_each_entry_rcu(). Some
netpoll send paths are already inside an RCU read-side section, but the
helper itself does not document or enforce that contract.
CONFIG_PROVE_RCU_LIST reports the poll_napi() traversal when the helper
is exercised directly from netpoll_poll_dev(). The current source has
important lifetime defenses around NAPI deletion and netpoll device
close, so this is not presented as a proven use-after-free. The issue is
that the RCU-list reader contract is implicit at the helper boundary.
Take rcu_read_lock() locally while walking the NAPI list. This keeps the
contract with netif_napi_del() and synchronize_net() explicit and avoids
relying on every current or future caller to provide the read-side
section.
This was found by our static analysis tool and then manually reviewed
against the current tree. CONFIG_PROVE_RCU_LIST was used as
target-matched triage evidence; the RFC is limited to making the
helper's RCU-list reader contract explicit.
This is an RFC because maintainers may prefer to express the existing
netpoll dev_lock/NAPI-list lifetime contract instead of adding a local
RCU reader around the polling loop.
Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn>
---
net/core/netpoll.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 5af14f14a362..2e13ca0d09fe 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -165,12 +165,14 @@ static void poll_napi(struct net_device *dev)
struct napi_struct *napi;
int cpu = smp_processor_id();
+ rcu_read_lock();
list_for_each_entry_rcu(napi, &dev->napi_list, dev_list) {
if (cmpxchg(&napi->poll_owner, -1, cpu) == -1) {
poll_one_napi(napi);
smp_store_release(&napi->poll_owner, -1);
}
}
+ rcu_read_unlock();
}
void netpoll_poll_dev(struct net_device *dev)
--
2.34.1
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [RFC PATCH net-next] netpoll: hold RCU while walking napi_list
2026-06-27 10:12 [RFC PATCH net-next] netpoll: hold RCU while walking napi_list Runyu Xiao
@ 2026-06-27 21:21 ` Jakub Kicinski
2026-06-28 5:04 ` Runyu Xiao
0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2026-06-27 21:21 UTC (permalink / raw)
To: Runyu Xiao
Cc: davem, edumazet, pabeni, horms, leitao, sashal, bigeasy, netdev,
linux-kernel, jianhao.xu
On Sat, 27 Jun 2026 18:12:28 +0800 Runyu Xiao wrote:
> CONFIG_PROVE_RCU_LIST reports the poll_napi() traversal when the helper
> is exercised directly from netpoll_poll_dev(). The current source has
> important lifetime defenses around NAPI deletion and netpoll device
> close, so this is not presented as a proven use-after-free. The issue is
> that the RCU-list reader contract is implicit at the helper boundary.
Please provide the stack trace from the report, rather than just saying
that you can trigger it.
--
pw-bot: rfc
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH net-next] netpoll: hold RCU while walking napi_list
2026-06-27 21:21 ` Jakub Kicinski
@ 2026-06-28 5:04 ` Runyu Xiao
2026-06-29 11:17 ` Breno Leitao
0 siblings, 1 reply; 5+ messages in thread
From: Runyu Xiao @ 2026-06-28 5:04 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, edumazet, pabeni, horms, leitao, sashal, bigeasy, netdev,
linux-kernel, jianhao.xu
Hi,
On Sat, 27 Jun 2026 14:21:05 -0700 Jakub Kicinski wrote:
> Please provide the stack trace from the report, rather than just saying
> that you can trigger it.
Sure, sorry for not including it in the RFC. The warning was from the
reviewed reproducer used for the CONFIG_PROVE_RCU_LIST triage, not from
a production crash. The relevant part of the dmesg is:
WARNING: suspicious RCU usage
6.1.66 #3 Tainted: G O
-----------------------------
/home/ubuntu22/msv_workspace/shared/vuln_msv.c:45 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
no locks held by insmod/190.
stack backtrace:
CPU: 1 PID: 190 Comm: insmod Tainted: G O 6.1.66 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
<task>
dump_stack_lvl+0x45/0x5d
lockdep_rcu_suspicious.cold+0x2d/0x64
poll_napi.constprop.0+0x43/0x71 [vuln_msv]
netpoll_poll_dev.constprop.0+0x27/0x36 [vuln_msv]
? 0xffffffffc0005000
rcu_list_msv_init+0xe2/0x1000 [vuln_msv]
do_one_initcall+0x56/0x250
do_init_module+0x47/0x1c0
__do_sys_finit_module+0xa6/0x100
do_syscall_64+0x3b/0x90
entry_SYSCALL_64_after_hwframe+0x64/0xce
</task>
The reproducer keeps the shape intentionally small: netpoll_poll_dev()
is exercised directly and calls poll_napi(), which walks dev->napi_list
with list_for_each_entry_rcu() outside an explicit RCU read-side section.
It does not model a concurrent NAPI free.
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH net-next] netpoll: hold RCU while walking napi_list
2026-06-28 5:04 ` Runyu Xiao
@ 2026-06-29 11:17 ` Breno Leitao
2026-06-29 22:58 ` Jakub Kicinski
0 siblings, 1 reply; 5+ messages in thread
From: Breno Leitao @ 2026-06-29 11:17 UTC (permalink / raw)
To: Runyu Xiao
Cc: Jakub Kicinski, davem, edumazet, pabeni, horms, sashal, bigeasy,
netdev, linux-kernel, jianhao.xu
Hello,
On Sun, Jun 28, 2026 at 01:04:17PM +0800, Runyu Xiao wrote:
> Hi,
>
> On Sat, 27 Jun 2026 14:21:05 -0700 Jakub Kicinski wrote:
> > Please provide the stack trace from the report, rather than just saying
> > that you can trigger it.
I am really suprised to see this warning. I've been runing this code with
CONFIG_PROVE_RCU_LIST for ages, and I haven't seen anything similar.
> Sure, sorry for not including it in the RFC. The warning was from the
> reviewed reproducer used for the CONFIG_PROVE_RCU_LIST triage, not from
> a production crash. The relevant part of the dmesg is:
Reading it, it does not come from the kernel's netpoll code at
all -- it comes from an out-of-tree module (!?)
> WARNING: suspicious RCU usage
> 6.1.66 #3 Tainted: G O
> -----------------------------
> /home/ubuntu22/msv_workspace/shared/vuln_msv.c:45 RCU-list traversed in non-reader section!!
>
> other info that might help us debug this:
>
> rcu_scheduler_active = 2, debug_locks = 1
> no locks held by insmod/190.
>
> stack backtrace:
> CPU: 1 PID: 190 Comm: insmod Tainted: G O 6.1.66 #3
Have you tested it on a more modern kernel?
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> Call Trace:
> <task>
> dump_stack_lvl+0x45/0x5d
> lockdep_rcu_suspicious.cold+0x2d/0x64
> poll_napi.constprop.0+0x43/0x71 [vuln_msv]
> netpoll_poll_dev.constprop.0+0x27/0x36 [vuln_msv]
> ? 0xffffffffc0005000
> rcu_list_msv_init+0xe2/0x1000 [vuln_msv]
What is `vuln_msv` exactly?
Could you reproduce this from an in-kernel path instead -- a real
netpoll/netconsole/bonding caller, with the frames resolving to the kernel
rather than [vuln_msv]?
Meanwhile, NAK until the above is clarified
--
pw-bot: rejected
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [RFC PATCH net-next] netpoll: hold RCU while walking napi_list
2026-06-29 11:17 ` Breno Leitao
@ 2026-06-29 22:58 ` Jakub Kicinski
0 siblings, 0 replies; 5+ messages in thread
From: Jakub Kicinski @ 2026-06-29 22:58 UTC (permalink / raw)
To: Breno Leitao
Cc: Runyu Xiao, davem, edumazet, pabeni, horms, sashal, bigeasy,
netdev, linux-kernel, jianhao.xu
On Mon, 29 Jun 2026 04:17:10 -0700 Breno Leitao wrote:
> > Sure, sorry for not including it in the RFC. The warning was from the
> > reviewed reproducer used for the CONFIG_PROVE_RCU_LIST triage, not from
> > a production crash. The relevant part of the dmesg is:
>
> Reading it, it does not come from the kernel's netpoll code at
> all -- it comes from an out-of-tree module (!?)
Yes, like Breno says, we need an in-kernel path that can trigger the
issue. Out-of-tree reproducers can be useful to validate the fix, but
they are not useful to prove that a problem actually exists.
We try to avoid defensive programming in the kernel.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-06-29 22:58 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-27 10:12 [RFC PATCH net-next] netpoll: hold RCU while walking napi_list Runyu Xiao
2026-06-27 21:21 ` Jakub Kicinski
2026-06-28 5:04 ` Runyu Xiao
2026-06-29 11:17 ` Breno Leitao
2026-06-29 22:58 ` Jakub Kicinski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox