* [PATCH net] l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname
@ 2026-05-23 2:34 Michael Bommarito
2026-05-26 6:44 ` James Chapman
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: Michael Bommarito @ 2026-05-23 2:34 UTC (permalink / raw)
To: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
Cc: James Chapman, Tom Parkin, Guillaume Nault, Simon Horman,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
Kees Cook, netdev, linux-rt-devel, linux-kernel
A reader in l2tp_session_get_by_ifname() can return a pointer to a
session whose refcount has reached zero. The getter takes its
reference with plain refcount_inc(), but every other session getter
in the same file (l2tp_v2_session_get, l2tp_v3_session_get, and the
corresponding _get_next variants) uses refcount_inc_not_zero()
because the IDR/RCU lookup can race with refcount_dec_and_test() ->
l2tp_session_free() -> kfree_rcu(). The ifname getter is the only
outlier; the inconsistency was raised on-list after 979c017803c4
("l2tp: use list_del_rcu in l2tp_session_unhash").
A reader inside rcu_read_lock_bh() that matches session->ifname can
be preempted between the strcmp() and the refcount_inc(). If the
last reference drops on another CPU in that window, the reader's
refcount_inc() runs on a counter that has reached zero. refcount_t
catches the addition-on-zero, prints "refcount_t: addition on 0;
use-after-free", saturates the counter, and returns the saturated
pointer to the caller. Session memory is held live by the in-flight
RCU read section, but the kfree_rcu() callback queued from
l2tp_session_free() will free it once the grace period closes; a
caller that dereferences the returned session past that point hits
a slab-use-after-free. On PREEMPT_RT local_bh_disable() is a per-CPU
sleeping lock and the preemption window is real; on stock PREEMPT
kernels local_bh_disable() is a preempt_count increment that closes
the cross-CPU race in practice (see below).
Use refcount_inc_not_zero() and continue the list walk on failure,
matching the other session getters in the file. The ifname getter
is the only session getter in net/l2tp/ that still uses the bare
refcount_inc() pattern; this change restores file-internal
consistency. The success path is unchanged.
Fixes: abe7a1a7d0b6 ("l2tp: improve tunnel/session refcount helpers")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
---
Distro reachability:
l2tp_core and l2tp_netlink autoload via the genl family alias
net-pf-16-proto-16-family-l2tp on first AF_NETLINK CTRL_CMD_GETFAMILY
lookup; pppol2tp autoloads on demand via net-l2tp-type-7 from
L2TP_CMD_SESSION_CREATE with pw_type=L2TP_PWTYPE_PPP. Neither
autoload requires CAP_SYS_MODULE. SESSION_GET / SESSION_CREATE /
SESSION_DELETE are GENL_UNS_ADMIN_PERM, so CAP_NET_ADMIN in the
netns user namespace suffices and is reachable from `unshare -Urn`
on Debian 11/12, Ubuntu 22.04 LTS / 23.10+ / 24.04+, Arch, Alpine;
RHEL/Fedora blacklist l2tp_netlink by default but on hosts running
NetworkManager-l2tp, xl2tpd, or any L2TPv3 endpoint the gate is open
there too. PREEMPT_RT ships on Ubuntu Pro Realtime, RHEL 9 RT, SUSE
RT, and Yocto-RT.
Reproduced on x86_64 QEMU/KVM. Under tracefs-kprobe widening at the
refcount-inc instruction, a PREEMPT_RT build hits "refcount_t:
addition on 0; use-after-free" through l2tp_session_get_by_ifname at
533 s; patched runs 1800 s clean. A stock PREEMPT (non-RT) build
with the same widening ran 1800 s without firing: local_bh_disable()
on non-RT prevents preemption inside the RCU section so the
cross-CPU window does not open at observable rates.
net/l2tp/l2tp_core.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 1455f67e01ddb..0000000000000 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -441,12 +441,13 @@ struct l2tp_session *l2tp_session_get_by_ifname(const struct net *net,
idr_for_each_entry_ul(&pn->l2tp_tunnel_idr, tunnel, tmp, tunnel_id) {
if (tunnel) {
list_for_each_entry_rcu(session, &tunnel->session_list, list) {
- if (!strcmp(session->ifname, ifname)) {
- refcount_inc(&session->ref_count);
- rcu_read_unlock_bh();
+ if (strcmp(session->ifname, ifname))
+ continue;
+ if (!refcount_inc_not_zero(&session->ref_count))
+ continue;
+ rcu_read_unlock_bh();
- return session;
- }
+ return session;
}
}
}
--
2.46.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH net] l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname
2026-05-23 2:34 [PATCH net] l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname Michael Bommarito
@ 2026-05-26 6:44 ` James Chapman
2026-05-26 16:52 ` Simon Horman
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: James Chapman @ 2026-05-26 6:44 UTC (permalink / raw)
To: Michael Bommarito, David S . Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: Tom Parkin, Guillaume Nault, Simon Horman,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
Kees Cook, netdev, linux-rt-devel, linux-kernel
On 23/05/2026 03:34, Michael Bommarito wrote:
> A reader in l2tp_session_get_by_ifname() can return a pointer to a
> session whose refcount has reached zero. The getter takes its
> reference with plain refcount_inc(), but every other session getter
> in the same file (l2tp_v2_session_get, l2tp_v3_session_get, and the
> corresponding _get_next variants) uses refcount_inc_not_zero()
> because the IDR/RCU lookup can race with refcount_dec_and_test() ->
> l2tp_session_free() -> kfree_rcu(). The ifname getter is the only
> outlier; the inconsistency was raised on-list after 979c017803c4
> ("l2tp: use list_del_rcu in l2tp_session_unhash").
>
> A reader inside rcu_read_lock_bh() that matches session->ifname can
> be preempted between the strcmp() and the refcount_inc(). If the
> last reference drops on another CPU in that window, the reader's
> refcount_inc() runs on a counter that has reached zero. refcount_t
> catches the addition-on-zero, prints "refcount_t: addition on 0;
> use-after-free", saturates the counter, and returns the saturated
> pointer to the caller. Session memory is held live by the in-flight
> RCU read section, but the kfree_rcu() callback queued from
> l2tp_session_free() will free it once the grace period closes; a
> caller that dereferences the returned session past that point hits
> a slab-use-after-free. On PREEMPT_RT local_bh_disable() is a per-CPU
> sleeping lock and the preemption window is real; on stock PREEMPT
> kernels local_bh_disable() is a preempt_count increment that closes
> the cross-CPU race in practice (see below).
>
> Use refcount_inc_not_zero() and continue the list walk on failure,
> matching the other session getters in the file. The ifname getter
> is the only session getter in net/l2tp/ that still uses the bare
> refcount_inc() pattern; this change restores file-internal
> consistency. The success path is unchanged.
>
> Fixes: abe7a1a7d0b6 ("l2tp: improve tunnel/session refcount helpers")
> Cc: stable@vger.kernel.org
> Assisted-by: Claude:claude-opus-4-7
> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
> ---
> Distro reachability:
>
> l2tp_core and l2tp_netlink autoload via the genl family alias
> net-pf-16-proto-16-family-l2tp on first AF_NETLINK CTRL_CMD_GETFAMILY
> lookup; pppol2tp autoloads on demand via net-l2tp-type-7 from
> L2TP_CMD_SESSION_CREATE with pw_type=L2TP_PWTYPE_PPP. Neither
> autoload requires CAP_SYS_MODULE. SESSION_GET / SESSION_CREATE /
> SESSION_DELETE are GENL_UNS_ADMIN_PERM, so CAP_NET_ADMIN in the
> netns user namespace suffices and is reachable from `unshare -Urn`
> on Debian 11/12, Ubuntu 22.04 LTS / 23.10+ / 24.04+, Arch, Alpine;
> RHEL/Fedora blacklist l2tp_netlink by default but on hosts running
> NetworkManager-l2tp, xl2tpd, or any L2TPv3 endpoint the gate is open
> there too. PREEMPT_RT ships on Ubuntu Pro Realtime, RHEL 9 RT, SUSE
> RT, and Yocto-RT.
>
> Reproduced on x86_64 QEMU/KVM. Under tracefs-kprobe widening at the
> refcount-inc instruction, a PREEMPT_RT build hits "refcount_t:
> addition on 0; use-after-free" through l2tp_session_get_by_ifname at
> 533 s; patched runs 1800 s clean. A stock PREEMPT (non-RT) build
> with the same widening ran 1800 s without firing: local_bh_disable()
> on non-RT prevents preemption inside the RCU section so the
> cross-CPU window does not open at observable rates.
>
> net/l2tp/l2tp_core.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
> index 1455f67e01ddb..0000000000000 100644
> --- a/net/l2tp/l2tp_core.c
> +++ b/net/l2tp/l2tp_core.c
> @@ -441,12 +441,13 @@ struct l2tp_session *l2tp_session_get_by_ifname(const struct net *net,
> idr_for_each_entry_ul(&pn->l2tp_tunnel_idr, tunnel, tmp, tunnel_id) {
> if (tunnel) {
> list_for_each_entry_rcu(session, &tunnel->session_list, list) {
> - if (!strcmp(session->ifname, ifname)) {
> - refcount_inc(&session->ref_count);
> - rcu_read_unlock_bh();
> + if (strcmp(session->ifname, ifname))
> + continue;
> + if (!refcount_inc_not_zero(&session->ref_count))
> + continue;
> + rcu_read_unlock_bh();
>
> - return session;
> - }
> + return session;
> }
> }
> }
> --
> 2.46.0
>
Reviewed-by: James Chapman <jchapman@katalix.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname
2026-05-23 2:34 [PATCH net] l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname Michael Bommarito
2026-05-26 6:44 ` James Chapman
@ 2026-05-26 16:52 ` Simon Horman
2026-05-27 1:00 ` patchwork-bot+netdevbpf
2026-05-27 7:16 ` Sebastian Andrzej Siewior
3 siblings, 0 replies; 5+ messages in thread
From: Simon Horman @ 2026-05-26 16:52 UTC (permalink / raw)
To: Michael Bommarito
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
James Chapman, Tom Parkin, Guillaume Nault,
Sebastian Andrzej Siewior, Clark Williams, Steven Rostedt,
Kees Cook, netdev, linux-rt-devel, linux-kernel
On Fri, May 22, 2026 at 10:34:23PM -0400, Michael Bommarito wrote:
> A reader in l2tp_session_get_by_ifname() can return a pointer to a
> session whose refcount has reached zero. The getter takes its
> reference with plain refcount_inc(), but every other session getter
> in the same file (l2tp_v2_session_get, l2tp_v3_session_get, and the
> corresponding _get_next variants) uses refcount_inc_not_zero()
> because the IDR/RCU lookup can race with refcount_dec_and_test() ->
> l2tp_session_free() -> kfree_rcu(). The ifname getter is the only
> outlier; the inconsistency was raised on-list after 979c017803c4
> ("l2tp: use list_del_rcu in l2tp_session_unhash").
>
> A reader inside rcu_read_lock_bh() that matches session->ifname can
> be preempted between the strcmp() and the refcount_inc(). If the
> last reference drops on another CPU in that window, the reader's
> refcount_inc() runs on a counter that has reached zero. refcount_t
> catches the addition-on-zero, prints "refcount_t: addition on 0;
> use-after-free", saturates the counter, and returns the saturated
> pointer to the caller. Session memory is held live by the in-flight
> RCU read section, but the kfree_rcu() callback queued from
> l2tp_session_free() will free it once the grace period closes; a
> caller that dereferences the returned session past that point hits
> a slab-use-after-free. On PREEMPT_RT local_bh_disable() is a per-CPU
> sleeping lock and the preemption window is real; on stock PREEMPT
> kernels local_bh_disable() is a preempt_count increment that closes
> the cross-CPU race in practice (see below).
>
> Use refcount_inc_not_zero() and continue the list walk on failure,
> matching the other session getters in the file. The ifname getter
> is the only session getter in net/l2tp/ that still uses the bare
> refcount_inc() pattern; this change restores file-internal
> consistency. The success path is unchanged.
>
> Fixes: abe7a1a7d0b6 ("l2tp: improve tunnel/session refcount helpers")
> Cc: stable@vger.kernel.org
> Assisted-by: Claude:claude-opus-4-7
> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname
2026-05-23 2:34 [PATCH net] l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname Michael Bommarito
2026-05-26 6:44 ` James Chapman
2026-05-26 16:52 ` Simon Horman
@ 2026-05-27 1:00 ` patchwork-bot+netdevbpf
2026-05-27 7:16 ` Sebastian Andrzej Siewior
3 siblings, 0 replies; 5+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-05-27 1:00 UTC (permalink / raw)
To: Michael Bommarito
Cc: davem, edumazet, kuba, pabeni, jchapman, tparkin, gnault, horms,
bigeasy, clrkwllms, rostedt, kees, netdev, linux-rt-devel,
linux-kernel
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Fri, 22 May 2026 22:34:23 -0400 you wrote:
> A reader in l2tp_session_get_by_ifname() can return a pointer to a
> session whose refcount has reached zero. The getter takes its
> reference with plain refcount_inc(), but every other session getter
> in the same file (l2tp_v2_session_get, l2tp_v3_session_get, and the
> corresponding _get_next variants) uses refcount_inc_not_zero()
> because the IDR/RCU lookup can race with refcount_dec_and_test() ->
> l2tp_session_free() -> kfree_rcu(). The ifname getter is the only
> outlier; the inconsistency was raised on-list after 979c017803c4
> ("l2tp: use list_del_rcu in l2tp_session_unhash").
>
> [...]
Here is the summary with links:
- [net] l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname
https://git.kernel.org/netdev/net/c/05f95729ca84
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH net] l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname
2026-05-23 2:34 [PATCH net] l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname Michael Bommarito
` (2 preceding siblings ...)
2026-05-27 1:00 ` patchwork-bot+netdevbpf
@ 2026-05-27 7:16 ` Sebastian Andrzej Siewior
3 siblings, 0 replies; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2026-05-27 7:16 UTC (permalink / raw)
To: Michael Bommarito
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
James Chapman, Tom Parkin, Guillaume Nault, Simon Horman,
Clark Williams, Steven Rostedt, Kees Cook, netdev, linux-rt-devel,
linux-kernel
sorry for being late…
On 2026-05-22 22:34:23 [-0400], Michael Bommarito wrote:
…
> a slab-use-after-free. On PREEMPT_RT local_bh_disable() is a per-CPU
> sleeping lock and the preemption window is real;
No, it is not but there is a preemption window, yes.
> on stock PREEMPT
> kernels local_bh_disable() is a preempt_count increment that closes
> the cross-CPU race in practice (see below).
It might be that the window is not wide open. I don't see why it should
not trigger on SMP.
> Use refcount_inc_not_zero() and continue the list walk on failure,
> matching the other session getters in the file. The ifname getter
> is the only session getter in net/l2tp/ that still uses the bare
> refcount_inc() pattern; this change restores file-internal
> consistency. The success path is unchanged.
This is the right change.
> Fixes: abe7a1a7d0b6 ("l2tp: improve tunnel/session refcount helpers")
This simply removes the wrapper but the logic is the same. Wouldn't
commit 2777e2ab5a9cf ("l2tp: take a reference on sessions used in
genetlink handlers") be where it was introduced?
> Cc: stable@vger.kernel.org
> Assisted-by: Claude:claude-opus-4-7
> Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com>
…
Sebastian
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-27 7:16 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-23 2:34 [PATCH net] l2tp: use refcount_inc_not_zero in l2tp_session_get_by_ifname Michael Bommarito
2026-05-26 6:44 ` James Chapman
2026-05-26 16:52 ` Simon Horman
2026-05-27 1:00 ` patchwork-bot+netdevbpf
2026-05-27 7:16 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox