public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] xfrm: always flush state and policy upon NETDEV_DOWN/NETDEV_UNREGISTER events
@ 2026-01-27 15:32 Tetsuo Handa
  2026-01-28 10:24 ` Leon Romanovsky
  0 siblings, 1 reply; 12+ messages in thread
From: Tetsuo Handa @ 2026-01-27 15:32 UTC (permalink / raw)
  To: Steffen Klassert, Herbert Xu, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Simon Horman, Ilan Tayari,
	Guy Shapiro, Yossi Kuperman, Network Development, Sabrina Dubroca,
	Leon Romanovsky

syzbot is reporting that "struct xfrm_state" refcount is leaking.

  unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 2
  ref_tracker: netdev@ffff888052f24618 has 1/1 users at
       __netdev_tracker_alloc include/linux/netdevice.h:4400 [inline]
       netdev_tracker_alloc include/linux/netdevice.h:4412 [inline]
       xfrm_dev_state_add+0x3a5/0x1080 net/xfrm/xfrm_device.c:316
       xfrm_state_construct net/xfrm/xfrm_user.c:986 [inline]
       xfrm_add_sa+0x34ff/0x5fa0 net/xfrm/xfrm_user.c:1022
       xfrm_user_rcv_msg+0x58e/0xc00 net/xfrm/xfrm_user.c:3507
       netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2550
       xfrm_netlink_rcv+0x71/0x90 net/xfrm/xfrm_user.c:3529
       netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
       netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
       netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1894
       sock_sendmsg_nosec net/socket.c:727 [inline]
       __sock_sendmsg net/socket.c:742 [inline]
       ____sys_sendmsg+0xa5d/0xc30 net/socket.c:2592
       ___sys_sendmsg+0x134/0x1d0 net/socket.c:2646
       __sys_sendmsg+0x16d/0x220 net/socket.c:2678
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

This is because currently xfrm_dev_down() (which is called upon NETDEV_DOWN
event and NETDEV_UNREGISTER event) can release the reference to
"struct net_device" taken by xfrm_dev_state_add() only if
the NETIF_F_HW_ESP bit is set, but the NETIF_F_HW_ESP bit can be cleared
by "ethtool -K $dev esp-hw-offload off" command.
In other words, we cannot guess whether xfrm_dev_state_add() has taken a
reference to "struct net_device" based on whether the NETIF_F_HW_ESP bit
is currently set.

For recording why this patch does not re-introduce xfrm_dev_unregister(),
my guessed history about this module's NETDEV_UNREGISTER handler is shown
below.

Commit d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
introduced xfrm_dev_state_add(). That commit called xfrm_dev_state_add()
 from xfrm_state_construct(), and introduced the NETDEV_UNREGISTER case
to xfrm_dev_event(). But that commit implemented xfrm_dev_unregister() as
a no-op, and implemented xfrm_dev_down() to call xfrm_dev_state_flush()
only if (dev->features & NETIF_F_HW_ESP) != 0. I guess that that commit
expected that NETDEV_DOWN event is fired before NETDEV_UNREGISTER event
fires, and also assumed that xfrm_dev_state_add() is called only if
(dev->features & NETIF_F_HW_ESP) != 0.

Commit ec30d78c14a8 ("xfrm: add xdst pcpu cache") added
xfrm_policy_cache_flush() call to xfrm_dev_unregister(), but
commit e4db5b61c572 ("xfrm: policy: remove pcpu policy cache") removed
xfrm_policy_cache_flush() call from xfrm_dev_unregister() and also
removed the NETDEV_UNREGISTER case from xfrm_dev_event() because
xfrm_dev_unregister() again became no-op.

Commit 03891f820c21 ("xfrm: handle NETDEV_UNREGISTER for xfrm device")
re-introduced the NETDEV_UNREGISTER case to xfrm_dev_event(), but that
commit for unknown reason chose to share xfrm_dev_down() between the
NETDEV_DOWN case and the NETDEV_UNREGISTER case. Therefore, I assumed
that doing the same behavior for both cases is desirable. If something
is wrong with this choice, please re-introduce xfrm_dev_unregister().

Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
Previous discussion is https://lkml.kernel.org/r/924f9cf5-599a-48f0-b1e3-94cd971965b0@I-love.SAKURA.ne.jp

 net/xfrm/xfrm_device.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 52ae0e034d29..26e62b6a9db5 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -536,10 +536,8 @@ static int xfrm_api_check(struct net_device *dev)
 
 static int xfrm_dev_down(struct net_device *dev)
 {
-	if (dev->features & NETIF_F_HW_ESP) {
-		xfrm_dev_state_flush(dev_net(dev), dev, true);
-		xfrm_dev_policy_flush(dev_net(dev), dev, true);
-	}
+	xfrm_dev_state_flush(dev_net(dev), dev, true);
+	xfrm_dev_policy_flush(dev_net(dev), dev, true);
 
 	return NOTIFY_DONE;
 }
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-02-01 14:17 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-27 15:32 [PATCH net] xfrm: always flush state and policy upon NETDEV_DOWN/NETDEV_UNREGISTER events Tetsuo Handa
2026-01-28 10:24 ` Leon Romanovsky
2026-01-28 10:44   ` Tetsuo Handa
2026-01-28 12:35     ` Leon Romanovsky
2026-01-29  8:06       ` Tetsuo Handa
2026-01-29  9:09         ` Leon Romanovsky
2026-01-29 10:16           ` Tetsuo Handa
2026-01-29 10:32             ` Tetsuo Handa
2026-01-29 16:05             ` Sabrina Dubroca
2026-02-01 13:12               ` Leon Romanovsky
2026-02-01 14:17                 ` Tetsuo Handa
2026-01-29 15:59         ` Sabrina Dubroca

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox