public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* xfrm: question regarding NETDEV_UNREGISTER handling
@ 2026-01-17 16:03 Tetsuo Handa
  2026-01-19 11:19 ` [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event Tetsuo Handa
  0 siblings, 1 reply; 11+ messages in thread
From: Tetsuo Handa @ 2026-01-17 16:03 UTC (permalink / raw)
  To: Aviad Yehezkel, Aviv Heller, Boris Pismenny, David S. Miller,
	Florian Westphal, Guy Shapiro, Ilan Tayari, Kristian Evensen,
	Leon Romanovsky, Leon Romanovsky, Raed Salem, Raed Salem,
	Saeed Mahameed, Steffen Klassert, Yossi Kuperman
  Cc: Network Development

syzbot is reporting that "struct xfrm_state" refcount is leaking.

  unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 2
  ref_tracker: netdev@ffff888052f24618 has 1/1 users at
       __netdev_tracker_alloc include/linux/netdevice.h:4400 [inline]
       netdev_tracker_alloc include/linux/netdevice.h:4412 [inline]
       xfrm_dev_state_add+0x3a5/0x1080 net/xfrm/xfrm_device.c:316
       xfrm_state_construct net/xfrm/xfrm_user.c:986 [inline]
       xfrm_add_sa+0x34ff/0x5fa0 net/xfrm/xfrm_user.c:1022
       xfrm_user_rcv_msg+0x58e/0xc00 net/xfrm/xfrm_user.c:3507
       netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2550
       xfrm_netlink_rcv+0x71/0x90 net/xfrm/xfrm_user.c:3529
       netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
       netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
       netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1894
       sock_sendmsg_nosec net/socket.c:727 [inline]
       __sock_sendmsg net/socket.c:742 [inline]
       ____sys_sendmsg+0xa5d/0xc30 net/socket.c:2592
       ___sys_sendmsg+0x134/0x1d0 net/socket.c:2646
       __sys_sendmsg+0x16d/0x220 net/socket.c:2678
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

Commit d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API") introduced
xfrm_dev_state_add() which grabs a reference to "struct net_device".
That commit called xfrm_dev_state_add() from xfrm_state_construct() and
introduced the NETDEV_UNREGISTER case to xfrm_dev_event(), but
xfrm_dev_unregister() was a no-op (rather than doing necessary actions
for releasing "struct xfrm_state" which will in turn drop the reference
to "struct net_device").

Commit 152afb9b45a8 ("xfrm: Indicate xfrm_state offload errors") added
proper error code propagation, and commit cc01572e2fb0 ("xfrm: Add SA to
hardware at the end of xfrm_state_construct()") moved the location to call
xfrm_dev_state_add(), but these commits did not touch NETDEV_UNREGISTER
handling.

Commit ec30d78c14a8 ("xfrm: add xdst pcpu cache") added
xfrm_policy_cache_flush() call to xfrm_dev_unregister(), but
commit e4db5b61c572 ("xfrm: policy: remove pcpu policy cache") removed
xfrm_policy_cache_flush() call from xfrm_dev_unregister() and also
removed the NETDEV_UNREGISTER case from xfrm_dev_event() because
xfrm_dev_unregister() became no-op.

Commit 03891f820c21 ("xfrm: handle NETDEV_UNREGISTER for xfrm device")
re-introduced the NETDEV_UNREGISTER case to xfrm_dev_event(), but that
commit chose to do the same thing for NETDEV_DOWN case and
NETDEV_UNREGISTER case. Since xfrm_dev_down() is no-op unless
(dev->features & NETIF_F_HW_ESP) != 0, no necessary actions are done for
releasing "struct xfrm_state" (for at least !NETIF_F_HW_ESP case).

Commit 919e43fad516 ("xfrm: add an interface to offload policy") added
xfrm_dev_policy_flush() to xfrm_dev_down(). I don't know whether this commit
is relevant or not.

But I feel that calling xfrm_dev_state_add() and  xfrm_dev_policy_add()
are possible for !NETIF_F_HW_ESP case. If my feeling is correct, we have
never implemented proper NETDEV_UNREGISTER handling for releasing
"struct xfrm_state". What actions are needed for properly releasing
"struct xfrm_state" from NETDEV_UNREGISTER handler?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event
  2026-01-17 16:03 xfrm: question regarding NETDEV_UNREGISTER handling Tetsuo Handa
@ 2026-01-19 11:19 ` Tetsuo Handa
  2026-01-22  8:24   ` Tetsuo Handa
  0 siblings, 1 reply; 11+ messages in thread
From: Tetsuo Handa @ 2026-01-19 11:19 UTC (permalink / raw)
  To: Aviad Yehezkel, Aviv Heller, Boris Pismenny, David S. Miller,
	Florian Westphal, Guy Shapiro, Ilan Tayari, Kristian Evensen,
	Leon Romanovsky, Leon Romanovsky, Raed Salem, Raed Salem,
	Saeed Mahameed, Steffen Klassert, Yossi Kuperman
  Cc: Network Development

syzbot is reporting that "struct xfrm_state" refcount is leaking.

  unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 2
  ref_tracker: netdev@ffff888052f24618 has 1/1 users at
       __netdev_tracker_alloc include/linux/netdevice.h:4400 [inline]
       netdev_tracker_alloc include/linux/netdevice.h:4412 [inline]
       xfrm_dev_state_add+0x3a5/0x1080 net/xfrm/xfrm_device.c:316
       xfrm_state_construct net/xfrm/xfrm_user.c:986 [inline]
       xfrm_add_sa+0x34ff/0x5fa0 net/xfrm/xfrm_user.c:1022
       xfrm_user_rcv_msg+0x58e/0xc00 net/xfrm/xfrm_user.c:3507
       netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2550
       xfrm_netlink_rcv+0x71/0x90 net/xfrm/xfrm_user.c:3529
       netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
       netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
       netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1894
       sock_sendmsg_nosec net/socket.c:727 [inline]
       __sock_sendmsg net/socket.c:742 [inline]
       ____sys_sendmsg+0xa5d/0xc30 net/socket.c:2592
       ___sys_sendmsg+0x134/0x1d0 net/socket.c:2646
       __sys_sendmsg+0x16d/0x220 net/socket.c:2678
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

Currently, the NETDEV_UNREGISTER case in xfrm_dev_event() is no-op
when (dev->features & NETIF_F_HW_ESP) == 0. Since xfrm_dev_state_add()
and xfrm_dev_policy_add() take a reference to "struct net_device", the
corresponding NETDEV_UNREGISTER handler must release that reference.
Flush dev state and dev policy, without checking whether to flush, when
NETDEV_UNREGISTER event fires.

Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
WARNING: This patch is just an analogy case of net/can/j1939 module.
This patch is completely untested and might not solve this problem, for
reproducer is not available for this problem. I appreciate if someone
can write a test code for this problem.

 drivers/net/bonding/bond_main.c |  2 +-
 include/net/xfrm.h              |  5 ++---
 net/xfrm/xfrm_device.c          | 15 ++++++++++++---
 net/xfrm/xfrm_policy.c          |  4 ++--
 net/xfrm/xfrm_state.c           |  4 ++--
 5 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 3d56339a8a10..bbb6bc4b30cd 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3824,7 +3824,7 @@ static int bond_master_netdev_event(unsigned long event,
 	case NETDEV_UNREGISTER:
 		bond_remove_proc_entry(event_bond);
 #ifdef CONFIG_XFRM_OFFLOAD
-		xfrm_dev_state_flush(dev_net(bond_dev), bond_dev, true);
+		xfrm_dev_state_flush(dev_net(bond_dev), bond_dev, true, false);
 #endif /* CONFIG_XFRM_OFFLOAD */
 		break;
 	case NETDEV_REGISTER:
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 0a14daaa5dd4..b19e7b1fbda2 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -1765,9 +1765,8 @@ struct xfrmk_spdinfo {
 struct xfrm_state *xfrm_find_acq_byseq(struct net *net, u32 mark, u32 seq, u32 pcpu_num);
 int xfrm_state_delete(struct xfrm_state *x);
 int xfrm_state_flush(struct net *net, u8 proto, bool task_valid);
-int xfrm_dev_state_flush(struct net *net, struct net_device *dev, bool task_valid);
-int xfrm_dev_policy_flush(struct net *net, struct net_device *dev,
-			  bool task_valid);
+int xfrm_dev_state_flush(struct net *net, struct net_device *dev, bool task_valid, bool force);
+int xfrm_dev_policy_flush(struct net *net, struct net_device *dev, bool task_valid, bool force);
 void xfrm_sad_getinfo(struct net *net, struct xfrmk_sadinfo *si);
 void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si);
 u32 xfrm_replay_seqhi(struct xfrm_state *x, __be32 net_seq);
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 52ae0e034d29..ec094aeb1604 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -537,13 +537,21 @@ static int xfrm_api_check(struct net_device *dev)
 static int xfrm_dev_down(struct net_device *dev)
 {
 	if (dev->features & NETIF_F_HW_ESP) {
-		xfrm_dev_state_flush(dev_net(dev), dev, true);
-		xfrm_dev_policy_flush(dev_net(dev), dev, true);
+		xfrm_dev_state_flush(dev_net(dev), dev, true, false);
+		xfrm_dev_policy_flush(dev_net(dev), dev, true, false);
 	}
 
 	return NOTIFY_DONE;
 }
 
+static int xfrm_dev_unregister(struct net_device *dev)
+{
+	xfrm_dev_state_flush(dev_net(dev), dev, true, true);
+	xfrm_dev_policy_flush(dev_net(dev), dev, true, true);
+
+	return NOTIFY_DONE;
+}
+
 static int xfrm_dev_event(struct notifier_block *this, unsigned long event, void *ptr)
 {
 	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
@@ -556,8 +564,9 @@ static int xfrm_dev_event(struct notifier_block *this, unsigned long event, void
 		return xfrm_api_check(dev);
 
 	case NETDEV_DOWN:
-	case NETDEV_UNREGISTER:
 		return xfrm_dev_down(dev);
+	case NETDEV_UNREGISTER:
+		return xfrm_dev_unregister(dev);
 	}
 	return NOTIFY_DONE;
 }
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 62486f866975..a451dff25c52 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -1855,14 +1855,14 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
 EXPORT_SYMBOL(xfrm_policy_flush);
 
 int xfrm_dev_policy_flush(struct net *net, struct net_device *dev,
-			  bool task_valid)
+			  bool task_valid, bool forced)
 {
 	int dir, err = 0, cnt = 0;
 	struct xfrm_policy *pol;
 
 	spin_lock_bh(&net->xfrm.xfrm_policy_lock);
 
-	err = xfrm_dev_policy_flush_secctx_check(net, dev, task_valid);
+	err = forced ? 0 : xfrm_dev_policy_flush_secctx_check(net, dev, task_valid);
 	if (err)
 		goto out;
 
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 98b362d51836..29a124291331 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -958,7 +958,7 @@ int xfrm_state_flush(struct net *net, u8 proto, bool task_valid)
 }
 EXPORT_SYMBOL(xfrm_state_flush);
 
-int xfrm_dev_state_flush(struct net *net, struct net_device *dev, bool task_valid)
+int xfrm_dev_state_flush(struct net *net, struct net_device *dev, bool task_valid, bool forced)
 {
 	struct xfrm_state *x;
 	struct hlist_node *tmp;
@@ -966,7 +966,7 @@ int xfrm_dev_state_flush(struct net *net, struct net_device *dev, bool task_vali
 	int i, err = 0, cnt = 0;
 
 	spin_lock_bh(&net->xfrm.xfrm_state_lock);
-	err = xfrm_dev_state_flush_secctx_check(net, dev, task_valid);
+	err = forced ? 0 : xfrm_dev_state_flush_secctx_check(net, dev, task_valid);
 	if (err)
 		goto out;
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event
  2026-01-19 11:19 ` [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event Tetsuo Handa
@ 2026-01-22  8:24   ` Tetsuo Handa
  2026-01-22 11:15     ` Steffen Klassert
  0 siblings, 1 reply; 11+ messages in thread
From: Tetsuo Handa @ 2026-01-22  8:24 UTC (permalink / raw)
  To: Aviad Yehezkel, Aviv Heller, Boris Pismenny, David S. Miller,
	Florian Westphal, Guy Shapiro, Ilan Tayari, Kristian Evensen,
	Leon Romanovsky, Leon Romanovsky, Raed Salem, Raed Salem,
	Saeed Mahameed, Steffen Klassert, Yossi Kuperman
  Cc: Network Development, linux-security-module

A debug patch in linux-next-20260121
( https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit?id=fc0f090e41e652d158f946c616cdd82baed3c8f4 )
has demonstrated that calling xfrm_dev_state_flush()/xfrm_dev_policy_flush()
when (dev->features & NETIF_F_HW_ESP) == 0 helps releasing "struct net_device" refcount.

  unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 2
  ref_tracker: netdev@ffff88805d3c0628 has 1/1 users at
       xfrm_dev_state_add+0x6f4/0xc40 net/xfrm/xfrm_device.c:316
       xfrm_state_construct net/xfrm/xfrm_user.c:986 [inline]
       xfrm_add_sa+0x34ca/0x4230 net/xfrm/xfrm_user.c:1022
       xfrm_user_rcv_msg+0x746/0xb20 net/xfrm/xfrm_user.c:3507
       netlink_rcv_skb+0x232/0x4b0 net/netlink/af_netlink.c:2550
       xfrm_netlink_rcv+0x79/0x90 net/xfrm/xfrm_user.c:3529
       netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
       netlink_unicast+0x80f/0x9b0 net/netlink/af_netlink.c:1344
       netlink_sendmsg+0x813/0xb40 net/netlink/af_netlink.c:1894
       sock_sendmsg_nosec+0x18f/0x1d0 net/socket.c:737
       __sock_sendmsg net/socket.c:752 [inline]
       ____sys_sendmsg+0x589/0x8c0 net/socket.c:2610
       ___sys_sendmsg+0x2a5/0x360 net/socket.c:2664
       __sys_sendmsg net/socket.c:2696 [inline]
       __do_sys_sendmsg net/socket.c:2701 [inline]
       __se_sys_sendmsg net/socket.c:2699 [inline]
       __x64_sys_sendmsg+0x1bd/0x2a0 net/socket.c:2699
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f
  
  infiniband: balance for netdevsim0@ib_gid_table_entry is 0
  ***** Releasing 1 refcount on 0000000000000000
  ***** Refcount for netdevsim0 changed from 2 to 1

The bond_master_netdev_event(NETDEV_UNREGISTER) case is already calling
xfrm_dev_state_flush() without checking (dev->features & NETIF_F_HW_ESP) != 0.
Therefore, I assume that (dev->features & NETIF_F_HW_ESP) != 0 check in
xfrm_dev_down() is wrong, and I would like to propose

 static int xfrm_dev_down(struct net_device *dev)
 {
- 	if (dev->features & NETIF_F_HW_ESP) {
-		xfrm_dev_state_flush(dev_net(dev), dev, true);
-		xfrm_dev_policy_flush(dev_net(dev), dev, true);
- 	}
+	xfrm_dev_state_flush(dev_net(dev), dev, true);
+	xfrm_dev_policy_flush(dev_net(dev), dev, true);
 
 	return NOTIFY_DONE;
 }

change as a fix for "unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 2"
problem.



But I have a question regarding security_xfrm_state_delete()/security_xfrm_policy_delete().

xfrm_dev_state_flush_secctx_check() calls security_xfrm_state_delete() which can make
xfrm_dev_state_flush() no-op by returning an error value.
xfrm_dev_policy_flush_secctx_check() calls security_xfrm_policy_delete() which can make
xfrm_dev_policy_flush() no-op by returning an error value.

Since xfrm_dev_state_flush()/xfrm_dev_policy_flush() are called by NETDEV_UNREGISTER
event (which is a signal for releasing all resources that prevent "struct net_device"
references from dropping), making xfrm_dev_state_flush()/xfrm_dev_policy_flush() no-op (by
allowing security_xfrm_state_delete()/security_xfrm_policy_delete() to return an error) is
a denial-of-service bug.

Therefore, I wonder what are security_xfrm_state_delete() and security_xfrm_policy_delete()
for. Can I kill xfrm_dev_state_flush_secctx_check() and xfrm_dev_policy_flush_secctx_check() ?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event
  2026-01-22  8:24   ` Tetsuo Handa
@ 2026-01-22 11:15     ` Steffen Klassert
  2026-01-22 11:28       ` Tetsuo Handa
  2026-01-26 22:41       ` Paul Moore
  0 siblings, 2 replies; 11+ messages in thread
From: Steffen Klassert @ 2026-01-22 11:15 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Aviad Yehezkel, Aviv Heller, Boris Pismenny, David S. Miller,
	Florian Westphal, Guy Shapiro, Ilan Tayari, Kristian Evensen,
	Leon Romanovsky, Leon Romanovsky, Raed Salem, Raed Salem,
	Saeed Mahameed, Yossi Kuperman, Network Development,
	linux-security-module

On Thu, Jan 22, 2026 at 05:24:22PM +0900, Tetsuo Handa wrote:
> A debug patch in linux-next-20260121
> ( https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit?id=fc0f090e41e652d158f946c616cdd82baed3c8f4 )
> has demonstrated that calling xfrm_dev_state_flush()/xfrm_dev_policy_flush()
> when (dev->features & NETIF_F_HW_ESP) == 0 helps releasing "struct net_device" refcount.
> 
>   unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 2
>   ref_tracker: netdev@ffff88805d3c0628 has 1/1 users at
>        xfrm_dev_state_add+0x6f4/0xc40 net/xfrm/xfrm_device.c:316
>        xfrm_state_construct net/xfrm/xfrm_user.c:986 [inline]
>        xfrm_add_sa+0x34ca/0x4230 net/xfrm/xfrm_user.c:1022
>        xfrm_user_rcv_msg+0x746/0xb20 net/xfrm/xfrm_user.c:3507
>        netlink_rcv_skb+0x232/0x4b0 net/netlink/af_netlink.c:2550
>        xfrm_netlink_rcv+0x79/0x90 net/xfrm/xfrm_user.c:3529
>        netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
>        netlink_unicast+0x80f/0x9b0 net/netlink/af_netlink.c:1344
>        netlink_sendmsg+0x813/0xb40 net/netlink/af_netlink.c:1894
>        sock_sendmsg_nosec+0x18f/0x1d0 net/socket.c:737
>        __sock_sendmsg net/socket.c:752 [inline]
>        ____sys_sendmsg+0x589/0x8c0 net/socket.c:2610
>        ___sys_sendmsg+0x2a5/0x360 net/socket.c:2664
>        __sys_sendmsg net/socket.c:2696 [inline]
>        __do_sys_sendmsg net/socket.c:2701 [inline]
>        __se_sys_sendmsg net/socket.c:2699 [inline]
>        __x64_sys_sendmsg+0x1bd/0x2a0 net/socket.c:2699
>        do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>        do_syscall_64+0xe2/0xf80 arch/x86/entry/syscall_64.c:94
>        entry_SYSCALL_64_after_hwframe+0x77/0x7f
>   
>   infiniband: balance for netdevsim0@ib_gid_table_entry is 0
>   ***** Releasing 1 refcount on 0000000000000000
>   ***** Refcount for netdevsim0 changed from 2 to 1
> 
> The bond_master_netdev_event(NETDEV_UNREGISTER) case is already calling
> xfrm_dev_state_flush() without checking (dev->features & NETIF_F_HW_ESP) != 0.
> Therefore, I assume that (dev->features & NETIF_F_HW_ESP) != 0 check in
> xfrm_dev_down() is wrong, and I would like to propose
> 
>  static int xfrm_dev_down(struct net_device *dev)
>  {
> - 	if (dev->features & NETIF_F_HW_ESP) {
> -		xfrm_dev_state_flush(dev_net(dev), dev, true);
> -		xfrm_dev_policy_flush(dev_net(dev), dev, true);
> - 	}
> +	xfrm_dev_state_flush(dev_net(dev), dev, true);
> +	xfrm_dev_policy_flush(dev_net(dev), dev, true);
>  
>  	return NOTIFY_DONE;
>  }

Hm, I'd say we should not try to offload to a device that does
not support NETIF_F_HW_ESP.

> 
> change as a fix for "unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 2"
> problem.
> 
> 
> 
> But I have a question regarding security_xfrm_state_delete()/security_xfrm_policy_delete().
> 
> xfrm_dev_state_flush_secctx_check() calls security_xfrm_state_delete() which can make
> xfrm_dev_state_flush() no-op by returning an error value.
> xfrm_dev_policy_flush_secctx_check() calls security_xfrm_policy_delete() which can make
> xfrm_dev_policy_flush() no-op by returning an error value.
> 
> Since xfrm_dev_state_flush()/xfrm_dev_policy_flush() are called by NETDEV_UNREGISTER
> event (which is a signal for releasing all resources that prevent "struct net_device"
> references from dropping), making xfrm_dev_state_flush()/xfrm_dev_policy_flush() no-op (by
> allowing security_xfrm_state_delete()/security_xfrm_policy_delete() to return an error) is
> a denial-of-service bug.

This means that the calling task doesn't have the permission to delete the
state, some LSM has a policy the does not grant this permission.

> 
> Therefore, I wonder what are security_xfrm_state_delete() and security_xfrm_policy_delete()
> for. Can I kill xfrm_dev_state_flush_secctx_check() and xfrm_dev_policy_flush_secctx_check() ?

This might violate a LSM policy then.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event
  2026-01-22 11:15     ` Steffen Klassert
@ 2026-01-22 11:28       ` Tetsuo Handa
  2026-01-22 11:32         ` Steffen Klassert
  2026-01-26 22:41       ` Paul Moore
  1 sibling, 1 reply; 11+ messages in thread
From: Tetsuo Handa @ 2026-01-22 11:28 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Boris Pismenny, David S. Miller, Florian Westphal,
	Kristian Evensen, Leon Romanovsky, Leon Romanovsky, Raed Salem,
	Raed Salem, Saeed Mahameed, Yossi Kuperman, Network Development,
	linux-security-module, Aviad Yehezkel

On 2026/01/22 20:15, Steffen Klassert wrote:
> Hm, I'd say we should not try to offload to a device that does
> not support NETIF_F_HW_ESP.

I was about to post the patch below, but you are suggesting that "do not allow calling
xfrm_dev_state_add()/xfrm_dev_policy_add() if (dev->features & NETIF_F_HW_ESP) == 0" ?

[PATCH] xfrm: always flush state and policy upon NETDEV_DOWN/NETDEV_UNREGISTER events

syzbot is reporting that "struct xfrm_state" refcount is leaking.

  unregister_netdevice: waiting for netdevsim0 to become free. Usage count = 2
  ref_tracker: netdev@ffff888052f24618 has 1/1 users at
       __netdev_tracker_alloc include/linux/netdevice.h:4400 [inline]
       netdev_tracker_alloc include/linux/netdevice.h:4412 [inline]
       xfrm_dev_state_add+0x3a5/0x1080 net/xfrm/xfrm_device.c:316
       xfrm_state_construct net/xfrm/xfrm_user.c:986 [inline]
       xfrm_add_sa+0x34ff/0x5fa0 net/xfrm/xfrm_user.c:1022
       xfrm_user_rcv_msg+0x58e/0xc00 net/xfrm/xfrm_user.c:3507
       netlink_rcv_skb+0x158/0x420 net/netlink/af_netlink.c:2550
       xfrm_netlink_rcv+0x71/0x90 net/xfrm/xfrm_user.c:3529
       netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
       netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
       netlink_sendmsg+0x8c8/0xdd0 net/netlink/af_netlink.c:1894
       sock_sendmsg_nosec net/socket.c:727 [inline]
       __sock_sendmsg net/socket.c:742 [inline]
       ____sys_sendmsg+0xa5d/0xc30 net/socket.c:2592
       ___sys_sendmsg+0x134/0x1d0 net/socket.c:2646
       __sys_sendmsg+0x16d/0x220 net/socket.c:2678
       do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
       do_syscall_64+0xcd/0xf80 arch/x86/entry/syscall_64.c:94
       entry_SYSCALL_64_after_hwframe+0x77/0x7f

Since xfrm_dev_state_add() takes a reference to "struct net_device",
the corresponding NETDEV_UNREGISTER handler must release that reference.

Commit d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
introduced xfrm_dev_state_add() which grabs a reference to
"struct net_device". That commit called xfrm_dev_state_add() from
xfrm_state_construct() and introduced the NETDEV_UNREGISTER case to
xfrm_dev_event(). But that commit implemented xfrm_dev_unregister() as
a no-op, and implemented xfrm_dev_down() to call xfrm_dev_state_flush()
only if (dev->features & NETIF_F_HW_ESP) != 0. Maybe that commit expected
that NETDEV_DOWN event is fired before NETDEV_UNREGISTER event fires, and
also assumed that xfrm_dev_state_add() is called only if
(dev->features & NETIF_F_HW_ESP) != 0.

Commit ec30d78c14a8 ("xfrm: add xdst pcpu cache") added
xfrm_policy_cache_flush() call to xfrm_dev_unregister(), but
commit e4db5b61c572 ("xfrm: policy: remove pcpu policy cache") removed
xfrm_policy_cache_flush() call from xfrm_dev_unregister() and also
removed the NETDEV_UNREGISTER case from xfrm_dev_event() because
xfrm_dev_unregister() again became no-op.

Commit 03891f820c21 ("xfrm: handle NETDEV_UNREGISTER for xfrm device")
re-introduced the NETDEV_UNREGISTER case to xfrm_dev_event(), but that
commit for unknown reason chose to share xfrm_dev_down() between
the NETDEV_DOWN case and the NETDEV_UNREGISTER case. But since syzbot is
demonstrating that it is possible to call xfrm_dev_state_add() even if
(dev->features & NETIF_F_HW_ESP) == 0, we need to make sure that
netdev_put() from xfrm_dev_state_free() from xfrm_dev_state_flush() is
called upon NETDEV_UNREGISTER event.

Assuming that it is correct behavior to call netdev_put() upon NETDEV_DOWN
event even if (dev->features & NETIF_F_HW_ESP) == 0, this patch updates
xfrm_dev_down() rather than re-introducing xfrm_dev_unregister().

Reported-by: syzbot+881d65229ca4f9ae8c84@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=881d65229ca4f9ae8c84
Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
Since no reproducer is available for this problem, I can't ask syzbot to
test this change. But I confirmed using linux-next tree that calling
xfrm_dev_state_flush() upon NETDEV_UNREGISTER event even if
(dev->features & NETIF_F_HW_ESP) == 0 solved this problem.

 net/xfrm/xfrm_device.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 52ae0e034d29..26e62b6a9db5 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -536,10 +536,8 @@ static int xfrm_api_check(struct net_device *dev)
 
 static int xfrm_dev_down(struct net_device *dev)
 {
-	if (dev->features & NETIF_F_HW_ESP) {
-		xfrm_dev_state_flush(dev_net(dev), dev, true);
-		xfrm_dev_policy_flush(dev_net(dev), dev, true);
-	}
+	xfrm_dev_state_flush(dev_net(dev), dev, true);
+	xfrm_dev_policy_flush(dev_net(dev), dev, true);
 
 	return NOTIFY_DONE;
 }
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event
  2026-01-22 11:28       ` Tetsuo Handa
@ 2026-01-22 11:32         ` Steffen Klassert
  2026-01-22 13:07           ` Tetsuo Handa
  0 siblings, 1 reply; 11+ messages in thread
From: Steffen Klassert @ 2026-01-22 11:32 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Boris Pismenny, David S. Miller, Florian Westphal,
	Kristian Evensen, Leon Romanovsky, Leon Romanovsky, Raed Salem,
	Raed Salem, Saeed Mahameed, Yossi Kuperman, Network Development,
	linux-security-module, Aviad Yehezkel

On Thu, Jan 22, 2026 at 08:28:31PM +0900, Tetsuo Handa wrote:
> On 2026/01/22 20:15, Steffen Klassert wrote:
> > Hm, I'd say we should not try to offload to a device that does
> > not support NETIF_F_HW_ESP.
> 
> I was about to post the patch below, but you are suggesting that "do not allow calling
> xfrm_dev_state_add()/xfrm_dev_policy_add() if (dev->features & NETIF_F_HW_ESP) == 0" ?

As said, I think this is the correct way to do it. But let's wait
on opinions from the hardware people.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event
  2026-01-22 11:32         ` Steffen Klassert
@ 2026-01-22 13:07           ` Tetsuo Handa
  2026-01-26 11:07             ` Leon Romanovsky
  2026-01-26 14:16             ` Sabrina Dubroca
  0 siblings, 2 replies; 11+ messages in thread
From: Tetsuo Handa @ 2026-01-22 13:07 UTC (permalink / raw)
  To: Steffen Klassert, linux-security-module
  Cc: Boris Pismenny, David S. Miller, Florian Westphal,
	Kristian Evensen, Leon Romanovsky, Leon Romanovsky, Raed Salem,
	Raed Salem, Saeed Mahameed, Yossi Kuperman, Network Development,
	Aviad Yehezkel, Herbert Xu

On 2026/01/22 20:32, Steffen Klassert wrote:
> On Thu, Jan 22, 2026 at 08:28:31PM +0900, Tetsuo Handa wrote:
>> On 2026/01/22 20:15, Steffen Klassert wrote:
>>> Hm, I'd say we should not try to offload to a device that does
>>> not support NETIF_F_HW_ESP.
>>
>> I was about to post the patch below, but you are suggesting that "do not allow calling
>> xfrm_dev_state_add()/xfrm_dev_policy_add() if (dev->features & NETIF_F_HW_ESP) == 0" ?
> 
> As said, I think this is the correct way to do it. But let's wait
> on opinions from the hardware people.

OK. I guess something like below.

 net/xfrm/xfrm_device.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 52ae0e034d29..19aa61609d24 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -292,6 +292,13 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
 		dst_release(dst);
 	}
 
+	if (!(dev->features & NETIF_F_HW_ESP)) {
+		NL_SET_ERR_MSG(extack, "Device doesn't support offload");
+		xso->dev = NULL;
+		dev_put(dev);
+		return -EINVAL;
+	}
+
 	if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_state_add) {
 		xso->dev = NULL;
 		dev_put(dev);
@@ -367,7 +374,8 @@ int xfrm_dev_policy_add(struct net *net, struct xfrm_policy *xp,
 	if (!dev)
 		return -EINVAL;
 
-	if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_policy_add) {
+	if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_policy_add ||
+	    !(dev->features & NETIF_F_HW_ESP)) {
 		xdo->dev = NULL;
 		dev_put(dev);
 		NL_SET_ERR_MSG(extack, "Policy offload is not supported");



On 2026/01/22 20:15, Steffen Klassert wrote:
>> But I have a question regarding security_xfrm_state_delete()/security_xfrm_policy_delete().
>>
>> xfrm_dev_state_flush_secctx_check() calls security_xfrm_state_delete() which can make
>> xfrm_dev_state_flush() no-op by returning an error value.
>> xfrm_dev_policy_flush_secctx_check() calls security_xfrm_policy_delete() which can make
>> xfrm_dev_policy_flush() no-op by returning an error value.
>>
>> Since xfrm_dev_state_flush()/xfrm_dev_policy_flush() are called by NETDEV_UNREGISTER
>> event (which is a signal for releasing all resources that prevent "struct net_device"
>> references from dropping), making xfrm_dev_state_flush()/xfrm_dev_policy_flush() no-op (by
>> allowing security_xfrm_state_delete()/security_xfrm_policy_delete() to return an error) is
>> a denial-of-service bug.
> 
> This means that the calling task doesn't have the permission to delete the
> state, some LSM has a policy the does not grant this permission.

But NETDEV_UNREGISTER event can fire without explicit request from a user.
Roughly speaking, current behavior is that

  while (security_xfrm_state_delete() != 0) {
    schedule_timeout_uninterruptible(10 * HZ);
    pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n",
             dev->name, netdev_refcnt_read(dev));
  }
  while (security_xfrm_policy_delete() != 0) {
    schedule_timeout_uninterruptible(10 * HZ);
    pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n",
             dev->name, netdev_refcnt_read(dev));
  }

might be executed upon e.g. termination of a userspace process.

> 
>>
>> Therefore, I wonder what are security_xfrm_state_delete() and security_xfrm_policy_delete()
>> for. Can I kill xfrm_dev_state_flush_secctx_check() and xfrm_dev_policy_flush_secctx_check() ?
> 
> This might violate a LSM policy then.

But LSM policy that results in system hung upon automatic cleanup logic is so stupid.
I want to kill xfrm_dev_state_flush_secctx_check() and xfrm_dev_policy_flush_secctx_check()
in order to eliminate possibility of system hung.


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event
  2026-01-22 13:07           ` Tetsuo Handa
@ 2026-01-26 11:07             ` Leon Romanovsky
  2026-01-26 15:57               ` Tetsuo Handa
  2026-01-26 14:16             ` Sabrina Dubroca
  1 sibling, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2026-01-26 11:07 UTC (permalink / raw)
  To: Tetsuo Handa, Steffen Klassert
  Cc: linux-security-module, Boris Pismenny, David S. Miller,
	Florian Westphal, Kristian Evensen, Raed Salem, Raed Salem,
	Saeed Mahameed, Yossi Kuperman, Network Development,
	Aviad Yehezkel, Herbert Xu

On Thu, Jan 22, 2026 at 10:07:46PM +0900, Tetsuo Handa wrote:
> On 2026/01/22 20:32, Steffen Klassert wrote:
> > On Thu, Jan 22, 2026 at 08:28:31PM +0900, Tetsuo Handa wrote:
> >> On 2026/01/22 20:15, Steffen Klassert wrote:
> >>> Hm, I'd say we should not try to offload to a device that does
> >>> not support NETIF_F_HW_ESP.
> >>
> >> I was about to post the patch below, but you are suggesting that "do not allow calling
> >> xfrm_dev_state_add()/xfrm_dev_policy_add() if (dev->features & NETIF_F_HW_ESP) == 0" ?
> > 
> > As said, I think this is the correct way to do it. But let's wait
> > on opinions from the hardware people.
> 
> OK. I guess something like below.
> 
>  net/xfrm/xfrm_device.c |   10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
> index 52ae0e034d29..19aa61609d24 100644
> --- a/net/xfrm/xfrm_device.c
> +++ b/net/xfrm/xfrm_device.c
> @@ -292,6 +292,13 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
>  		dst_release(dst);
>  	}
>  
> +	if (!(dev->features & NETIF_F_HW_ESP)) {
> +		NL_SET_ERR_MSG(extack, "Device doesn't support offload");
> +		xso->dev = NULL;
> +		dev_put(dev);
> +		return -EINVAL;
> +	}

Steffen, Tetsuo

If by "HW people" you mean me, we always set NETIF_F_HW_ESP when adding
the .xfrm_dev_*_add() callbacks.

  1334 void mlx5e_ipsec_build_netdev(struct mlx5e_priv *priv)
  1335 {
  1336         struct mlx5_core_dev *mdev = priv->mdev;
  1337         struct net_device *netdev = priv->netdev;
  1338
  1339         if (!mlx5_ipsec_device_caps(mdev))
  1340                 return;
  1341
  1342         mlx5_core_info(mdev,
  1343                        "mlx5e: IPSec ESP acceleration enabled\n");
  1344
  1345         netdev->xfrmdev_ops = &mlx5e_ipsec_xfrmdev_ops;
  1346         netdev->features |= NETIF_F_HW_ESP;
  1347         netdev->hw_enc_features |= NETIF_F_HW_ESP;

So we are left with two possibilities: either the device registered XFRM
ops without setting NETIF_F_HW_ESP, or netdev->features was modified
without clearing the xfrmdev_ops pointer.

Which device is triggering the syzcaller crash?

Thanks

> +
>  	if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_state_add) {
>  		xso->dev = NULL;
>  		dev_put(dev);
> @@ -367,7 +374,8 @@ int xfrm_dev_policy_add(struct net *net, struct xfrm_policy *xp,
>  	if (!dev)
>  		return -EINVAL;
>  
> -	if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_policy_add) {
> +	if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_policy_add ||
> +	    !(dev->features & NETIF_F_HW_ESP)) {
>  		xdo->dev = NULL;
>  		dev_put(dev);
>  		NL_SET_ERR_MSG(extack, "Policy offload is not supported");
> 
> 
> 
> On 2026/01/22 20:15, Steffen Klassert wrote:
> >> But I have a question regarding security_xfrm_state_delete()/security_xfrm_policy_delete().
> >>
> >> xfrm_dev_state_flush_secctx_check() calls security_xfrm_state_delete() which can make
> >> xfrm_dev_state_flush() no-op by returning an error value.
> >> xfrm_dev_policy_flush_secctx_check() calls security_xfrm_policy_delete() which can make
> >> xfrm_dev_policy_flush() no-op by returning an error value.
> >>
> >> Since xfrm_dev_state_flush()/xfrm_dev_policy_flush() are called by NETDEV_UNREGISTER
> >> event (which is a signal for releasing all resources that prevent "struct net_device"
> >> references from dropping), making xfrm_dev_state_flush()/xfrm_dev_policy_flush() no-op (by
> >> allowing security_xfrm_state_delete()/security_xfrm_policy_delete() to return an error) is
> >> a denial-of-service bug.
> > 
> > This means that the calling task doesn't have the permission to delete the
> > state, some LSM has a policy the does not grant this permission.
> 
> But NETDEV_UNREGISTER event can fire without explicit request from a user.
> Roughly speaking, current behavior is that
> 
>   while (security_xfrm_state_delete() != 0) {
>     schedule_timeout_uninterruptible(10 * HZ);
>     pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n",
>              dev->name, netdev_refcnt_read(dev));
>   }
>   while (security_xfrm_policy_delete() != 0) {
>     schedule_timeout_uninterruptible(10 * HZ);
>     pr_emerg("unregister_netdevice: waiting for %s to become free. Usage count = %d\n",
>              dev->name, netdev_refcnt_read(dev));
>   }
> 
> might be executed upon e.g. termination of a userspace process.
> 
> > 
> >>
> >> Therefore, I wonder what are security_xfrm_state_delete() and security_xfrm_policy_delete()
> >> for. Can I kill xfrm_dev_state_flush_secctx_check() and xfrm_dev_policy_flush_secctx_check() ?
> > 
> > This might violate a LSM policy then.
> 
> But LSM policy that results in system hung upon automatic cleanup logic is so stupid.
> I want to kill xfrm_dev_state_flush_secctx_check() and xfrm_dev_policy_flush_secctx_check()
> in order to eliminate possibility of system hung.
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event
  2026-01-22 13:07           ` Tetsuo Handa
  2026-01-26 11:07             ` Leon Romanovsky
@ 2026-01-26 14:16             ` Sabrina Dubroca
  1 sibling, 0 replies; 11+ messages in thread
From: Sabrina Dubroca @ 2026-01-26 14:16 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Steffen Klassert, linux-security-module, Boris Pismenny,
	David S. Miller, Florian Westphal, Kristian Evensen,
	Leon Romanovsky, Leon Romanovsky, Raed Salem, Raed Salem,
	Saeed Mahameed, Yossi Kuperman, Network Development,
	Aviad Yehezkel, Herbert Xu

2026-01-22, 22:07:46 +0900, Tetsuo Handa wrote:
> On 2026/01/22 20:32, Steffen Klassert wrote:
> > On Thu, Jan 22, 2026 at 08:28:31PM +0900, Tetsuo Handa wrote:
> >> On 2026/01/22 20:15, Steffen Klassert wrote:
> >>> Hm, I'd say we should not try to offload to a device that does
> >>> not support NETIF_F_HW_ESP.
> >>
> >> I was about to post the patch below, but you are suggesting that "do not allow calling
> >> xfrm_dev_state_add()/xfrm_dev_policy_add() if (dev->features & NETIF_F_HW_ESP) == 0" ?
> > 
> > As said, I think this is the correct way to do it. But let's wait
> > on opinions from the hardware people.

But the current behavior ("ignore NETIF_F_HW_ESP and call
xdo_dev_state_add for new states anyway") has been established for
multiple years. Changing that now seems a bit risky.

> OK. I guess something like below.
> 
>  net/xfrm/xfrm_device.c |   10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
> index 52ae0e034d29..19aa61609d24 100644
> --- a/net/xfrm/xfrm_device.c
> +++ b/net/xfrm/xfrm_device.c
> @@ -292,6 +292,13 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
>  		dst_release(dst);
>  	}
>  
> +	if (!(dev->features & NETIF_F_HW_ESP)) {
> +		NL_SET_ERR_MSG(extack, "Device doesn't support offload");
> +		xso->dev = NULL;
> +		dev_put(dev);
> +		return -EINVAL;
> +	}

I'm not sure we want to make state creation fail in this case...

> +
>  	if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_state_add) {

while it will succeed (just without offload) in that case.

>  		xso->dev = NULL;
>  		dev_put(dev);
> @@ -367,7 +374,8 @@ int xfrm_dev_policy_add(struct net *net, struct xfrm_policy *xp,
>  	if (!dev)
>  		return -EINVAL;
>  
> -	if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_policy_add) {
> +	if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_policy_add ||
> +	    !(dev->features & NETIF_F_HW_ESP)) {
>  		xdo->dev = NULL;
>  		dev_put(dev);
>  		NL_SET_ERR_MSG(extack, "Policy offload is not supported");

-- 
Sabrina

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event
  2026-01-26 11:07             ` Leon Romanovsky
@ 2026-01-26 15:57               ` Tetsuo Handa
  0 siblings, 0 replies; 11+ messages in thread
From: Tetsuo Handa @ 2026-01-26 15:57 UTC (permalink / raw)
  To: Leon Romanovsky, Steffen Klassert, Sabrina Dubroca
  Cc: Boris Pismenny, David S. Miller, Florian Westphal,
	Kristian Evensen, Raed Salem, Raed Salem, Saeed Mahameed,
	Yossi Kuperman, Network Development, Aviad Yehezkel, Herbert Xu

On 2026/01/26 20:07, Leon Romanovsky wrote:
> Steffen, Tetsuo
> 
> If by "HW people" you mean me, we always set NETIF_F_HW_ESP when adding
> the .xfrm_dev_*_add() callbacks.
> 
>   1334 void mlx5e_ipsec_build_netdev(struct mlx5e_priv *priv)
>   1335 {
>   1336         struct mlx5_core_dev *mdev = priv->mdev;
>   1337         struct net_device *netdev = priv->netdev;
>   1338
>   1339         if (!mlx5_ipsec_device_caps(mdev))
>   1340                 return;
>   1341
>   1342         mlx5_core_info(mdev,
>   1343                        "mlx5e: IPSec ESP acceleration enabled\n");
>   1344
>   1345         netdev->xfrmdev_ops = &mlx5e_ipsec_xfrmdev_ops;
>   1346         netdev->features |= NETIF_F_HW_ESP;
>   1347         netdev->hw_enc_features |= NETIF_F_HW_ESP;
> 
> So we are left with two possibilities: either the device registered XFRM
> ops without setting NETIF_F_HW_ESP, or netdev->features was modified
> without clearing the xfrmdev_ops pointer.

I didn't know that netdev->features can be updated via "struct net_device_ops"->
ndo_set_features() callback using "ethtool -K".

Since netdev->features is not "constant after creation", how can
the (dev->features & NETIF_F_HW_ESP) check in xfrm_dev_down() make sense?

Sabrina Dubroca commented that the current behavior ("ignore NETIF_F_HW_ESP and
call xdo_dev_state_add for new states anyway") has been established for multiple
years. Then, I think that we can't count on NETIF_F_HW_ESP bit. I think that
we can't guarantee that dev_put() from xfrm_dev_down() will be called unless
https://lkml.kernel.org/r/447378de-3cc9-44f5-872e-a1fc477f591e@I-love.SAKURA.ne.jp
is applied.

> Which device is triggering the syzcaller crash?

In a off-list mail, Leon Romanovsky commented that netdevsim0 device did not
clear its xfrmdev_ops pointer when NETIF_F_HW_ESP bit was removed.

Indeed, it seems that nsim_set_features() does not update xfrmdev_ops pointer
when toggling esp-hw-offload feature, but my question above remains regardless of
xfrmdev_ops pointer.

[  185.475627] [   T1489] nsim_set_features dev->xfrmdev_ops=000000000f684dab dev->features=41800008001c869 features=41000008001c869
[  237.302143] [   T1510] nsim_set_features dev->xfrmdev_ops=000000000f684dab dev->features=41000008001c869 features=41800008001c869
[  258.573477] [   T1531] nsim_set_features dev->xfrmdev_ops=000000000f684dab dev->features=41800008001c869 features=41c00008001c869
[  263.853173] [   T1571] nsim_set_features dev->xfrmdev_ops=000000000f684dab dev->features=41c00008001c869 features=41800008001c869
[  295.907167] [   T1612] nsim_set_features dev->xfrmdev_ops=000000000f684dab dev->features=41800008001c869 features=41c00008001c869


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event
  2026-01-22 11:15     ` Steffen Klassert
  2026-01-22 11:28       ` Tetsuo Handa
@ 2026-01-26 22:41       ` Paul Moore
  1 sibling, 0 replies; 11+ messages in thread
From: Paul Moore @ 2026-01-26 22:41 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Tetsuo Handa, Aviad Yehezkel, Aviv Heller, Boris Pismenny,
	David S. Miller, Florian Westphal, Guy Shapiro, Ilan Tayari,
	Kristian Evensen, Leon Romanovsky, Leon Romanovsky, Raed Salem,
	Raed Salem, Saeed Mahameed, Yossi Kuperman, Network Development,
	linux-security-module

On Thu, Jan 22, 2026 at 7:00 AM Steffen Klassert
<steffen.klassert@secunet.com> wrote:
> On Thu, Jan 22, 2026 at 05:24:22PM +0900, Tetsuo Handa wrote:

...

> > Therefore, I wonder what are security_xfrm_state_delete() and security_xfrm_policy_delete()
> > for. Can I kill xfrm_dev_state_flush_secctx_check() and xfrm_dev_policy_flush_secctx_check() ?
>
> This might violate a LSM policy then.

Exactly.  SELinux is currently the only LSM that enforces any access
controls on the XFRM/IPsec code, but it does use both of these LSM
hooks to authorize deletion of SPD/SA objects.

-- 
paul-moore.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2026-01-26 22:41 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-17 16:03 xfrm: question regarding NETDEV_UNREGISTER handling Tetsuo Handa
2026-01-19 11:19 ` [PATCH] xfrm: force flush upon NETDEV_UNREGISTER event Tetsuo Handa
2026-01-22  8:24   ` Tetsuo Handa
2026-01-22 11:15     ` Steffen Klassert
2026-01-22 11:28       ` Tetsuo Handa
2026-01-22 11:32         ` Steffen Klassert
2026-01-22 13:07           ` Tetsuo Handa
2026-01-26 11:07             ` Leon Romanovsky
2026-01-26 15:57               ` Tetsuo Handa
2026-01-26 14:16             ` Sabrina Dubroca
2026-01-26 22:41       ` Paul Moore

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox