netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: drop secpath extension before skb deferral free
@ 2024-05-13 10:02 Jianbo Liu
  2024-05-13 10:29 ` Eric Dumazet
  0 siblings, 1 reply; 19+ messages in thread
From: Jianbo Liu @ 2024-05-13 10:02 UTC (permalink / raw)
  To: netdev, edumazet; +Cc: Jianbo Liu, Leon Romanovsky

In commit 68822bdf76f1 ("net: generalize skb freeing deferral to
per-cpu lists"), skb can be queued on remote cpu list for deferral
free.

The remote cpu is kicked if the queue reaches half capacity. As
mentioned in the patch, this seems very unlikely to trigger
NET_RX_SOFTIRQ on the remote CPU in this way. But that seems not true,
we actually saw something that indicates this: skb is not freed
immediately, or even kept for a long time. And the possibility is
increased if there are more cpu cores.

As skb is not freed, its extension is not freed as well. An error
occurred while unloading the driver after running TCP traffic with
IPsec, where both crypto and packet were offloaded. However, in the
case of crypto offload, this failure was rare and significantly more
challenging to replicate.

 unregister_netdevice: waiting for eth2 to become free. Usage count = 2
 ref_tracker: eth%d@000000007421424b has 1/1 users at
      xfrm_dev_state_add+0xe5/0x4d0
      xfrm_add_sa+0xc5c/0x11e0
      xfrm_user_rcv_msg+0xfa/0x240
      netlink_rcv_skb+0x54/0x100
      xfrm_netlink_rcv+0x31/0x40
      netlink_unicast+0x1fc/0x2c0
      netlink_sendmsg+0x232/0x4a0
      __sock_sendmsg+0x38/0x60
      ____sys_sendmsg+0x1e3/0x200
      ___sys_sendmsg+0x80/0xc0
      __sys_sendmsg+0x51/0x90
      do_syscall_64+0x40/0xe0
      entry_SYSCALL_64_after_hwframe+0x46/0x4e

The ref_tracker shows the netdev is hold when the offloading xfrm
state is first added to hardware. When receiving packet, the secpath
extension, which saves xfrm state, is added to skb by ipsec offload,
and the xfrm state is hence hold by the received skb. It can't be
flushed till skb is dequeued from the defer list, then skb and its
extension are really freed. Also, the netdev can't be unregistered
because it still referred by xfrm state.

To fix this issue, drop this extension before skb is queued to the
defer list, so xfrm state destruction is not blocked.

Fixes: 68822bdf76f1 ("net: generalize skb freeing deferral to per-cpu lists")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
---
 net/core/skbuff.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index b99127712e67..d7f5024f3c08 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -7025,6 +7025,10 @@ nodefer:	__kfree_skb(skb);
 	if (READ_ONCE(sd->defer_count) >= defer_max)
 		goto nodefer;
 
+#ifdef CONFIG_XFRM
+	skb_ext_del(skb, SKB_EXT_SEC_PATH);
+#endif
+
 	spin_lock_bh(&sd->defer_lock);
 	/* Send an IPI every time queue reaches half capacity. */
 	kick = sd->defer_count == (defer_max >> 1);
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-05-28  9:26 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-13 10:02 [PATCH net] net: drop secpath extension before skb deferral free Jianbo Liu
2024-05-13 10:29 ` Eric Dumazet
2024-05-14  7:37   ` Jianbo Liu
2024-05-14  8:51     ` Eric Dumazet
2024-05-15  3:10       ` Jianbo Liu
2024-05-20 10:06       ` Jianbo Liu
2024-05-21 10:15         ` Steffen Klassert
2024-05-22  9:34         ` Steffen Klassert
2024-05-22 11:06           ` Eric Dumazet
2024-05-23  2:22           ` Jianbo Liu
2024-05-23  6:44             ` Steffen Klassert
2024-05-23  6:57               ` Jianbo Liu
2024-05-23 10:00                 ` Steffen Klassert
2024-05-23 15:26                   ` Jianbo Liu
2024-05-27  7:40                     ` Steffen Klassert
2024-05-28  8:44                       ` Steffen Klassert
2024-05-28  9:02                         ` Jianbo Liu
2024-05-28  9:26                           ` Steffen Klassert
2024-05-26 10:57               ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).