* [PATCH net] xfrm: hold skb->dev across async IPv6 transport reinject
@ 2026-03-20 7:30 Qi Tang
2026-03-20 7:44 ` Steffen Klassert
0 siblings, 1 reply; 7+ messages in thread
From: Qi Tang @ 2026-03-20 7:30 UTC (permalink / raw)
To: netdev
Cc: steffen.klassert, herbert, davem, edumazet, kuba, pabeni, horms,
stable
xfrm_trans_queue() queues transport-mode packets for async reinject via
xfrm_trans_reinject(). The queued skb may still be reinjected after the
originating device teardown has started.
Keep the device alive across the async reinject window by taking a netdev
reference when queueing the skb and dropping it after the reinject callback
completes.
Fixes: acf568ee859f ("xfrm: Reinject transport-mode packets through tasklet")
Cc: stable@vger.kernel.org
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
---
net/xfrm/xfrm_input.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 4ed346e682c7..4b5147cb44b7 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -40,6 +40,7 @@ struct xfrm_trans_cb {
} header;
int (*finish)(struct net *net, struct sock *sk, struct sk_buff *skb);
struct net *net;
+ struct net_device *dev;
};
#define XFRM_TRANS_SKB_CB(__skb) ((struct xfrm_trans_cb *)&((__skb)->cb[0]))
@@ -784,9 +785,13 @@ static void xfrm_trans_reinject(struct work_struct *work)
spin_unlock_bh(&trans->queue_lock);
local_bh_disable();
- while ((skb = __skb_dequeue(&queue)))
- XFRM_TRANS_SKB_CB(skb)->finish(XFRM_TRANS_SKB_CB(skb)->net,
- NULL, skb);
+ while ((skb = __skb_dequeue(&queue))) {
+ struct xfrm_trans_cb *cb = XFRM_TRANS_SKB_CB(skb);
+ struct net_device *dev = cb->dev;
+
+ cb->finish(cb->net, NULL, skb);
+ dev_put(dev);
+ }
local_bh_enable();
}
@@ -805,6 +810,8 @@ int xfrm_trans_queue_net(struct net *net, struct sk_buff *skb,
XFRM_TRANS_SKB_CB(skb)->finish = finish;
XFRM_TRANS_SKB_CB(skb)->net = net;
+ XFRM_TRANS_SKB_CB(skb)->dev = skb->dev;
+ dev_hold(XFRM_TRANS_SKB_CB(skb)->dev);
spin_lock_bh(&trans->queue_lock);
__skb_queue_tail(&trans->queue, skb);
spin_unlock_bh(&trans->queue_lock);
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH net] xfrm: hold skb->dev across async IPv6 transport reinject
2026-03-20 7:30 [PATCH net] xfrm: hold skb->dev across async IPv6 transport reinject Qi Tang
@ 2026-03-20 7:44 ` Steffen Klassert
2026-03-26 12:41 ` Qi Tang
0 siblings, 1 reply; 7+ messages in thread
From: Steffen Klassert @ 2026-03-20 7:44 UTC (permalink / raw)
To: Qi Tang; +Cc: netdev, herbert, davem, edumazet, kuba, pabeni, horms, stable
On Fri, Mar 20, 2026 at 03:30:23PM +0800, Qi Tang wrote:
> xfrm_trans_queue() queues transport-mode packets for async reinject via
> xfrm_trans_reinject(). The queued skb may still be reinjected after the
> originating device teardown has started.
>
> Keep the device alive across the async reinject window by taking a netdev
> reference when queueing the skb and dropping it after the reinject callback
> completes.
>
> Fixes: acf568ee859f ("xfrm: Reinject transport-mode packets through tasklet")
> Cc: stable@vger.kernel.org
> Signed-off-by: Qi Tang <tpluszz77@gmail.com>
> ---
> net/xfrm/xfrm_input.c | 13 ++++++++++---
> 1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
> index 4ed346e682c7..4b5147cb44b7 100644
> --- a/net/xfrm/xfrm_input.c
> +++ b/net/xfrm/xfrm_input.c
> @@ -40,6 +40,7 @@ struct xfrm_trans_cb {
> } header;
> int (*finish)(struct net *net, struct sock *sk, struct sk_buff *skb);
> struct net *net;
> + struct net_device *dev;
> };
>
> #define XFRM_TRANS_SKB_CB(__skb) ((struct xfrm_trans_cb *)&((__skb)->cb[0]))
> @@ -784,9 +785,13 @@ static void xfrm_trans_reinject(struct work_struct *work)
> spin_unlock_bh(&trans->queue_lock);
>
> local_bh_disable();
> - while ((skb = __skb_dequeue(&queue)))
> - XFRM_TRANS_SKB_CB(skb)->finish(XFRM_TRANS_SKB_CB(skb)->net,
> - NULL, skb);
> + while ((skb = __skb_dequeue(&queue))) {
> + struct xfrm_trans_cb *cb = XFRM_TRANS_SKB_CB(skb);
> + struct net_device *dev = cb->dev;
> +
> + cb->finish(cb->net, NULL, skb);
> + dev_put(dev);
> + }
> local_bh_enable();
> }
>
> @@ -805,6 +810,8 @@ int xfrm_trans_queue_net(struct net *net, struct sk_buff *skb,
>
> XFRM_TRANS_SKB_CB(skb)->finish = finish;
> XFRM_TRANS_SKB_CB(skb)->net = net;
> + XFRM_TRANS_SKB_CB(skb)->dev = skb->dev;
> + dev_hold(XFRM_TRANS_SKB_CB(skb)->dev);
While this looks correct, we take and drop yet another reference
here. This is an expensive operation. We do dev_hold already
in xfrm_input and it seems we just drop it too early.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net] xfrm: hold skb->dev across async IPv6 transport reinject
2026-03-20 7:44 ` Steffen Klassert
@ 2026-03-26 12:41 ` Qi Tang
2026-03-31 7:59 ` Steffen Klassert
0 siblings, 1 reply; 7+ messages in thread
From: Qi Tang @ 2026-03-26 12:41 UTC (permalink / raw)
To: steffen.klassert; +Cc: davem, edumazet, herbert, horms, kuba, netdev, pabeni
I reworked the fix direction to avoid taking another netdev reference.
The current idea is to reuse the existing dev_hold() in xfrm_input(),
carry that reference across the async transport reinject path, and drop
it only once the skb is either consumed earlier or the async reinject
callback has completed.
Very roughly, the change would look like this:
- in xfrm_input(), do not drop the existing dev reference immediately
on async resume
- add a dedicated queueing path for skbs that already carry that
input-side reference
- in transport_finish(), switch from NF_HOOK() to nf_hook() so the
ret != 1 path can drop the carried reference when the skb is consumed
before okfn runs
- in xfrm_trans_reinject(), drop the transferred reference after the
callback completes
Something along these lines:
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
@@
if (encap_type == -1) {
async = 1;
- dev_put(skb->dev);
seq = XFRM_SKB_CB(skb)->seq.input.low;
...
}
@@
+int xfrm_trans_queue_in(struct net *net, struct sk_buff *skb, ...)
+{
+ ...
+ XFRM_TRANS_SKB_CB(skb)->dev_ref_held = true;
+ ...
+}
diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
@@
- NF_HOOK(..., xfrm6_transport_finish2);
+ ret = nf_hook(..., async ? xfrm6_transport_finish2_async :
+ xfrm6_transport_finish2);
+ if (ret != 1) {
+ if (async)
+ dev_put(dev);
+ return 0;
+ }
+ (async ? xfrm6_transport_finish2_async :
+ xfrm6_transport_finish2)(net, NULL, skb);
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
@@
while ((skb = __skb_dequeue(&queue))) {
+ bool dev_ref_held = cb->dev_ref_held;
+ struct net_device *dev = dev_ref_held ? skb->dev : NULL;
cb->finish(cb->net, NULL, skb);
+ if (dev_ref_held)
+ dev_put(dev);
}
Does this sound like a reasonable direction for a v2?
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH net] xfrm: hold skb->dev across async IPv6 transport reinject
2026-03-26 12:41 ` Qi Tang
@ 2026-03-31 7:59 ` Steffen Klassert
0 siblings, 0 replies; 7+ messages in thread
From: Steffen Klassert @ 2026-03-31 7:59 UTC (permalink / raw)
To: Qi Tang; +Cc: davem, edumazet, herbert, horms, kuba, netdev, pabeni
On Thu, Mar 26, 2026 at 08:41:31PM +0800, Qi Tang wrote:
> I reworked the fix direction to avoid taking another netdev reference.
>
> The current idea is to reuse the existing dev_hold() in xfrm_input(),
> carry that reference across the async transport reinject path, and drop
> it only once the skb is either consumed earlier or the async reinject
> callback has completed.
>
> Very roughly, the change would look like this:
>
> - in xfrm_input(), do not drop the existing dev reference immediately
> on async resume
> - add a dedicated queueing path for skbs that already carry that
> input-side reference
> - in transport_finish(), switch from NF_HOOK() to nf_hook() so the
> ret != 1 path can drop the carried reference when the skb is consumed
> before okfn runs
> - in xfrm_trans_reinject(), drop the transferred reference after the
> callback completes
>
> Something along these lines:
>
> diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
> @@
> if (encap_type == -1) {
> async = 1;
> - dev_put(skb->dev);
> seq = XFRM_SKB_CB(skb)->seq.input.low;
> ...
> }
>
> @@
> +int xfrm_trans_queue_in(struct net *net, struct sk_buff *skb, ...)
> +{
> + ...
> + XFRM_TRANS_SKB_CB(skb)->dev_ref_held = true;
> + ...
> +}
>
> diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
> @@
> - NF_HOOK(..., xfrm6_transport_finish2);
> + ret = nf_hook(..., async ? xfrm6_transport_finish2_async :
> + xfrm6_transport_finish2);
> + if (ret != 1) {
> + if (async)
> + dev_put(dev);
> + return 0;
> + }
> + (async ? xfrm6_transport_finish2_async :
> + xfrm6_transport_finish2)(net, NULL, skb);
>
> diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
> @@
> while ((skb = __skb_dequeue(&queue))) {
> + bool dev_ref_held = cb->dev_ref_held;
> + struct net_device *dev = dev_ref_held ? skb->dev : NULL;
> cb->finish(cb->net, NULL, skb);
> + if (dev_ref_held)
> + dev_put(dev);
> }
>
> Does this sound like a reasonable direction for a v2?
Please send the actual patch for review. It is hard to
guess from such snippets if that's the correct way to
go.
Thanks!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] xfrm: hold skb->dev across async IPv6 transport reinject
@ 2026-03-20 8:32 steffen-ai
2026-03-20 8:37 ` Steffen Klassert
0 siblings, 1 reply; 7+ messages in thread
From: steffen-ai @ 2026-03-20 8:32 UTC (permalink / raw)
To: Qi Tang
Cc: steffen.klassert, netdev, herbert, davem, edumazet, kuba, pabeni,
horms, stable, steffen
Subject: Re: [PATCH net] xfrm: hold skb->dev across async IPv6 transport reinject
In-Reply-To: <20260320073023.21873-1-tpluszz77@gmail.com>
To: Qi Tang <tpluszz77@gmail.com>
Cc: netdev@vger.kernel.org, steffen.klassert@secunet.com, herbert@gondor.apana.org.au, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, horms@kernel.org, stable@vger.kernel.org
The fix looks correct. A few comments:
1. The approach of holding a netdev reference at enqueue time and
releasing it after async reinject properly addresses the UAF window.
The ordering in xfrm_trans_reinject() is key - caching the dev
pointer before calling finish() avoids touching skb->cb after a
callback that may consume/free the skb.
2. One minor note: the net tree generally prefers netdev_hold() /
netdev_put() for reference tracking, but dev_hold() / dev_put()
is functionally equivalent here.
3. The BUILD_BUG_ON() in xfrm_trans_queue_net() will catch any cb
size overflow, so that's covered.
No functional issues observed.
Reviewed-by: Steffen Klassert <steffen.klassert@secunet.com>
Model: openai-codex/gpt-5.3-codex
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH net] xfrm: hold skb->dev across async IPv6 transport reinject
2026-03-20 8:32 steffen-ai
@ 2026-03-20 8:37 ` Steffen Klassert
0 siblings, 0 replies; 7+ messages in thread
From: Steffen Klassert @ 2026-03-20 8:37 UTC (permalink / raw)
To: Qi Tang
Cc: Qi Tang, netdev, herbert, davem, edumazet, kuba, pabeni, horms,
stable, steffen
Please ignore this. It was sent accidentally to the list.
Sorry.
On Fri, Mar 20, 2026 at 08:32:12AM +0000, steffen-ai@klassert.de wrote:
> Subject: Re: [PATCH net] xfrm: hold skb->dev across async IPv6 transport reinject
> In-Reply-To: <20260320073023.21873-1-tpluszz77@gmail.com>
> To: Qi Tang <tpluszz77@gmail.com>
> Cc: netdev@vger.kernel.org, steffen.klassert@secunet.com, herbert@gondor.apana.org.au, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, horms@kernel.org, stable@vger.kernel.org
...
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH net] xfrm: hold skb->dev across async IPv6 transport reinject
@ 2026-03-19 16:56 Qi Tang
0 siblings, 0 replies; 7+ messages in thread
From: Qi Tang @ 2026-03-19 16:56 UTC (permalink / raw)
To: tpluszz77; +Cc: stable
xfrm_trans_queue() queues transport-mode packets for async reinject via
xfrm_trans_reinject(). The queued skb keeps only a bare skb->dev pointer.
If the originating netns is torn down before the workqueue callback runs,
ip6_rcv_finish() can still dereference skb->dev after the device
has already been released by default_device_exit_batch().
Fix this by taking a netdev reference when queueing the skb and dropping it
after the reinject callback completes.
This was reproduced with KASAN under QEMU:
BUG: KASAN: slab-use-after-free in ip6_rcv_finish+0x17c/0x1b0
Workqueue: events xfrm_trans_reinject
Call Trace:
ip6_rcv_finish+0x17c/0x1b0
xfrm_trans_reinject+0x292/0x440
process_one_work+0x63c/0x1100
worker_thread+0x62d/0xef0
kthread+0x368/0x480
ret_from_fork+0x529/0x750
Allocated by task 112:
alloc_netdev_mqs+0x82/0x1180
rtnl_create_link+0xaa4/0xe30
rtnl_newlink+0xa98/0x1f90
Freed by task 12:
device_release+0x9b/0x210
netdev_run_todo+0x497/0xcf0
default_device_exit_batch+0x735/0xab0
cleanup_net+0x3c7/0x860
The issue can be reproduced by repeatedly:
- creating an IPv6 veth pair and network namespace,
- installing IPv6 transport-mode XFRM state and policy,
- sending IPv6 traffic to queue async reinject work, and
- tearing the namespace and device down in parallel.
When unprivileged user namespaces are enabled, this can be triggered by a
non-root user after entering its own user/net namespace with unshare -Urn.
Fixes: acf568ee859f ("xfrm: Reinject transport-mode packets through tasklet")
Cc: stable@vger.kernel.org
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
---
net/xfrm/xfrm_input.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 4ed346e682c7..4b5147cb44b7 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -40,6 +40,7 @@ struct xfrm_trans_cb {
} header;
int (*finish)(struct net *net, struct sock *sk, struct sk_buff *skb);
struct net *net;
+ struct net_device *dev;
};
#define XFRM_TRANS_SKB_CB(__skb) ((struct xfrm_trans_cb *)&((__skb)->cb[0]))
@@ -784,9 +785,13 @@ static void xfrm_trans_reinject(struct work_struct *work)
spin_unlock_bh(&trans->queue_lock);
local_bh_disable();
- while ((skb = __skb_dequeue(&queue)))
- XFRM_TRANS_SKB_CB(skb)->finish(XFRM_TRANS_SKB_CB(skb)->net,
- NULL, skb);
+ while ((skb = __skb_dequeue(&queue))) {
+ struct xfrm_trans_cb *cb = XFRM_TRANS_SKB_CB(skb);
+ struct net_device *dev = cb->dev;
+
+ cb->finish(cb->net, NULL, skb);
+ dev_put(dev);
+ }
local_bh_enable();
}
@@ -805,6 +810,8 @@ int xfrm_trans_queue_net(struct net *net, struct sk_buff *skb,
XFRM_TRANS_SKB_CB(skb)->finish = finish;
XFRM_TRANS_SKB_CB(skb)->net = net;
+ XFRM_TRANS_SKB_CB(skb)->dev = skb->dev;
+ dev_hold(XFRM_TRANS_SKB_CB(skb)->dev);
spin_lock_bh(&trans->queue_lock);
__skb_queue_tail(&trans->queue, skb);
spin_unlock_bh(&trans->queue_lock);
--
2.43.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-03-31 7:59 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-20 7:30 [PATCH net] xfrm: hold skb->dev across async IPv6 transport reinject Qi Tang
2026-03-20 7:44 ` Steffen Klassert
2026-03-26 12:41 ` Qi Tang
2026-03-31 7:59 ` Steffen Klassert
-- strict thread matches above, loose matches on Subject: below --
2026-03-20 8:32 steffen-ai
2026-03-20 8:37 ` Steffen Klassert
2026-03-19 16:56 Qi Tang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.