* [PATCH 6.12.y v2] xfrm: hold dev ref until after transport_finish NF_HOOK
@ 2026-06-11 12:11 Simon Liebold
2026-06-11 15:26 ` Sasha Levin
0 siblings, 1 reply; 3+ messages in thread
From: Simon Liebold @ 2026-06-11 12:11 UTC (permalink / raw)
To: Steffen Klassert, Herbert Xu, David S . Miller, David Ahern,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev,
linux-kernel, stable, Simon Liebold
Cc: Qi Tang, Florian Westphal, Simon Liebold
From: Qi Tang <tpluszz77@gmail.com>
[ Upstream commit 1c428b03840094410c5fb6a5db30640486bbbfcb ]
After async crypto completes, xfrm_input_resume() calls dev_put()
immediately on re-entry before the skb reaches transport_finish.
The skb->dev pointer is then used inside NF_HOOK and its okfn,
which can race with device teardown.
Remove the dev_put from the async resumption entry and instead
drop the reference after the NF_HOOK call in transport_finish,
using a saved device pointer since NF_HOOK may consume the skb.
This covers NF_DROP, NF_QUEUE and NF_STOLEN paths that skip
the okfn.
For non-transport exits (decaps, gro, drop) and secondary
async return points, release the reference inline when
async is set.
Suggested-by: Florian Westphal <fw@strlen.de>
Fixes: acf568ee859f ("xfrm: Reinject transport-mode packets through tasklet")
Cc: stable@vger.kernel.org
Signed-off-by: Qi Tang <tpluszz77@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
[ net/xfrm/xfrm_input.c: dev_hold/dev_put are unconditional here rather
than inside !crypto_done as in mainline, and the dev_put in the
encap_type == -1 async-resumption block does not exist. Adapted by
taking a fresh dev_hold (when async && !xfrm_gro) immediately before
transport_finish, which releases it after NF_HOOK. The per-iteration
dev_hold/dev_put pair at loop-top/resume: is left unchanged.]
Signed-off-by: Simon Liebold <simonlie@amazon.de>
---
Notes:
v2: Restore unconditional dev_put at resume: and instead take a fresh (commits)
dev_hold immediately before transport_finish (when async && !xfrm_gro),
avoiding the reference leak on nested transport-mode that v1's
suppressed resume: dev_put caused.
Prerequisite b05d42eefac7 ("xfrm: hold device only for the asynchronous
decryption") was not backported as it restructures the lock ordering and
resume: label semantics of the decryption loop, requiring non-trivial
adaptation beyond what a minimal stable fix warrants.
I will send patches for 5.10.y -> 6.6.y once we concluded on this patch.
net/ipv4/xfrm4_input.c | 5 ++++-
net/ipv6/xfrm6_input.c | 5 ++++-
net/xfrm/xfrm_input.c | 5 ++++-
3 files changed, 12 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c
index 12a1a0f421956..adf21d6b6076c 100644
--- a/net/ipv4/xfrm4_input.c
+++ b/net/ipv4/xfrm4_input.c
@@ -50,6 +50,7 @@ int xfrm4_transport_finish(struct sk_buff *skb, int async)
{
struct xfrm_offload *xo = xfrm_offload(skb);
struct iphdr *iph = ip_hdr(skb);
+ struct net_device *dev = skb->dev;
iph->protocol = XFRM_MODE_SKB_CB(skb)->protocol;
@@ -73,8 +74,10 @@ int xfrm4_transport_finish(struct sk_buff *skb, int async)
}
NF_HOOK(NFPROTO_IPV4, NF_INET_PRE_ROUTING,
- dev_net(skb->dev), NULL, skb, skb->dev, NULL,
+ dev_net(dev), NULL, skb, dev, NULL,
xfrm4_rcv_encap_finish);
+ if (async)
+ dev_put(dev);
return 0;
}
diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c
index 9005fc156a20e..699a001ac1662 100644
--- a/net/ipv6/xfrm6_input.c
+++ b/net/ipv6/xfrm6_input.c
@@ -43,6 +43,7 @@ static int xfrm6_transport_finish2(struct net *net, struct sock *sk,
int xfrm6_transport_finish(struct sk_buff *skb, int async)
{
struct xfrm_offload *xo = xfrm_offload(skb);
+ struct net_device *dev = skb->dev;
int nhlen = -skb_network_offset(skb);
skb_network_header(skb)[IP6CB(skb)->nhoff] =
@@ -68,8 +69,10 @@ int xfrm6_transport_finish(struct sk_buff *skb, int async)
}
NF_HOOK(NFPROTO_IPV6, NF_INET_PRE_ROUTING,
- dev_net(skb->dev), NULL, skb, skb->dev, NULL,
+ dev_net(dev), NULL, skb, dev, NULL,
xfrm6_transport_finish2);
+ if (async)
+ dev_put(dev);
return 0;
}
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 8edcb32735e59..0288d98e66ee4 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -726,8 +726,11 @@ int xfrm_input(struct sk_buff *skb, int nexthdr, __be32 spi, int encap_type)
err = -EAFNOSUPPORT;
rcu_read_lock();
afinfo = xfrm_state_afinfo_get_rcu(x->props.family);
- if (likely(afinfo))
+ if (likely(afinfo)) {
+ if (async && !xfrm_gro)
+ dev_hold(skb->dev);
err = afinfo->transport_finish(skb, xfrm_gro || async);
+ }
rcu_read_unlock();
if (xfrm_gro) {
sp = skb_sec_path(skb);
base-commit: 1d3a00d3bacff25652c96e1527610c69e91f7c38
--
2.50.1
Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christof Hellmis, Andreas Stieger
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH 6.12.y v2] xfrm: hold dev ref until after transport_finish NF_HOOK
2026-06-11 12:11 [PATCH 6.12.y v2] xfrm: hold dev ref until after transport_finish NF_HOOK Simon Liebold
@ 2026-06-11 15:26 ` Sasha Levin
2026-06-11 15:44 ` Sasha Levin
0 siblings, 1 reply; 3+ messages in thread
From: Sasha Levin @ 2026-06-11 15:26 UTC (permalink / raw)
To: Steffen Klassert, Herbert Xu, David S . Miller, David Ahern,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev,
linux-kernel, stable, Simon Liebold
Cc: Sasha Levin, Qi Tang, Florian Westphal, Simon Liebold
On Thu, Jun 11, 2026 at 12:11:27PM +0000, Simon Liebold wrote:
> [ Upstream commit 1c428b03840094410c5fb6a5db30640486bbbfcb ]
>
> After async crypto completes, xfrm_input_resume() calls dev_put()
> immediately on re-entry before the skb reaches transport_finish.
Queued for 6.12, thanks.
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 6.12.y v2] xfrm: hold dev ref until after transport_finish NF_HOOK
2026-06-11 15:26 ` Sasha Levin
@ 2026-06-11 15:44 ` Sasha Levin
0 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2026-06-11 15:44 UTC (permalink / raw)
To: Steffen Klassert, Herbert Xu, David S . Miller, David Ahern,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Simon Horman, netdev,
linux-kernel, stable, Simon Liebold
Cc: Qi Tang, Florian Westphal, Simon Liebold
On Thu, Jun 11, 2026 at 11:26:20AM -0400, Sasha Levin wrote:
>On Thu, Jun 11, 2026 at 12:11:27PM +0000, Simon Liebold wrote:
>> [ Upstream commit 1c428b03840094410c5fb6a5db30640486bbbfcb ]
>>
>> After async crypto completes, xfrm_input_resume() calls dev_put()
>> immediately on re-entry before the skb reaches transport_finish.
>
>Queued for 6.12, thanks.
Ugh... Looking at it again, I've dropped it.
The problem is the assumption that "the dev_put in the encap_type == -1
async-resumption block does not exist" in 6.12.y. It's true there is no dev_put
inside the 'if (encap_type == -1)' block, but that is only because the early
drop lives somewhere else here: it's the dev_put right at the 'resume:' label.
Look at where 'resume:' sits relative to the per-iteration dev_put:
mainline (post-fix): 6.12.y:
dev_hold(skb->dev); dev_hold(skb->dev);
nexthdr = ...input(x, skb); nexthdr = ...input(x, skb);
if (nexthdr == -EINPROGRESS) { if (nexthdr == -EINPROGRESS)
if (async) return 0;
dev_put(...); resume:
return 0; dev_put(skb->dev); <-- early drop
}
dev_put(skb->dev);
resume: [async re-entry does goto resume,
... so this dev_put runs immediately]
In mainline the fix works because 'resume:' is *after* the per-iteration
dev_put, so when xfrm_input_resume() re-enters and does 'goto resume', the
async ref taken at the loop-top dev_hold is *not* dropped - it is held
continuously until after the NF_HOOK (plus the inline 'if (async) dev_put()' it
adds at the decaps/gro/drop/secondary-EINPROGRESS exits).
In 6.12.y 'resume:' is *before* that dev_put, so the async 'goto resume' hits
'dev_put(skb->dev)' straight away and drops the ref at the very start of resume
processing. The fresh 'dev_hold(skb->dev)' added before transport_finish does
not save it:
- between the early dev_put and the re-hold, skb->dev is held by no
xfrm reference at all - the exact window device teardown can race; and
- 'dev_hold(skb->dev)' itself dereferences skb->dev to bump the
refcount, so if the device was already freed in that window the
re-hold is itself a use-after-free.
So this is a lifetime bug, not a refcount-balance bug: every hold still has a
matching put, but the reference no longer covers the critical window.
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-06-11 15:44 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-11 12:11 [PATCH 6.12.y v2] xfrm: hold dev ref until after transport_finish NF_HOOK Simon Liebold
2026-06-11 15:26 ` Sasha Levin
2026-06-11 15:44 ` Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox