* [PATCH net-next] xfrm: fix xfrm_dev_offload_ok() returning true for software SAs
@ 2026-05-27 14:09 Petr Wozniak
2026-05-27 16:01 ` Sabrina Dubroca
0 siblings, 1 reply; 4+ messages in thread
From: Petr Wozniak @ 2026-05-27 14:09 UTC (permalink / raw)
To: netdev; +Cc: steffen.klassert, herbert, davem, petr.wozniak
xfrm_dev_offload_ok() reaches the ok: label when the SA has no
offload device (dev == NULL, SW SA) because the guard condition
"(!dev || dev == xfrm_dst_path(dst)->dev)" evaluates true for
dev == NULL. The explicit "if (!dev) return true" at ok: then
incorrectly routes the SW SA onto the GSO offload path.
This triggers the following chain on platforms with an async crypto
engine (e.g. MediaTek EIP-197):
xfrm_output() -> secpath_set() -> esp4_gso_encap() (no encryption,
marks skb) -> validate_xmit_xfrm() -> esp_xmit() -> async crypto
-> -EINPROGRESS -> validate_xmit_xfrm returns NULL
On real netdevs (noqueue absent) sch_direct_xmit() handles NULL
gracefully and async crypto completes the packet via
xfrm_dev_resume(). On bridge netdevs (noqueue qdisc) the direct
branch of __dev_queue_xmit() initialises rc to -ENOMEM and never
overwrites it when skb is NULL, propagating -ENOMEM to userspace as
"sendmsg: Out of memory" on every packet.
On synchronous platforms (x86/ARM without a dedicated crypto engine)
crypto_aead_encrypt() completes inline, -EINPROGRESS never occurs,
and the bug is latent.
Fix: return false for dev == NULL so SW SAs use the normal inline
path in xfrm_output_one() via esp4_output(). On -EINPROGRESS,
xfrm_output_resume() converts it to 0 and the async callback
re-enters xfrm_output_resume() to deliver the encrypted packet
correctly. HW offload SAs (dev != NULL) are unaffected.
Fixes: 9e1437937bdd ("xfrm: Add xfrm_dev_offload_ok helper")
Signed-off-by: Petr Wozniak <petr.wozniak@gmail.com>
---
net/xfrm/xfrm_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 550457e4c..5454be0b2 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -434,7 +434,7 @@ bool xfrm_dev_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
ok:
if (!dev)
- return true;
+ return false;
check_tunnel_size = x->xso.type == XFRM_DEV_OFFLOAD_PACKET &&
x->props.mode == XFRM_MODE_TUNNEL;
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH net-next] xfrm: fix xfrm_dev_offload_ok() returning true for software SAs
2026-05-27 14:09 Petr Wozniak
@ 2026-05-27 16:01 ` Sabrina Dubroca
0 siblings, 0 replies; 4+ messages in thread
From: Sabrina Dubroca @ 2026-05-27 16:01 UTC (permalink / raw)
To: Petr Wozniak; +Cc: netdev, steffen.klassert, herbert, davem
2026-05-27, 16:09:48 +0200, Petr Wozniak wrote:
> xfrm_dev_offload_ok() reaches the ok: label when the SA has no
> offload device (dev == NULL, SW SA) because the guard condition
> "(!dev || dev == xfrm_dst_path(dst)->dev)" evaluates true for
> dev == NULL. The explicit "if (!dev) return true" at ok: then
> incorrectly routes the SW SA onto the GSO offload path.
Incorrectly? IPsec in SW with GSO is a valid setup. I think you're
breaking that with your patch.
--
Sabrina
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next] xfrm: fix xfrm_dev_offload_ok() returning true for software SAs
[not found] <20260527140948.78243-1-petr.wozniak@gmail.com>
@ 2026-05-27 16:47 ` Petr Wozniak
2026-05-27 18:41 ` Sabrina Dubroca
0 siblings, 1 reply; 4+ messages in thread
From: Petr Wozniak @ 2026-05-27 16:47 UTC (permalink / raw)
To: sd; +Cc: Petr Wozniak, netdev, steffen.klassert, herbert, davem
2026-05-27, Sabrina Dubroca wrote:
> Incorrectly? IPsec in SW with GSO is a valid setup. I think you're
> breaking that with your patch.
Fair point — SW IPsec with GSO is intentional and the patch is too broad.
The actual observable bug on this platform (MT7988A, EIP-197 async crypto):
xfrm_dev_offload_ok() → true (SW SA, dev == NULL)
→ esp4_gso_encap() marks the skb
→ validate_xmit_xfrm() → esp_xmit() → async crypto → -EINPROGRESS
→ validate_xmit_xfrm() returns NULL
On bridge interfaces (noqueue qdisc), __dev_queue_xmit() takes the
direct branch, initialises rc = -ENOMEM and never overwrites it
when skb is NULL → ENOMEM on every packet.
On real netdevs with a qdisc, sch_direct_xmit() handles NULL
gracefully and async completion via xfrm_dev_resume() delivers
the packet correctly.
Where would you suggest the actual fix should go — in the
bridge/noqueue path, or in validate_xmit_xfrm() / sch_direct_xmit()?
Petr
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH net-next] xfrm: fix xfrm_dev_offload_ok() returning true for software SAs
2026-05-27 16:47 ` [PATCH net-next] xfrm: fix xfrm_dev_offload_ok() returning true for software SAs Petr Wozniak
@ 2026-05-27 18:41 ` Sabrina Dubroca
0 siblings, 0 replies; 4+ messages in thread
From: Sabrina Dubroca @ 2026-05-27 18:41 UTC (permalink / raw)
To: Petr Wozniak; +Cc: netdev, steffen.klassert, herbert, davem
2026-05-27, 18:47:17 +0200, Petr Wozniak wrote:
> 2026-05-27, Sabrina Dubroca wrote:
> > Incorrectly? IPsec in SW with GSO is a valid setup. I think you're
> > breaking that with your patch.
>
> Fair point — SW IPsec with GSO is intentional and the patch is too broad.
(this whole email looks like direct c/p from an LLM...)
> The actual observable bug on this platform (MT7988A, EIP-197 async crypto):
>
> xfrm_dev_offload_ok() → true (SW SA, dev == NULL)
> → esp4_gso_encap() marks the skb
> → validate_xmit_xfrm() → esp_xmit() → async crypto → -EINPROGRESS
> → validate_xmit_xfrm() returns NULL
>
> On bridge interfaces (noqueue qdisc), __dev_queue_xmit() takes the
> direct branch, initialises rc = -ENOMEM and never overwrites it
> when skb is NULL → ENOMEM on every packet.
>
> On real netdevs with a qdisc, sch_direct_xmit() handles NULL
> gracefully and async completion via xfrm_dev_resume() delivers
> the packet correctly.
But the packet is delivered correctly also in the bridge case, right?
Once we enter async crypto, the original datapath becomes irrelevant.
> Where would you suggest the actual fix should go — in the
> bridge/noqueue path, or in validate_xmit_xfrm() / sch_direct_xmit()?
It looks like commit f53c723902d1 ("net: Add asynchronous callbacks
for xfrm on layer 2.") updated the meaning of validate_xmit_xfrm()
returning NULL from "packet was dropped" to "packet was stolen (maybe
dropped, maybe still going)" and this isn't handled well.
I guess validate_xmit_xfrm() could propagate -EINPROGRESS via
ERR_PTR(), and validate_xmit_skb() could return that to its callers.
validate_xmit_skb_list() would need to check IS_ERR_OR_NULL(skb)
instead of !skb, and __dev_queue_xmit() could now differentiate the
EINPROGRESS case from a drop.
--
Sabrina
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-27 18:42 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260527140948.78243-1-petr.wozniak@gmail.com>
2026-05-27 16:47 ` [PATCH net-next] xfrm: fix xfrm_dev_offload_ok() returning true for software SAs Petr Wozniak
2026-05-27 18:41 ` Sabrina Dubroca
2026-05-27 14:09 Petr Wozniak
2026-05-27 16:01 ` Sabrina Dubroca
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox