* [PATCH] net: Expand headroom to send fragmented packets in bridge fragment forward
@ 2025-04-09 7:33 Huajian Yang
2025-04-09 9:18 ` Florian Westphal
0 siblings, 1 reply; 4+ messages in thread
From: Huajian Yang @ 2025-04-09 7:33 UTC (permalink / raw)
To: pablo
Cc: kadlec, razor, idosch, davem, dsahern, edumazet, kuba, pabeni,
horms, netfilter-devel, coreteam, bridge, netdev, linux-kernel,
Huajian Yang
The config NF_CONNTRACK_BRIDGE will change the way fragments are processed.
Bridge does not know that it is a fragmented packet and forwards it
directly, after NF_CONNTRACK_BRIDGE is enabled, function nf_br_ip_fragment
will check and fraglist this packet.
Some network devices that would not able to ping large packet under bridge,
but large packet ping is successful if not enable NF_CONNTRACK_BRIDGE.
In function nf_br_ip_fragment, checking the headroom before sending is
undoubted, but it is unreasonable to directly drop skb with insufficient
headroom.
Using skb_copy_expand to expand the headroom of skb instead of dropping
it.
Signed-off-by: Huajian Yang <huajianyang@asrmicro.com>
---
net/bridge/netfilter/nf_conntrack_bridge.c | 14 ++++++++++++--
net/ipv6/netfilter.c | 14 ++++++++++++--
2 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/net/bridge/netfilter/nf_conntrack_bridge.c b/net/bridge/netfilter/nf_conntrack_bridge.c
index 816bb0fde718..b8fb81a49377 100644
--- a/net/bridge/netfilter/nf_conntrack_bridge.c
+++ b/net/bridge/netfilter/nf_conntrack_bridge.c
@@ -62,7 +62,7 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk,
if (first_len - hlen > mtu ||
skb_headroom(skb) < ll_rs)
- goto blackhole;
+ goto expand_headroom;
if (skb_cloned(skb))
goto slow_path;
@@ -70,7 +70,7 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk,
skb_walk_frags(skb, frag) {
if (frag->len > mtu ||
skb_headroom(frag) < hlen + ll_rs)
- goto blackhole;
+ goto expand_headroom;
if (skb_shared(frag))
goto slow_path;
@@ -97,6 +97,16 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk,
return err;
}
+
+expand_headroom:
+ struct sk_buff *expand_skb;
+
+ expand_skb = skb_copy_expand(skb, ll_rs, skb_tailroom(skb), GFP_ATOMIC);
+ if (unlikely(!expand_skb))
+ goto blackhole;
+ kfree_skb(skb);
+ skb = expand_skb;
+
slow_path:
/* This is a linearized skbuff, the original geometry is lost for us.
* This may also be a clone skbuff, we could preserve the geometry for
diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
index 581ce055bf52..619d4b97581b 100644
--- a/net/ipv6/netfilter.c
+++ b/net/ipv6/netfilter.c
@@ -166,7 +166,7 @@ int br_ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
if (first_len - hlen > mtu ||
skb_headroom(skb) < (hroom + sizeof(struct frag_hdr)))
- goto blackhole;
+ goto expand_headroom;
if (skb_cloned(skb))
goto slow_path;
@@ -174,7 +174,7 @@ int br_ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
skb_walk_frags(skb, frag2) {
if (frag2->len > mtu ||
skb_headroom(frag2) < (hlen + hroom + sizeof(struct frag_hdr)))
- goto blackhole;
+ goto expand_headroom;
/* Partially cloned skb? */
if (skb_shared(frag2))
@@ -208,6 +208,16 @@ int br_ip6_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
kfree_skb_list(iter.frag);
return err;
}
+
+expand_headroom:
+ struct sk_buff *expand_skb;
+
+ expand_skb = skb_copy_expand(skb, ll_rs, skb_tailroom(skb), GFP_ATOMIC);
+ if (unlikely(!expand_skb))
+ goto blackhole;
+ kfree_skb(skb);
+ skb = expand_skb;
+
slow_path:
/* This is a linearized skbuff, the original geometry is lost for us.
* This may also be a clone skbuff, we could preserve the geometry for
--
2.48.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] net: Expand headroom to send fragmented packets in bridge fragment forward
2025-04-09 7:33 [PATCH] net: Expand headroom to send fragmented packets in bridge fragment forward Huajian Yang
@ 2025-04-09 9:18 ` Florian Westphal
2025-04-09 10:27 ` 答复: " Yang Huajian(杨华健)
0 siblings, 1 reply; 4+ messages in thread
From: Florian Westphal @ 2025-04-09 9:18 UTC (permalink / raw)
To: Huajian Yang
Cc: pablo, kadlec, razor, idosch, davem, dsahern, edumazet, kuba,
pabeni, horms, netfilter-devel, coreteam, bridge, netdev,
linux-kernel
Huajian Yang <huajianyang@asrmicro.com> wrote:
> The config NF_CONNTRACK_BRIDGE will change the way fragments are processed.
> Bridge does not know that it is a fragmented packet and forwards it
> directly, after NF_CONNTRACK_BRIDGE is enabled, function nf_br_ip_fragment
> will check and fraglist this packet.
>
> Some network devices that would not able to ping large packet under bridge,
> but large packet ping is successful if not enable NF_CONNTRACK_BRIDGE.
Can you add a new test to tools/testing/selftests/net/netfilter/ that
demonstrates this problem?
> In function nf_br_ip_fragment, checking the headroom before sending is
> undoubted, but it is unreasonable to directly drop skb with insufficient
> headroom.
Are we talking about
if (first_len - hlen > mtu
or
skb_headroom(skb) < ll_rs)
?
>
> if (first_len - hlen > mtu ||
> skb_headroom(skb) < ll_rs)
> - goto blackhole;
> + goto expand_headroom;
I guess this should be
if (first_len - hlen > mtu)
goto blackhole;
if (skb_headroom(skb) < ll_rs)
goto expand_headroom;
... but I'm not sure what the actual problem is.
> +expand_headroom:
> + struct sk_buff *expand_skb;
> +
> + expand_skb = skb_copy_expand(skb, ll_rs, skb_tailroom(skb), GFP_ATOMIC);
> + if (unlikely(!expand_skb))
> + goto blackhole;
Why does this need to make a full skb copy?
Should that be using skb_expand_head()?
> slow_path:
Actually, can't you just (re)use the slowpath for the skb_headroom < ll_rs
case instead of adding headroom expansion?
^ permalink raw reply [flat|nested] 4+ messages in thread
* 答复: [PATCH] net: Expand headroom to send fragmented packets in bridge fragment forward
2025-04-09 9:18 ` Florian Westphal
@ 2025-04-09 10:27 ` Yang Huajian(杨华健)
2025-04-09 10:45 ` Florian Westphal
0 siblings, 1 reply; 4+ messages in thread
From: Yang Huajian(杨华健) @ 2025-04-09 10:27 UTC (permalink / raw)
To: Florian Westphal
Cc: pablo@netfilter.org, kadlec@netfilter.org, razor@blackwall.org,
idosch@nvidia.com, davem@davemloft.net, dsahern@kernel.org,
edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
horms@kernel.org, netfilter-devel@vger.kernel.org,
coreteam@netfilter.org, bridge@lists.linux.dev,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Thank you for your reply!
> Some network devices that would not able to ping large packet under
> bridge, but large packet ping is successful if not enable NF_CONNTRACK_BRIDGE.
> Can you add a new test to tools/testing/selftests/net/netfilter/ that demonstrates this problem?
Maybe I can't demonstrate this problem with a shell script,
I actually discovered this problem while debugging a wifi network device.
This netdevice is set a large needed_headroom(80), so ll_rs is oversize and goto blackhole.
We can easily to reproduce it by configing needed_headroom in a netdevice,
then add this netdevice to a bridge, and test bridge forwarding.
ping large packet could reproduce this appearance.(successful if not enable NF_CONNTRACK_BRIDGE)
> I guess this should be
>
> if (first_len - hlen > mtu)
> goto blackhole;
> if (skb_headroom(skb) < ll_rs)
> goto expand_headroom;
> ... but I'm not sure what the actual problem is.
Yes, your guess is correct!
Actual problem: I think it is unreasonable to directly drop skb with insufficient headroom.
> Why does this need to make a full skb copy?
> Should that be using skb_expand_head()?
Using skb_expand_head has the same effect.
> Actually, can't you just (re)use the slowpath for the skb_headroom < ll_rs case instead of adding headroom expansion?
I tested it just now, reuse the slowpath will successed.
But maybe this change cannot resolve all cases if the netdevice really needs this headroom.
Best Regards,
Huajian
-----邮件原件-----
发件人: Florian Westphal [mailto:fw@strlen.de]
发送时间: 2025年4月9日 17:18
收件人: Yang Huajian(杨华健) <huajianyang@asrmicro.com>
抄送: pablo@netfilter.org; kadlec@netfilter.org; razor@blackwall.org; idosch@nvidia.com; davem@davemloft.net; dsahern@kernel.org; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; horms@kernel.org; netfilter-devel@vger.kernel.org; coreteam@netfilter.org; bridge@lists.linux.dev; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
主题: Re: [PATCH] net: Expand headroom to send fragmented packets in bridge fragment forward
Huajian Yang <huajianyang@asrmicro.com> wrote:
> The config NF_CONNTRACK_BRIDGE will change the way fragments are processed.
> Bridge does not know that it is a fragmented packet and forwards it
> directly, after NF_CONNTRACK_BRIDGE is enabled, function
> nf_br_ip_fragment will check and fraglist this packet.
>
> Some network devices that would not able to ping large packet under
> bridge, but large packet ping is successful if not enable NF_CONNTRACK_BRIDGE.
Can you add a new test to tools/testing/selftests/net/netfilter/ that demonstrates this problem?
> In function nf_br_ip_fragment, checking the headroom before sending is
> undoubted, but it is unreasonable to directly drop skb with
> insufficient headroom.
Are we talking about
if (first_len - hlen > mtu
or
skb_headroom(skb) < ll_rs)
?
>
> if (first_len - hlen > mtu ||
> skb_headroom(skb) < ll_rs)
> - goto blackhole;
> + goto expand_headroom;
I guess this should be
if (first_len - hlen > mtu)
goto blackhole;
if (skb_headroom(skb) < ll_rs)
goto expand_headroom;
... but I'm not sure what the actual problem is.
> +expand_headroom:
> + struct sk_buff *expand_skb;
> +
> + expand_skb = skb_copy_expand(skb, ll_rs, skb_tailroom(skb), GFP_ATOMIC);
> + if (unlikely(!expand_skb))
> + goto blackhole;
Why does this need to make a full skb copy?
Should that be using skb_expand_head()?
> slow_path:
Actually, can't you just (re)use the slowpath for the skb_headroom < ll_rs case instead of adding headroom expansion?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: 答复: [PATCH] net: Expand headroom to send fragmented packets in bridge fragment forward
2025-04-09 10:27 ` 答复: " Yang Huajian(杨华健)
@ 2025-04-09 10:45 ` Florian Westphal
0 siblings, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2025-04-09 10:45 UTC (permalink / raw)
To: Yang Huajian(杨华健)
Cc: Florian Westphal, pablo@netfilter.org, kadlec@netfilter.org,
razor@blackwall.org, idosch@nvidia.com, davem@davemloft.net,
dsahern@kernel.org, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, horms@kernel.org,
netfilter-devel@vger.kernel.org, coreteam@netfilter.org,
bridge@lists.linux.dev, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Yang Huajian(杨华健) <huajianyang@asrmicro.com> wrote:
> > if (skb_headroom(skb) < ll_rs)
> > goto expand_headroom;
>
> > ... but I'm not sure what the actual problem is.
>
> Yes, your guess is correct!
>
> Actual problem: I think it is unreasonable to directly drop skb with insufficient headroom.
>
> > Why does this need to make a full skb copy?
> > Should that be using skb_expand_head()?
>
> Using skb_expand_head has the same effect.
> > Actually, can't you just (re)use the slowpath for the skb_headroom < ll_rs case instead of adding headroom expansion?
>
> I tested it just now, reuse the slowpath will successed.
> But maybe this change cannot resolve all cases if the netdevice really needs this headroom.
The slowpath considers headroom requirements, see ip_frag_next():
skb2 = alloc_skb(len + state->hlen + state->ll_rs, GFP_ATOMIC);
You should wait for more feedback and then send a v2 tomorrow.
Thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2025-04-09 10:45 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-09 7:33 [PATCH] net: Expand headroom to send fragmented packets in bridge fragment forward Huajian Yang
2025-04-09 9:18 ` Florian Westphal
2025-04-09 10:27 ` 答复: " Yang Huajian(杨华健)
2025-04-09 10:45 ` Florian Westphal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).