From: Jakub Kicinski <kuba@kernel.org>
To: "Maciej Żenczykowski" <maze@google.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
davem@davemloft.net, netdev@vger.kernel.org, edumazet@google.com,
pabeni@redhat.com, andrew+netdev@lunn.ch, horms@kernel.org,
martin.lau@linux.dev, daniel@iogearbox.net,
john.fastabend@gmail.com, eddyz87@gmail.com, sdf@fomichev.me,
haoluo@google.com, willemb@google.com,
william.xuanziyang@huawei.com, alan.maguire@oracle.com,
bpf@vger.kernel.org
Subject: Re: [PATCH net] net: clear the dst when changing skb protocol
Date: Fri, 6 Jun 2025 10:11:06 -0700 [thread overview]
Message-ID: <20250606101106.133cb314@kernel.org> (raw)
In-Reply-To: <CANP3RGfXNrL7b+BPUCPc_=iiExtxZVxLhpQR=vyzgksuuLYkeA@mail.gmail.com>
On Fri, 6 Jun 2025 11:40:31 +0200 Maciej Żenczykowski wrote:
> Hopefully this is helpful?
So IIUC this is what we should do?
Cover the cases where we are 99% sure dropping dst is right without
being overeager ?
---->8-----
diff --git a/net/core/filter.c b/net/core/filter.c
index 327ca73f9cd7..d5917d6446f2 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3401,18 +3401,24 @@ BPF_CALL_3(bpf_skb_change_proto, struct sk_buff *, skb, __be16, proto,
* care of stores.
*
* Currently, additional options and extension header space are
* not supported, but flags register is reserved so we can adapt
* that. For offloads, we mark packet as dodgy, so that headers
* need to be verified first.
*/
ret = bpf_skb_proto_xlat(skb, proto);
+ if (ret)
+ return ret;
+
bpf_compute_data_pointers(skb);
- return ret;
+ if (skb_valid_dst(skb))
+ skb_dst_drop(skb);
+
+ return 0;
}
static const struct bpf_func_proto bpf_skb_change_proto_proto = {
.func = bpf_skb_change_proto,
.gpl_only = false,
.ret_type = RET_INTEGER,
.arg1_type = ARG_PTR_TO_CTX,
.arg2_type = ARG_ANYTHING,
@@ -3549,16 +3555,19 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
/* Match skb->protocol to new outer l3 protocol */
if (skb->protocol == htons(ETH_P_IP) &&
flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV6)
skb->protocol = htons(ETH_P_IPV6);
else if (skb->protocol == htons(ETH_P_IPV6) &&
flags & BPF_F_ADJ_ROOM_ENCAP_L3_IPV4)
skb->protocol = htons(ETH_P_IP);
+
+ if (skb_valid_dst(skb))
+ skb_dst_drop(skb);
}
if (skb_is_gso(skb)) {
struct skb_shared_info *shinfo = skb_shinfo(skb);
/* Header must be checked, and gso_segs recomputed. */
shinfo->gso_type |= gso_type;
shinfo->gso_segs = 0;
@@ -3576,16 +3585,17 @@ static int bpf_skb_net_grow(struct sk_buff *skb, u32 off, u32 len_diff,
}
return 0;
}
static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
u64 flags)
{
+ bool decap = flags & BPF_F_ADJ_ROOM_DECAP_L3_MASK;
int ret;
if (unlikely(flags & ~(BPF_F_ADJ_ROOM_FIXED_GSO |
BPF_F_ADJ_ROOM_DECAP_L3_MASK |
BPF_F_ADJ_ROOM_NO_CSUM_RESET)))
return -EINVAL;
if (skb_is_gso(skb) && !skb_is_gso_tcp(skb)) {
@@ -3598,23 +3608,28 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff,
ret = skb_unclone(skb, GFP_ATOMIC);
if (unlikely(ret < 0))
return ret;
ret = bpf_skb_net_hdr_pop(skb, off, len_diff);
if (unlikely(ret < 0))
return ret;
- /* Match skb->protocol to new outer l3 protocol */
- if (skb->protocol == htons(ETH_P_IP) &&
- flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
- skb->protocol = htons(ETH_P_IPV6);
- else if (skb->protocol == htons(ETH_P_IPV6) &&
- flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
- skb->protocol = htons(ETH_P_IP);
+ if (decap) {
+ /* Match skb->protocol to new outer l3 protocol */
+ if (skb->protocol == htons(ETH_P_IP) &&
+ flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV6)
+ skb->protocol = htons(ETH_P_IPV6);
+ else if (skb->protocol == htons(ETH_P_IPV6) &&
+ flags & BPF_F_ADJ_ROOM_DECAP_L3_IPV4)
+ skb->protocol = htons(ETH_P_IP);
+
+ if (skb_valid_dst(skb))
+ skb_dst_drop(skb);
+ }
if (skb_is_gso(skb)) {
struct skb_shared_info *shinfo = skb_shinfo(skb);
/* Due to header shrink, MSS can be upgraded. */
if (!(flags & BPF_F_ADJ_ROOM_FIXED_GSO))
skb_increase_gso_size(shinfo, len_diff);
next prev parent reply other threads:[~2025-06-06 17:11 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-04 21:06 [PATCH net] net: clear the dst when changing skb protocol Jakub Kicinski
2025-06-04 21:21 ` Maciej Żenczykowski
2025-06-05 13:22 ` Jakub Kicinski
2025-06-05 13:50 ` Maciej Żenczykowski
2025-06-05 14:01 ` Jakub Kicinski
2025-06-06 0:09 ` Willem de Bruijn
2025-06-06 0:31 ` Jakub Kicinski
2025-06-06 9:40 ` Maciej Żenczykowski
2025-06-06 17:11 ` Jakub Kicinski [this message]
2025-06-06 17:36 ` Maciej Żenczykowski
2025-06-06 15:55 ` Willem de Bruijn
2025-06-06 17:09 ` Jakub Kicinski
2025-06-04 21:45 ` Daniel Borkmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250606101106.133cb314@kernel.org \
--to=kuba@kernel.org \
--cc=alan.maguire@oracle.com \
--cc=andrew+netdev@lunn.ch \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=haoluo@google.com \
--cc=horms@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=martin.lau@linux.dev \
--cc=maze@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=sdf@fomichev.me \
--cc=willemb@google.com \
--cc=willemdebruijn.kernel@gmail.com \
--cc=william.xuanziyang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).