Re: [PATCH bpf-next v5 1/4] bpf: add bpf_icmp_send kfunc

BPF List
 help / color / mirror / Atom feed

From: sashiko-bot@kernel.org
To: "Mahe Tardy" <mahe.tardy@gmail.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH bpf-next v5 1/4] bpf: add bpf_icmp_send kfunc
Date: Fri, 15 May 2026 20:23:57 +0000	[thread overview]
Message-ID: <20260515202358.20252C2BCB0@smtp.kernel.org> (raw)
In-Reply-To: <20260515194746.50920-2-mahe.tardy@gmail.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Critical] Calling pskb_network_may_pull() on the original skb before cloning modifies the SKB layout, causing BPF packet pointer Use-After-Free and panicking on shared SKBs.
- [High] IPv4 ICMP generation fails silently in TC ingress due to a missing routing table entry.
- [Medium] Unsafe use of a per-CPU recursion protection flag without disabling preemption or bottom-halves causes spurious failures.
--

commit a8e749b3f92e491c40d461bbeec717cff3d92cd2
Author: Mahe Tardy <mahe.tardy@gmail.com>
One line subject: bpf: add bpf_icmp_send kfunc

This commit adds the bpf_icmp_send kfunc, allowing cgroup_skb and tc
programs to generate ICMP error messages instead of silently dropping
packets. It clones the skb to preserve the original buffer, checks bounds
with pskb_network_may_pull, and relies on icmp_send or icmpv6_send to
transmit the error message.

> diff --git a/net/core/filter.c b/net/core/filter.c
> index 9590877b0714f..a1fe20b45f29a 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c

[ ... ]

> @@ -12464,6 +12466,95 @@ __bpf_kfunc int bpf_xdp_pull_data(struct xdp_md *x, u32 len)
>  	return 0;
>  }
>  
> +static DEFINE_PER_CPU(bool, bpf_icmp_send_in_progress);
> +
> +/**
> + * bpf_icmp_send - Send an ICMP control message
> + * @skb: Packet that triggered the control message
> + * @type: ICMP type (only ICMP_DEST_UNREACH/ICMPV6_DEST_UNREACH supported)

[ ... ]

> +__bpf_kfunc int bpf_icmp_send(struct __sk_buff *__skb, int type, int code)
> +{
> +	struct sk_buff *skb = (struct sk_buff *)__skb;
> +	struct sk_buff *nskb;
> +	bool *in_progress;
> +
> +	in_progress = this_cpu_ptr(&bpf_icmp_send_in_progress);
> +	if (*in_progress)
> +		return -EBUSY;

Since cgroup_skb egress programs execute in process context, they can be
interrupted by network RX softirqs on the same CPU.

If a task sets *in_progress to true and is immediately preempted by a softirq
that runs a TC ingress program calling bpf_icmp_send(), will it read the same
per-CPU flag and spuriously fail with -EBUSY?

Does this critical section need to run with bottom-halves disabled?

> +
> +	switch (skb->protocol) {
> +#if IS_ENABLED(CONFIG_INET)
> +	case htons(ETH_P_IP):
> +		if (type != ICMP_DEST_UNREACH)
> +			return -EOPNOTSUPP;
> +		if (code < 0 || code > NR_ICMP_UNREACH)
> +			return -EINVAL;
> +
> +		if (!pskb_network_may_pull(skb, sizeof(struct iphdr)))
> +			return -EBADMSG;

Does calling pskb_network_may_pull() on the original skb before cloning modify
the packet layout? If the headers span into paged fragments, this could
trigger reallocation of the linear data buffer via pskb_expand_head().

Since this kfunc lacks flags like KF_INVALIDATE_PKT_PTRS, could the BPF
verifier allow programs to retain and dereference stale direct packet
pointers?

Additionally, if the original skb is shared (which is common in tc and
cgroup_skb), would pskb_expand_head() hit a BUG_ON(skb_shared(skb)) and panic
the kernel? Should the packet be cloned first before pulling headers?

> +
> +		nskb = skb_clone(skb, GFP_ATOMIC);
> +		if (!nskb)
> +			return -ENOMEM;
> +
> +		memset(IPCB(nskb), 0, sizeof(struct inet_skb_parm));
> +
> +		*in_progress = true;
> +		icmp_send(nskb, type, code, 0);

The commit message notes this is "mostly for tc ingress progs". Since TC
ingress happens before IP routing, skb_rtable(nskb) will be NULL here.

Looking at __icmp_send(), it checks for a routing entry and aborts early if
missing:

net/ipv4/icmp.c:__icmp_send() {
        ...
        rt = skb_rtable(skb_in);
        if (!rt)
                goto out;
        ...
}

Will this cause IPv4 ICMP generation to silently fail for TC ingress programs?

> +		*in_progress = false;
> +		kfree_skb(nskb);
> +		break;
> +#endif

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260515194746.50920-1-mahe.tardy@gmail.com?part=1

next prev parent reply	other threads:[~2026-05-15 20:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15 19:47 [PATCH bpf-next v5 0/4] bpf: add icmp_send kfunc Mahe Tardy
2026-05-15 19:47 ` [PATCH bpf-next v5 1/4] bpf: add bpf_icmp_send kfunc Mahe Tardy
2026-05-15 20:23   ` sashiko-bot [this message]
2026-05-15 19:47 ` [PATCH bpf-next v5 2/4] selftests/bpf: add bpf_icmp_send kfunc tests Mahe Tardy
2026-05-15 20:36   ` sashiko-bot
2026-05-15 19:47 ` [PATCH bpf-next v5 3/4] selftests/bpf: add bpf_icmp_send kfunc IPv6 tests Mahe Tardy
2026-05-15 20:44   ` sashiko-bot
2026-05-15 19:47 ` [PATCH bpf-next v5 4/4] selftests/bpf: add bpf_icmp_send recursion test Mahe Tardy
2026-05-15 21:00   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260515202358.20252C2BCB0@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=mahe.tardy@gmail.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox