All of lore.kernel.org
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Mahe Tardy" <mahe.tardy@gmail.com>
Cc: bpf@vger.kernel.org
Subject: Re: [PATCH bpf-next v5 1/4] bpf: add bpf_icmp_send kfunc
Date: Fri, 15 May 2026 20:23:57 +0000	[thread overview]
Message-ID: <20260515202358.20252C2BCB0@smtp.kernel.org> (raw)
In-Reply-To: <20260515194746.50920-2-mahe.tardy@gmail.com>

Thank you for your contribution! Sashiko AI review found 3 potential issue(s) to consider:
- [Critical] Calling pskb_network_may_pull() on the original skb before cloning modifies the SKB layout, causing BPF packet pointer Use-After-Free and panicking on shared SKBs.
- [High] IPv4 ICMP generation fails silently in TC ingress due to a missing routing table entry.
- [Medium] Unsafe use of a per-CPU recursion protection flag without disabling preemption or bottom-halves causes spurious failures.
--

commit a8e749b3f92e491c40d461bbeec717cff3d92cd2
Author: Mahe Tardy <mahe.tardy@gmail.com>
One line subject: bpf: add bpf_icmp_send kfunc

This commit adds the bpf_icmp_send kfunc, allowing cgroup_skb and tc
programs to generate ICMP error messages instead of silently dropping
packets. It clones the skb to preserve the original buffer, checks bounds
with pskb_network_may_pull, and relies on icmp_send or icmpv6_send to
transmit the error message.

> diff --git a/net/core/filter.c b/net/core/filter.c
> index 9590877b0714f..a1fe20b45f29a 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c

[ ... ]

> @@ -12464,6 +12466,95 @@ __bpf_kfunc int bpf_xdp_pull_data(struct xdp_md *x, u32 len)
>  	return 0;
>  }
>  
> +static DEFINE_PER_CPU(bool, bpf_icmp_send_in_progress);
> +
> +/**
> + * bpf_icmp_send - Send an ICMP control message
> + * @skb: Packet that triggered the control message
> + * @type: ICMP type (only ICMP_DEST_UNREACH/ICMPV6_DEST_UNREACH supported)

[ ... ]

> +__bpf_kfunc int bpf_icmp_send(struct __sk_buff *__skb, int type, int code)
> +{
> +	struct sk_buff *skb = (struct sk_buff *)__skb;
> +	struct sk_buff *nskb;
> +	bool *in_progress;
> +
> +	in_progress = this_cpu_ptr(&bpf_icmp_send_in_progress);
> +	if (*in_progress)
> +		return -EBUSY;

Since cgroup_skb egress programs execute in process context, they can be
interrupted by network RX softirqs on the same CPU.

If a task sets *in_progress to true and is immediately preempted by a softirq
that runs a TC ingress program calling bpf_icmp_send(), will it read the same
per-CPU flag and spuriously fail with -EBUSY?

Does this critical section need to run with bottom-halves disabled?

> +
> +	switch (skb->protocol) {
> +#if IS_ENABLED(CONFIG_INET)
> +	case htons(ETH_P_IP):
> +		if (type != ICMP_DEST_UNREACH)
> +			return -EOPNOTSUPP;
> +		if (code < 0 || code > NR_ICMP_UNREACH)
> +			return -EINVAL;
> +
> +		if (!pskb_network_may_pull(skb, sizeof(struct iphdr)))
> +			return -EBADMSG;

Does calling pskb_network_may_pull() on the original skb before cloning modify
the packet layout? If the headers span into paged fragments, this could
trigger reallocation of the linear data buffer via pskb_expand_head().

Since this kfunc lacks flags like KF_INVALIDATE_PKT_PTRS, could the BPF
verifier allow programs to retain and dereference stale direct packet
pointers?

Additionally, if the original skb is shared (which is common in tc and
cgroup_skb), would pskb_expand_head() hit a BUG_ON(skb_shared(skb)) and panic
the kernel? Should the packet be cloned first before pulling headers?

> +
> +		nskb = skb_clone(skb, GFP_ATOMIC);
> +		if (!nskb)
> +			return -ENOMEM;
> +
> +		memset(IPCB(nskb), 0, sizeof(struct inet_skb_parm));
> +
> +		*in_progress = true;
> +		icmp_send(nskb, type, code, 0);

The commit message notes this is "mostly for tc ingress progs". Since TC
ingress happens before IP routing, skb_rtable(nskb) will be NULL here.

Looking at __icmp_send(), it checks for a routing entry and aborts early if
missing:

net/ipv4/icmp.c:__icmp_send() {
        ...
        rt = skb_rtable(skb_in);
        if (!rt)
                goto out;
        ...
}

Will this cause IPv4 ICMP generation to silently fail for TC ingress programs?

> +		*in_progress = false;
> +		kfree_skb(nskb);
> +		break;
> +#endif

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260515194746.50920-1-mahe.tardy@gmail.com?part=1

  reply	other threads:[~2026-05-15 20:23 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-15 19:47 [PATCH bpf-next v5 0/4] bpf: add icmp_send kfunc Mahe Tardy
2026-05-15 19:47 ` [PATCH bpf-next v5 1/4] bpf: add bpf_icmp_send kfunc Mahe Tardy
2026-05-15 20:23   ` sashiko-bot [this message]
2026-05-15 19:47 ` [PATCH bpf-next v5 2/4] selftests/bpf: add bpf_icmp_send kfunc tests Mahe Tardy
2026-05-15 20:36   ` sashiko-bot
2026-05-15 19:47 ` [PATCH bpf-next v5 3/4] selftests/bpf: add bpf_icmp_send kfunc IPv6 tests Mahe Tardy
2026-05-15 20:44   ` sashiko-bot
2026-05-15 19:47 ` [PATCH bpf-next v5 4/4] selftests/bpf: add bpf_icmp_send recursion test Mahe Tardy
2026-05-15 21:00   ` sashiko-bot
2026-05-16  9:54   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260515202358.20252C2BCB0@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=mahe.tardy@gmail.com \
    --cc=sashiko-reviews@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.