All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin KaFai Lau <martin.lau@linux.dev>
To: Mahe Tardy <mahe.tardy@gmail.com>
Cc: alexei.starovoitov@gmail.com, andrii@kernel.org, ast@kernel.org,
	bpf@vger.kernel.org, coreteam@netfilter.org,
	daniel@iogearbox.net, fw@strlen.de, john.fastabend@gmail.com,
	netdev@vger.kernel.org, netfilter-devel@vger.kernel.org,
	oe-kbuild-all@lists.linux.dev, pablo@netfilter.org,
	lkp@intel.com
Subject: Re: [PATCH bpf-next v3 3/4] bpf: add bpf_icmp_send_unreach cgroup_skb kfunc
Date: Tue, 29 Jul 2025 16:13:06 -0700	[thread overview]
Message-ID: <df4b0996-3e88-4ea4-983b-82866455a6fc@linux.dev> (raw)
In-Reply-To: <aIidIq2EM--Ugp6f@gmail.com>

On 7/29/25 3:06 AM, Mahe Tardy wrote:
> On Mon, Jul 28, 2025 at 06:05:26PM -0700, Martin KaFai Lau wrote:
>> On 7/28/25 2:43 AM, Mahe Tardy wrote:
>>> This is needed in the context of Tetragon to provide improved feedback
>>> (in contrast to just dropping packets) to east-west traffic when blocked
>>> by policies using cgroup_skb programs.
>>>
>>> This reuse concepts from netfilter reject target codepath with the
>>> differences that:
>>> * Packets are cloned since the BPF user can still return SK_PASS from
>>>     the cgroup_skb progs and the current skb need to stay untouched
>>
>> This needs more details. Which field(s) of the skb are changed by the kfunc,
>> the skb_dst_set in ip[6]_route_reply_fetch_dst() and/or the code path in the
>> icmp[v6]_send() ?
> 
> Okay I can add that: "ip[6]_route_reply_fetch_dst set the dst of the skb
> by using the saddr as a daddr and routing it", I don't think
> icmp[v6]_send touches the skb?

I also don't think icmp[v6]_send touches the skb. I am still not sure if 
ip[6]_route_reply_fetch_dst is needed.

> 
>>
>>>     (cgroup_skb hooks only allow read-only skb payload).
>>> * Since cgroup_skb programs are called late in the stack, checksums do
>>>     not need to be computed or verified, and IPv4 fragmentation does not
>>>     need to be checked (ip_local_deliver should take care of that
>>>     earlier).
>>>
>>> Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
>>> ---
>>>    net/core/filter.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++
>>>    1 file changed, 61 insertions(+)
>>>
>>> diff --git a/net/core/filter.c b/net/core/filter.c
>>> index 7a72f766aacf..050872324575 100644
>>> --- a/net/core/filter.c
>>> +++ b/net/core/filter.c
>>> @@ -85,6 +85,10 @@
>>>    #include <linux/un.h>
>>>    #include <net/xdp_sock_drv.h>
>>>    #include <net/inet_dscp.h>
>>> +#include <linux/icmp.h>
>>> +#include <net/icmp.h>
>>> +#include <net/route.h>
>>> +#include <net/ip6_route.h>
>>>
>>>    #include "dev.h"
>>>
>>> @@ -12148,6 +12152,53 @@ __bpf_kfunc int bpf_sock_ops_enable_tx_tstamp(struct bpf_sock_ops_kern *skops,
>>>    	return 0;
>>>    }
>>>
>>> +__bpf_kfunc int bpf_icmp_send_unreach(struct __sk_buff *__skb, int code)
>>> +{
>>> +	struct sk_buff *skb = (struct sk_buff *)__skb;
>>> +	struct sk_buff *nskb;
>>> +
>>> +	switch (skb->protocol) {
>>> +	case htons(ETH_P_IP):
>>> +		if (code < 0 || code > NR_ICMP_UNREACH)
>>> +			return -EINVAL;
>>> +
>>> +		nskb = skb_clone(skb, GFP_ATOMIC);
>>> +		if (!nskb)
>>> +			return -ENOMEM;
>>> +
>>> +		if (ip_route_reply_fetch_dst(nskb) < 0) {
>>> +			kfree_skb(nskb);
>>> +			return -EHOSTUNREACH;
>>> +		}
>>> +
>>> +		icmp_send(nskb, ICMP_DEST_UNREACH, code, 0);
>>> +		kfree_skb(nskb);
>>> +		break;
>>> +#if IS_ENABLED(CONFIG_IPV6)
>>> +	case htons(ETH_P_IPV6):
>>> +		if (code < 0 || code > ICMPV6_REJECT_ROUTE)
>>> +			return -EINVAL;
>>> +
>>> +		nskb = skb_clone(skb, GFP_ATOMIC);
>>> +		if (!nskb)
>>> +			return -ENOMEM;
>>> +
>>> +		if (ip6_route_reply_fetch_dst(nskb) < 0) {
>>
>>  From a very quick look at icmpv6_send(), it does its own route lookup. I
>> haven't looked at the v4 yet.
>>
>> I am likely missing some details. Can you explain why it needs to do a
>> lookup before calling icmpv6_send()?
> 
>  From my understanding, I need to do this to invert the daddr with the
> saddr to send the unreach message back to the sender.

 From looking at how fl6.{daddr,saddr} are filled and passed to 
icmpv6_route_lookup in icmpv6_send(), the icmpv6_send() should have done the 
reverse/invert route lookup. I also don't see icmpv6_send uses the skb_dst() of 
the original skb. Did I misread the code? The kfunc does not work without 
ip[6]_route_reply_fetch_dst()? Again, I have not checked the v4 icmp_send. fwiw, 
the selftest should have both v4 and v6 test.

Note that at cgroup/egress, the skb->_skb_refdst should have been set.

The same should be true for cgroup/ingress for inet proto but it seems 
BPF_CGROUP_RUN_PROG_"INET"_INGRESS is not called from INET only now. e.g. 
sk_filter() can be called from af_netlink. It seems like there is a bug.

> 
>>
>>> +			kfree_skb(nskb);
>>> +			return -EHOSTUNREACH;
>>> +		}
>>> +
>>> +		icmpv6_send(nskb, ICMPV6_DEST_UNREACH, code, 0);
>>> +		kfree_skb(nskb);
>>> +		break;
>>> +#endif
>>> +	default:
>>> +		return -EPROTONOSUPPORT;
>>> +	}
>>> +
>>> +	return SK_DROP;
>>> +}
>>> +
> 


  reply	other threads:[~2025-07-29 23:13 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-10 10:26 [PATCH bpf-next v1 0/4] bpf: add icmp_send_unreach kfunc Mahe Tardy
2025-07-10 10:26 ` [PATCH bpf-next v1 1/4] net: move netfilter nf_reject_fill_skb_dst to core ipv4 Mahe Tardy
2025-07-10 10:26 ` [PATCH bpf-next v1 2/4] net: move netfilter nf_reject6_fill_skb_dst to core ipv6 Mahe Tardy
2025-07-10 22:02   ` kernel test robot
2025-07-10 10:26 ` [PATCH bpf-next v1 3/4] bpf: add bpf_icmp_send_unreach cgroup_skb kfunc Mahe Tardy
2025-07-10 16:07   ` Alexei Starovoitov
2025-07-11 10:57     ` Mahe Tardy
2025-07-25 18:53     ` [PATCH bpf-next v1 0/4] bpf: add icmp_send_unreach kfunc Mahe Tardy
2025-07-25 18:53       ` [PATCH bpf-next v2 1/4] net: move netfilter nf_reject_fill_skb_dst to core ipv4 Mahe Tardy
2025-07-25 18:53       ` [PATCH bpf-next v2 2/4] net: move netfilter nf_reject6_fill_skb_dst to core ipv6 Mahe Tardy
2025-07-25 18:53       ` [PATCH bpf-next v2 3/4] bpf: add bpf_icmp_send_unreach cgroup_skb kfunc Mahe Tardy
2025-07-27  1:49         ` kernel test robot
2025-07-28  9:43           ` [PATCH bpf-next v3 0/4] bpf: add icmp_send_unreach kfunc Mahe Tardy
2025-07-28  9:43             ` [PATCH bpf-next v3 1/4] net: move netfilter nf_reject_fill_skb_dst to core ipv4 Mahe Tardy
2025-07-28  9:43             ` [PATCH bpf-next v3 2/4] net: move netfilter nf_reject6_fill_skb_dst to core ipv6 Mahe Tardy
2025-07-28  9:43             ` [PATCH bpf-next v3 3/4] bpf: add bpf_icmp_send_unreach cgroup_skb kfunc Mahe Tardy
2025-07-28 20:10               ` kernel test robot
2025-07-29  1:05               ` Martin KaFai Lau
2025-07-29 10:06                 ` Mahe Tardy
2025-07-29 23:13                   ` Martin KaFai Lau [this message]
2025-07-28  9:43             ` [PATCH bpf-next v3 4/4] selftests/bpf: add icmp_send_unreach kfunc tests Mahe Tardy
2025-07-28 15:40               ` Yonghong Song
2025-07-28 15:59                 ` Mahe Tardy
2025-07-29  1:18               ` Martin KaFai Lau
2025-07-29  9:09                 ` Mahe Tardy
2025-07-29 23:27                   ` Martin KaFai Lau
2025-07-30  0:01                     ` Martin KaFai Lau
2025-07-30  0:32                       ` Martin KaFai Lau
2025-08-05 23:26               ` Jordan Rife
2025-07-29  1:21             ` [PATCH bpf-next v3 0/4] bpf: add icmp_send_unreach kfunc Martin KaFai Lau
2025-07-29  9:53               ` Mahe Tardy
2025-07-30  1:54                 ` Martin KaFai Lau
2025-08-01 18:50                   ` Mahe Tardy
2026-04-20 10:58                     ` [PATCH bpf-next v4 0/6] " Mahe Tardy
2026-04-20 10:58                       ` [PATCH bpf-next v4 1/6] net: move netfilter nf_reject_fill_skb_dst to core ipv4 Mahe Tardy
2026-04-20 11:36                         ` bot+bpf-ci
2026-04-20 13:04                           ` Mahe Tardy
2026-04-21 11:13                         ` sashiko-bot
2026-04-20 10:58                       ` [PATCH bpf-next v4 2/6] net: move netfilter nf_reject6_fill_skb_dst to core ipv6 Mahe Tardy
2026-04-21 11:13                         ` sashiko-bot
2026-04-20 10:58                       ` [PATCH bpf-next v4 3/6] bpf: add bpf_icmp_send_unreach kfunc Mahe Tardy
2026-04-20 11:36                         ` bot+bpf-ci
2026-04-20 13:07                           ` Mahe Tardy
2026-04-21 11:13                         ` sashiko-bot
2026-04-20 10:58                       ` [PATCH bpf-next v4 4/6] selftests/bpf: add icmp_send_unreach kfunc tests Mahe Tardy
2026-04-20 11:36                         ` bot+bpf-ci
2026-04-20 13:08                           ` Mahe Tardy
2026-04-21 11:13                         ` sashiko-bot
2026-04-20 10:58                       ` [PATCH bpf-next v4 5/6] selftests/bpf: add icmp_send_unreach kfunc IPv6 tests Mahe Tardy
2026-04-21 11:13                         ` sashiko-bot
2026-04-20 10:58                       ` [PATCH bpf-next v4 6/6] selftests/bpf: add icmp_send_unreach_recursion test Mahe Tardy
2026-04-21 11:13                         ` sashiko-bot
2025-07-25 18:53       ` [PATCH bpf-next v2 4/4] selftests/bpf: add icmp_send_unreach kfunc tests Mahe Tardy
2025-07-11  0:32   ` [PATCH bpf-next v1 3/4] bpf: add bpf_icmp_send_unreach cgroup_skb kfunc kernel test robot
2025-07-10 10:26 ` [PATCH bpf-next v1 4/4] selftests/bpf: add icmp_send_unreach kfunc tests Mahe Tardy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=df4b0996-3e88-4ea4-983b-82866455a6fc@linux.dev \
    --to=martin.lau@linux.dev \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=coreteam@netfilter.org \
    --cc=daniel@iogearbox.net \
    --cc=fw@strlen.de \
    --cc=john.fastabend@gmail.com \
    --cc=lkp@intel.com \
    --cc=mahe.tardy@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=oe-kbuild-all@lists.linux.dev \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.