From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f45.google.com (mail-wm1-f45.google.com [209.85.128.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD53B37104C for ; Tue, 26 May 2026 15:37:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.45 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779809850; cv=none; b=JUoYaT/F5ILc+010/w868N1/HquJZE/MFsgu1HlzhK+rt/koB6e3VfGUBhJ610mlW2dVo7zun1LvfMfOSoEWsm+vYt4SB5zdpYdVPCQxSyo+/05F9Xf9DkR/bcPUA3DPF6wP5olftEuuEjbVo1Uhxx4+v/SFh6aldHTZFXz/pLY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779809850; c=relaxed/simple; bh=ptAcbdL+kbSV77PwAIenMJPVCzFvL2ZNvMdRmmkcDP8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=WRkwEX9gl7ipbFiPY/3fM3mo2l3+3j4SIe8jaxuuwQS6DV5hRtOVDCkl81XjE1YFW3QxBOz+0IwFuuGnqmg5gYzcI9eJG1Irld+Yu5Z2WJFPkaG5vKPCPJWYoPI220tw3GwCwOq0qKS4BoXu2dwnWOWw5ikQ91BY93gnYSZ5NxU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=s7Mbfpp4; arc=none smtp.client-ip=209.85.128.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="s7Mbfpp4" Received: by mail-wm1-f45.google.com with SMTP id 5b1f17b1804b1-4905529b933so24013095e9.0 for ; Tue, 26 May 2026 08:37:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779809847; x=1780414647; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wC2miNKigvo1STo6UwYOiI6JNb7uTzCPY971tAbbVuA=; b=s7Mbfpp4PwdowE6XvAfT5RR0ZO6wpibcHMWY9DUictBEs3MAQqEevg6DBp8tb+kSQ+ 6T7PPnZGi+zMiku2r5imO26nRgnI0SByAhkx2TbE+N9OkbjhhnvH9ThHfojhwkAaDOd6 jEHYBelWN+vGlXFX9YPfbsyYHVq+V8FqDdSTbptqxMKhFU7koyjiePwXzXSDQ+J3wcZu axegW80C6r7Zt7sMeuf84ktbcKBLHf3BwXtkr1VSl2rqGSKqFMf5rcZO6pChBb7N4KG4 p/HhdjB9wgIr+/itX/kJOoBOXYTFfXIeRKYOlxPRBq9VN/1ooLKEYmDjB+3ckxj0nQqr 0Kww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779809847; x=1780414647; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=wC2miNKigvo1STo6UwYOiI6JNb7uTzCPY971tAbbVuA=; b=k1MOc9QEoqu2kw6fKL/RyhDY0KNgM0Wj1yZjXW4XyHzi2KoXcjWCcrakgL8a+Fwm8+ WfFsbuDqBFke6KCEuutbsW7TS2MaemJfVwfVo2a16ReAQ3NdnENMvCcJqVf5TI+JAJY4 +1Xlm+2YUlBjQQLitwGnB6WlGhljya+ZEjG3Xuy07l1DfvLBUeJisIFE/MzQ7aemdTQc MSTpSMwAxzWbiiQ3CcP4U8X2mJBtR4b4BrLKJGQ60mWcLCACa3nS4SDXAiZ5A7tBX6lk BPt0vE8SRWbt/UQaUNzBbZjQ+G9k5lbosiuuqQD7dbdlQOKRHN3BULmyHJO1Bh1scROB NDsw== X-Forwarded-Encrypted: i=1; AFNElJ9cJf7Qbr1W1Ky+ujnG4f4oXMKSGS2GCi/DNZCf8Mek1JOI6ere16CDpBa0YdThJBAcoXgTLwU=@vger.kernel.org X-Gm-Message-State: AOJu0YxkPzswb5ETyb+CVGb83FJiHAC0k3U+Knri7S/dD//QRI/Orlqu Ugk7frb1fio+WcO7UsGwkRlwooRorgSSXkhoaAbk70gO5d8RdwUmzm4/k9Xame6b X-Gm-Gg: Acq92OFIHSfZaew4KlHme1ZmoOxjXFkDnHVYIW8Gz4h5VYKTOd1S6i++Tyk7iuPedEr EWEnkxZESIT2tlJk5K3BmRx4uY9wt2YY205jYO9zuxXN7MGznQTVkLYqXaL2WWAWre2arvTKu4e 0u+v1HDpCSBRhoyfZQjGPcc3wjfzUvjxPTV4MIzc2S2OK2p7T8g2/gFQQB0KwG3NZo6qjjs+MH2 KsN0IthNAvAJohYfUWMIVfC4V0HZOI0z01ybSjzdjv0j6gLjh9iRhl1zxDLZlKBm/rQr9EKu3fc bwdl2MmSRP0VcOkGBdZIxYlGeo7yH/GvILr4HZiCjMg5Yu8RP9UQ1Dav8gV+uqJBtRXpogurORn zPOYyjcbcyyq5t4P+KLHQgdyAlrAmCLCJy6HPkkBAQHymnZ0aRg53Wgqqvv5T8boSjUuK7PQ24r r49k+tLlL0k6xcwJ9b615MGaosB3uat6Q5NfenTw== X-Received: by 2002:a05:600d:6413:20b0:48a:9540:1a3a with SMTP id 5b1f17b1804b1-49042495183mr242111015e9.8.1779809847009; Tue, 26 May 2026 08:37:27 -0700 (PDT) Received: from mtardy-friendly-lvh-runner.local ([2600:1900:4010:1a8::]) by smtp.googlemail.com with ESMTPSA id 5b1f17b1804b1-4907df9edeasm1083655e9.9.2026.05.26.08.37.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 May 2026 08:37:26 -0700 (PDT) From: Mahe Tardy To: bpf@vger.kernel.org Cc: martin.lau@linux.dev, daniel@iogearbox.net, john.fastabend@gmail.com, ast@kernel.org, andrii@kernel.org, yonghong.song@linux.dev, jordan@jrife.io, netdev@vger.kernel.org, netfilter-devel@vger.kernel.org, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, Mahe Tardy Subject: [PATCH bpf-next v7 3/7] bpf: add bpf_icmp_send kfunc Date: Tue, 26 May 2026 15:37:04 +0000 Message-Id: <20260526153708.279717-4-mahe.tardy@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260526153708.279717-1-mahe.tardy@gmail.com> References: <20260526153708.279717-1-mahe.tardy@gmail.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit This is needed in the context of Tetragon to provide improved feedback (in contrast to just dropping packets) to east-west traffic when blocked by policies using cgroup_skb programs. We also extend this kfunc to tc program as a convenience. This reuses concepts from netfilter reject target codepath with the differences that: * Packets are cloned since the BPF user can still let the packet pass (SK_PASS from the cgroup_skb progs for example) and the current skb need to stay untouched (cgroup_skb hooks only allow read-only skb payload). * We protect against recursion since the kfunc, by generating an ICMP error message, could retrigger the BPF prog that invoked it. For now, we support cgroup_skb and tc program types. For cgroup_skb and tc egress, almost everything should be good. However for tc ingress: - packet will not be routed yet: need to set the net device for icmp_send, thus the call to ip[6]_route_reply_fill_dst. - fragments could trigger hook: icmp_send will only reply to fragment 0. - ensure the ip headers is linearized before processing, and zero out the SKB control block after cloning to prevent icmp_send()/icmpv6_send() from misinterpreting garbage data as IP options. Only ICMP_DEST_UNREACH and ICMPV6_DEST_UNREACH are currently supported. The interface accepts a type parameter to facilitate future extension to other ICMP control message types. Signed-off-by: Mahe Tardy --- net/core/filter.c | 109 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 109 insertions(+) diff --git a/net/core/filter.c b/net/core/filter.c index 9590877b0714..6db0bdd71c6f 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -84,6 +84,8 @@ #include #include #include +#include +#include #include "dev.h" @@ -12464,6 +12466,101 @@ __bpf_kfunc int bpf_xdp_pull_data(struct xdp_md *x, u32 len) return 0; } +/** + * bpf_icmp_send - Send an ICMP control message + * @skb_ctx: Packet that triggered the control message + * @type: ICMP type (only ICMP_DEST_UNREACH/ICMPV6_DEST_UNREACH supported) + * @code: ICMP code (0-15 for IPv4, 0-6 for IPv6) + * + * Sends an ICMP control message in response to the packet. The original packet + * is cloned before sending the ICMP message, so the BPF program can still let + * the packet pass if desired. + * + * Currently only ICMP_DEST_UNREACH (IPv4) and ICMPV6_DEST_UNREACH (IPv6) are + * supported. + * + * Return: 0 on success, negative error code on failure: + * -EINVAL: Invalid code parameter + * -EBADMSG: Packet too short or malformed + * -ENOMEM: Memory allocation failed + * -EBUSY: Recursion detected + * -EHOSTUNREACH: Routing failed + * -EPROTONOSUPPORT: Non-IP protocol + * -EOPNOTSUPP: Unsupported ICMP type + */ +__bpf_kfunc int bpf_icmp_send(struct __sk_buff *skb_ctx, int type, int code) +{ + struct sk_buff *skb = (struct sk_buff *)skb_ctx; + struct sk_buff *nskb; + struct sock *sk; + + sk = skb_to_full_sk(skb); + if (sk && sk->sk_kern_sock && + (sk->sk_protocol == IPPROTO_ICMP || sk->sk_protocol == IPPROTO_ICMPV6)) + return -EBUSY; + + switch (skb->protocol) { +#if IS_ENABLED(CONFIG_INET) + case htons(ETH_P_IP): + if (type != ICMP_DEST_UNREACH) + return -EOPNOTSUPP; + if (code < 0 || code > NR_ICMP_UNREACH) + return -EINVAL; + + nskb = skb_clone(skb, GFP_ATOMIC); + if (!nskb) + return -ENOMEM; + + if (!pskb_network_may_pull(nskb, sizeof(struct iphdr))) { + kfree_skb(nskb); + return -EBADMSG; + } + + if (!skb_dst(nskb) && ip_route_reply_fill_dst(nskb) < 0) { + kfree_skb(nskb); + return -EHOSTUNREACH; + } + + memset(IPCB(nskb), 0, sizeof(struct inet_skb_parm)); + + icmp_send(nskb, type, code, 0); + consume_skb(nskb); + break; +#endif +#if IS_ENABLED(CONFIG_IPV6) + case htons(ETH_P_IPV6): + if (type != ICMPV6_DEST_UNREACH) + return -EOPNOTSUPP; + if (code < 0 || code > ICMPV6_REJECT_ROUTE) + return -EINVAL; + + nskb = skb_clone(skb, GFP_ATOMIC); + if (!nskb) + return -ENOMEM; + + if (!pskb_network_may_pull(nskb, sizeof(struct ipv6hdr))) { + kfree_skb(nskb); + return -EBADMSG; + } + + if (!skb_dst(nskb) && ip6_route_reply_fill_dst(nskb) < 0) { + kfree_skb(nskb); + return -EHOSTUNREACH; + } + + memset(IP6CB(nskb), 0, sizeof(struct inet6_skb_parm)); + + icmpv6_send(nskb, type, code, 0); + consume_skb(nskb); + break; +#endif + default: + return -EPROTONOSUPPORT; + } + + return 0; +} + __bpf_kfunc_end_defs(); int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags, @@ -12506,6 +12603,10 @@ BTF_KFUNCS_START(bpf_kfunc_check_set_sock_ops) BTF_ID_FLAGS(func, bpf_sock_ops_enable_tx_tstamp) BTF_KFUNCS_END(bpf_kfunc_check_set_sock_ops) +BTF_KFUNCS_START(bpf_kfunc_check_set_icmp_send) +BTF_ID_FLAGS(func, bpf_icmp_send) +BTF_KFUNCS_END(bpf_kfunc_check_set_icmp_send) + static const struct btf_kfunc_id_set bpf_kfunc_set_skb = { .owner = THIS_MODULE, .set = &bpf_kfunc_check_set_skb, @@ -12536,6 +12637,11 @@ static const struct btf_kfunc_id_set bpf_kfunc_set_sock_ops = { .set = &bpf_kfunc_check_set_sock_ops, }; +static const struct btf_kfunc_id_set bpf_kfunc_set_icmp_send = { + .owner = THIS_MODULE, + .set = &bpf_kfunc_check_set_icmp_send, +}; + static int __init bpf_kfunc_init(void) { int ret; @@ -12557,6 +12663,9 @@ static int __init bpf_kfunc_init(void) ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR, &bpf_kfunc_set_sock_addr); ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk); + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SKB, &bpf_kfunc_set_icmp_send); + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_icmp_send); + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_ACT, &bpf_kfunc_set_icmp_send); return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SOCK_OPS, &bpf_kfunc_set_sock_ops); } late_initcall(bpf_kfunc_init); -- 2.34.1