* Re: [PATCH] MAINTAINERS: Orphan SUNPLUS ETHERNET DRIVER
From: patchwork-bot+netdevbpf @ 2026-06-24 2:20 UTC (permalink / raw)
To: Wells Lu
Cc: kuba, netdev, linux-kernel, shital.gandhi45, andrew, davem,
edumazet, pabeni, horms, shitalkumar.gandhi
In-Reply-To: <20260622180721.28334-1-wellslutw@gmail.com>
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Tue, 23 Jun 2026 02:07:21 +0800 you wrote:
> I have left Sunplus and no longer have access to the relevant hardware
> to test or maintain this driver. Mark the driver as orphaned.
>
> Signed-off-by: Wells Lu <wellslutw@gmail.com>
> ---
> MAINTAINERS | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
Here is the summary with links:
- MAINTAINERS: Orphan SUNPLUS ETHERNET DRIVER
https://git.kernel.org/netdev/net/c/89adcf17ee7a
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [PATCH net] net: au1000: move free_irq out of the close-time spinlocked section
From: patchwork-bot+netdevbpf @ 2026-06-24 2:20 UTC (permalink / raw)
To: Runyu Xiao
Cc: andrew+netdev, davem, edumazet, kuba, pabeni, netdev,
linux-kernel, stable
In-Reply-To: <20260619151816.1144289-1-runyu.xiao@seu.edu.cn>
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Fri, 19 Jun 2026 23:18:16 +0800 you wrote:
> au1000_close() calls free_irq() while aup->lock is still held with
> spin_lock_irqsave(). free_irq() can sleep because it takes the IRQ
> descriptor request mutex, so it does not belong inside the close-time
> spinlocked section.
>
> This was found by our static analysis tool and then confirmed by manual
> review of the in-tree au1000_close() .ndo_stop path. The reviewed path
> keeps aup->lock held across the MAC reset, queue stop and
> free_irq(dev->irq, dev).
>
> [...]
Here is the summary with links:
- [net] net: au1000: move free_irq out of the close-time spinlocked section
https://git.kernel.org/netdev/net/c/f48763beab4e
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [PATCH net v4] net: dsa: Fix skb ownership in taggers
From: Jakub Kicinski @ 2026-06-24 2:15 UTC (permalink / raw)
To: linusw
Cc: Jakub Kicinski, andrew, olteanv, davem, edumazet, pabeni, horms,
florian.fainelli, jonas.gorski, hauke, kurt, woojung.huh,
UNGLinuxDriver, chester.a.unal, daniel, matthias.bgg,
angelogioacchino.delregno, wei.fang, xiaoning.wang, clement.leger,
george.mccollister, mmyangfl, netdev, sashiko-bot
In-Reply-To: <20260622-dsa-fix-free-skb-v4-1-7aea01bf4036@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: dsa: Fix skb ownership in taggers
This change reworks the DSA tagger ->xmit() and ->rcv() ownership rules
so that taggers consume the skb on failure rather than leaving the drop
to dsa_user_xmit() and dsa_switch_rcv(). It removes the kfree_skb()
calls in those two callers and adds matching kfree_skb() in every
tagger error path that retains skb ownership.
Closes: https://lore.kernel.org/r/20260610153952.1685895-1-kuba@kernel.org/
> diff --git a/net/dsa/tag.c b/net/dsa/tag.c
> index 79ad105902d9..107e93250b94 100644
> --- a/net/dsa/tag.c
> +++ b/net/dsa/tag.c
> @@ -79,15 +79,16 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct net_device *dev,
> if (likely(skb->dev)) {
> dsa_default_offload_fwd_mark(skb);
> nskb = skb;
> + } else {
> + /* Just drop the skb if we can't find the user */
> + kfree_skb(skb);
> }
> } else {
> nskb = cpu_dp->rcv(skb, dev);
> }
>
> - if (!nskb) {
> - kfree_skb(skb);
> + if (!nskb)
> return 0;
> - }
[Critical]
While the ->rcv() path in dsa_switch_rcv() is now consistent with the
new "consumer drops on failure" rule, the dsa_software_vlan_untag()
call later in the same function appears to still follow the old rule.
Can that produce a double-free?
The relevant block, unchanged by this patch, is:
if (unlikely(cpu_dp->ds->untag_bridge_pvid ||
cpu_dp->ds->untag_vlan_aware_bridge_pvid)) {
nskb = dsa_software_vlan_untag(skb);
if (!nskb) {
kfree_skb(skb);
return 0;
}
skb = nskb;
}
The only NULL return inside dsa_software_vlan_untag() is the
skb_vlan_untag() path in net/dsa/tag.h:
if (!skb_vlan_tag_present(skb) && skb->protocol == htons(proto)) {
skb = skb_vlan_untag(skb);
if (!skb)
return NULL;
}
skb_vlan_untag() in net/core/skbuff.c already frees on every error
path before returning NULL:
err_free:
kfree_skb(skb);
return NULL;
That is reachable on a malformed VLAN-tagged frame, for example
through the pskb_may_pull(skb, VLAN_HLEN + sizeof(unsigned short))
check, the skb_share_check() failure, or skb_reorder_vlan_header().
So when dsa_software_vlan_untag() returns NULL, the original skb
has already been freed by skb_vlan_untag(), and the kfree_skb(skb)
in dsa_switch_rcv() then operates on freed memory. Is this the
same anti-pattern the commit message describes as "conceptually
wrong: the pattern elsewhere in the networking code is that
consumers drop their skb:s on failure", and should this call site
also be converted (drop the kfree_skb(skb) here, since the helper
already consumes on failure)?
The path is reachable from any driver that sets
ds->untag_bridge_pvid or ds->untag_vlan_aware_bridge_pvid.
--
pw-bot: cr
^ permalink raw reply
* Re: [PATCH bpf-next v8 3/7] bpf: add bpf_icmp_send kfunc
From: Emil Tsalapatis @ 2026-06-24 2:09 UTC (permalink / raw)
To: Mahe Tardy, bpf
Cc: andrii, ast, daniel, edumazet, john.fastabend, jordan, kuba,
martin.lau, netdev, netfilter-devel, pabeni, yonghong.song
In-Reply-To: <20260622120515.137082-4-mahe.tardy@gmail.com>
On Mon Jun 22, 2026 at 8:05 AM EDT, Mahe Tardy wrote:
> This is needed in the context of Tetragon to provide improved feedback
> (in contrast to just dropping packets) to east-west traffic when blocked
> by policies using cgroup_skb programs. We also extend this kfunc to tc
> program as a convenience.
>
> This reuses concepts from netfilter reject target codepath with the
> differences that:
> * Packets are cloned since the BPF user can still let the packet pass
> (SK_PASS from the cgroup_skb progs for example) and the current skb
> need to stay untouched (cgroup_skb hooks only allow read-only skb
> payload).
> * We protect against recursion since the kfunc, by generating an ICMP
> error message, could retrigger the BPF prog that invoked it.
>
> For now, we support cgroup_skb and tc program types. For cgroup_skb and
> tc egress, almost everything should be good. However for tc ingress:
> - packet will not be routed yet: need to set the net device for
> icmp_send, thus the call to ip[6]_route_reply_fill_dst.
> - fragments could trigger hook: icmp_send will only reply to fragment 0.
> - ensure the ip headers is linearized before processing, and zero out
> the SKB control block after cloning to prevent icmp_send()/icmpv6_send()
> from misinterpreting garbage data as IP options.
>
> Only ICMP_DEST_UNREACH and ICMPV6_DEST_UNREACH are currently supported.
> The interface accepts a type parameter to facilitate future extension to
> other ICMP control message types.
>
> Reviewed-by: Jordan Rife <jordan@jrife.io>
> Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
> ---
> net/core/filter.c | 109 ++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 109 insertions(+)
>
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 2e96b4b847ce..fc69a14650e4 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -84,6 +84,8 @@
> #include <linux/un.h>
> #include <net/xdp_sock_drv.h>
> #include <net/inet_dscp.h>
> +#include <linux/icmpv6.h>
> +#include <net/icmp.h>
>
> #include "dev.h"
>
> @@ -12546,6 +12548,101 @@ __bpf_kfunc int bpf_xdp_pull_data(struct xdp_md *x, u32 len)
> return 0;
> }
>
> +/**
> + * bpf_icmp_send - Send an ICMP control message
> + * @skb_ctx: Packet that triggered the control message
> + * @type: ICMP type (only ICMP_DEST_UNREACH/ICMPV6_DEST_UNREACH supported)
> + * @code: ICMP code (0-15 for IPv4, 0-6 for IPv6)
> + *
> + * Sends an ICMP control message in response to the packet. The original packet
> + * is cloned before sending the ICMP message, so the BPF program can still let
> + * the packet pass if desired.
> + *
> + * Currently only ICMP_DEST_UNREACH (IPv4) and ICMPV6_DEST_UNREACH (IPv6) are
> + * supported.
> + *
> + * Return: 0 on success, negative error code on failure:
> + * -EINVAL: Invalid code parameter
> + * -EBADMSG: Packet too short or malformed
> + * -ENOMEM: Memory allocation failed
> + * -EBUSY: Recursion detected
> + * -EHOSTUNREACH: Routing failed
> + * -EPROTONOSUPPORT: Non-IP protocol
> + * -EOPNOTSUPP: Unsupported ICMP type
> + */
> +__bpf_kfunc int bpf_icmp_send(struct __sk_buff *skb_ctx, int type, int code)
> +{
> + struct sk_buff *skb = (struct sk_buff *)skb_ctx;
> + struct sk_buff *nskb;
> + struct sock *sk;
> +
> + sk = skb_to_full_sk(skb);
> + if (sk && sk->sk_kern_sock &&
> + (sk->sk_protocol == IPPROTO_ICMP || sk->sk_protocol == IPPROTO_ICMPV6))
> + return -EBUSY;
> +
> + switch (skb->protocol) {
> +#if IS_ENABLED(CONFIG_INET)
> + case htons(ETH_P_IP):
> + if (type != ICMP_DEST_UNREACH)
> + return -EOPNOTSUPP;
> + if (code < 0 || code > NR_ICMP_UNREACH)
> + return -EINVAL;
> +
> + nskb = skb_clone(skb, GFP_ATOMIC);
> + if (!nskb)
> + return -ENOMEM;
> +
> + if (!pskb_network_may_pull(nskb, sizeof(struct iphdr))) {
> + kfree_skb(nskb);
> + return -EBADMSG;
> + }
> +
> + if (!skb_dst(nskb) && ip_route_reply_fill_dst(nskb) < 0) {
> + kfree_skb(nskb);
> + return -EHOSTUNREACH;
> + }
> +
> + memset(IPCB(nskb), 0, sizeof(struct inet_skb_parm));
> +
> + icmp_send(nskb, type, code, 0);
> + consume_skb(nskb);
> + break;
> +#endif
> +#if IS_ENABLED(CONFIG_IPV6)
> + case htons(ETH_P_IPV6):
> + if (type != ICMPV6_DEST_UNREACH)
> + return -EOPNOTSUPP;
> + if (code < 0 || code > ICMPV6_REJECT_ROUTE)
> + return -EINVAL;
> +
> + nskb = skb_clone(skb, GFP_ATOMIC);
> + if (!nskb)
> + return -ENOMEM;
> +
> + if (!pskb_network_may_pull(nskb, sizeof(struct ipv6hdr))) {
Minor nit, but this may also fail with SKB_DROP_REASON_NOMEM. Now this is only
possible if the IP header is not in the linear space which may well be
impossible (?), but do we want to differentiate with
pskb_network_may_pull_reason()?
> + kfree_skb(nskb);
> + return -EBADMSG;
> + }
> +
> + if (!skb_dst(nskb) && ip6_route_reply_fill_dst(nskb) < 0) {
> + kfree_skb(nskb);
> + return -EHOSTUNREACH;
> + }
> +
> + memset(IP6CB(nskb), 0, sizeof(struct inet6_skb_parm));
> +
> + icmpv6_send(nskb, type, code, 0);
> + consume_skb(nskb);
> + break;
> +#endif
> + default:
> + return -EPROTONOSUPPORT;
> + }
> +
> + return 0;
> +}
> +
> __bpf_kfunc_end_defs();
>
> int bpf_dynptr_from_skb_rdonly(struct __sk_buff *skb, u64 flags,
> @@ -12588,6 +12685,10 @@ BTF_KFUNCS_START(bpf_kfunc_check_set_sock_ops)
> BTF_ID_FLAGS(func, bpf_sock_ops_enable_tx_tstamp)
> BTF_KFUNCS_END(bpf_kfunc_check_set_sock_ops)
>
> +BTF_KFUNCS_START(bpf_kfunc_check_set_icmp_send)
> +BTF_ID_FLAGS(func, bpf_icmp_send)
> +BTF_KFUNCS_END(bpf_kfunc_check_set_icmp_send)
> +
> static const struct btf_kfunc_id_set bpf_kfunc_set_skb = {
> .owner = THIS_MODULE,
> .set = &bpf_kfunc_check_set_skb,
> @@ -12618,6 +12719,11 @@ static const struct btf_kfunc_id_set bpf_kfunc_set_sock_ops = {
> .set = &bpf_kfunc_check_set_sock_ops,
> };
>
> +static const struct btf_kfunc_id_set bpf_kfunc_set_icmp_send = {
> + .owner = THIS_MODULE,
> + .set = &bpf_kfunc_check_set_icmp_send,
> +};
> +
> static int __init bpf_kfunc_init(void)
> {
> int ret;
> @@ -12639,6 +12745,9 @@ static int __init bpf_kfunc_init(void)
> ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SOCK_ADDR,
> &bpf_kfunc_set_sock_addr);
> ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_tcp_reqsk);
> + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_CGROUP_SKB, &bpf_kfunc_set_icmp_send);
> + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_CLS, &bpf_kfunc_set_icmp_send);
> + ret = ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SCHED_ACT, &bpf_kfunc_set_icmp_send);
Based on Sashiko's feedback, since we mostly care about cgroup_skb
should we just make it exclusive to them and drop CLS_ACT?
> return ret ?: register_btf_kfunc_id_set(BPF_PROG_TYPE_SOCK_OPS, &bpf_kfunc_set_sock_ops);
> }
> late_initcall(bpf_kfunc_init);
> --
> 2.34.1
^ permalink raw reply
* Re: [PATCH 0/3] SM8450 IPA support
From: Esteban Urrutia @ 2026-06-24 1:57 UTC (permalink / raw)
To: Alex Elder, Bjorn Andersson, Konrad Dybcio, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alex Elder
Cc: linux-arm-msm, devicetree, linux-kernel, netdev
In-Reply-To: <959db395-ae71-4a50-bd46-ac5add545a52@riscstar.com>
On 6/23/26 11:56 AM, Alex Elder wrote:
> I assume you have implemented this based on what you found in
> some downstream code. And if so, could you please indicate
> where to find that (so I can do some cross-referencing myself).
> I no longer have access to any Qualcomm internal documentation.
Hello. Yes, that would be the case. What I used goes as follows.
1. My personal findings regarding IPA:
https://gist.github.com/esteuwu/bd49ed67ed9290f41612bdae1cacb5bc
Note that these may be subject to errors since I mostly cross-checked
values to get here.
2. SM8450 downstream device tree:
https://github.com/LineageOS/android_kernel_qcom_sm8450-devicetrees/blob/lineage-20/qcom/waipio.dtsi#L3304
3. SM8475 downstream device tree:
https://github.com/LineageOS/android_kernel_qcom_sm8450-devicetrees/blob/lineage-20/qcom/cape.dtsi#L2624
It's worth mentioning that between SM8450 and SM8475, IPA SRAM size is
different, so I used the smaller SRAM size to support SM8475 as well. Hence
the reason why I included SM8475's downstream device tree as well.
4. SM8450/SM8475 downstream IPA driver:
https://github.com/LineageOS/android_kernel_qcom_sm8450-modules/tree/lineage-20/qcom/opensource/dataipa
Most of my cross-checking came from the source code in this folder.
Finally, for some values such as qmap, aggregation, tre_count and
event_count, I had to cross-check on the same folder that all
ipa_data-vX.Y.c files reside, since I couldn't find any reference to these
values in downstream code.
Regards,
Esteban
^ permalink raw reply
* Re: [PATCH 1/3] arm64: dts: qcom: sm8450: Add IPA support
From: Esteban Urrutia @ 2026-06-24 1:52 UTC (permalink / raw)
To: Konrad Dybcio, Bjorn Andersson, Konrad Dybcio, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Andrew Lunn, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Alex Elder
Cc: linux-arm-msm, devicetree, linux-kernel, netdev
In-Reply-To: <806046b2-20ed-437e-a7e6-b3c0699f5a2d@oss.qualcomm.com>
On 6/23/26 5:37 AM, Konrad Dybcio wrote:
> size = 0xb0000 for the RAM and uC regions that the driver seems
> to poke at (at a glance anyway..)
Sorry, I don't quite understand. Could you please clarify?
> base=0x1468_0000
> size=0x40_000
Noted, will fix in v2.
Regards,
Esteban
^ permalink raw reply
* Re: [PATCH bpf 1/2] bpf, sockmap: Don't leak UDP socks on lookup-bind-release
From: Jiayuan Chen @ 2026-06-24 1:36 UTC (permalink / raw)
To: Michal Luczaj, John Fastabend, Jakub Sitnicki, Jiayuan Chen,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Alexei Starovoitov, Cong Wang, Daniel Borkmann,
Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Martin KaFai Lau, Song Liu, Yonghong Song, Jiri Olsa,
Emil Tsalapatis, Shuah Khan
Cc: netdev, bpf, linux-kernel, linux-kselftest
In-Reply-To: <20260623-sockmap-lookup-udp-leak-v1-1-05804f9308e4@rbox.co>
On 6/24/26 2:03 AM, Michal Luczaj wrote:
> UDP sockets get SOCK_RCU_FREE set when (auto-)bound. This means
> sk_is_refcounted(unbound) = true, while sk_is_refcounted(bound) = false.
>
> Because sockmap accepts unbound UDP sockets, a BPF program can increment a
> socket's refcount via lookup. If the socket is subsequently bound, the
> transition from unbound to bound causes bpf_sk_release() to skip the
> decrement of the refcount, causing a memory leak.
>
> unreferenced object 0xffff88810bc2eb40 (size 1984):
> comm "test_progs", pid 2451, jiffies 4295320596
> hex dump (first 32 bytes):
> 7f 00 00 01 7f 00 00 01 d2 04 1b b7 04 d2 00 00 ................
> 02 00 01 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............
> backtrace (crc bdee079d):
> kmem_cache_alloc_noprof+0x557/0x660
> sk_prot_alloc+0x69/0x240
> sk_alloc+0x30/0x460
> inet_create+0x2ce/0xf80
> __sock_create+0x25b/0x5c0
> __sys_socket+0x119/0x1d0
> __x64_sys_socket+0x72/0xd0
> do_syscall_64+0xa1/0x5f0
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> Maintain balanced refcounts across sk lookup/release: (re-)set
> SOCK_RCU_FREE on proto update to treat the socket (whether bound or
> unbound) as not requiring a refcount increment on (a RCU protected) lookup.
>
> Fixes: 0c48eefae712 ("sock_map: Lift socket state restriction for datagram sockets")
> Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>
^ permalink raw reply
* Re: [PATCH bpf-next v2] bpf, unix: Guard sk_msg-dependent code behind CONFIG_NET_SOCK_MSG
From: Jiayuan Chen @ 2026-06-24 1:32 UTC (permalink / raw)
To: Alexei Starovoitov, Jakub Sitnicki
Cc: Amery Hung, Kuniyuki Iwashima, bpf, Alexei Starovoitov,
Daniel Borkmann, Jakub Kicinski, John Fastabend,
Network Development, kernel-team
In-Reply-To: <CAADnVQKr1XisnigNsBw7CsXxY3Xn5KOGtX_YDdXmNMZyJy4_Cw@mail.gmail.com>
On 6/24/26 5:26 AM, Alexei Starovoitov wrote:
> On Tue, Jun 23, 2026 at 1:36 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>> On Tue, Jun 23, 2026 at 01:22 PM -07, Amery Hung wrote:
>>> On Tue, Jun 23, 2026 at 1:04 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>>> On Tue, Jun 23, 2026 at 12:33 PM -07, Alexei Starovoitov wrote:
>>>>> On Tue, Jun 23, 2026 at 12:31 PM Kuniyuki Iwashima <kuniyu@google.com> wrote:
>>>>>> On Tue, Jun 23, 2026 at 12:21 PM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>>>>>> On Tue, Jun 23, 2026 at 09:08 AM -07, Kuniyuki Iwashima wrote:
>>>>>>>> On Tue, Jun 23, 2026 at 4:20 AM Jakub Sitnicki <jakub@cloudflare.com> wrote:
>>>>>>>>> Prepare to decouple BPF_SYSCALL config option from NET_SOCK_MSG. When
>>>>>>>>> completed all code paths related to sockmap-based redirects should be
>>>>>>>>> guarded by BPF_SYSCALL && NET_SOCK_MSG to allow users to opt out by
>>>>>>>>> disabling NET_SOCK_MSG. The implementation of sockmap as a container for
>>>>>>>>> socket references would remain under BPF_SYSCALL.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
>>>>>>>>> ---
>>>>>>>>> Changes in v2:
>>>>>>>>> - Handle prot->recvmsg being NULL (Sashiko)
>>>>>>>>> - Elaborate on the end goal in description
>>>>>>>>> - Link to v1: https://patch.msgid.link/20260622-bpf-sk_msg-split-unix-v1-1-d7e0cb7bb03b@cloudflare.com
>>>>>>>>> ---
>>>>>>>>> net/unix/af_unix.c | 4 ++--
>>>>>>>>> net/unix/unix_bpf.c | 6 ++++++
>>>>>>>>> 2 files changed, 8 insertions(+), 2 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
>>>>>>>>> index f7a9d55eee8a..84c11c60c75f 100644
>>>>>>>>> --- a/net/unix/af_unix.c
>>>>>>>>> +++ b/net/unix/af_unix.c
>>>>>>>>> @@ -2675,7 +2675,7 @@ static int unix_dgram_recvmsg(struct socket *sock, struct msghdr *msg, size_t si
>>>>>>>>> #ifdef CONFIG_BPF_SYSCALL
>>>>>>>>> const struct proto *prot = READ_ONCE(sk->sk_prot);
>>>>>>>>>
>>>>>>>>> - if (prot != &unix_dgram_proto)
>>>>>>>>> + if (prot->recvmsg)
>>>>>>>> There is no reason to have this dead branch when
>>>>>>>> CONFIG_BPF_SYSCALL && !NET_SOCK_MSG.
>>>>>>>>
>>>>>>>> Let's compile out all sockmap code when both configs
>>>>>>>> are not enabled.
>>>>>>>>
>>>>>>>> Since AF_UNIX differs from TCP/UDP, it can take the
>>>>>>>> simpler approach.
>>>>>>> Okay, will put the whole file behind hidden config option like so:
>>>>>>>
>>>>>>> --- a/net/unix/Kconfig
>>>>>>> +++ b/net/unix/Kconfig
>>>>>>> @@ -30,3 +30,8 @@ config UNIX_DIAG
>>>>>>> help
>>>>>>> Support for UNIX socket monitoring interface used by the ss tool.
>>>>>>> If unsure, say Y.
>>>>>>> +
>>>>>>> +config UNIX_BPF
>>>>>> Maybe UNIX_BPF_SOCKMAP or something.
>>>>>> bpf_iter is supported without this config.
>>>>> I don't like where it's going.
>>>>> I strongly dislike new config knobs.
>>>>> I'd rather remove existing knobs.
>>>>> What is the motivation?
>>>> The goal is to compile out sockmap bits that use sk_msg.
>>>> NET_SOCK_MSG is natural, exisiting candidate.
>>>> New knob wasn't my idea.
>>> I'm also missing the big picture here.
>>>
>>> sockmap already holds socket references today. You can store and look
>>> up sockets without attaching any verdict/parser program, and no
>>> redirect happens. So if the goal is to use sockmap purely as a socket
>>> container without the sk_msg fast-path overhead, what does a
>>> compile-time NET_SOCK_MSG knob add over the runtime checks?
>> Sure, let me clarify. It's about the maintenance overhead.
>>
>> sockmap-based redirects are a rather niche feature with few users, for
>> which we've been getting quite a few bug reports since AI came along.
>>
>> We're not using it internally at Cloudflare, so I don't really have a
>> good reason to justify time spent on these bug reports.
>>
>> Hence the move to put sockmap-based redirect behind a config option,
>> which you can enable at your own risk. Or which we can deprecate, but
>> that's not really my call.
Hi Alexei and Jakub,
skmsg is actually still pretty useful for gateways.
I started with bpf by integrating skmsg into nginx as a module and envoy
has something similar.
The usual setup is cgroup/sk for L4 bypass (reject SYN), and skmsg for
L7, redirecting
between local apps by looking at the payload. So there are real users.
> This is wishful thinking that a config knob will stop
> the bug reports.
> Just disable it for real instead.
About the AI bug reports - yeah, I've seen them too. I think it just
comes from the complexity
of networking plus how programmable bpf is. Reviewing AI-written patches
is often painful,
the commit message is frequently wrong, once it took me a whole day just
to reproduce and
confirm the issue. But I do believe these reports will converge eventually.
>>> I am also not sure if NET_SOCK_MSG is right. It is broader than
>>> "sockmap redirect". It is selected by TLS and {INET,INET6}_ESPINTCP.
>>> Because those select it, it can't be toggled independently.
>> Once the sockmap redirect bits are behind _some_ config option, it will
>> be easy to replace it with a more granular one that depends on
>> NET_SOCK_MSG. But we're not there yet. One step at a time.
> No. That's not workable.
>
>>> Could you share the concrete use case you have in mind, and whether
>>> this came out of an earlier discussion or thread upstream?
>> This is a follow up from discussions at BPF summit with Alexei & John.
> Not quite. The discussion was to disable pieces of sockmap
> that are causing trouble.
> Not to move them under config knobs, but disable them.
Agree, just like we remove skmsg from KTLS which is rarely used.
I think the motivation of this patch - making the boundary between skmsg
and sockmap clear - is worthwhile.
Hope not have skmsg disabled by default.
I don't work on that upper-layer software anymore, but I really don't
want my ex-colleagues to
upgrade their kernel some day, find the feature I wrote broken, and come
curse me :) (selfish)
^ permalink raw reply
* Re: [PATCH bpf-next v8 2/7] net: move netfilter nf_reject6_fill_skb_dst to core ipv6
From: Emil Tsalapatis @ 2026-06-24 0:16 UTC (permalink / raw)
To: Mahe Tardy, bpf
Cc: andrii, ast, daniel, edumazet, john.fastabend, jordan, kuba,
martin.lau, netdev, netfilter-devel, pabeni, yonghong.song
In-Reply-To: <20260622120515.137082-3-mahe.tardy@gmail.com>
On Mon Jun 22, 2026 at 8:05 AM EDT, Mahe Tardy wrote:
> Move and rename nf_reject6_fill_skb_dst from
> ipv6/netfilter/nf_reject_ipv6 to ip6_route_reply_fill_dst in
> ipv6/route.c so that it can be reused in the following patches by BPF
> kfuncs.
>
> Netfilter uses nf_ip6_route that is almost a transparent wrapper around
> ip6_route_output so this patch inlines it.
>
> Reviewed-by: Jordan Rife <jordan@jrife.io>
> Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
> ---
> include/net/ip6_route.h | 2 ++
> net/ipv6/netfilter/nf_reject_ipv6.c | 17 +----------------
> net/ipv6/route.c | 18 ++++++++++++++++++
> 3 files changed, 21 insertions(+), 16 deletions(-)
>
> diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h
> index 09ffe0f13ce7..eb5a60d3babe 100644
> --- a/include/net/ip6_route.h
> +++ b/include/net/ip6_route.h
> @@ -100,6 +100,8 @@ static inline struct dst_entry *ip6_route_output(struct net *net,
> return ip6_route_output_flags(net, sk, fl6, 0);
> }
>
> +int ip6_route_reply_fill_dst(struct sk_buff *skb);
> +
> /* Only conditionally release dst if flags indicates
> * !RT6_LOOKUP_F_DST_NOREF or dst is in uncached_list.
> */
> diff --git a/net/ipv6/netfilter/nf_reject_ipv6.c b/net/ipv6/netfilter/nf_reject_ipv6.c
> index ef5b7e85cffa..7d2f577e72b8 100644
> --- a/net/ipv6/netfilter/nf_reject_ipv6.c
> +++ b/net/ipv6/netfilter/nf_reject_ipv6.c
> @@ -293,21 +293,6 @@ nf_reject_ip6_tcphdr_put(struct sk_buff *nskb,
> sizeof(struct tcphdr), 0));
> }
>
> -static int nf_reject6_fill_skb_dst(struct sk_buff *skb_in)
> -{
> - struct dst_entry *dst = NULL;
> - struct flowi fl;
> -
> - memset(&fl, 0, sizeof(struct flowi));
> - fl.u.ip6.daddr = ipv6_hdr(skb_in)->saddr;
> - nf_ip6_route(dev_net(skb_in->dev), &dst, &fl, false);
> - if (!dst)
> - return -1;
> -
> - skb_dst_set(skb_in, dst);
> - return 0;
> -}
> -
> void nf_send_reset6(struct net *net, struct sock *sk, struct sk_buff *oldskb,
> int hook)
> {
> @@ -440,7 +425,7 @@ void nf_send_unreach6(struct net *net, struct sk_buff *skb_in,
> if (hooknum == NF_INET_LOCAL_OUT && skb_in->dev == NULL)
> skb_in->dev = net->loopback_dev;
>
> - if (!skb_dst(skb_in) && nf_reject6_fill_skb_dst(skb_in) < 0)
> + if (!skb_dst(skb_in) && ip6_route_reply_fill_dst(skb_in) < 0)
> return;
>
> icmpv6_send(skb_in, ICMPV6_DEST_UNREACH, code, 0);
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 6361ad2fcf77..0fa56c801178 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -2732,6 +2732,24 @@ struct dst_entry *ip6_route_output_flags(struct net *net,
> }
> EXPORT_SYMBOL_GPL(ip6_route_output_flags);
>
> +int ip6_route_reply_fill_dst(struct sk_buff *skb)
> +{
> + struct dst_entry *result;
> + struct flowi6 fl = {
> + .daddr = ipv6_hdr(skb)->saddr
> + };
> + int err;
> +
> + result = ip6_route_output(dev_net(skb->dev), NULL, &fl);
> + err = result->error;
> + if (err)
> + dst_release(result);
> + else
> + skb_dst_set(skb, result);
> + return err;
> +}
> +EXPORT_SYMBOL_GPL(ip6_route_reply_fill_dst);
> +
> struct dst_entry *ip6_blackhole_route(struct net *net, struct dst_entry *dst_orig)
> {
> struct rt6_info *rt, *ort = dst_rt6_info(dst_orig);
> --
> 2.34.1
^ permalink raw reply
* Re: [PATCH bpf-next v8 1/7] net: move netfilter nf_reject_fill_skb_dst to core ipv4
From: Emil Tsalapatis @ 2026-06-24 0:09 UTC (permalink / raw)
To: Mahe Tardy, bpf
Cc: andrii, ast, daniel, edumazet, john.fastabend, jordan, kuba,
martin.lau, netdev, netfilter-devel, pabeni, yonghong.song
In-Reply-To: <20260622120515.137082-2-mahe.tardy@gmail.com>
On Mon Jun 22, 2026 at 8:05 AM EDT, Mahe Tardy wrote:
> Move and rename nf_reject_fill_skb_dst from
> ipv4/netfilter/nf_reject_ipv4 to ip_route_reply_fill_dst in ipv4/route.c
> so that it can be reused in the following patches by BPF kfuncs.
>
> Netfilter uses nf_ip_route that is almost a transparent wrapper around
> ip_route_output_key so this patch inlines it.
>
> Reviewed-by: Jordan Rife <jordan@jrife.io>
> Signed-off-by: Mahe Tardy <mahe.tardy@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
> ---
> include/net/route.h | 1 +
> net/ipv4/netfilter/nf_reject_ipv4.c | 19 ++-----------------
> net/ipv4/route.c | 15 +++++++++++++++
> 3 files changed, 18 insertions(+), 17 deletions(-)
>
> diff --git a/include/net/route.h b/include/net/route.h
> index f90106f383c5..300d292cd9a1 100644
> --- a/include/net/route.h
> +++ b/include/net/route.h
> @@ -173,6 +173,7 @@ struct rtable *ip_route_output_flow(struct net *, struct flowi4 *flp,
> const struct sock *sk);
> struct dst_entry *ipv4_blackhole_route(struct net *net,
> struct dst_entry *dst_orig);
> +int ip_route_reply_fill_dst(struct sk_buff *skb);
>
> static inline struct rtable *ip_route_output_key(struct net *net, struct flowi4 *flp)
> {
> diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c b/net/ipv4/netfilter/nf_reject_ipv4.c
> index fecf6621f679..c1c0724e4d4d 100644
> --- a/net/ipv4/netfilter/nf_reject_ipv4.c
> +++ b/net/ipv4/netfilter/nf_reject_ipv4.c
> @@ -252,21 +252,6 @@ static void nf_reject_ip_tcphdr_put(struct sk_buff *nskb, const struct sk_buff *
> nskb->csum_offset = offsetof(struct tcphdr, check);
> }
>
> -static int nf_reject_fill_skb_dst(struct sk_buff *skb_in)
> -{
> - struct dst_entry *dst = NULL;
> - struct flowi fl;
> -
> - memset(&fl, 0, sizeof(struct flowi));
> - fl.u.ip4.daddr = ip_hdr(skb_in)->saddr;
> - nf_ip_route(dev_net(skb_in->dev), &dst, &fl, false);
> - if (!dst)
> - return -1;
> -
> - skb_dst_set(skb_in, dst);
> - return 0;
> -}
> -
> /* Send RST reply */
> void nf_send_reset(struct net *net, struct sock *sk, struct sk_buff *oldskb,
> int hook)
> @@ -279,7 +264,7 @@ void nf_send_reset(struct net *net, struct sock *sk, struct sk_buff *oldskb,
> if (!oth)
> return;
>
> - if (!skb_dst(oldskb) && nf_reject_fill_skb_dst(oldskb) < 0)
> + if (!skb_dst(oldskb) && ip_route_reply_fill_dst(oldskb) < 0)
> return;
>
> if (skb_rtable(oldskb)->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST))
> @@ -352,7 +337,7 @@ void nf_send_unreach(struct sk_buff *skb_in, int code, int hook)
> if (iph->frag_off & htons(IP_OFFSET))
> return;
>
> - if (!skb_dst(skb_in) && nf_reject_fill_skb_dst(skb_in) < 0)
> + if (!skb_dst(skb_in) && ip_route_reply_fill_dst(skb_in) < 0)
> return;
>
> if (skb_csum_unnecessary(skb_in) ||
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index 3f3de5164d6e..f24609933fbe 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -2942,6 +2942,21 @@ struct rtable *ip_route_output_flow(struct net *net, struct flowi4 *flp4,
> }
> EXPORT_SYMBOL_GPL(ip_route_output_flow);
>
> +int ip_route_reply_fill_dst(struct sk_buff *skb)
> +{
> + struct rtable *rt;
> + struct flowi4 fl4 = {
> + .daddr = ip_hdr(skb)->saddr
> + };
> +
> + rt = ip_route_output_key(dev_net(skb->dev), &fl4);
> + if (IS_ERR(rt))
> + return PTR_ERR(rt);
> + skb_dst_set(skb, &rt->dst);
> + return 0;
> +}
> +EXPORT_SYMBOL_GPL(ip_route_reply_fill_dst);
> +
> /* called with rcu_read_lock held */
> static int rt_fill_info(struct net *net, __be32 dst, __be32 src,
> struct rtable *rt, u32 table_id, dscp_t dscp,
> --
> 2.34.1
^ permalink raw reply
* [PATCH net] nfc: nci: fix uninit-value in the RF discover/activated NTF handlers
From: Samuel Page @ 2026-06-23 23:41 UTC (permalink / raw)
To: David Heidelberg
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, oe-linux-nfc, netdev, linux-kernel, stable
nci_rf_discover_ntf_packet() and nci_rf_intf_activated_ntf_packet() each
parse a notification into an on-stack struct (nci_rf_discover_ntf /
nci_rf_intf_activated_ntf) that is not initialised. The technology- and
activation-specific parameters are only extracted when the corresponding
length field is non-zero, so a notification that reports a zero length
leaves the relevant union uninitialised - and the handlers then read it:
- discover: with rf_tech_specific_params_len == 0, nci_add_new_protocol()
reads the uninitialised rf_tech_specific_params union (nfca_poll->
nfcid1_len is used as a branch condition and a memcpy length) into
ndev->targets;
- activated: with rf_tech_specific_params_len == 0 the same union is read
via nci_target_auto_activated(); with activation_params_len == 0 the
activation_params union is read by nci_store_ats_nfc_iso_dep() into
ndev->target_ats.
In each case the uninitialised bytes are subsequently exposed to user
space (NFC_CMD_GET_TARGET / NFC_ATTR_TARGET_ATS).
BUG: KMSAN: uninit-value in nci_add_new_protocol+0x624/0x6c0
nci_add_new_protocol+0x624/0x6c0
nci_ntf_packet+0x25b2/0x3c30
nci_rx_work+0x318/0x5d0
process_scheduled_works+0x84b/0x17a0
worker_thread+0xc10/0x11b0
kthread+0x376/0x500
Local variable ntf.i created at:
nci_ntf_packet+0xbc2/0x3c30
Zero-initialise both on-stack notifications so the unions read back as
zero when the corresponding parameters are absent.
Fixes: 019c4fbaa790 ("NFC: Add NCI multiple targets support")
Fixes: e8c0dacd9836 ("NFC: Update names and structs to NCI spec 1.0 d18")
Link: https://lore.kernel.org/netdev/20260623172109.1105965-2-horms@kernel.org/
Cc: stable@vger.kernel.org
Assisted-by: Bynario AI
Signed-off-by: Samuel Page <sam@bynar.io>
---
net/nfc/nci/ntf.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/nfc/nci/ntf.c b/net/nfc/nci/ntf.c
index c96512bb8653..274d9a4202c9 100644
--- a/net/nfc/nci/ntf.c
+++ b/net/nfc/nci/ntf.c
@@ -440,7 +440,7 @@ void nci_clear_target_list(struct nci_dev *ndev)
static int nci_rf_discover_ntf_packet(struct nci_dev *ndev,
const struct sk_buff *skb)
{
- struct nci_rf_discover_ntf ntf;
+ struct nci_rf_discover_ntf ntf = {};
const __u8 *data;
bool add_target = true;
@@ -688,7 +688,7 @@ static int nci_rf_intf_activated_ntf_packet(struct nci_dev *ndev,
const struct sk_buff *skb)
{
struct nci_conn_info *conn_info;
- struct nci_rf_intf_activated_ntf ntf;
+ struct nci_rf_intf_activated_ntf ntf = {};
const __u8 *data;
int err = NCI_STATUS_OK;
base-commit: a986fde914d88af47eb78fd29c5d1af7952c3500
--
2.54.0
^ permalink raw reply related
* Re: [PATCH net v2] net: usb: lan78xx: restore VLAN and hash filters after link up
From: patchwork-bot+netdevbpf @ 2026-06-23 23:30 UTC (permalink / raw)
To: Nicolai Buchwitz
Cc: Thangaraj.S, Rengarajan.S, UNGLinuxDriver, Woojung.Huh,
andrew+netdev, davem, edumazet, kuba, pabeni, schuchmann, netdev,
linux-usb, linux-kernel
In-Reply-To: <20260622102911.484045-1-nb@tipi-net.de>
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:
On Mon, 22 Jun 2026 12:29:11 +0200 you wrote:
> Configured VLANs intermittently stop receiving traffic after a link
> down/up cycle, e.g. when the network cable is unplugged and plugged back
> in. VLAN filtering stays enabled but all VLAN-tagged frames are dropped
> until a VLAN is added or removed again.
>
> The LAN7801 datasheet (DS00002123E) states:
>
> [...]
Here is the summary with links:
- [net,v2] net: usb: lan78xx: restore VLAN and hash filters after link up
https://git.kernel.org/netdev/net/c/5c12248673c7
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [PATCH 1/7] xfrm: use compat translator only for u64 alignment mismatch
From: patchwork-bot+netdevbpf @ 2026-06-23 23:30 UTC (permalink / raw)
To: Steffen Klassert; +Cc: davem, kuba, herbert, netdev
In-Reply-To: <20260622075726.29685-2-steffen.klassert@secunet.com>
Hello:
This series was applied to netdev/net.git (main)
by Steffen Klassert <steffen.klassert@secunet.com>:
On Mon, 22 Jun 2026 09:57:03 +0200 you wrote:
> From: Sanman Pradhan <psanman@juniper.net>
>
> The XFRM compat layer (CONFIG_XFRM_USER_COMPAT) translates 32-bit xfrm
> netlink and setsockopt messages into the native 64-bit layout. It is
> only needed on architectures where the 32-bit and 64-bit ABIs disagree
> on u64 alignment, which the kernel encodes as COMPAT_FOR_U64_ALIGNMENT.
>
> [...]
Here is the summary with links:
- [1/7] xfrm: use compat translator only for u64 alignment mismatch
https://git.kernel.org/netdev/net/c/355fbcbdc253
- [2/7] net: af_key: initialize alg_key_len for IPComp states
https://git.kernel.org/netdev/net/c/d129c3177d7b
- [3/7] xfrm: Fix dev use-after-free in xfrm async resumption
https://git.kernel.org/netdev/net/c/8045c0df98d4
- [4/7] xfrm: Fix xfrm state cache insertion race
https://git.kernel.org/netdev/net/c/ddd3d0132920
- [5/7] xfrm: annotate data-races around xfrm_policy_count[] and xfrm_policy_default[]
https://git.kernel.org/netdev/net/c/68de007d5ac9
- [6/7] espintcp: use sk_msg_free_partial to fix partial send
https://git.kernel.org/netdev/net/c/007800408002
- [7/7] xfrm: validate selector family and prefixlen during match
https://git.kernel.org/netdev/net/c/40f0b1047918
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply
* Re: [PATCH bpf v4] selftests/bpf: Cover partial copy of non-linear test_run output
From: Emil Tsalapatis @ 2026-06-23 23:18 UTC (permalink / raw)
To: Sun Jian, bpf
Cc: netdev, linux-kselftest, linux-kernel, ast, daniel, andrii,
martin.lau, paul.chaignon
In-Reply-To: <20260623014027.402820-1-sun.jian.kdev@gmail.com>
On Mon Jun 22, 2026 at 9:40 PM EDT, Sun Jian wrote:
> prog_run_opts already verifies that BPF_PROG_TEST_RUN returns -ENOSPC
> for a short data_out buffer while still reporting the full output size
> through data_size_out.
>
> Add the same coverage for non-linear test_run output. Use pass-through
> TC and XDP programs with a 9000-byte packet, a 64-byte linear data area,
> and a 100-byte data_out buffer. The expected output spans both the linear
> data and the first fragment.
>
> Verify that test_run returns -ENOSPC, reports the full packet length
> through data_size_out, and copies the packet prefix into data_out for
> both non-linear skb and XDP frags paths.
>
> Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
> ---
>
> v4:
> - Send only the selftest patch; the fix patch has been applied to bpf/master.
> - Initialize data_out buffers to avoid reading uninitialized stack memory if
> bpf_prog_test_run_opts() fails unexpectedly.
>
> .../selftests/bpf/prog_tests/prog_run_opts.c | 70 +++++++++++++++++++
> .../selftests/bpf/progs/test_pkt_access.c | 12 ++++
> 2 files changed, 82 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c b/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c
> index 01f1d1b6715a..beb6fa78fd94 100644
> --- a/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c
> +++ b/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c
> @@ -4,6 +4,10 @@
>
> #include "test_pkt_access.skel.h"
>
> +#define NONLINEAR_PKT_LEN 9000
> +#define NONLINEAR_LINEAR_DATA_LEN 64
> +#define SHORT_OUT_LEN 100
> +
> static const __u32 duration;
>
> static void check_run_cnt(int prog_fd, __u64 run_cnt)
> @@ -20,6 +24,69 @@ static void check_run_cnt(int prog_fd, __u64 run_cnt)
> "incorrect number of repetitions, want %llu have %llu\n", run_cnt, info.run_cnt);
> }
>
> +static void init_pkt(__u8 *pkt, size_t len)
> +{
> + size_t i;
> +
> + for (i = 0; i < len; i++)
> + pkt[i] = i & 0xff;
> +}
> +
> +static void test_skb_nonlinear_data_out_partial(struct test_pkt_access *skel)
> +{
> + LIBBPF_OPTS(bpf_test_run_opts, topts);
> + __u8 pkt[NONLINEAR_PKT_LEN];
> + __u8 out[SHORT_OUT_LEN] = {};
> + struct __sk_buff skb = {};
> + int prog_fd, err;
> +
> + init_pkt(pkt, sizeof(pkt));
> +
> + skb.data_end = NONLINEAR_LINEAR_DATA_LEN;
> +
> + topts.data_in = pkt;
> + topts.data_size_in = sizeof(pkt);
> + topts.data_out = out;
> + topts.data_size_out = sizeof(out);
> + topts.ctx_in = &skb;
> + topts.ctx_size_in = sizeof(skb);
> +
> + prog_fd = bpf_program__fd(skel->progs.tc_pass_prog);
> + err = bpf_prog_test_run_opts(prog_fd, &topts);
> +
> + ASSERT_EQ(err, -ENOSPC, "skb_partial_err");
> + ASSERT_EQ(topts.data_size_out, sizeof(pkt), "skb_partial_size");
> + ASSERT_OK(memcmp(out, pkt, sizeof(out)), "skb_partial_data");
> +}
> +
> +static void test_xdp_nonlinear_data_out_partial(struct test_pkt_access *skel)
> +{
> + LIBBPF_OPTS(bpf_test_run_opts, topts);
> + __u8 pkt[NONLINEAR_PKT_LEN];
> + __u8 out[SHORT_OUT_LEN] = {};
> + struct xdp_md ctx = {};
> + int prog_fd, err;
> +
> + init_pkt(pkt, sizeof(pkt));
> +
> + ctx.data = 0;
> + ctx.data_end = NONLINEAR_LINEAR_DATA_LEN;
> +
> + topts.data_in = pkt;
> + topts.data_size_in = sizeof(pkt);
> + topts.data_out = out;
> + topts.data_size_out = sizeof(out);
> + topts.ctx_in = &ctx;
> + topts.ctx_size_in = sizeof(ctx);
> +
> + prog_fd = bpf_program__fd(skel->progs.xdp_frags_pass_prog);
> + err = bpf_prog_test_run_opts(prog_fd, &topts);
> +
> + ASSERT_EQ(err, -ENOSPC, "xdp_partial_err");
> + ASSERT_EQ(topts.data_size_out, sizeof(pkt), "xdp_partial_size");
> + ASSERT_OK(memcmp(out, pkt, sizeof(out)), "xdp_partial_data");
> +}
> +
> void test_prog_run_opts(void)
> {
> struct test_pkt_access *skel;
> @@ -69,6 +136,9 @@ void test_prog_run_opts(void)
> run_cnt += topts.repeat;
> check_run_cnt(prog_fd, run_cnt);
>
> + test_skb_nonlinear_data_out_partial(skel);
> + test_xdp_nonlinear_data_out_partial(skel);
> +
> cleanup:
> if (skel)
> test_pkt_access__destroy(skel);
> diff --git a/tools/testing/selftests/bpf/progs/test_pkt_access.c b/tools/testing/selftests/bpf/progs/test_pkt_access.c
> index bce7173152c6..cd284401eebd 100644
> --- a/tools/testing/selftests/bpf/progs/test_pkt_access.c
> +++ b/tools/testing/selftests/bpf/progs/test_pkt_access.c
> @@ -150,3 +150,15 @@ int test_pkt_access(struct __sk_buff *skb)
>
> return TC_ACT_UNSPEC;
> }
> +
> +SEC("tc")
> +int tc_pass_prog(struct __sk_buff *skb)
> +{
> + return TC_ACT_OK;
> +}
> +
> +SEC("xdp.frags")
> +int xdp_frags_pass_prog(struct xdp_md *ctx)
> +{
> + return XDP_PASS;
> +}
^ permalink raw reply
* Re: [PATCH RFC 5/8] clk: sunxi-ng: a733: Add bus clocks support
From: Andre Przywara @ 2026-06-23 22:35 UTC (permalink / raw)
To: Junhui Liu, Michael Turquette, Stephen Boyd, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Chen-Yu Tsai, Jernej Skrabec,
Samuel Holland, Philipp Zabel, Paul Walmsley, Palmer Dabbelt,
Albert Ou, Alexandre Ghiti, Richard Cochran
Cc: linux-clk, devicetree, linux-arm-kernel, linux-sunxi,
linux-kernel, linux-riscv, netdev
In-Reply-To: <20260310-a733-clk-v1-5-36b4e9b24457@pigmoral.tech>
Hi,
On 3/10/26 08:33, Junhui Liu wrote:
> Add the essential bus clocks in the Allwinner A733 CCU, including AHB,
> APB0, APB1, APB_UART, NSI, and MBUS. These buses are necessary for many
> other functional modules. Additionally clocks such as trace, gic and
> cpu_peri are also added as they fall within the register address range
> of the bus clocks, even though they are not strictly bus clocks.
>
> The MBUS clock is marked as critical to ensure the memory bus remains
> operational at all times. For the NSI and MBUS clocks, the hardware
> requires an update bit (bit 27) to be set so that the configuration
> takes effect and the updated parameters can be correctly read back.
>
> Signed-off-by: Junhui Liu <junhui.liu@pigmoral.tech>
> ---
> drivers/clk/sunxi-ng/ccu-sun60i-a733.c | 131 +++++++++++++++++++++++++++++++++
> 1 file changed, 131 insertions(+)
>
> diff --git a/drivers/clk/sunxi-ng/ccu-sun60i-a733.c b/drivers/clk/sunxi-ng/ccu-sun60i-a733.c
> index cf819504c51f..68457813dbbb 100644
> --- a/drivers/clk/sunxi-ng/ccu-sun60i-a733.c
> +++ b/drivers/clk/sunxi-ng/ccu-sun60i-a733.c
> @@ -19,6 +19,7 @@
> #include "ccu_common.h"
>
> #include "ccu_div.h"
> +#include "ccu_mp.h"
> #include "ccu_mult.h"
> #include "ccu_nkmp.h"
> #include "ccu_nm.h"
> @@ -65,6 +66,16 @@ static const struct clk_hw *pll_ref_hws[] = {
> &pll_ref_clk.common.hw
> };
>
> +/*
> + * There is a non-software-configurable mux selecting between the DCXO and the
> + * PLL_REF in hardware, whose output is fed to the sys-24M clock. Although both
> + * sys-24M and pll-ref are fixed at 24 MHz, define a 1:1 fixed factor clock to
> + * provide logical separation:
> + * - pll-ref is dedicated to feeding other PLLs
> + * - sys-24M serves as reference clock for downstream functional modules
> + */
> +static CLK_FIXED_FACTOR_HWS(sys_24M_clk, "sys-24M", pll_ref_hws, 1, 1, 0);
> +
> #define SUN60I_A733_PLL_DDR_REG 0x020
> static struct ccu_nkmp pll_ddr_clk = {
> .enable = BIT(27),
> @@ -371,6 +382,107 @@ static SUNXI_CCU_M_HWS(pll_de_4x_clk, "pll-de-4x", pll_de_hws,
> static SUNXI_CCU_M_HWS(pll_de_3x_clk, "pll-de-3x", pll_de_hws,
> SUN60I_A733_PLL_DE_REG, 16, 3, 0);
>
> +/**************************************************************************
> + * bus clocks *
> + **************************************************************************/
> +
> +static const struct clk_parent_data ahb_apb_parents[] = {
> + { .hw = &sys_24M_clk.hw },
> + { .fw_name = "losc" },
> + { .fw_name = "iosc" },
> + { .hw = &pll_periph0_600M_clk.hw },
> +};
> +
> +static SUNXI_CCU_M_DATA_WITH_MUX(ahb_clk, "ahb", ahb_apb_parents, 0x500,
> + 0, 5, /* M */
> + 24, 2, /* mux */
> + 0);
> +
> +static SUNXI_CCU_M_DATA_WITH_MUX(apb0_clk, "apb0", ahb_apb_parents, 0x510,
> + 0, 5, /* M */
> + 24, 2, /* mux */
> + 0);
> +
> +static SUNXI_CCU_M_DATA_WITH_MUX(apb1_clk, "apb1", ahb_apb_parents, 0x518,
> + 0, 5, /* M */
> + 24, 2, /* mux */
> + 0);
> +
> +static const struct clk_parent_data apb_uart_parents[] = {
> + { .hw = &sys_24M_clk.hw },
> + { .fw_name = "losc" },
> + { .fw_name = "iosc" },
> + { .hw = &pll_periph0_600M_clk.hw },
> + { .hw = &pll_periph0_480M_clk.common.hw },
> +};
> +static SUNXI_CCU_M_DATA_WITH_MUX(apb_uart_clk, "apb-uart", apb_uart_parents, 0x538,
> + 0, 5, /* M */
> + 24, 3, /* mux */
> + 0);
> +
> +static const struct clk_parent_data trace_parents[] = {
> + { .hw = &sys_24M_clk.hw },
> + { .fw_name = "losc" },
> + { .fw_name = "iosc" },
> + { .hw = &pll_periph0_300M_clk.hw },
> + { .hw = &pll_periph0_400M_clk.hw },
> +};
> +static SUNXI_CCU_M_DATA_WITH_MUX_GATE(trace_clk, "trace", trace_parents, 0x540,
> + 0, 5, /* M */
> + 24, 3, /* mux */
> + BIT(31), /* gate */
> + 0);
> +
> +static const struct clk_parent_data gic_cpu_peri_parents[] = {
> + { .hw = &sys_24M_clk.hw },
> + { .fw_name = "losc" },
> + { .hw = &pll_periph0_600M_clk.hw },
> + { .hw = &pll_periph0_480M_clk.common.hw },
> + { .hw = &pll_periph0_400M_clk.hw },
> +};
> +static SUNXI_CCU_M_DATA_WITH_MUX_GATE(gic_clk, "gic", gic_cpu_peri_parents, 0x560,
Do we really want to model the GIC clock? The A523 has one as well, as
we don't describe it there. And while the GICv3 binding describes a
clock property, the Linux driver completely ignores that.
So if I see this correctly, this clock would become unused, and would be
turned off, killing the GIC? So we would at least need a CLK_IS_CRITICAL
flag?
But it's a good reminder to lift this clock to something PLL based, in
U-Boot's SPL, because I guess the 24MHz are rather slow.
> + 0, 5, /* M */
> + 24, 3, /* mux */
> + BIT(31), /* gate */
> + 0);
> +
> +static SUNXI_CCU_M_DATA_WITH_MUX_GATE(cpu_peri_clk, "cpu-peri", gic_cpu_peri_parents, 0x568,
What is this clock about? I don't see it referenced by any peripheral in
the manual.
> + 0, 5, /* M */
> + 24, 3, /* mux */
> + BIT(31), /* gate */
> + 0);
> +
> +static const struct clk_parent_data nsi_parents[] = {
> + { .hw = &sys_24M_clk.hw },
> + { .hw = &pll_ddr_clk.common.hw },
> + { .hw = &pll_periph0_800M_clk.common.hw },
> + { .hw = &pll_periph0_600M_clk.hw },
> + { .hw = &pll_periph0_480M_clk.common.hw },
> + { .hw = &pll_de_3x_clk.common.hw },
> +};
> +static SUNXI_CCU_MP_DATA_WITH_MUX_GATE_FEAT(nsi_clk, "nsi", nsi_parents, 0x580,
Similar question like for the GIC: do we need this in the kernel, and do
we need to prevent this from being turned off?
> + 0, 5, /* M */
> + 0, 0, /* no P */
> + 24, 3, /* mux */
> + BIT(31), /* gate */
> + 0, CCU_FEATURE_UPDATE_BIT);
> +
> +static const struct clk_parent_data mbus_parents[] = {
> + { .hw = &sys_24M_clk.hw },
> + { .hw = &pll_periph1_600M_clk.hw },
> + { .hw = &pll_ddr_clk.common.hw },
> + { .hw = &pll_periph1_480M_clk.common.hw },
> + { .hw = &pll_periph1_400M_clk.hw },
> + { .hw = &pll_npu_clk.common.hw },
> +};
> +static SUNXI_CCU_MP_DATA_WITH_MUX_GATE_FEAT(mbus_clk, "mbus", mbus_parents, 0x588,
> + 0, 5, /* M */
> + 0, 0, /* no P */
> + 24, 3, /* mux */
> + BIT(31), /* gate */
> + CLK_IS_CRITICAL,
> + CCU_FEATURE_UPDATE_BIT);
> +
> /*
> * Contains all clocks that are controlled by a hardware register. They
> * have a (sunxi) .common member, which needs to be initialised by the common
> @@ -407,11 +519,21 @@ static struct ccu_common *sun60i_a733_ccu_clks[] = {
> &pll_de_clk.common,
> &pll_de_4x_clk.common,
> &pll_de_3x_clk.common,
> + &ahb_clk.common,
> + &apb0_clk.common,
> + &apb1_clk.common,
> + &apb_uart_clk.common,
> + &trace_clk.common,
> + &gic_clk.common,
> + &cpu_peri_clk.common,
> + &nsi_clk.common,
> + &mbus_clk.common,
> };
>
> static struct clk_hw_onecell_data sun60i_a733_hw_clks = {
> .hws = {
> [CLK_PLL_REF] = &pll_ref_clk.common.hw,
> + [CLK_SYS_24M] = &sys_24M_clk.hw,
> [CLK_PLL_DDR] = &pll_ddr_clk.common.hw,
> [CLK_PLL_PERIPH0_4X] = &pll_periph0_4x_clk.common.hw,
> [CLK_PLL_PERIPH0_2X] = &pll_periph0_2x_clk.common.hw,
> @@ -453,6 +575,15 @@ static struct clk_hw_onecell_data sun60i_a733_hw_clks = {
> [CLK_PLL_DE] = &pll_de_clk.common.hw,
> [CLK_PLL_DE_4X] = &pll_de_4x_clk.common.hw,
> [CLK_PLL_DE_3X] = &pll_de_3x_clk.common.hw,
> + [CLK_AHB] = &ahb_clk.common.hw,
> + [CLK_APB0] = &apb0_clk.common.hw,
> + [CLK_APB1] = &apb1_clk.common.hw,
> + [CLK_APB_UART] = &apb_uart_clk.common.hw,
> + [CLK_TRACE] = &trace_clk.common.hw,
> + [CLK_GIC] = &gic_clk.common.hw,
> + [CLK_CPU_PERI] = &cpu_peri_clk.common.hw,
> + [CLK_NSI] = &nsi_clk.common.hw,
> + [CLK_MBUS] = &mbus_clk.common.hw,
> },
> .num = CLK_FANOUT3 + 1,
> };
>
^ permalink raw reply
* Re: [PATCH net] nfc: nci: fix out-of-bounds write in nci_target_auto_activated()
From: Sam P @ 2026-06-23 22:33 UTC (permalink / raw)
To: Simon Horman
Cc: david, davem, edumazet, kuba, pabeni, oe-linux-nfc, netdev,
linux-kernel, stable
In-Reply-To: <20260623172109.1105965-2-horms@kernel.org>
On 23/06/2026 19:21, Simon Horman wrote:
>> diff --git a/net/nfc/nci/ntf.c b/net/nfc/nci/ntf.c
>> index c96512bb86531..566ca839fa488 100644
>> --- a/net/nfc/nci/ntf.c
>> +++ b/net/nfc/nci/ntf.c
>> @@ -603,6 +603,12 @@ static void nci_target_auto_activated(struct nci_dev *ndev,
>> struct nfc_target *target;
>> int rc;
>>
>> + /* This is a new target, check if we've enough room */
>> + if (ndev->n_targets == NCI_MAX_DISCOVERED_TARGETS) {
>> + pr_debug("not enough room, ignoring new target...\n");
>> + return;
>
> [Severity: High]
> Does this early return cause the state machine to stall?
>
> Looking at nci_rf_intf_activated_ntf_packet(), the state is transitioned
> to NCI_POLL_ACTIVE right before this function is called:
>
> net/nfc/nci/ntf.c:nci_rf_intf_activated_ntf_packet() {
> ...
> atomic_set(&ndev->state, NCI_POLL_ACTIVE);
> if (err == NCI_STATUS_OK)
> nci_target_auto_activated(ndev, &ntf);
> ...
> }
>
> Because of the early return here, nfc_targets_found() is skipped. Does this
> leave the state wedged in NCI_POLL_ACTIVE, permanently blocking future
> attempts to stop polling since nci_stop_poll() requires the state to be
> NCI_DISCOVERY?
>
>> + }
>> +
>> target = &ndev->targets[ndev->n_targets];
>
> [ ... ]
Thanks for sharing the review.
I don't believe the early return can stall the state machine, as
nci_target_auto_activated() does not touch ndev->state.
nci_rf_intf_activated_ntf_packet() sets ndev->state = NCI_POLL_ACTIVE
before it calls nci_target_auto_activated(), so the transition has
already happened by the time the new check runs. Neither
nci_target_auto_activated() nor nfc_targets_found() modifies ndev->state;
nfc_targets_found() only clears the nfc_dev dev->polling flag under
device_lock, which is independent of the NCI state machine that
nci_stop_poll() tests. So ndev->state is NCI_POLL_ACTIVE after an
auto-activation whether the target is appended or the guard returns
early; the state behaviour is the same with or without this patch.
The check is the same n_targets == NCI_MAX_DISCOVERED_TARGETS guard the
sibling nci_add_new_target() (the RF_DISCOVER_NTF path) has used since
the Fixes: commit; it only drops the out-of-spec 11th+ target, which a
conformant controller shouldn't produce. The device still returns to
NCI_DISCOVERY/NCI_IDLE through the normal RF_DEACTIVATE path.
> [Severity: Critical]
> This isn't a bug introduced by this patch, but can the ntf union copy
> uninitialized stack memory into persistent device structures?
>
> In net/nfc/nci/ntf.c:nci_rf_discover_ntf_packet(), the local ntf struct
> is declared on the stack:
>
> struct nci_rf_discover_ntf ntf;
>
> If the incoming packet has ntf.rf_tech_specific_params_len == 0, the
> extraction routines are bypassed, leaving the rf_tech_specific_params union
> uninitialized. The struct is then passed to nci_add_new_target():
>
> if (add_target == true)
> nci_add_new_target(ndev, &ntf);
>
> This eventually calls nci_add_new_protocol(), which blindly reads from this
> union:
>
> nfca_poll = (struct rf_tech_specific_params_nfca_poll *)params;
> target->sens_res = nfca_poll->sens_res;
> target->sel_res = nfca_poll->sel_res;
> target->nfcid1_len = nfca_poll->nfcid1_len;
>
> Could this uninitialized stack garbage be copied into ndev->targets, and
> later broadcasted to userspace via Netlink in nfc_targets_found()?
I managed to repro this issue locally via KMSAN, I can queue up a patch
for this that I can submit separately, ntf just needs to be zero-initd.
> [Severity: Critical]
> This isn't a bug introduced by this patch, but does the general bytes length
> calculation underflow?
>
> In net/nfc/nci/ntf.c:nci_store_general_bytes_nfc_dep():
>
> ndev->remote_gb_len = min_t(__u8,
> (ntf->activation_params.listen_nfc_dep.atr_req_len -
> NFC_ATR_REQ_GT_OFFSET), NFC_ATR_REQ_GB_MAXSIZE);
>
> memcpy(ndev->remote_gb,
> (ntf->activation_params.listen_nfc_dep.atr_req +
> NFC_ATR_REQ_GT_OFFSET), ndev->remote_gb_len);
>
> The length is computed by subtracting NFC_ATR_REQ_GT_OFFSET (14) from
> atr_req_len. If atr_req_len is less than 14, this subtraction will
> underflow.
>
> Because of the min_t cast, this large underflowed value is bounded to
> NFC_ATR_REQ_GB_MAXSIZE (48). The memcpy() then copies 48 bytes starting
> from atr_req + 14 into ndev->remote_gb.
>
> Since the source buffer is part of the ntf union allocated on the stack
> in nci_rf_intf_activated_ntf_packet(), could this leak up to 48 bytes of
> uninitialized kernel stack memory to userspace?
This one is already being fixed on the list:
[PATCH net v4 1/5] nfc: nci: fix u8 underflow in
nci_store_general_bytes_nfc_dep (Lekë Hapçiu)
https://lore.kernel.org/all/20260424180151.3808557-2-snowwlake@icloud.com/
> [Severity: High]
> This isn't a bug introduced by this patch, but does the lack of skb->len
> validation in NCI response handlers allow out-of-bounds reads?
>
> In nci_core_init_rsp_packet_v1() in net/nfc/nci/rsp.c, fields are read
> and arrays are copied from the payload:
>
> ndev->num_supported_rf_interfaces =
> rsp_1->num_supported_rf_interfaces;
>
> memcpy(ndev->supported_rf_interfaces, rsp_1->supported_rf_interfaces,
> ndev->num_supported_rf_interfaces);
>
> Similarly in nci_core_init_rsp_packet_v2():
>
> while (rf_interface_idx < ndev->num_supported_rf_interfaces) {
> ndev->supported_rf_interfaces[rf_interface_idx++] =
> *supported_rf_interface++;
> }
>
> Since the NCI control header is stripped via skb_pull(), leaving the
> payload length in skb->len, what happens if an NFCC sends a short response
> packet (e.g. plen = 1)? Could this cause the handlers to read adjacent
> kernel slab memory?
I've since submitted a fix for this issue:
[PATCH net] nfc: nci: fix uninit-value in
nci_core_init_rsp_packet()
https://lore.kernel.org/netdev/20260623222402.175798-1-sam@bynar.io/
> [Severity: High]
> This isn't a bug introduced by this patch, but is there a missing boundary
> validation when extracting activation parameters?
>
> In net/nfc/nci/ntf.c:nci_extract_activation_params_iso_dep():
>
> nfca_poll->rats_res_len = min_t(__u8, *data++, NFC_ATS_MAXSIZE);
> if (nfca_poll->rats_res_len > 0) {
> memcpy(nfca_poll->rats_res, data, nfca_poll->rats_res_len);
> }
>
> And in nci_extract_activation_params_nfc_dep():
>
> poll->atr_res_len = min_t(__u8, *data++, NFC_ATR_RES_MAXSIZE - 2);
> if (poll->atr_res_len > 0)
> memcpy(poll->atr_res, data, poll->atr_res_len);
>
> These inner lengths are bounded by a maximum size like NFC_ATS_MAXSIZE, but
> they are not validated against the remaining bytes in the SKB (skb->len).
>
> If a malformed packet provides a large inner length but a short payload,
> could this memcpy() read out of bounds into the device structure,
> potentially leaking adjacent slab memory to userspace later?
This one is already being fixed on the list too:
[PATCH v3] nfc: nci: add data_len bound checks to activation
parameter extractors (Bryam Vargas)
https://lore.kernel.org/all/20260612-b4-disp-6d52d8b0-v3-1-e26221f8826d@proton.me/
> [Severity: High]
> This isn't a bug introduced by this patch, but is there a potential
> priority inversion or timeout deadlock involving the device lock?
>
> Functions like nfc_stop_poll() in net/nfc/core.c acquire the device lock:
>
> device_lock(&dev->dev);
> ...
> if (dev->ops->stop_poll)
> dev->ops->stop_poll(dev);
>
> This eventually calls nci_request(), which synchronously waits for a
> completion signaled by the nci_rx_work thread. However, if an NTF packet
> is received first, nci_rx_work processes it and invokes
> nfc_targets_found(), which also attempts to acquire the device lock:
>
> device_lock(&dev->dev);
>
> Since the calling thread already holds the device lock, nci_rx_work blocks
> indefinitely. Because the RX worker is blocked, it cannot process the
> pending RSP, causing nci_request() to time out and fail. Could this
> deadlock the RX thread?
No patch for this one, although I'm not sure how accurate it is.
Thanks,
Sam
^ permalink raw reply
* [PATCH net] nfc: nci: fix uninit-value in nci_core_init_rsp_packet()
From: Samuel Page @ 2026-06-23 22:24 UTC (permalink / raw)
To: David Heidelberg
Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, oe-linux-nfc, netdev, linux-kernel, stable
The CORE_INIT_RSP handlers walk the response using length fields taken
from the packet itself, without checking they stay within skb->len:
- v1 computes
rsp_2 = skb->data + 6 + rsp_1->num_supported_rf_interfaces;
from the on-wire (unclamped) interface count and then dereferences
rsp_2, and memcpy()s the advertised interfaces - both can run past the
received data;
- v2 walks supported_rf_interfaces[], advancing the cursor by an
in-packet rf_extension_cnt with no bound.
A short CORE_INIT_RSP therefore makes the parser read past the packet
(into the uninitialised tail of the RX skb); the values are stored into
struct nci_dev and consumed while bringing the device up:
BUG: KMSAN: uninit-value in nci_dev_up+0x10f3/0x1720
nci_dev_up+0x10f3/0x1720
nfc_dev_up+0x187/0x380
nfc_genl_dev_up+0xdc/0x1a0
genl_rcv_msg+0x5d4/0x9e0
netlink_rcv_skb+0x28f/0x530
Uninit was stored to memory at:
nci_rsp_packet+0x68f/0x2310
nci_rx_work+0x25f/0x5d0
Uninit was created at:
__alloc_skb+0x540/0xd40
virtual_ncidev_write+0x65/0x210
Bound both parsers to skb->len before dereferencing the variable-length
parts, rejecting truncated responses with NCI_STATUS_SYNTAX_ERROR.
Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation")
Fixes: bcd684aace34 ("net/nfc/nci: Support NCI 2.x initial sequence")
Cc: stable@vger.kernel.org
Assisted-by: Bynario AI
Signed-off-by: Samuel Page <sam@bynar.io>
---
net/nfc/nci/rsp.c | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/net/nfc/nci/rsp.c b/net/nfc/nci/rsp.c
index 9eeb862825c5..cdcd23c8ca95 100644
--- a/net/nfc/nci/rsp.c
+++ b/net/nfc/nci/rsp.c
@@ -50,6 +50,9 @@ static u8 nci_core_init_rsp_packet_v1(struct nci_dev *ndev,
const struct nci_core_init_rsp_1 *rsp_1 = (void *)skb->data;
const struct nci_core_init_rsp_2 *rsp_2;
+ if (skb->len < sizeof(*rsp_1))
+ return NCI_STATUS_SYNTAX_ERROR;
+
pr_debug("status 0x%x\n", rsp_1->status);
if (rsp_1->status != NCI_STATUS_OK)
@@ -58,6 +61,15 @@ static u8 nci_core_init_rsp_packet_v1(struct nci_dev *ndev,
ndev->nfcc_features = __le32_to_cpu(rsp_1->nfcc_features);
ndev->num_supported_rf_interfaces = rsp_1->num_supported_rf_interfaces;
+ /*
+ * supported_rf_interfaces[] and the trailing nci_core_init_rsp_2 are
+ * addressed using the on-wire (unclamped) interface count, so the
+ * response must be long enough for both before they are dereferenced.
+ */
+ if (skb->len < sizeof(*rsp_1) +
+ rsp_1->num_supported_rf_interfaces + sizeof(*rsp_2))
+ return NCI_STATUS_SYNTAX_ERROR;
+
ndev->num_supported_rf_interfaces =
min((int)ndev->num_supported_rf_interfaces,
NCI_MAX_SUPPORTED_RF_INTERFACES);
@@ -88,9 +100,13 @@ static u8 nci_core_init_rsp_packet_v2(struct nci_dev *ndev,
{
const struct nci_core_init_rsp_nci_ver2 *rsp = (void *)skb->data;
const u8 *supported_rf_interface = rsp->supported_rf_interfaces;
+ const u8 *end = skb->data + skb->len;
u8 rf_interface_idx = 0;
u8 rf_extension_cnt = 0;
+ if (skb->len < sizeof(*rsp))
+ return NCI_STATUS_SYNTAX_ERROR;
+
pr_debug("status %x\n", rsp->status);
if (rsp->status != NCI_STATUS_OK)
@@ -104,10 +120,16 @@ static u8 nci_core_init_rsp_packet_v2(struct nci_dev *ndev,
NCI_MAX_SUPPORTED_RF_INTERFACES);
while (rf_interface_idx < ndev->num_supported_rf_interfaces) {
- ndev->supported_rf_interfaces[rf_interface_idx++] = *supported_rf_interface++;
+ /* one interface byte + one extension-count byte must be present */
+ if (end - supported_rf_interface < 2)
+ return NCI_STATUS_SYNTAX_ERROR;
+ ndev->supported_rf_interfaces[rf_interface_idx++] =
+ *supported_rf_interface++;
- /* skip rf extension parameters */
+ /* skip rf extension parameters, bounded by the packet */
rf_extension_cnt = *supported_rf_interface++;
+ if (rf_extension_cnt > end - supported_rf_interface)
+ return NCI_STATUS_SYNTAX_ERROR;
supported_rf_interface += rf_extension_cnt;
}
base-commit: a986fde914d88af47eb78fd29c5d1af7952c3500
--
2.54.0
^ permalink raw reply related
* [PATCH net 14/14] netfilter: nf_conntrack_helper: cap maximum number of expectation at helper registration
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>
On helper registration, the maximum number of expectations cannot go over
NF_CT_EXPECT_MAX_CNT (255), but zero can be specified then
nf_conntrack_expect_max applies. Turn zero into NF_CT_EXPECT_MAX_CNT
otherwise, expectation LRU eviction on insertion is disabled.
Moreover, expand this sanity check all expectation classes.
This max_expecy policy is only tunable since userspace helpers are
available, set Fixes: tag to the commit that adds such infrastructure.
Remove the check for p->max_expected given this field must always
be non-zero after this patch.
Fixes: 12f7a505331e ("netfilter: add user-space connection tracking helper infrastructure")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_conntrack_expect.c | 3 +--
net/netfilter/nf_conntrack_helper.c | 9 +++++++--
2 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c
index 113bb1cb1683..38630c5e006f 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -496,8 +496,7 @@ static inline int __nf_ct_expect_check(struct nf_conntrack_expect *expect,
lockdep_is_held(&nf_conntrack_expect_lock));
if (helper) {
p = &helper->expect_policy[expect->class];
- if (p->max_expected &&
- master_help->expecting[expect->class] >= p->max_expected)
+ if (master_help->expecting[expect->class] >= p->max_expected)
evict_oldest_expect(master_help, expect, p);
} else {
const struct nf_conntrack_expect_policy default_exp_policy = {
diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index 8b94001c2430..500509b17663 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -374,8 +374,13 @@ int __nf_conntrack_helper_register(struct nf_conntrack_helper *me)
if (!nf_ct_helper_hash)
return -ENOENT;
- if (me->expect_policy->max_expected > NF_CT_EXPECT_MAX_CNT)
- return -EINVAL;
+ for (i = 0; i <= me->expect_class_max; i++) {
+ if (!me->expect_policy[i].max_expected)
+ me->expect_policy[i].max_expected = NF_CT_EXPECT_MAX_CNT;
+
+ if (me->expect_policy[i].max_expected > NF_CT_EXPECT_MAX_CNT)
+ return -EINVAL;
+ }
mutex_lock(&nf_ct_helper_mutex);
for (i = 0; i < nf_ct_helper_hsize; i++) {
--
2.47.3
^ permalink raw reply related
* [PATCH net 13/14] netfilter: nft_ct: expectation timeouts are passed in milliseconds
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>
From: Florian Westphal <fw@strlen.de>
Userspace passes '5000' in case user asks for 5 seconds.
Allowing for sub-second expectation lifetimes makes sense to me. so
fix up the kernel side instead of munging nft to send a value rounded
up to next second.
Also note that this violates nft convention of passing integers in
network byte order, but we can't change this anymore.
Fixes: 857b46027d6f ("netfilter: nft_ct: add ct expectations support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nft_ct.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c
index 958054dd2e2e..03a88c77e0f0 100644
--- a/net/netfilter/nft_ct.c
+++ b/net/netfilter/nft_ct.c
@@ -1215,11 +1215,23 @@ struct nft_ct_expect_obj {
u32 timeout;
};
+static int nft_ct_expect_timeout_get(const struct nlattr *attr, u32 *val)
+{
+ unsigned long jiffies_val = msecs_to_jiffies(nla_get_u32(attr));
+
+ if (jiffies_val > UINT_MAX)
+ return -ERANGE;
+
+ *val = jiffies_val;
+ return 0;
+}
+
static int nft_ct_expect_obj_init(const struct nft_ctx *ctx,
const struct nlattr * const tb[],
struct nft_object *obj)
{
struct nft_ct_expect_obj *priv = nft_obj_data(obj);
+ int err;
if (!tb[NFTA_CT_EXPECT_L4PROTO] ||
!tb[NFTA_CT_EXPECT_DPORT] ||
@@ -1254,8 +1266,11 @@ static int nft_ct_expect_obj_init(const struct nft_ctx *ctx,
return -EOPNOTSUPP;
}
+ err = nft_ct_expect_timeout_get(tb[NFTA_CT_EXPECT_TIMEOUT], &priv->timeout);
+ if (err)
+ return err;
+
priv->dport = nla_get_be16(tb[NFTA_CT_EXPECT_DPORT]);
- priv->timeout = nla_get_u32(tb[NFTA_CT_EXPECT_TIMEOUT]);
priv->size = nla_get_u8(tb[NFTA_CT_EXPECT_SIZE]);
return nf_ct_netns_get(ctx->net, ctx->family);
@@ -1275,7 +1290,7 @@ static int nft_ct_expect_obj_dump(struct sk_buff *skb,
if (nla_put_be16(skb, NFTA_CT_EXPECT_L3PROTO, htons(priv->l3num)) ||
nla_put_u8(skb, NFTA_CT_EXPECT_L4PROTO, priv->l4proto) ||
nla_put_be16(skb, NFTA_CT_EXPECT_DPORT, priv->dport) ||
- nla_put_u32(skb, NFTA_CT_EXPECT_TIMEOUT, priv->timeout) ||
+ nla_put_u32(skb, NFTA_CT_EXPECT_TIMEOUT, jiffies_to_msecs(priv->timeout)) ||
nla_put_u8(skb, NFTA_CT_EXPECT_SIZE, priv->size))
return -1;
@@ -1325,7 +1340,7 @@ static void nft_ct_expect_obj_eval(struct nft_object *obj,
&ct->tuplehash[!dir].tuple.src.u3,
&ct->tuplehash[!dir].tuple.dst.u3,
priv->l4proto, NULL, &priv->dport);
- exp->timeout += priv->timeout * HZ;
+ exp->timeout += priv->timeout;
if (nf_ct_expect_related(exp, 0) != 0)
regs->verdict.code = NF_DROP;
--
2.47.3
^ permalink raw reply related
* [PATCH net 11/14] netfilter: nf_conntrack_expect: store master_tuple in expectation
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>
Store master conntrack tuple in the expectation since exp->master might
refer to a different conntrack when accessed from rcu read side lock
area due to typesafe rcu rules.
Fixes: 02a3231b6d82 ("netfilter: nf_conntrack_expect: store netns and zone in expectation")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nf_conntrack_expect.h | 1 +
net/netfilter/nf_conntrack_broadcast.c | 1 +
net/netfilter/nf_conntrack_expect.c | 2 ++
net/netfilter/nf_conntrack_netlink.c | 10 ++++------
4 files changed, 8 insertions(+), 6 deletions(-)
diff --git a/include/net/netfilter/nf_conntrack_expect.h b/include/net/netfilter/nf_conntrack_expect.h
index be4a120d549e..c024345c9bd8 100644
--- a/include/net/netfilter/nf_conntrack_expect.h
+++ b/include/net/netfilter/nf_conntrack_expect.h
@@ -26,6 +26,7 @@ struct nf_conntrack_expect {
possible_net_t net;
/* We expect this tuple, with the following mask */
+ struct nf_conntrack_tuple master_tuple;
struct nf_conntrack_tuple tuple;
struct nf_conntrack_tuple_mask mask;
diff --git a/net/netfilter/nf_conntrack_broadcast.c b/net/netfilter/nf_conntrack_broadcast.c
index 400119b6320e..bf78828c7549 100644
--- a/net/netfilter/nf_conntrack_broadcast.c
+++ b/net/netfilter/nf_conntrack_broadcast.c
@@ -62,6 +62,7 @@ int nf_conntrack_broadcast_help(struct sk_buff *skb,
if (exp == NULL)
goto out;
+ exp->master_tuple = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
exp->tuple = ct->tuplehash[IP_CT_DIR_REPLY].tuple;
helper = rcu_dereference(help->helper);
diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c
index 49e18eda037e..9454913e1b33 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -355,6 +355,8 @@ void nf_ct_expect_init(struct nf_conntrack_expect *exp, unsigned int class,
exp->tuple.src.l3num = family;
exp->tuple.dst.protonum = proto;
+ exp->master_tuple = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
+
if (saddr) {
memcpy(&exp->tuple.src.u3, saddr, len);
if (sizeof(exp->tuple.src.u3) > len)
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index cb38ef42e9e6..4217715d42dc 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -3002,7 +3002,6 @@ ctnetlink_exp_dump_expect(struct sk_buff *skb,
const struct nf_conntrack_expect *exp)
{
__s32 timeout = (__s32)(READ_ONCE(exp->timeout) - nfct_time_stamp) / HZ;
- struct nf_conn *master = exp->master;
struct nf_conntrack_helper *helper;
#if IS_ENABLED(CONFIG_NF_NAT)
struct nlattr *nest_parms;
@@ -3017,9 +3016,7 @@ ctnetlink_exp_dump_expect(struct sk_buff *skb,
goto nla_put_failure;
if (ctnetlink_exp_dump_mask(skb, &exp->tuple, &exp->mask) < 0)
goto nla_put_failure;
- if (ctnetlink_exp_dump_tuple(skb,
- &master->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
- CTA_EXPECT_MASTER) < 0)
+ if (ctnetlink_exp_dump_tuple(skb, &exp->master_tuple, CTA_EXPECT_MASTER) < 0)
goto nla_put_failure;
#if IS_ENABLED(CONFIG_NF_NAT)
@@ -3032,9 +3029,9 @@ ctnetlink_exp_dump_expect(struct sk_buff *skb,
if (nla_put_be32(skb, CTA_EXPECT_NAT_DIR, htonl(exp->dir)))
goto nla_put_failure;
- nat_tuple.src.l3num = nf_ct_l3num(master);
+ nat_tuple.src.l3num = exp->master_tuple.src.l3num;
nat_tuple.src.u3 = exp->saved_addr;
- nat_tuple.dst.protonum = nf_ct_protonum(master);
+ nat_tuple.dst.protonum = exp->master_tuple.dst.protonum;
nat_tuple.src.u = exp->saved_proto;
if (ctnetlink_exp_dump_tuple(skb, &nat_tuple,
@@ -3576,6 +3573,7 @@ ctnetlink_alloc_expect(const struct nlattr * const cda[], struct nf_conn *ct,
#endif
rcu_assign_pointer(exp->helper, helper);
rcu_assign_pointer(exp->assign_helper, assign_helper);
+ exp->master_tuple = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
exp->tuple = *tuple;
exp->mask.src.u3 = mask->src.u3;
exp->mask.src.u.all = mask->src.u.all;
--
2.47.3
^ permalink raw reply related
* [PATCH net 12/14] netfilter: nf_conntrack_expect: run expectation eviction with no helper
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>
Run expectation eviction if no helper is specified to deal with the
nft_ct expectation support.
Cap the maximum expectation limit per master conntrack to
NF_CT_EXPECT_MAX_CNT (255).
Fixes: 857b46027d6f ("netfilter: nft_ct: add ct expectations support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_conntrack_expect.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c
index 9454913e1b33..113bb1cb1683 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -499,6 +499,13 @@ static inline int __nf_ct_expect_check(struct nf_conntrack_expect *expect,
if (p->max_expected &&
master_help->expecting[expect->class] >= p->max_expected)
evict_oldest_expect(master_help, expect, p);
+ } else {
+ const struct nf_conntrack_expect_policy default_exp_policy = {
+ .max_expected = NF_CT_EXPECT_MAX_CNT,
+ };
+
+ if (master_help->expecting[expect->class] >= default_exp_policy.max_expected)
+ evict_oldest_expect(master_help, expect, &default_exp_policy);
}
cnet = nf_ct_pernet(net);
--
2.47.3
^ permalink raw reply related
* [PATCH net 10/14] netfilter: conntrack: add deprecation warnings for irc and pptp trackers
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>
From: Florian Westphal <fw@strlen.de>
IRC Direct client-to-client requires plaintext. IRC over TLS should be
preferred, making this helper ineffective. Add a deprecation warning and
update the help text to better reflect that this is needed for the DCC
extension, not IRC itself.
PPTP is esoteric these days and it is the only helper that requires the
destroy callback in the conntrack helper API.
Removal would simplify the conntrack core.
Both helpers are IPv4 only.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/net/netfilter/nf_conntrack_helper.h | 4 ++++
net/netfilter/Kconfig | 11 ++++++-----
net/netfilter/nf_conntrack_irc.c | 2 ++
net/netfilter/nf_conntrack_pptp.c | 2 ++
4 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/include/net/netfilter/nf_conntrack_helper.h b/include/net/netfilter/nf_conntrack_helper.h
index 81025101f86d..c761cd8158b2 100644
--- a/include/net/netfilter/nf_conntrack_helper.h
+++ b/include/net/netfilter/nf_conntrack_helper.h
@@ -114,6 +114,10 @@ int nf_conntrack_helpers_register(struct nf_conntrack_helper *, unsigned int,
void nf_conntrack_helpers_unregister(struct nf_conntrack_helper **,
unsigned int);
+#define nf_conntrack_helper_deprecated(name) \
+ pr_warn("The %s conntrack helper is scheduled for removal.\n" \
+ "Please contact the netfilter-devel mailing list if you still need this.\n", name)
+
struct nf_conn_help *nf_ct_helper_ext_add(struct nf_conn *ct, gfp_t gfp);
int __nf_ct_try_assign_helper(struct nf_conn *ct, struct nf_conn *tmpl,
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 665f8008cc4b..4c04cd8d40a2 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -256,8 +256,7 @@ config NF_CONNTRACK_H323
To compile it as a module, choose M here. If unsure, say N.
config NF_CONNTRACK_IRC
- tristate "IRC protocol support"
- default m if NETFILTER_ADVANCED=n
+ tristate "IRC DCC protocol support (obsolete)"
help
There is a commonly-used extension to IRC called
Direct Client-to-Client Protocol (DCC). This enables users to send
@@ -267,6 +266,8 @@ config NF_CONNTRACK_IRC
using NAT, this extension will enable you to send files and initiate
chats. Note that you do NOT need this extension to get files or
have others initiate chats, or everything else in IRC.
+ DCC tracking behind NAT requires plaintext (unencrypted) IRC, so
+ this helper is of limited use these days.
To compile it as a module, choose M here. If unsure, say N.
@@ -308,17 +309,17 @@ config NF_CONNTRACK_SNMP
To compile it as a module, choose M here. If unsure, say N.
config NF_CONNTRACK_PPTP
- tristate "PPtP protocol support"
+ tristate "PPtP protocol support (deprecated)"
depends on NETFILTER_ADVANCED
select NF_CT_PROTO_GRE
help
This module adds support for PPTP (Point to Point Tunnelling
Protocol, RFC2637) connection tracking and NAT.
- If you are running PPTP sessions over a stateful firewall or NAT
+ If you are still running PPTP sessions over a stateful firewall or NAT
box, you may want to enable this feature.
- Please note that not all PPTP modes of operation are supported yet.
+ Please note that not all PPTP modes of operation are supported.
Specifically these limitations exist:
- Blindly assumes that control connections are always established
in PNS->PAC direction. This is a violation of RFC2637.
diff --git a/net/netfilter/nf_conntrack_irc.c b/net/netfilter/nf_conntrack_irc.c
index 0c117b8492e9..193ab34db795 100644
--- a/net/netfilter/nf_conntrack_irc.c
+++ b/net/netfilter/nf_conntrack_irc.c
@@ -262,6 +262,8 @@ static int __init nf_conntrack_irc_init(void)
{
int i, ret;
+ nf_conntrack_helper_deprecated(HELPER_NAME);
+
if (max_dcc_channels < 1) {
pr_err("max_dcc_channels must not be zero\n");
return -EINVAL;
diff --git a/net/netfilter/nf_conntrack_pptp.c b/net/netfilter/nf_conntrack_pptp.c
index 776505a78e64..80fc14c87ddc 100644
--- a/net/netfilter/nf_conntrack_pptp.c
+++ b/net/netfilter/nf_conntrack_pptp.c
@@ -545,6 +545,8 @@ static int __init nf_conntrack_pptp_init(void)
pptp.destroy = gre_pptp_destroy_siblings;
+ nf_conntrack_helper_deprecated(pptp.name);
+
return nf_conntrack_helper_register(&pptp, &pptp_ptr);
}
--
2.47.3
^ permalink raw reply related
* [PATCH net 09/14] netfilter: ctnetlink: do not allow to reset helper on existing conntrack
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>
This feature allows to reset a helper for an existing conntrack, but it
is not safe. This requires a synchronized_rcu() call after resetting the
helper, which is going to be expensive for a large batch of conntrack
entries. This also needs to call to the .destroy callback to release the
GRE/PPTP mappings to fix it.
This feature antedates the creation of the conntrack-tools and I cannot
find a good use-case for this. Given that I cannot find any user in the
netfilter.org userspace tree, I prefer to remove this feature.
Fixes: c1d10adb4a52 ("[NETFILTER]: Add ctnetlink port for nf_conntrack")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nf_conntrack_netlink.c | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 4e78d2482989..cb38ef42e9e6 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1953,19 +1953,6 @@ static int ctnetlink_change_helper(struct nf_conn *ct,
return err;
}
- if (!strcmp(helpname, "") && help) {
- helper = rcu_dereference(help->helper);
- if (helper) {
- /* we had a helper before ... */
- nf_ct_remove_expectations(ct);
- RCU_INIT_POINTER(help->helper, NULL);
- if (refcount_dec_and_test(&helper->ct_refcnt))
- kfree_rcu(helper, rcu);
- }
- rcu_read_unlock();
- return 0;
- }
-
helper = __nf_conntrack_helper_find(helpname, nf_ct_l3num(ct),
nf_ct_protonum(ct));
if (helper == NULL) {
--
2.47.3
^ permalink raw reply related
* [PATCH net 08/14] selftests: nft_queue.sh: add a bridge queue test
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>
From: Florian Westphal <fw@strlen.de>
Add a test queueing from bridge family.
This was lacking: we queued from inet for ipv4 and ipv6 but
we had no bridge queue test so far.
Given kernel MUST validate that in/out port are still part of
a bridge device on reinject add a test case for this before
adding this check.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
.../selftests/net/netfilter/nft_queue.sh | 66 ++++++++++++++++---
1 file changed, 58 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/net/netfilter/nft_queue.sh b/tools/testing/selftests/net/netfilter/nft_queue.sh
index d80390848e85..7c857a2e0f34 100755
--- a/tools/testing/selftests/net/netfilter/nft_queue.sh
+++ b/tools/testing/selftests/net/netfilter/nft_queue.sh
@@ -85,11 +85,12 @@ ip -net "$ns3" route add default via 10.0.3.1
ip -net "$ns3" route add default via dead:3::1
load_ruleset() {
- local name=$1
- local prio=$2
+ local family=$1
+ local name=$2
+ local prio=$3
ip netns exec "$nsrouter" nft -f /dev/stdin <<EOF
-table inet $name {
+table $family $name {
chain nfq {
ip protocol icmp queue bypass
icmpv6 type { "echo-request", "echo-reply" } queue num 1 bypass
@@ -228,6 +229,7 @@ nf_queue_wait()
test_queue()
{
local expected="$1"
+ local family="$2"
local last=""
# spawn nf_queue listeners
@@ -255,11 +257,13 @@ test_queue()
if [ x"$last" != x"$expected packets total" ]; then
echo "FAIL: Expected $expected packets total, but got $last" 1>&2
ip netns exec "$nsrouter" nft list ruleset
+ echo -n "$TMPFILE0: ";cat "$TMPFILE0"
+ echo -n "$TMPFILE1: ";cat "$TMPFILE1"
exit 1
fi
done
- echo "PASS: Expected and received $last"
+ echo "PASS: Expected and received $last ($family)"
}
listener_ready()
@@ -400,6 +404,8 @@ EOF
kill "$nfqpid"
echo "PASS: icmp+nfqueue via vrf"
+ ip -net "$ns1" link del tvrf
+ ip netns exec "$ns1" nft flush ruleset
}
sctp_listener_ready()
@@ -814,12 +820,53 @@ EOF
check_tainted "queue program exiting while packets queued"
}
+test_queue_bridge()
+{
+ ip -net "$nsrouter" addr flush dev veth0
+ ip -net "$nsrouter" addr flush dev veth1
+
+ ip -net "$nsrouter" link add br0 type bridge
+ ip -net "$nsrouter" link set veth0 master br0
+ ip -net "$nsrouter" link set veth1 master br0
+
+ ip -net "$nsrouter" link set br0 up
+
+ ip -net "$nsrouter" addr add 10.0.2.1/16 dev br0
+ ip -net "$nsrouter" addr add dead:2::1/64 dev br0 nodad
+
+ ip -net "$ns1" addr flush dev eth0
+ ip -net "$ns2" addr flush dev eth0
+
+ ip -net "$ns1" addr add 10.0.1.1/16 dev eth0
+ ip -net "$ns1" addr add dead:2::2/64 dev eth0 nodad
+
+ ip -net "$ns2" addr add 10.0.2.99/16 dev eth0
+ ip -net "$ns2" addr add dead:2::99/64 dev eth0 nodad
+
+ ip netns exec "$nsrouter" nft flush ruleset
+
+ ip netns exec "$nsrouter" sysctl net.ipv6.conf.all.forwarding=0 > /dev/null
+ ip netns exec "$nsrouter" sysctl net.ipv4.conf.veth0.forwarding=0 > /dev/null
+ ip netns exec "$nsrouter" sysctl net.ipv4.conf.veth1.forwarding=0 > /dev/null
+
+ if ! test_ping;then
+ echo "FAIL: netns bridge connectivity" 1>&2
+ exit $ret
+ fi
+
+ load_ruleset "bridge" "filter" 10
+ test_queue 10 "bridge"
+
+ load_ruleset "bridge" "filter2" 20
+ test_queue 20 "bridge"
+}
+
ip netns exec "$nsrouter" sysctl net.ipv6.conf.all.forwarding=1 > /dev/null
ip netns exec "$nsrouter" sysctl net.ipv4.conf.veth0.forwarding=1 > /dev/null
ip netns exec "$nsrouter" sysctl net.ipv4.conf.veth1.forwarding=1 > /dev/null
ip netns exec "$nsrouter" sysctl net.ipv4.conf.veth2.forwarding=1 > /dev/null
-load_ruleset "filter" 0
+load_ruleset "inet" "filter" 0
if test_ping; then
# queue bypass works (rules were skipped, no listener)
@@ -842,11 +889,11 @@ load_counter_ruleset 10
# 1x icmp prerouting,forward,postrouting -> 3 queue events (6 incl. reply).
# 1x icmp prerouting,input,output postrouting -> 4 queue events incl. reply.
# so we expect that userspace program receives 10 packets.
-test_queue 10
+test_queue 10 "inet"
# same. We queue to a second program as well.
-load_ruleset "filter2" 20
-test_queue 20
+load_ruleset "inet" "filter2" 20
+test_queue 20 "inet"
ip netns exec "$ns1" nft flush ruleset
test_tcp_forward
@@ -863,4 +910,7 @@ test_queue_stress
test_icmp_vrf
test_queue_removal
+# turns router into a bridge
+test_queue_bridge
+
exit $ret
--
2.47.3
^ permalink raw reply related
* [PATCH net 07/14] netfilter: nft_compat: ebtables emulation must reject non-bridge targets
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>
From: Florian Westphal <fw@strlen.de>
xtables targets return netfilter verdicts: NF_ACCEPT, NF_DROP, and so
on. ebtables targets return incompatible verdicts: EBT_ACCEPT,
EBT_DROP, ... We cannot allow fallback to NFPROTO_UNSPEC.
ebtables doesn't permit this since
11ff7288beb2 ("netfilter: ebtables: reject non-bridge targets")
but that commit missed the nft_compat layer.
Reported-by: Ren Wei <n05ec@lzu.edu.cn>
Reported-by: Wyatt Feng <bronzed_45_vested@icloud.com>
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
net/netfilter/nft_compat.c | 24 +++++++++++++++++++++---
1 file changed, 21 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c
index 0caa9304d2d0..63864b928259 100644
--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -397,6 +397,22 @@ static int nft_target_validate(const struct nft_ctx *ctx,
return 0;
}
+static int nft_target_bridge_validate(const struct nft_ctx *ctx,
+ const struct nft_expr *expr)
+{
+ struct xt_target *target = expr->ops->data;
+
+ /* Do not allow UNSPEC to stand-in for NFPROTO_BRIDGE
+ * targets: they are incompatible. ebtables targets return
+ * EBT_ACCEPT, DROP and so on which are not compatible with
+ * NF_ACCEPT, NF_DROP and so on.
+ */
+ if (target->family != NFPROTO_BRIDGE)
+ return -ENOENT;
+
+ return nft_target_validate(ctx, expr);
+}
+
static void __nft_match_eval(const struct nft_expr *expr,
struct nft_regs *regs,
const struct nft_pktinfo *pkt,
@@ -932,13 +948,15 @@ nft_target_select_ops(const struct nft_ctx *ctx,
ops->init = nft_target_init;
ops->destroy = nft_target_destroy;
ops->dump = nft_target_dump;
- ops->validate = nft_target_validate;
ops->data = target;
- if (family == NFPROTO_BRIDGE)
+ if (family == NFPROTO_BRIDGE) {
ops->eval = nft_target_eval_bridge;
- else
+ ops->validate = nft_target_bridge_validate;
+ } else {
ops->eval = nft_target_eval_xt;
+ ops->validate = nft_target_validate;
+ }
return ops;
err:
--
2.47.3
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox