Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net] net: do not acquire dev->tx_global_lock in netdev_watchdog_up()
From: patchwork-bot+netdevbpf @ 2026-06-23 21:50 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, kuba, pabeni, horms, netdev, eric.dumazet, m.szyprowski
In-Reply-To: <20260622110108.69541-1-edumazet@google.com>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 22 Jun 2026 11:01:08 +0000 you wrote:
> Marek Szyprowski reported a deadlock during system resume when virtio_net
> driver is used.
> 
> The deadlock occurs because netif_device_attach() is called while holding
> dev->tx_global_lock (via netif_tx_lock_bh() in virtnet_restore_up()).
> netif_device_attach() calls __netdev_watchdog_up(), which now also tries
> to acquire dev->tx_global_lock to synchronize with dev_watchdog().
> 
> [...]

Here is the summary with links:
  - [net] net: do not acquire dev->tx_global_lock in netdev_watchdog_up()
    https://git.kernel.org/netdev/net/c/d09a78a2a469

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net] veth: fix NAPI leak in XDP enable error path
From: patchwork-bot+netdevbpf @ 2026-06-23 21:50 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: davem, kuba, pabeni, horms, netdev, eric.dumazet, groeck,
	bjorn.topel, daniel, ilias.apalodimas, mst, tariqt
In-Reply-To: <20260622111825.88337-1-edumazet@google.com>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 22 Jun 2026 11:18:25 +0000 you wrote:
> During XDP enablement in veth, if xdp_rxq_info_reg() or
> xdp_rxq_info_reg_mem_model() fails, the driver rolls back the changes.
> 
> However, the rollback loop:
> 	for (i--; i >= start; i--) {
> 
> decrements the loop index 'i' before the first iteration. This
> correctly skips unregistering the rxq for the failed index 'i' (as
> registration failed or was already cleaned up), but it also
> erroneously skips calling netif_napi_deli() for rq[i].xdp_napi.
> 
> [...]

Here is the summary with links:
  - [net] veth: fix NAPI leak in XDP enable error path
    https://git.kernel.org/netdev/net/c/6739027cb72d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net v2] net: ti: icssg: Fix XSK zero copy TX during application wakeup
From: patchwork-bot+netdevbpf @ 2026-06-23 21:50 UTC (permalink / raw)
  To: Meghana Malladi
  Cc: diogo.ivo, vadim.fedorenko, haokexin, devnexen, horms,
	jacob.e.keller, pabeni, kuba, edumazet, davem, andrew+netdev,
	linux-kernel, netdev, linux-arm-kernel, srk, vigneshr, rogerq,
	danishanwar
In-Reply-To: <20260618100348.2209907-1-m-malladi@ti.com>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Thu, 18 Jun 2026 15:33:48 +0530 you wrote:
> emac_xsk_xmit_zc() handles tx xmit for zero copy and gets called
> inside napi context. User application wakes up the kernel while
> initiating the transmit which triggers napi to start processing
> the tx packets. The num_tx check inside emac_tx_complete_packets()
> returns early if no packet transfer happen hindering the call
> to emac_xsk_xmit_zc(). Remove this check to let application
> wakeup initiate zero copy xmit traffic.
> 
> [...]

Here is the summary with links:
  - [net,v2] net: ti: icssg: Fix XSK zero copy TX during application wakeup
    https://git.kernel.org/netdev/net/c/d95ea4bc09e8

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: s2io: driver still in use - please reconsider removal
From: Michael Pratte @ 2026-06-23 22:11 UTC (permalink / raw)
  To: Ethan Nelson-Moore
  Cc: Jakub Kicinski, Paolo Abeni, Eric Dumazet, Andrew Lunn,
	Simon Horman, David S . Miller, netdev
In-Reply-To: <CADkSEUhkWRzW+-39JmkDSjUi8Qfwrr0qsVJFDxZhpNCBSenMyw@mail.gmail.com>

On Tue, Jun 23, 2026, Ethan Nelson-Moore wrote:
> Are you using the card for actual work, or are you just testing it out
> of curiosity? What kernel version were you running before you upgraded
> to a current kernel?

A mix of both. I run and maintain a lot of older hardware, some for
work and some for fun, and a 10G card that works in PCI-X has been
hard to find. I brought this one up directly on 6.6, so I hadn't run
it on an older kernel; I bisected afterward and found it last worked
in 4.1.

> Given that the driver has not been working for almost 11 years and you
> are seemingly the first person to notice, I would like to respectfully
> disagree with this assertion.

I might well be the only one still using it. But it's a one-line fix,
and I'm willing to maintain it going forward if that helps keep it in
tree.

Thanks,
Michael

^ permalink raw reply

* [PATCH net 00/14] Netfilter fixes for net
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms

Hi,

The following patchset contains Netfilter fixes for net:

1) Add a workaround to avoid a possible crash if nf_nat and nft_chain_nat are
   compiled built-in and nf_nat fails to register, allowing nft_chain_nat to
   access the incorrect pernetns area. This is crash specific of all built-in
   compilation. From Matias Krause.

2) Revisit conncount GC optimization for confirmed conntracks, skip GC round
   if IPS_ASSURED is set on. This is addressing an issue for corner case
   use case scenario involving locally generated traffic. No crash, just a
   functionality fix. From Fernando F. Mancera.

3) Validate iph->ihl in flowtable IPIP tunnel support, from Lorenzo Bianconi.
   This a sanity check to bounces back malformed IPIP packets to classic
   forwarding path.

4) Kdoc fixes for x_tables.h, from Randy Dunlap.

5) Use info->options so nft_synproxy_tcp_options() stays on the same local
   snapshot, otherwise eval path can observe inconsistent mix of mss and
   timestamps. From Runyu Xiao.

6) Add conntrack_sctp_collision.sh to cover for SCTP INIT collisions.
   From Yi Chen.

7) Do not allow NFPROTO_UNSPEC targets if family is NFPROTO_BRIDGE in
   nft_compat. This allows to use non-sense targets such as xt_nat leading
   to crash. From Florian Westphal.

8) Add a selftest queueing from bridge family. From Florian Westphal.

9) Do not allow to reset a conntrack helper via ctnetlink. This feature
   antedates the creation of the conntrack-tools, and it is not used
   I don't have a usecase for it, I prefer to remove than fixing it.

10) Add deprecation warning for IPv4 only conntrack helpers for PPTP
    and IRC. From Florian Westphal.

11) Store the master tuple in the expectation object and use it,
    otherwise SLAB_TYPESAFE_RCU rules allow to display incorrect
    master tuple information through ctnetlink.

12) Run expectation eviction when inserting an expectation with no
    helper, this is a fix for the nft_ct custom expectation support.

13) Fix nft_ct custom expectation timeouts, userspace provides a
    timeout in milliseconds but kernel assumes this comes in seconds.
    From Florian Westphal.

14) Cap maximum number of expectations per class to 255 expectations
    per master conntrack at helper registration. This is a fix to
    restrict the maximum number of expectations per master conntrack
    which can be a issue for the new lazy GC expectation approach.

Please, pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git nf-26-06-23

Thanks.

P.S: Sashiko has been reporting "Failed to apply" with recent patches,
     I suspect it relies on the Linus' tree which does not contain
     yet the patches that were recently included in the last PR.
     If it fails to deliver a report, I can provide a list of list
     to the reviews that sashiko provided when patches were posted to
     the netfilter-devel mailing list.

----------------------------------------------------------------

The following changes since commit a986fde914d88af47eb78fd29c5d1af7952c3500:

  bnx2x: fix potential memory leak in bnx2x_alloc_mem_bp() (2026-06-22 18:39:12 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git tags/nf-26-06-23

for you to fetch changes up to 397c8300972f6e1486fd1afd99a044648a401cd5:

  netfilter: nf_conntrack_helper: cap maximum number of expectation at helper registration (2026-06-23 13:10:48 +0200)

----------------------------------------------------------------
netfilter pull request 26-06-23

----------------------------------------------------------------
Fernando Fernandez Mancera (1):
      netfilter: nf_conncount: prevent connlimit drops for early confirmed ct

Florian Westphal (4):
      netfilter: nft_compat: ebtables emulation must reject non-bridge targets
      selftests: nft_queue.sh: add a bridge queue test
      netfilter: conntrack: add deprecation warnings for irc and pptp trackers
      netfilter: nft_ct: expectation timeouts are passed in milliseconds

Lorenzo Bianconi (1):
      netfilter: flowtable: Validate iph->ihl in nf_flow_ip4_tunnel_proto()

Mathias Krause (1):
      netfilter: nf_nat: avoid invalid nat_net pointer use on failed nf_nat_init()

Pablo Neira Ayuso (4):
      netfilter: ctnetlink: do not allow to reset helper on existing conntrack
      netfilter: nf_conntrack_expect: store master_tuple in expectation
      netfilter: nf_conntrack_expect: run expectation eviction with no helper
      netfilter: nf_conntrack_helper: cap maximum number of expectation at helper registration

Randy Dunlap (1):
      netfilter: x_tables.h: fix all kernel-doc warnings

Runyu Xiao (1):
      netfilter: nft_synproxy: stop bypassing the priv->info snapshot

Yi Chen (1):
      selftests: netfilter: conntrack_sctp_collision.sh: Introduce SCTP INIT collision test

 include/linux/netfilter/x_tables.h                 | 29 +++++--
 include/net/netfilter/nf_conntrack_expect.h        |  1 +
 include/net/netfilter/nf_conntrack_helper.h        |  4 +
 net/netfilter/Kconfig                              | 11 +--
 net/netfilter/nf_conncount.c                       | 11 ++-
 net/netfilter/nf_conntrack_broadcast.c             |  1 +
 net/netfilter/nf_conntrack_expect.c                | 12 ++-
 net/netfilter/nf_conntrack_helper.c                |  9 ++-
 net/netfilter/nf_conntrack_irc.c                   |  2 +
 net/netfilter/nf_conntrack_netlink.c               | 23 +-----
 net/netfilter/nf_conntrack_pptp.c                  |  2 +
 net/netfilter/nf_flow_table_ip.c                   |  8 +-
 net/netfilter/nf_nat_core.c                        | 10 +++
 net/netfilter/nft_compat.c                         | 24 +++++-
 net/netfilter/nft_ct.c                             | 21 ++++-
 net/netfilter/nft_synproxy.c                       |  9 +--
 .../net/netfilter/conntrack_sctp_collision.sh      | 89 ++++++++++++++++------
 tools/testing/selftests/net/netfilter/nft_queue.sh | 66 ++++++++++++++--
 18 files changed, 246 insertions(+), 86 deletions(-)

^ permalink raw reply

* [PATCH net 01/14] netfilter: nf_nat: avoid invalid nat_net pointer use on failed nf_nat_init()
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

From: Mathias Krause <minipli@grsecurity.net>

We ran into below KASAN splat, which is mostly uninteresting, beside
for having nf_nat_register_fn() in the call chain as a cause for the
offending access:

==================================================================
BUG: KASAN: slab-out-of-bounds in nf_nat_register_fn+0x5f9/0x640
Read of size 8 at addr ffff890031e54c20 by task iptables/9510

CPU: 0 UID: 0 PID: 9510 Comm: iptables Not tainted 6.18.18-grsec-full-20260320181326 #1 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
 <TASK>
 […] dump_stack_lvl+0xee/0x160 ffff88004117eeb8
 […] print_report+0x6e/0x640 ffff88004117eee0
 […] ? __phys_addr+0x8e/0x140 ffff88004117eef0
 […] ? kasan_addr_to_slab+0x51/0xe0 ffff88004117ef08
 […] ? complete_report_info+0xec/0x1c0 ffff88004117ef20
 […] ? nf_nat_register_fn+0x5f9/0x640 ffff88004117ef48
 […] kasan_report+0xbc/0x140 ffff88004117ef50
 […] ? nf_nat_register_fn+0x5f9/0x640 ffff88004117ef90
 […] nf_nat_register_fn+0x5f9/0x640 ffff88004117eff8
 […] ? nf_nat_icmp_reply_translation+0x6e0/0x6e0 ffff88004117f070
 […] nf_tables_register_hook.part.0+0xa0/0x220 ffff88004117f080
 […] nf_tables_addchain.constprop.0+0x1054/0x1fc0 ffff88004117f0b8
 […] ? nft_chain_lookup.part.0+0x4ce/0xac0 ffff88004117f130
 […] ? nf_tables_abort+0x3d80/0x3d80 ffff88004117f190
 […] ? nf_tables_dumpreset_obj+0x100/0x100 ffff88004117f1c8
 […] ? nft_table_lookup.part.0+0x255/0x300 ffff88004117f310
 […] ? nf_tables_newchain+0x21a4/0x2fa0 ffff88004117f358
 […] nf_tables_newchain+0x21a4/0x2fa0 ffff88004117f360
 […] ? nf_tables_addchain.constprop.0+0x1fc0/0x1fc0 ffff88004117f458
 […] ? nla_get_range_signed+0x4a0/0x4a0 ffff88004117f488
 […] ? lock_acquire+0x16f/0x320 ffff88004117f490
 […] ? find_held_lock+0x3b/0xe0 ffff88004117f4b0
 […] ? __nla_parse+0x45/0x80 ffff88004117f500
 […] nfnetlink_rcv_batch+0xbca/0x19a0 ffff88004117f550
 […] ? nfnetlink_net_exit_batch+0x120/0x120 ffff88004117f618
 […] ? __sanitizer_cov_trace_switch+0x63/0xe0 ffff88004117f720
 […] ? gr_acl_handle_mmap+0x1c4/0x320 ffff88004117f7c0
 […] ? nla_get_range_signed+0x4a0/0x4a0 ffff88004117f7e8
 […] ? gr_is_capable+0x6f/0xe0 ffff88004117f830
 […] ? __nla_parse+0x45/0x80 ffff88004117f860
 […] ? skb_pull+0x103/0x1a0 ffff88004117f880
 […] nfnetlink_rcv+0x3db/0x4a0 ffff88004117f8b0
 […] ? nfnetlink_rcv_batch+0x19a0/0x19a0 ffff88004117f8d8
 […] ? netlink_lookup+0xe2/0x240 ffff88004117f900
 […] netlink_unicast+0x74b/0xb00 ffff88004117f930
 […] ? netlink_attachskb+0xb20/0xb20 ffff88004117f980
 […] ? __check_object_size+0x3e/0xaa0 ffff88004117f998
 […] ? security_netlink_send+0x51/0x160 ffff88004117f9c8
 […] netlink_sendmsg+0xa03/0x1200 ffff88004117f9f8
 […] ? netlink_unicast+0xb00/0xb00 ffff88004117fa70
 […] ? netlink_unicast+0xb00/0xb00 ffff88004117fac8
 […] ? ____sys_sendmsg+0xe2a/0x1040 ffff88004117faf8
 […] ____sys_sendmsg+0xe2a/0x1040 ffff88004117fb00
 […] ? kernel_recvmsg+0x300/0x300 ffff88004117fb60
 […] ? reacquire_held_locks+0xe9/0x260 ffff88004117fbc8
 […] ___sys_sendmsg+0x138/0x200 ffff88004117fbf8
 […] ? do_recvmmsg+0x7e0/0x7e0 ffff88004117fc30
 […] ? lockdep_hardirqs_on_prepare+0x101/0x1e0 ffff88004117fc50
 […] ? lock_acquire+0x16f/0x320 ffff88004117fd20
 […] ? lock_acquire+0x16f/0x320 ffff88004117fd58
 […] ? find_held_lock+0x3b/0xe0 ffff88004117fd70
 […] __sys_sendmsg+0x17a/0x260 ffff88004117fdc8
 […] ? __sys_sendmsg_sock+0x80/0x80 ffff88004117fdf0
 […] ? syscall_trace_enter+0x15e/0x2c0 ffff88004117fe98
 […] do_syscall_64+0x7d/0x400 ffff88004117fec8
 […] entry_SYSCALL_64_safe_stack+0x4a/0x60 ffff88004117fef8
 </TASK>
==================================================================

The out-of-bounds report, though, is a red herring as it is for an
access that shouldn't have happened in the first place.

When nf_nat_init() fails to register its BPF kfuncs, it'll unwind and,
among others, call unregister_pernet_subsys() to deregister its per-net
ops. This makes the previously allocated net id available for reuse by
the next caller of register_pernet_subsys(), in our case, synproxy.
However, 'nat_net_id' will still hold the previously allocated value.

If nf_nat.o gets build as a module, all this doesn't matter. A failed
initialization routine makes the module fail to load and any dependent
module won't be able to load either. However, if nf_nat.o is built-in,
a failing init won't /completely/ make its functionality unavailable to
dependent modules, namely the code and static data is still there, free
to be called by modules like nft_chain_nat.ko.

Case in point, nft_chain_nat registers hooks that'll call into nf_nat
which, in our case, failed to initialize and therefore won't have a
valid net id nor related net_nat object any more.

Code in nf_nat, namely nf_nat_register_fn() and nf_nat_unregister_fn(),
still making use of the reallocated net id, lead to a type confusion as
the call to net_generic() will no longer return memory belonging to an
object suited to fit 'struct nat_net' but 'struct synproxy_net' instead.
The latter is only 24 bytes on 64-bit systems, much smaller than struct
nat_net which is 176 bytes, perfectly explaining the OOB KASAN report.

Detect and handle a failed nf_nat_init() by testing the 'nf_nat_hook'
pointer which will be reset to NULL on initialization errors to prevent
the usage of an invalid nat_net pointer.

As this check is only needed when nf_nat.o is built-in, guard it by
'#ifndef MODULE...'.

Fixes: cbc1dd5b659f ("netfilter: nf_nat: Fix possible memory leak in nf_nat_init()")
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_nat_core.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index 2bbf5163c0e2..63ff6b4d5d21 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -1181,6 +1181,16 @@ int nf_nat_register_fn(struct net *net, u8 pf, const struct nf_hook_ops *ops,
 	struct nf_hook_ops *nat_ops;
 	int i, ret;
 
+#ifndef MODULE
+	/* If nf_nat_core is built-in and nf_nat_init() fails, dependent
+	 * modules like nft_chain_nat.ko may still call this function.
+	 * However, nat_net would be invalid, likely pointing to some other
+	 * per-net structure.
+	 */
+	if (WARN_ON_ONCE(!nf_nat_hook))
+		return -EOPNOTSUPP;
+#endif
+
 	if (WARN_ON_ONCE(pf >= ARRAY_SIZE(nat_net->nat_proto_net)))
 		return -EINVAL;
 
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 02/14] netfilter: nf_conncount: prevent connlimit drops for early confirmed ct
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

From: Fernando Fernandez Mancera <fmancera@suse.de>

Commit 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add
was skipped") introduced a regression where packets for valid
connections are dropped when using connlimit for soft-limiting
scenarios.

The issue occurs when a new connection reuses a socket currently in
the TIME_WAIT state. In this scenario, the connection tracking entry
is evaluated as already confirmed. Previously, __nf_conncount_add()
assumed that if a connection was confirmed and did not originate from
the loopback interface, it should skip the addition and return -EEXIST.

Skipping the addition triggers a garbage collection run that cleans up
the TIME_WAIT connection. Consequently, the active connection count
drops to 0, which xt_connlimit mishandles, leading to the false rejection
of the perfectly valid new connection.

Fix this by replacing the interface check with protocol-agnostic state
checks. We now skip the tree insertion and preserve the lockless garbage
collection optimization only if the connection is IPS_ASSURED. This
allows early-confirmed setup packets (such as reused TIME_WAIT sockets
or locally generated SYN-ACKs) to be properly evaluated and counted
without falsely dropping. The goto check_connections path is maintained
to ensure these setup packets are deduplicated correctly.

This has been tested with slowhttptest and HTTP server configured
locally to ensure we are not breaking soft-limiting scenarios for local
or external connections. In addition, it was tested with a OVS zone
limit too.

Fixes: 69894e5b4c5e ("netfilter: nft_connlimit: update the count if add was skipped")
Reported-by: Alejandro Olivan Alvarez <alejandro.olivan.alvarez@gmail.com>
Closes: https://lore.kernel.org/netfilter-devel/177349610461.3071718.4083978280323144323@eldamar.lan/
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conncount.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index dd67004a5cc0..91582069f6d2 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -183,17 +183,16 @@ static int __nf_conncount_add(struct net *net,
 		return -ENOENT;

 	if (ct && nf_ct_is_confirmed(ct)) {
-		/* local connections are confirmed in postrouting so confirmation
-		 * might have happened before hitting connlimit
+		/* Connection is confirmed but might still be in the setup phase.
+		 * Only skip the tracking if it is fully assured. This guarantees
+		 * that setup packets or retransmissions are properly counted and
+		 * deduplicated.
 		 */
-		if (skb->skb_iif != LOOPBACK_IFINDEX) {
+		if (test_bit(IPS_ASSURED_BIT, &ct->status)) {
 			err = -EEXIST;
 			goto out_put;
 		}

-		/* this is likely a local connection, skip optimization to avoid
-		 * adding duplicates from a 'packet train'
-		 */
 		goto check_connections;
 	}

-- 
2.47.3

^ permalink raw reply related

* [PATCH net 03/14] netfilter: flowtable: Validate iph->ihl in nf_flow_ip4_tunnel_proto()
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

From: Lorenzo Bianconi <lorenzo@kernel.org>

Add sanity check for iph->ihl field in nf_flow_ip4_tunnel_proto() before
using it to compute the header size, avoiding out-of-bounds access with
malformed IP headers.
While at it, use iph->protocol instead of the hardcoded IPPROTO_IPIP
constant when setting ctx->tun.proto and reference ctx->tun.hdr_size
when updating ctx->offset.

Fixes: ab427db178858 ("netfilter: flowtable: Add IPIP rx sw acceleration")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_flow_table_ip.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nf_flow_table_ip.c b/net/netfilter/nf_flow_table_ip.c
index e7a3fb2b2d94..29e93ac1e2e4 100644
--- a/net/netfilter/nf_flow_table_ip.c
+++ b/net/netfilter/nf_flow_table_ip.c
@@ -326,8 +326,10 @@ static bool nf_flow_ip4_tunnel_proto(struct nf_flowtable_ctx *ctx,
 		return false;
 
 	iph = (struct iphdr *)(skb_network_header(skb) + ctx->offset);
-	size = iph->ihl << 2;
+	if (iph->ihl < 5)
+		return false;
 
+	size = iph->ihl << 2;
 	if (ip_is_fragment(iph) || unlikely(ip_has_options(size)))
 		return false;
 
@@ -335,9 +337,9 @@ static bool nf_flow_ip4_tunnel_proto(struct nf_flowtable_ctx *ctx,
 		return false;
 
 	if (iph->protocol == IPPROTO_IPIP) {
-		ctx->tun.proto = IPPROTO_IPIP;
+		ctx->tun.proto = iph->protocol;
 		ctx->tun.hdr_size = size;
-		ctx->offset += size;
+		ctx->offset += ctx->tun.hdr_size;
 	}
 
 	return true;
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 04/14] netfilter: x_tables.h: fix all kernel-doc warnings
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

From: Randy Dunlap <rdunlap@infradead.org>

- use correct names in kernel-doc comments
- add missing struct members to kernel-doc comments

Warning: include/linux/netfilter/x_tables.h:41 struct member 'targinfo' not described in 'xt_action_param'
Warning: include/linux/netfilter/x_tables.h:41 Excess struct member 'targetinfo' description in 'xt_action_param'
Warning: include/linux/netfilter/x_tables.h:90 struct member 'family' not described in 'xt_mtchk_param'
Warning: include/linux/netfilter/x_tables.h:90 struct member 'nft_compat' not described in 'xt_mtchk_param'
Warning: include/linux/netfilter/x_tables.h:101 expecting prototype for struct xt_mdtor_param. Prototype was for struct xt_mtdtor_param instead

Warning: include/linux/netfilter/x_tables.h:121 struct member 'net' not described in 'xt_tgchk_param'
Warning: include/linux/netfilter/x_tables.h:121 struct member 'table' not described in 'xt_tgchk_param'
Warning: include/linux/netfilter/x_tables.h:121 struct member 'target' not described in 'xt_tgchk_param'
Warning: include/linux/netfilter/x_tables.h:121 struct member 'targinfo' not described in 'xt_tgchk_param'
Warning: include/linux/netfilter/x_tables.h:121 struct member 'hook_mask' not described in 'xt_tgchk_param'
Warning: include/linux/netfilter/x_tables.h:121 struct member 'family' not described in 'xt_tgchk_param'
Warning: include/linux/netfilter/x_tables.h:121 struct member 'nft_compat' not described in 'xt_tgchk_param'

Warning: include/linux/netfilter/x_tables.h:345 expecting prototype for xt_recseq(). Prototype was for DECLARE_PER_CPU() instead

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter/x_tables.h | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h
index 20d70dddbe50..25062f4a0dd5 100644
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -18,7 +18,7 @@
  * @match:	the match extension
  * @target:	the target extension
  * @matchinfo:	per-match data
- * @targetinfo:	per-target data
+ * @targinfo:	per-target data
  * @state:	pointer to hook state this packet came from
  * @fragoff:	packet is a fragment, this is the data offset
  * @thoff:	position of transport header relative to skb->data
@@ -77,7 +77,9 @@ static inline u_int8_t xt_family(const struct xt_action_param *par)
  * @match:	struct xt_match through which this function was invoked
  * @matchinfo:	per-match data
  * @hook_mask:	via which hooks the new rule is reachable
- * Other fields as above.
+ * @family:	actual NFPROTO_* through which the function is invoked
+ *		(helpful when match->family == NFPROTO_UNSPEC)
+ * @nft_compat:	running from the nft compat layer if true
  */
 struct xt_mtchk_param {
 	struct net *net;
@@ -91,8 +93,13 @@ struct xt_mtchk_param {
 };
 
 /**
- * struct xt_mdtor_param - match destructor parameters
- * Fields as above.
+ * struct xt_mtdtor_param - match destructor parameters
+ *
+ * @net:	network namespace through which the check was invoked
+ * @match:	struct xt_match through which this function was invoked
+ * @matchinfo:	per-match data
+ * @family:	actual NFPROTO_* through which the function is invoked
+ *		(helpful when match->family == NFPROTO_UNSPEC)
  */
 struct xt_mtdtor_param {
 	struct net *net;
@@ -105,10 +112,16 @@ struct xt_mtdtor_param {
  * struct xt_tgchk_param - parameters for target extensions'
  * checkentry functions
  *
+ * @net:	network namespace through which the check was invoked
+ * @table:	table the rule is tried to be inserted into
  * @entryinfo:	the family-specific rule data
  * 		(struct ipt_entry, ip6t_entry, arpt_entry, ebt_entry)
- *
- * Other fields see above.
+ * @target:	the target extension
+ * @targinfo:	per-target data
+ * @hook_mask:	via which hooks the new rule is reachable
+ * @family:	actual NFPROTO_* through which the function is invoked
+ *		(helpful when match->family == NFPROTO_UNSPEC)
+ * @nft_compat:	running from the nft compat layer if true
  */
 struct xt_tgchk_param {
 	struct net *net;
@@ -336,9 +349,9 @@ struct xt_table_info *xt_alloc_table_info(unsigned int size);
 void xt_free_table_info(struct xt_table_info *info);
 
 /**
- * xt_recseq - recursive seqcount for netfilter use
+ * var xt_recseq - recursive seqcount for netfilter use
  *
- * Packet processing changes the seqcount only if no recursion happened
+ * Packet processing changes the seqcount only if no recursion happened.
  * get_counters() can use read_seqcount_begin()/read_seqcount_retry(),
  * because we use the normal seqcount convention :
  * Low order bit set to 1 if a writer is active.
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 05/14] netfilter: nft_synproxy: stop bypassing the priv->info snapshot
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

From: Runyu Xiao <runyu.xiao@seu.edu.cn>

nft_synproxy_eval_v4() and nft_synproxy_eval_v6() already take a
whole-object READ_ONCE() snapshot of the shared priv->info state before
building the SYNACK reply, but nft_synproxy_tcp_options() still masks
opts->options with priv->info.options from the live shared object.

When a named synproxy object is updated concurrently with SYN traffic,
the eval path can then mix mss and timestamp handling from the local
snapshot with an options mask taken from a newer configuration, so one
SYNACK no longer reflects a coherent synproxy configuration.

Use info->options so nft_synproxy_tcp_options() stays on the same local
snapshot that the eval path already copied from priv->info.

Fixes: ee394f96ad75 ("netfilter: nft_synproxy: add synproxy stateful object support")
Signed-off-by: Runyu Xiao <runyu.xiao@seu.edu.cn>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_synproxy.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/nft_synproxy.c b/net/netfilter/nft_synproxy.c
index 7641f249614c..9ed288c9d168 100644
--- a/net/netfilter/nft_synproxy.c
+++ b/net/netfilter/nft_synproxy.c
@@ -24,14 +24,13 @@ static const struct nla_policy nft_synproxy_policy[NFTA_SYNPROXY_MAX + 1] = {
 static void nft_synproxy_tcp_options(struct synproxy_options *opts,
 				     const struct tcphdr *tcp,
 				     struct synproxy_net *snet,
-				     struct nf_synproxy_info *info,
-				     const struct nft_synproxy *priv)
+				     struct nf_synproxy_info *info)
 {
 	this_cpu_inc(snet->stats->syn_received);
 	if (tcp->ece && tcp->cwr)
 		opts->options |= NF_SYNPROXY_OPT_ECN;
 
-	opts->options &= priv->info.options;
+	opts->options &= info->options;
 	opts->mss_encode = opts->mss_option;
 	opts->mss_option = info->mss;
 	if (opts->options & NF_SYNPROXY_OPT_TIMESTAMP)
@@ -56,7 +55,7 @@ static void nft_synproxy_eval_v4(const struct nft_synproxy *priv,
 
 	if (tcp->syn) {
 		/* Initial SYN from client */
-		nft_synproxy_tcp_options(opts, tcp, snet, &info, priv);
+		nft_synproxy_tcp_options(opts, tcp, snet, &info);
 		synproxy_send_client_synack(net, skb, tcp, opts);
 		consume_skb(skb);
 		regs->verdict.code = NF_STOLEN;
@@ -87,7 +86,7 @@ static void nft_synproxy_eval_v6(const struct nft_synproxy *priv,
 
 	if (tcp->syn) {
 		/* Initial SYN from client */
-		nft_synproxy_tcp_options(opts, tcp, snet, &info, priv);
+		nft_synproxy_tcp_options(opts, tcp, snet, &info);
 		synproxy_send_client_synack_ipv6(net, skb, tcp, opts);
 		consume_skb(skb);
 		regs->verdict.code = NF_STOLEN;
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 06/14] selftests: netfilter: conntrack_sctp_collision.sh: Introduce SCTP INIT collision test
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

From: Yi Chen <yiche.cy@gmail.com>

The existing test covered a scenario where a delayed INIT_ACK chunk
updates the vtag in conntrack after the association has already been
established.

A similar issue can occur with a delayed SCTP INIT chunk.

Add a new simultaneous-open test case where the client's INIT is
delayed, allowing conntrack to establish the association based on
the server-initiated handshake.

When the stale INIT arrives later, it may get recorded and cause a
following INIT_ACK from the peer to be accepted instead of dropped.
This INIT_ACK overwrites the vtag in conntrack, causing subsequent
SCTP DATA chunks to be considered as invalid and then dropped by
nft rules matching on ct state invalid.

This test verifies such stale INIT chunks do not cause problems.

Signed-off-by: Yi Chen <yiche.cy@gmail.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 .../net/netfilter/conntrack_sctp_collision.sh | 89 ++++++++++++++-----
 1 file changed, 67 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/net/netfilter/conntrack_sctp_collision.sh b/tools/testing/selftests/net/netfilter/conntrack_sctp_collision.sh
index d860f7d9744b..7261975957ef 100755
--- a/tools/testing/selftests/net/netfilter/conntrack_sctp_collision.sh
+++ b/tools/testing/selftests/net/netfilter/conntrack_sctp_collision.sh
@@ -2,18 +2,32 @@
 # SPDX-License-Identifier: GPL-2.0
 #
 # Testing For SCTP COLLISION SCENARIO as Below:
-#
+# 1. Stale INIT_ACK capture:
 #   14:35:47.655279 IP CLIENT_IP.PORT > SERVER_IP.PORT: sctp (1) [INIT] [init tag: 2017837359]
 #   14:35:48.353250 IP SERVER_IP.PORT > CLIENT_IP.PORT: sctp (1) [INIT] [init tag: 1187206187]
 #   14:35:48.353275 IP CLIENT_IP.PORT > SERVER_IP.PORT: sctp (1) [INIT ACK] [init tag: 2017837359]
 #   14:35:48.353283 IP SERVER_IP.PORT > CLIENT_IP.PORT: sctp (1) [COOKIE ECHO]
 #   14:35:48.353977 IP CLIENT_IP.PORT > SERVER_IP.PORT: sctp (1) [COOKIE ACK]
 #   14:35:48.855335 IP SERVER_IP.PORT > CLIENT_IP.PORT: sctp (1) [INIT ACK] [init tag: 164579970]
+#   (Delayed)
+#
+# 2. Stale INIT capture:
+#   14:35:48.353250 IP SERVER_IP.PORT > CLIENT_IP.PORT: sctp (1) [INIT] [init tag: 1187206187]
+#   14:35:48.353275 IP CLIENT_IP.PORT > SERVER_IP.PORT: sctp (1) [INIT ACK] [init tag: 2017837359]
+#   14:35:48.353283 IP SERVER_IP.PORT > CLIENT_IP.PORT: sctp (1) [COOKIE ECHO]
+#   14:35:48.353977 IP CLIENT_IP.PORT > SERVER_IP.PORT: sctp (1) [COOKIE ACK]
+#   14:35:47.655279 IP CLIENT_IP.PORT > SERVER_IP.PORT: sctp (1) [INIT] [init tag: 2017837359]
+#   (Delayed)
+#   14:35:48.855335 IP SERVER_IP.PORT > CLIENT_IP.PORT: sctp (1) [INIT ACK] [init tag: 164579970]
 #
 # TOPO: SERVER_NS (link0)<--->(link1) ROUTER_NS (link2)<--->(link3) CLIENT_NS
 
 source lib.sh
 
+checktool "nft --version" "run test without nft"
+checktool "tc -h" "run test without tc"
+checktool "modprobe -q sctp" "load sctp module"
+
 CLIENT_IP="198.51.200.1"
 CLIENT_PORT=1234
 
@@ -24,7 +38,8 @@ CLIENT_GW="198.51.200.2"
 SERVER_GW="198.51.100.2"
 
 # setup the topo
-setup() {
+topo_setup() {
+	# setup_ns cleans up existing net namespaces first.
 	setup_ns CLIENT_NS SERVER_NS ROUTER_NS
 	ip -n "$SERVER_NS" link add link0 type veth peer name link1 netns "$ROUTER_NS"
 	ip -n "$CLIENT_NS" link add link3 type veth peer name link2 netns "$ROUTER_NS"
@@ -38,35 +53,53 @@ setup() {
 	ip -n "$ROUTER_NS" addr add $SERVER_GW/24 dev link1
 	ip -n "$ROUTER_NS" addr add $CLIENT_GW/24 dev link2
 	ip net exec "$ROUTER_NS" sysctl -wq net.ipv4.ip_forward=1
+	sysctl -wq net.netfilter.nf_log_all_netns=1
 
 	ip -n "$CLIENT_NS" link set link3 up
 	ip -n "$CLIENT_NS" addr add $CLIENT_IP/24 dev link3
 	ip -n "$CLIENT_NS" route add $SERVER_IP dev link3 via $CLIENT_GW
+}
+
+conf_delay()
+{
+	# simulate the delay on OVS upcall by setting up a delay for INIT_ACK/INIT with
+	local ns=$1
+	local link=$2
+	local chunk_type=$3
 
-	# simulate the delay on OVS upcall by setting up a delay for INIT_ACK with
-	# tc on $SERVER_NS side
-	tc -n "$SERVER_NS" qdisc add dev link0 root handle 1: htb r2q 64
-	tc -n "$SERVER_NS" class add dev link0 parent 1: classid 1:1 htb rate 100mbit
-	tc -n "$SERVER_NS" filter add dev link0 parent 1: protocol ip u32 match ip protocol 132 \
-		0xff match u8 2 0xff at 32 flowid 1:1
-	if ! tc -n "$SERVER_NS" qdisc add dev link0 parent 1:1 handle 10: netem delay 1200ms; then
+	# use a smaller number for assoc's max_retrans to reproduce the issue
+	ip net exec "$CLIENT_NS" sysctl -wq net.sctp.association_max_retrans=3
+
+	tc -n "$ns" qdisc add dev "$link" root handle 1: htb r2q 64
+	tc -n "$ns" class add dev "$link" parent 1: classid 1:1 htb rate 100mbit
+	tc -n "$ns" filter add dev "$link" parent 1: protocol ip \
+		u32 match ip protocol 132 0xff match u8 "$chunk_type" 0xff at 32 flowid 1:1
+	if ! tc -n "$ns" qdisc add dev "$link" parent 1:1 handle 10: netem delay 1200ms; then
 		echo "SKIP: Cannot add netem qdisc"
-		exit $ksft_skip
+		return $ksft_skip
 	fi
 
 	# simulate the ctstate check on OVS nf_conntrack
-	ip net exec "$ROUTER_NS" iptables -A FORWARD -m state --state INVALID,UNTRACKED -j DROP
-	ip net exec "$ROUTER_NS" iptables -A INPUT -p sctp -j DROP
-
-	# use a smaller number for assoc's max_retrans to reproduce the issue
-	modprobe -q sctp
-	ip net exec "$CLIENT_NS" sysctl -wq net.sctp.association_max_retrans=3
+	ip net exec "$ROUTER_NS" nft -f - <<-EOF
+	table ip t {
+		chain forward {
+			type filter hook forward priority filter; policy accept;
+			meta l4proto icmp counter accept
+			ct state new counter accept
+			ct state established,related counter accept
+			ct state invalid log flags all counter drop comment \
+			"Expect to drop stale INIT/INIT_ACK chunks"
+			counter
+		}
+	}
+	EOF
+	return 0
 }
 
 cleanup() {
-	ip net exec "$CLIENT_NS" pkill sctp_collision >/dev/null 2>&1
-	ip net exec "$SERVER_NS" pkill sctp_collision >/dev/null 2>&1
+	# cleanup_all_ns terminates running processes in the namespaces.
 	cleanup_all_ns
+	sysctl -wq net.netfilter.nf_log_all_netns=0
 }
 
 do_test() {
@@ -81,7 +114,19 @@ do_test() {
 
 # run the test case
 trap cleanup EXIT
-setup && \
-echo "Test for SCTP Collision in nf_conntrack:" && \
-do_test && echo "PASS!"
-exit $?
+
+echo "Test for SCTP INIT_ACK Collision in nf_conntrack:"
+topo_setup || exit $?
+conf_delay $SERVER_NS link0 2 || exit $?
+
+if ! do_test; then
+	exit $ksft_fail
+fi
+
+echo "Test for SCTP INIT Collision in nf_conntrack:"
+topo_setup || exit $?
+conf_delay $CLIENT_NS link3 1 || exit $?
+
+if ! do_test; then
+	exit $ksft_fail
+fi
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 07/14] netfilter: nft_compat: ebtables emulation must reject non-bridge targets
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

From: Florian Westphal <fw@strlen.de>

xtables targets return netfilter verdicts: NF_ACCEPT, NF_DROP, and so
on.  ebtables targets return incompatible verdicts: EBT_ACCEPT,
EBT_DROP, ...   We cannot allow fallback to NFPROTO_UNSPEC.

ebtables doesn't permit this since
11ff7288beb2 ("netfilter: ebtables: reject non-bridge targets")
but that commit missed the nft_compat layer.

Reported-by: Ren Wei <n05ec@lzu.edu.cn>
Reported-by: Wyatt Feng <bronzed_45_vested@icloud.com>
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_compat.c | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c
index 0caa9304d2d0..63864b928259 100644
--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -397,6 +397,22 @@ static int nft_target_validate(const struct nft_ctx *ctx,
 	return 0;
 }
 
+static int nft_target_bridge_validate(const struct nft_ctx *ctx,
+				      const struct nft_expr *expr)
+{
+	struct xt_target *target = expr->ops->data;
+
+	/* Do not allow UNSPEC to stand-in for NFPROTO_BRIDGE
+	 * targets: they are incompatible.  ebtables targets return
+	 * EBT_ACCEPT, DROP and so on which are not compatible with
+	 * NF_ACCEPT, NF_DROP and so on.
+	 */
+	if (target->family != NFPROTO_BRIDGE)
+		return -ENOENT;
+
+	return nft_target_validate(ctx, expr);
+}
+
 static void __nft_match_eval(const struct nft_expr *expr,
 			     struct nft_regs *regs,
 			     const struct nft_pktinfo *pkt,
@@ -932,13 +948,15 @@ nft_target_select_ops(const struct nft_ctx *ctx,
 	ops->init = nft_target_init;
 	ops->destroy = nft_target_destroy;
 	ops->dump = nft_target_dump;
-	ops->validate = nft_target_validate;
 	ops->data = target;
 
-	if (family == NFPROTO_BRIDGE)
+	if (family == NFPROTO_BRIDGE) {
 		ops->eval = nft_target_eval_bridge;
-	else
+		ops->validate = nft_target_bridge_validate;
+	} else {
 		ops->eval = nft_target_eval_xt;
+		ops->validate = nft_target_validate;
+	}
 
 	return ops;
 err:
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 08/14] selftests: nft_queue.sh: add a bridge queue test
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

From: Florian Westphal <fw@strlen.de>

Add a test queueing from bridge family.
This was lacking: we queued from inet for ipv4 and ipv6 but
we had no bridge queue test so far.

Given kernel MUST validate that in/out port are still part of
a bridge device on reinject add a test case for this before
adding this check.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 .../selftests/net/netfilter/nft_queue.sh      | 66 ++++++++++++++++---
 1 file changed, 58 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/net/netfilter/nft_queue.sh b/tools/testing/selftests/net/netfilter/nft_queue.sh
index d80390848e85..7c857a2e0f34 100755
--- a/tools/testing/selftests/net/netfilter/nft_queue.sh
+++ b/tools/testing/selftests/net/netfilter/nft_queue.sh
@@ -85,11 +85,12 @@ ip -net "$ns3" route add default via 10.0.3.1
 ip -net "$ns3" route add default via dead:3::1
 
 load_ruleset() {
-	local name=$1
-	local prio=$2
+	local family=$1
+	local name=$2
+	local prio=$3
 
 ip netns exec "$nsrouter" nft -f /dev/stdin <<EOF
-table inet $name {
+table $family $name {
 	chain nfq {
 		ip protocol icmp queue bypass
 		icmpv6 type { "echo-request", "echo-reply" } queue num 1 bypass
@@ -228,6 +229,7 @@ nf_queue_wait()
 test_queue()
 {
 	local expected="$1"
+	local family="$2"
 	local last=""
 
 	# spawn nf_queue listeners
@@ -255,11 +257,13 @@ test_queue()
 		if [ x"$last" != x"$expected packets total" ]; then
 			echo "FAIL: Expected $expected packets total, but got $last" 1>&2
 			ip netns exec "$nsrouter" nft list ruleset
+			echo -n "$TMPFILE0: ";cat "$TMPFILE0"
+			echo -n "$TMPFILE1: ";cat "$TMPFILE1"
 			exit 1
 		fi
 	done
 
-	echo "PASS: Expected and received $last"
+	echo "PASS: Expected and received $last ($family)"
 }
 
 listener_ready()
@@ -400,6 +404,8 @@ EOF
 
 	kill "$nfqpid"
 	echo "PASS: icmp+nfqueue via vrf"
+	ip -net "$ns1" link del tvrf
+	ip netns exec "$ns1" nft flush ruleset
 }
 
 sctp_listener_ready()
@@ -814,12 +820,53 @@ EOF
 	check_tainted "queue program exiting while packets queued"
 }
 
+test_queue_bridge()
+{
+	ip -net "$nsrouter" addr flush dev veth0
+	ip -net "$nsrouter" addr flush dev veth1
+
+	ip -net "$nsrouter" link add br0 type bridge
+	ip -net "$nsrouter" link set veth0 master br0
+	ip -net "$nsrouter" link set veth1 master br0
+
+	ip -net "$nsrouter" link set br0 up
+
+	ip -net "$nsrouter" addr add 10.0.2.1/16 dev br0
+	ip -net "$nsrouter" addr add dead:2::1/64 dev br0 nodad
+
+	ip -net "$ns1" addr flush dev eth0
+	ip -net "$ns2" addr flush dev eth0
+
+	ip -net "$ns1" addr add 10.0.1.1/16 dev eth0
+	ip -net "$ns1" addr add dead:2::2/64 dev eth0 nodad
+
+	ip -net "$ns2" addr add 10.0.2.99/16 dev eth0
+	ip -net "$ns2" addr add dead:2::99/64 dev eth0 nodad
+
+	ip netns exec "$nsrouter" nft flush ruleset
+
+	ip netns exec "$nsrouter" sysctl net.ipv6.conf.all.forwarding=0 > /dev/null
+	ip netns exec "$nsrouter" sysctl net.ipv4.conf.veth0.forwarding=0 > /dev/null
+	ip netns exec "$nsrouter" sysctl net.ipv4.conf.veth1.forwarding=0 > /dev/null
+
+	if ! test_ping;then
+		echo "FAIL: netns bridge connectivity" 1>&2
+		exit $ret
+	fi
+
+	load_ruleset "bridge" "filter" 10
+	test_queue 10 "bridge"
+
+	load_ruleset "bridge" "filter2" 20
+	test_queue 20 "bridge"
+}
+
 ip netns exec "$nsrouter" sysctl net.ipv6.conf.all.forwarding=1 > /dev/null
 ip netns exec "$nsrouter" sysctl net.ipv4.conf.veth0.forwarding=1 > /dev/null
 ip netns exec "$nsrouter" sysctl net.ipv4.conf.veth1.forwarding=1 > /dev/null
 ip netns exec "$nsrouter" sysctl net.ipv4.conf.veth2.forwarding=1 > /dev/null
 
-load_ruleset "filter" 0
+load_ruleset "inet" "filter" 0
 
 if test_ping; then
 	# queue bypass works (rules were skipped, no listener)
@@ -842,11 +889,11 @@ load_counter_ruleset 10
 # 1x icmp prerouting,forward,postrouting -> 3 queue events (6 incl. reply).
 # 1x icmp prerouting,input,output postrouting -> 4 queue events incl. reply.
 # so we expect that userspace program receives 10 packets.
-test_queue 10
+test_queue 10 "inet"
 
 # same.  We queue to a second program as well.
-load_ruleset "filter2" 20
-test_queue 20
+load_ruleset "inet" "filter2" 20
+test_queue 20 "inet"
 ip netns exec "$ns1" nft flush ruleset
 
 test_tcp_forward
@@ -863,4 +910,7 @@ test_queue_stress
 test_icmp_vrf
 test_queue_removal
 
+# turns router into a bridge
+test_queue_bridge
+
 exit $ret
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 09/14] netfilter: ctnetlink: do not allow to reset helper on existing conntrack
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

This feature allows to reset a helper for an existing conntrack, but it
is not safe. This requires a synchronized_rcu() call after resetting the
helper, which is going to be expensive for a large batch of conntrack
entries. This also needs to call to the .destroy callback to release the
GRE/PPTP mappings to fix it.

This feature antedates the creation of the conntrack-tools and I cannot
find a good use-case for this. Given that I cannot find any user in the
netfilter.org userspace tree, I prefer to remove this feature.

Fixes: c1d10adb4a52 ("[NETFILTER]: Add ctnetlink port for nf_conntrack")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conntrack_netlink.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 4e78d2482989..cb38ef42e9e6 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1953,19 +1953,6 @@ static int ctnetlink_change_helper(struct nf_conn *ct,
 		return err;
 	}
 
-	if (!strcmp(helpname, "") && help) {
-		helper = rcu_dereference(help->helper);
-		if (helper) {
-			/* we had a helper before ... */
-			nf_ct_remove_expectations(ct);
-			RCU_INIT_POINTER(help->helper, NULL);
-			if (refcount_dec_and_test(&helper->ct_refcnt))
-				kfree_rcu(helper, rcu);
-		}
-		rcu_read_unlock();
-		return 0;
-	}
-
 	helper = __nf_conntrack_helper_find(helpname, nf_ct_l3num(ct),
 					    nf_ct_protonum(ct));
 	if (helper == NULL) {
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 10/14] netfilter: conntrack: add deprecation warnings for irc and pptp trackers
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

From: Florian Westphal <fw@strlen.de>

IRC Direct client-to-client requires plaintext.  IRC over TLS should be
preferred, making this helper ineffective.  Add a deprecation warning and
update the help text to better reflect that this is needed for the DCC
extension, not IRC itself.

PPTP is esoteric these days and it is the only helper that requires the
destroy callback in the conntrack helper API.

Removal would simplify the conntrack core.

Both helpers are IPv4 only.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_conntrack_helper.h |  4 ++++
 net/netfilter/Kconfig                       | 11 ++++++-----
 net/netfilter/nf_conntrack_irc.c            |  2 ++
 net/netfilter/nf_conntrack_pptp.c           |  2 ++
 4 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_helper.h b/include/net/netfilter/nf_conntrack_helper.h
index 81025101f86d..c761cd8158b2 100644
--- a/include/net/netfilter/nf_conntrack_helper.h
+++ b/include/net/netfilter/nf_conntrack_helper.h
@@ -114,6 +114,10 @@ int nf_conntrack_helpers_register(struct nf_conntrack_helper *, unsigned int,
 void nf_conntrack_helpers_unregister(struct nf_conntrack_helper **,
 				     unsigned int);
 
+#define nf_conntrack_helper_deprecated(name) \
+	pr_warn("The %s conntrack helper is scheduled for removal.\n"	\
+		"Please contact the netfilter-devel mailing list if you still need this.\n", name)
+
 struct nf_conn_help *nf_ct_helper_ext_add(struct nf_conn *ct, gfp_t gfp);
 
 int __nf_ct_try_assign_helper(struct nf_conn *ct, struct nf_conn *tmpl,
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 665f8008cc4b..4c04cd8d40a2 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -256,8 +256,7 @@ config NF_CONNTRACK_H323
 	  To compile it as a module, choose M here.  If unsure, say N.
 
 config NF_CONNTRACK_IRC
-	tristate "IRC protocol support"
-	default m if NETFILTER_ADVANCED=n
+	tristate "IRC DCC protocol support (obsolete)"
 	help
 	  There is a commonly-used extension to IRC called
 	  Direct Client-to-Client Protocol (DCC).  This enables users to send
@@ -267,6 +266,8 @@ config NF_CONNTRACK_IRC
 	  using NAT, this extension will enable you to send files and initiate
 	  chats.  Note that you do NOT need this extension to get files or
 	  have others initiate chats, or everything else in IRC.
+	  DCC tracking behind NAT requires plaintext (unencrypted) IRC, so
+	  this helper is of limited use these days.
 
 	  To compile it as a module, choose M here.  If unsure, say N.
 
@@ -308,17 +309,17 @@ config NF_CONNTRACK_SNMP
 	  To compile it as a module, choose M here.  If unsure, say N.
 
 config NF_CONNTRACK_PPTP
-	tristate "PPtP protocol support"
+	tristate "PPtP protocol support (deprecated)"
 	depends on NETFILTER_ADVANCED
 	select NF_CT_PROTO_GRE
 	help
 	  This module adds support for PPTP (Point to Point Tunnelling
 	  Protocol, RFC2637) connection tracking and NAT.
 
-	  If you are running PPTP sessions over a stateful firewall or NAT
+	  If you are still running PPTP sessions over a stateful firewall or NAT
 	  box, you may want to enable this feature.
 
-	  Please note that not all PPTP modes of operation are supported yet.
+	  Please note that not all PPTP modes of operation are supported.
 	  Specifically these limitations exist:
 	    - Blindly assumes that control connections are always established
 	      in PNS->PAC direction. This is a violation of RFC2637.
diff --git a/net/netfilter/nf_conntrack_irc.c b/net/netfilter/nf_conntrack_irc.c
index 0c117b8492e9..193ab34db795 100644
--- a/net/netfilter/nf_conntrack_irc.c
+++ b/net/netfilter/nf_conntrack_irc.c
@@ -262,6 +262,8 @@ static int __init nf_conntrack_irc_init(void)
 {
 	int i, ret;
 
+	nf_conntrack_helper_deprecated(HELPER_NAME);
+
 	if (max_dcc_channels < 1) {
 		pr_err("max_dcc_channels must not be zero\n");
 		return -EINVAL;
diff --git a/net/netfilter/nf_conntrack_pptp.c b/net/netfilter/nf_conntrack_pptp.c
index 776505a78e64..80fc14c87ddc 100644
--- a/net/netfilter/nf_conntrack_pptp.c
+++ b/net/netfilter/nf_conntrack_pptp.c
@@ -545,6 +545,8 @@ static int __init nf_conntrack_pptp_init(void)
 
 	pptp.destroy = gre_pptp_destroy_siblings;
 
+	nf_conntrack_helper_deprecated(pptp.name);
+
 	return nf_conntrack_helper_register(&pptp, &pptp_ptr);
 }
 
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 12/14] netfilter: nf_conntrack_expect: run expectation eviction with no helper
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

Run expectation eviction if no helper is specified to deal with the
nft_ct expectation support.

Cap the maximum expectation limit per master conntrack to
NF_CT_EXPECT_MAX_CNT (255).

Fixes: 857b46027d6f ("netfilter: nft_ct: add ct expectations support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conntrack_expect.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c
index 9454913e1b33..113bb1cb1683 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -499,6 +499,13 @@ static inline int __nf_ct_expect_check(struct nf_conntrack_expect *expect,
 		if (p->max_expected &&
 		    master_help->expecting[expect->class] >= p->max_expected)
 			evict_oldest_expect(master_help, expect, p);
+	} else {
+		const struct nf_conntrack_expect_policy default_exp_policy = {
+			.max_expected = NF_CT_EXPECT_MAX_CNT,
+		};
+
+		if (master_help->expecting[expect->class] >= default_exp_policy.max_expected)
+			evict_oldest_expect(master_help, expect, &default_exp_policy);
 	}
 
 	cnet = nf_ct_pernet(net);
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 11/14] netfilter: nf_conntrack_expect: store master_tuple in expectation
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

Store master conntrack tuple in the expectation since exp->master might
refer to a different conntrack when accessed from rcu read side lock
area due to typesafe rcu rules.

Fixes: 02a3231b6d82 ("netfilter: nf_conntrack_expect: store netns and zone in expectation")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_conntrack_expect.h |  1 +
 net/netfilter/nf_conntrack_broadcast.c      |  1 +
 net/netfilter/nf_conntrack_expect.c         |  2 ++
 net/netfilter/nf_conntrack_netlink.c        | 10 ++++------
 4 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_expect.h b/include/net/netfilter/nf_conntrack_expect.h
index be4a120d549e..c024345c9bd8 100644
--- a/include/net/netfilter/nf_conntrack_expect.h
+++ b/include/net/netfilter/nf_conntrack_expect.h
@@ -26,6 +26,7 @@ struct nf_conntrack_expect {
 	possible_net_t net;
 
 	/* We expect this tuple, with the following mask */
+	struct nf_conntrack_tuple master_tuple;
 	struct nf_conntrack_tuple tuple;
 	struct nf_conntrack_tuple_mask mask;
 
diff --git a/net/netfilter/nf_conntrack_broadcast.c b/net/netfilter/nf_conntrack_broadcast.c
index 400119b6320e..bf78828c7549 100644
--- a/net/netfilter/nf_conntrack_broadcast.c
+++ b/net/netfilter/nf_conntrack_broadcast.c
@@ -62,6 +62,7 @@ int nf_conntrack_broadcast_help(struct sk_buff *skb,
 	if (exp == NULL)
 		goto out;
 
+	exp->master_tuple	  = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
 	exp->tuple                = ct->tuplehash[IP_CT_DIR_REPLY].tuple;
 
 	helper = rcu_dereference(help->helper);
diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c
index 49e18eda037e..9454913e1b33 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -355,6 +355,8 @@ void nf_ct_expect_init(struct nf_conntrack_expect *exp, unsigned int class,
 	exp->tuple.src.l3num = family;
 	exp->tuple.dst.protonum = proto;
 
+	exp->master_tuple = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
+
 	if (saddr) {
 		memcpy(&exp->tuple.src.u3, saddr, len);
 		if (sizeof(exp->tuple.src.u3) > len)
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index cb38ef42e9e6..4217715d42dc 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -3002,7 +3002,6 @@ ctnetlink_exp_dump_expect(struct sk_buff *skb,
 			  const struct nf_conntrack_expect *exp)
 {
 	__s32 timeout = (__s32)(READ_ONCE(exp->timeout) - nfct_time_stamp) / HZ;
-	struct nf_conn *master = exp->master;
 	struct nf_conntrack_helper *helper;
 #if IS_ENABLED(CONFIG_NF_NAT)
 	struct nlattr *nest_parms;
@@ -3017,9 +3016,7 @@ ctnetlink_exp_dump_expect(struct sk_buff *skb,
 		goto nla_put_failure;
 	if (ctnetlink_exp_dump_mask(skb, &exp->tuple, &exp->mask) < 0)
 		goto nla_put_failure;
-	if (ctnetlink_exp_dump_tuple(skb,
-				 &master->tuplehash[IP_CT_DIR_ORIGINAL].tuple,
-				 CTA_EXPECT_MASTER) < 0)
+	if (ctnetlink_exp_dump_tuple(skb, &exp->master_tuple, CTA_EXPECT_MASTER) < 0)
 		goto nla_put_failure;
 
 #if IS_ENABLED(CONFIG_NF_NAT)
@@ -3032,9 +3029,9 @@ ctnetlink_exp_dump_expect(struct sk_buff *skb,
 		if (nla_put_be32(skb, CTA_EXPECT_NAT_DIR, htonl(exp->dir)))
 			goto nla_put_failure;
 
-		nat_tuple.src.l3num = nf_ct_l3num(master);
+		nat_tuple.src.l3num = exp->master_tuple.src.l3num;
 		nat_tuple.src.u3 = exp->saved_addr;
-		nat_tuple.dst.protonum = nf_ct_protonum(master);
+		nat_tuple.dst.protonum = exp->master_tuple.dst.protonum;
 		nat_tuple.src.u = exp->saved_proto;
 
 		if (ctnetlink_exp_dump_tuple(skb, &nat_tuple,
@@ -3576,6 +3573,7 @@ ctnetlink_alloc_expect(const struct nlattr * const cda[], struct nf_conn *ct,
 #endif
 	rcu_assign_pointer(exp->helper, helper);
 	rcu_assign_pointer(exp->assign_helper, assign_helper);
+	exp->master_tuple = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
 	exp->tuple = *tuple;
 	exp->mask.src.u3 = mask->src.u3;
 	exp->mask.src.u.all = mask->src.u.all;
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 13/14] netfilter: nft_ct: expectation timeouts are passed in milliseconds
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

From: Florian Westphal <fw@strlen.de>

Userspace passes '5000' in case user asks for 5 seconds.

Allowing for sub-second expectation lifetimes makes sense to me. so
fix up the kernel side instead of munging nft to send a value rounded
up to next second.

Also note that this violates nft convention of passing integers in
network byte order, but we can't change this anymore.

Fixes: 857b46027d6f ("netfilter: nft_ct: add ct expectations support")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_ct.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c
index 958054dd2e2e..03a88c77e0f0 100644
--- a/net/netfilter/nft_ct.c
+++ b/net/netfilter/nft_ct.c
@@ -1215,11 +1215,23 @@ struct nft_ct_expect_obj {
 	u32		timeout;
 };
 
+static int nft_ct_expect_timeout_get(const struct nlattr *attr, u32 *val)
+{
+	unsigned long jiffies_val = msecs_to_jiffies(nla_get_u32(attr));
+
+	if (jiffies_val > UINT_MAX)
+		return -ERANGE;
+
+	*val = jiffies_val;
+	return 0;
+}
+
 static int nft_ct_expect_obj_init(const struct nft_ctx *ctx,
 				  const struct nlattr * const tb[],
 				  struct nft_object *obj)
 {
 	struct nft_ct_expect_obj *priv = nft_obj_data(obj);
+	int err;
 
 	if (!tb[NFTA_CT_EXPECT_L4PROTO] ||
 	    !tb[NFTA_CT_EXPECT_DPORT] ||
@@ -1254,8 +1266,11 @@ static int nft_ct_expect_obj_init(const struct nft_ctx *ctx,
 		return -EOPNOTSUPP;
 	}
 
+	err = nft_ct_expect_timeout_get(tb[NFTA_CT_EXPECT_TIMEOUT], &priv->timeout);
+	if (err)
+		return err;
+
 	priv->dport = nla_get_be16(tb[NFTA_CT_EXPECT_DPORT]);
-	priv->timeout = nla_get_u32(tb[NFTA_CT_EXPECT_TIMEOUT]);
 	priv->size = nla_get_u8(tb[NFTA_CT_EXPECT_SIZE]);
 
 	return nf_ct_netns_get(ctx->net, ctx->family);
@@ -1275,7 +1290,7 @@ static int nft_ct_expect_obj_dump(struct sk_buff *skb,
 	if (nla_put_be16(skb, NFTA_CT_EXPECT_L3PROTO, htons(priv->l3num)) ||
 	    nla_put_u8(skb, NFTA_CT_EXPECT_L4PROTO, priv->l4proto) ||
 	    nla_put_be16(skb, NFTA_CT_EXPECT_DPORT, priv->dport) ||
-	    nla_put_u32(skb, NFTA_CT_EXPECT_TIMEOUT, priv->timeout) ||
+	    nla_put_u32(skb, NFTA_CT_EXPECT_TIMEOUT, jiffies_to_msecs(priv->timeout)) ||
 	    nla_put_u8(skb, NFTA_CT_EXPECT_SIZE, priv->size))
 		return -1;
 
@@ -1325,7 +1340,7 @@ static void nft_ct_expect_obj_eval(struct nft_object *obj,
 		          &ct->tuplehash[!dir].tuple.src.u3,
 		          &ct->tuplehash[!dir].tuple.dst.u3,
 		          priv->l4proto, NULL, &priv->dport);
-	exp->timeout += priv->timeout * HZ;
+	exp->timeout += priv->timeout;
 
 	if (nf_ct_expect_related(exp, 0) != 0)
 		regs->verdict.code = NF_DROP;
-- 
2.47.3


^ permalink raw reply related

* [PATCH net 14/14] netfilter: nf_conntrack_helper: cap maximum number of expectation at helper registration
From: Pablo Neira Ayuso @ 2026-06-23 22:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw, horms
In-Reply-To: <20260623221548.701545-1-pablo@netfilter.org>

On helper registration, the maximum number of expectations cannot go over
NF_CT_EXPECT_MAX_CNT (255), but zero can be specified then
nf_conntrack_expect_max applies. Turn zero into NF_CT_EXPECT_MAX_CNT
otherwise, expectation LRU eviction on insertion is disabled.

Moreover, expand this sanity check all expectation classes.

This max_expecy policy is only tunable since userspace helpers are
available, set Fixes: tag to the commit that adds such infrastructure.

Remove the check for p->max_expected given this field must always
be non-zero after this patch.

Fixes: 12f7a505331e ("netfilter: add user-space connection tracking helper infrastructure")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conntrack_expect.c | 3 +--
 net/netfilter/nf_conntrack_helper.c | 9 +++++++--
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c
index 113bb1cb1683..38630c5e006f 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -496,8 +496,7 @@ static inline int __nf_ct_expect_check(struct nf_conntrack_expect *expect,
 					   lockdep_is_held(&nf_conntrack_expect_lock));
 	if (helper) {
 		p = &helper->expect_policy[expect->class];
-		if (p->max_expected &&
-		    master_help->expecting[expect->class] >= p->max_expected)
+		if (master_help->expecting[expect->class] >= p->max_expected)
 			evict_oldest_expect(master_help, expect, p);
 	} else {
 		const struct nf_conntrack_expect_policy default_exp_policy = {
diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index 8b94001c2430..500509b17663 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -374,8 +374,13 @@ int __nf_conntrack_helper_register(struct nf_conntrack_helper *me)
 	if (!nf_ct_helper_hash)
 		return -ENOENT;
 
-	if (me->expect_policy->max_expected > NF_CT_EXPECT_MAX_CNT)
-		return -EINVAL;
+	for (i = 0; i <= me->expect_class_max; i++) {
+		if (!me->expect_policy[i].max_expected)
+			me->expect_policy[i].max_expected = NF_CT_EXPECT_MAX_CNT;
+
+		if (me->expect_policy[i].max_expected > NF_CT_EXPECT_MAX_CNT)
+			return -EINVAL;
+	}
 
 	mutex_lock(&nf_ct_helper_mutex);
 	for (i = 0; i < nf_ct_helper_hsize; i++) {
-- 
2.47.3


^ permalink raw reply related

* [PATCH net] nfc: nci: fix uninit-value in nci_core_init_rsp_packet()
From: Samuel Page @ 2026-06-23 22:24 UTC (permalink / raw)
  To: David Heidelberg
  Cc: David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, oe-linux-nfc, netdev, linux-kernel, stable

The CORE_INIT_RSP handlers walk the response using length fields taken
from the packet itself, without checking they stay within skb->len:

 - v1 computes
	rsp_2 = skb->data + 6 + rsp_1->num_supported_rf_interfaces;
   from the on-wire (unclamped) interface count and then dereferences
   rsp_2, and memcpy()s the advertised interfaces - both can run past the
   received data;
 - v2 walks supported_rf_interfaces[], advancing the cursor by an
   in-packet rf_extension_cnt with no bound.

A short CORE_INIT_RSP therefore makes the parser read past the packet
(into the uninitialised tail of the RX skb); the values are stored into
struct nci_dev and consumed while bringing the device up:

  BUG: KMSAN: uninit-value in nci_dev_up+0x10f3/0x1720
   nci_dev_up+0x10f3/0x1720
   nfc_dev_up+0x187/0x380
   nfc_genl_dev_up+0xdc/0x1a0
   genl_rcv_msg+0x5d4/0x9e0
   netlink_rcv_skb+0x28f/0x530
  Uninit was stored to memory at:
   nci_rsp_packet+0x68f/0x2310
   nci_rx_work+0x25f/0x5d0
  Uninit was created at:
   __alloc_skb+0x540/0xd40
   virtual_ncidev_write+0x65/0x210

Bound both parsers to skb->len before dereferencing the variable-length
parts, rejecting truncated responses with NCI_STATUS_SYNTAX_ERROR.

Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation")
Fixes: bcd684aace34 ("net/nfc/nci: Support NCI 2.x initial sequence")
Cc: stable@vger.kernel.org
Assisted-by: Bynario AI
Signed-off-by: Samuel Page <sam@bynar.io>
---
 net/nfc/nci/rsp.c | 26 ++++++++++++++++++++++++--
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/net/nfc/nci/rsp.c b/net/nfc/nci/rsp.c
index 9eeb862825c5..cdcd23c8ca95 100644
--- a/net/nfc/nci/rsp.c
+++ b/net/nfc/nci/rsp.c
@@ -50,6 +50,9 @@ static u8 nci_core_init_rsp_packet_v1(struct nci_dev *ndev,
 	const struct nci_core_init_rsp_1 *rsp_1 = (void *)skb->data;
 	const struct nci_core_init_rsp_2 *rsp_2;
 
+	if (skb->len < sizeof(*rsp_1))
+		return NCI_STATUS_SYNTAX_ERROR;
+
 	pr_debug("status 0x%x\n", rsp_1->status);
 
 	if (rsp_1->status != NCI_STATUS_OK)
@@ -58,6 +61,15 @@ static u8 nci_core_init_rsp_packet_v1(struct nci_dev *ndev,
 	ndev->nfcc_features = __le32_to_cpu(rsp_1->nfcc_features);
 	ndev->num_supported_rf_interfaces = rsp_1->num_supported_rf_interfaces;
 
+	/*
+	 * supported_rf_interfaces[] and the trailing nci_core_init_rsp_2 are
+	 * addressed using the on-wire (unclamped) interface count, so the
+	 * response must be long enough for both before they are dereferenced.
+	 */
+	if (skb->len < sizeof(*rsp_1) +
+	    rsp_1->num_supported_rf_interfaces + sizeof(*rsp_2))
+		return NCI_STATUS_SYNTAX_ERROR;
+
 	ndev->num_supported_rf_interfaces =
 		min((int)ndev->num_supported_rf_interfaces,
 		    NCI_MAX_SUPPORTED_RF_INTERFACES);
@@ -88,9 +100,13 @@ static u8 nci_core_init_rsp_packet_v2(struct nci_dev *ndev,
 {
 	const struct nci_core_init_rsp_nci_ver2 *rsp = (void *)skb->data;
 	const u8 *supported_rf_interface = rsp->supported_rf_interfaces;
+	const u8 *end = skb->data + skb->len;
 	u8 rf_interface_idx = 0;
 	u8 rf_extension_cnt = 0;
 
+	if (skb->len < sizeof(*rsp))
+		return NCI_STATUS_SYNTAX_ERROR;
+
 	pr_debug("status %x\n", rsp->status);
 
 	if (rsp->status != NCI_STATUS_OK)
@@ -104,10 +120,16 @@ static u8 nci_core_init_rsp_packet_v2(struct nci_dev *ndev,
 		    NCI_MAX_SUPPORTED_RF_INTERFACES);
 
 	while (rf_interface_idx < ndev->num_supported_rf_interfaces) {
-		ndev->supported_rf_interfaces[rf_interface_idx++] = *supported_rf_interface++;
+		/* one interface byte + one extension-count byte must be present */
+		if (end - supported_rf_interface < 2)
+			return NCI_STATUS_SYNTAX_ERROR;
+		ndev->supported_rf_interfaces[rf_interface_idx++] =
+			*supported_rf_interface++;
 
-		/* skip rf extension parameters */
+		/* skip rf extension parameters, bounded by the packet */
 		rf_extension_cnt = *supported_rf_interface++;
+		if (rf_extension_cnt > end - supported_rf_interface)
+			return NCI_STATUS_SYNTAX_ERROR;
 		supported_rf_interface += rf_extension_cnt;
 	}
 

base-commit: a986fde914d88af47eb78fd29c5d1af7952c3500
-- 
2.54.0


^ permalink raw reply related

* Re: [PATCH net] nfc: nci: fix out-of-bounds write in nci_target_auto_activated()
From: Sam P @ 2026-06-23 22:33 UTC (permalink / raw)
  To: Simon Horman
  Cc: david, davem, edumazet, kuba, pabeni, oe-linux-nfc, netdev,
	linux-kernel, stable
In-Reply-To: <20260623172109.1105965-2-horms@kernel.org>

On 23/06/2026 19:21, Simon Horman wrote:
>> diff --git a/net/nfc/nci/ntf.c b/net/nfc/nci/ntf.c
>> index c96512bb86531..566ca839fa488 100644
>> --- a/net/nfc/nci/ntf.c
>> +++ b/net/nfc/nci/ntf.c
>> @@ -603,6 +603,12 @@ static void nci_target_auto_activated(struct nci_dev *ndev,
>>       struct nfc_target *target;
>>       int rc;
>>
>> +    /* This is a new target, check if we've enough room */
>> +    if (ndev->n_targets == NCI_MAX_DISCOVERED_TARGETS) {
>> +        pr_debug("not enough room, ignoring new target...\n");
>> +        return;
>
> [Severity: High]
> Does this early return cause the state machine to stall?
>
> Looking at nci_rf_intf_activated_ntf_packet(), the state is transitioned
> to NCI_POLL_ACTIVE right before this function is called:
>
> net/nfc/nci/ntf.c:nci_rf_intf_activated_ntf_packet() {
>      ...
>      atomic_set(&ndev->state, NCI_POLL_ACTIVE);
>      if (err == NCI_STATUS_OK)
>          nci_target_auto_activated(ndev, &ntf);
>      ...
> }
>
> Because of the early return here, nfc_targets_found() is skipped. Does this
> leave the state wedged in NCI_POLL_ACTIVE, permanently blocking future
> attempts to stop polling since nci_stop_poll() requires the state to be
> NCI_DISCOVERY?
>
>> +    }
>> +
>>       target = &ndev->targets[ndev->n_targets];
>
> [ ... ]

Thanks for sharing the review.

I don't believe the early return can stall the state machine, as
nci_target_auto_activated() does not touch ndev->state.

nci_rf_intf_activated_ntf_packet() sets ndev->state = NCI_POLL_ACTIVE
before it calls nci_target_auto_activated(), so the transition has
already happened by the time the new check runs. Neither
nci_target_auto_activated() nor nfc_targets_found() modifies ndev->state;
nfc_targets_found() only clears the nfc_dev dev->polling flag under
device_lock, which is independent of the NCI state machine that
nci_stop_poll() tests. So ndev->state is NCI_POLL_ACTIVE after an
auto-activation whether the target is appended or the guard returns
early; the state behaviour is the same with or without this patch.

The check is the same n_targets == NCI_MAX_DISCOVERED_TARGETS guard the
sibling nci_add_new_target() (the RF_DISCOVER_NTF path) has used since
the Fixes: commit; it only drops the out-of-spec 11th+ target, which a
conformant controller shouldn't produce. The device still returns to
NCI_DISCOVERY/NCI_IDLE through the normal RF_DEACTIVATE path.

> [Severity: Critical]
> This isn't a bug introduced by this patch, but can the ntf union copy
> uninitialized stack memory into persistent device structures?
>
> In net/nfc/nci/ntf.c:nci_rf_discover_ntf_packet(), the local ntf struct
> is declared on the stack:
>
>      struct nci_rf_discover_ntf ntf;
>
> If the incoming packet has ntf.rf_tech_specific_params_len == 0, the
> extraction routines are bypassed, leaving the rf_tech_specific_params union
> uninitialized. The struct is then passed to nci_add_new_target():
>
>      if (add_target == true)
>          nci_add_new_target(ndev, &ntf);
>
> This eventually calls nci_add_new_protocol(), which blindly reads from this
> union:
>
>      nfca_poll = (struct rf_tech_specific_params_nfca_poll *)params;
>      target->sens_res = nfca_poll->sens_res;
>      target->sel_res = nfca_poll->sel_res;
>      target->nfcid1_len = nfca_poll->nfcid1_len;
>
> Could this uninitialized stack garbage be copied into ndev->targets, and
> later broadcasted to userspace via Netlink in nfc_targets_found()?

I managed to repro this issue locally via KMSAN, I can queue up a patch
for this that I can submit separately, ntf just needs to be zero-initd.

> [Severity: Critical]
> This isn't a bug introduced by this patch, but does the general bytes length
> calculation underflow?
>
> In net/nfc/nci/ntf.c:nci_store_general_bytes_nfc_dep():
>
>      ndev->remote_gb_len = min_t(__u8,
>          (ntf->activation_params.listen_nfc_dep.atr_req_len -
>           NFC_ATR_REQ_GT_OFFSET), NFC_ATR_REQ_GB_MAXSIZE);
>
>      memcpy(ndev->remote_gb,
>          (ntf->activation_params.listen_nfc_dep.atr_req +
>           NFC_ATR_REQ_GT_OFFSET), ndev->remote_gb_len);
>
> The length is computed by subtracting NFC_ATR_REQ_GT_OFFSET (14) from
> atr_req_len. If atr_req_len is less than 14, this subtraction will
> underflow.
>
> Because of the min_t cast, this large underflowed value is bounded to
> NFC_ATR_REQ_GB_MAXSIZE (48). The memcpy() then copies 48 bytes starting
> from atr_req + 14 into ndev->remote_gb.
>
> Since the source buffer is part of the ntf union allocated on the stack
> in nci_rf_intf_activated_ntf_packet(), could this leak up to 48 bytes of
> uninitialized kernel stack memory to userspace?

This one is already being fixed on the list:

[PATCH net v4 1/5] nfc: nci: fix u8 underflow in
nci_store_general_bytes_nfc_dep  (Lekë Hapçiu)
https://lore.kernel.org/all/20260424180151.3808557-2-snowwlake@icloud.com/

> [Severity: High]
> This isn't a bug introduced by this patch, but does the lack of skb->len
> validation in NCI response handlers allow out-of-bounds reads?
>
> In nci_core_init_rsp_packet_v1() in net/nfc/nci/rsp.c, fields are read
> and arrays are copied from the payload:
>
>      ndev->num_supported_rf_interfaces =
>          rsp_1->num_supported_rf_interfaces;
>
>      memcpy(ndev->supported_rf_interfaces, rsp_1->supported_rf_interfaces,
>             ndev->num_supported_rf_interfaces);
>
> Similarly in nci_core_init_rsp_packet_v2():
>
>      while (rf_interface_idx < ndev->num_supported_rf_interfaces) {
>          ndev->supported_rf_interfaces[rf_interface_idx++] =
>              *supported_rf_interface++;
>      }
>
> Since the NCI control header is stripped via skb_pull(), leaving the
> payload length in skb->len, what happens if an NFCC sends a short response
> packet (e.g. plen = 1)? Could this cause the handlers to read adjacent
> kernel slab memory?

I've since submitted a fix for this issue:

[PATCH net] nfc: nci: fix uninit-value in
nci_core_init_rsp_packet()
https://lore.kernel.org/netdev/20260623222402.175798-1-sam@bynar.io/

> [Severity: High]
> This isn't a bug introduced by this patch, but is there a missing boundary
> validation when extracting activation parameters?
>
> In net/nfc/nci/ntf.c:nci_extract_activation_params_iso_dep():
>
>      nfca_poll->rats_res_len = min_t(__u8, *data++, NFC_ATS_MAXSIZE);
>      if (nfca_poll->rats_res_len > 0) {
>          memcpy(nfca_poll->rats_res, data, nfca_poll->rats_res_len);
>      }
>
> And in nci_extract_activation_params_nfc_dep():
>
>      poll->atr_res_len = min_t(__u8, *data++, NFC_ATR_RES_MAXSIZE - 2);
>      if (poll->atr_res_len > 0)
>          memcpy(poll->atr_res, data, poll->atr_res_len);
>
> These inner lengths are bounded by a maximum size like NFC_ATS_MAXSIZE, but
> they are not validated against the remaining bytes in the SKB (skb->len).
>
> If a malformed packet provides a large inner length but a short payload,
> could this memcpy() read out of bounds into the device structure,
> potentially leaking adjacent slab memory to userspace later?

This one is already being fixed on the list too:

[PATCH v3] nfc: nci: add data_len bound checks to activation
parameter extractors  (Bryam Vargas)
https://lore.kernel.org/all/20260612-b4-disp-6d52d8b0-v3-1-e26221f8826d@proton.me/

> [Severity: High]
> This isn't a bug introduced by this patch, but is there a potential
> priority inversion or timeout deadlock involving the device lock?
>
> Functions like nfc_stop_poll() in net/nfc/core.c acquire the device lock:
>
>      device_lock(&dev->dev);
>      ...
>      if (dev->ops->stop_poll)
>          dev->ops->stop_poll(dev);
>
> This eventually calls nci_request(), which synchronously waits for a
> completion signaled by the nci_rx_work thread. However, if an NTF packet
> is received first, nci_rx_work processes it and invokes
> nfc_targets_found(), which also attempts to acquire the device lock:
>
>      device_lock(&dev->dev);
>
> Since the calling thread already holds the device lock, nci_rx_work blocks
> indefinitely. Because the RX worker is blocked, it cannot process the
> pending RSP, causing nci_request() to time out and fail. Could this
> deadlock the RX thread?

No patch for this one, although I'm not sure how accurate it is.

Thanks,
Sam

^ permalink raw reply

* Re: [PATCH RFC 5/8] clk: sunxi-ng: a733: Add bus clocks support
From: Andre Przywara @ 2026-06-23 22:35 UTC (permalink / raw)
  To: Junhui Liu, Michael Turquette, Stephen Boyd, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Chen-Yu Tsai, Jernej Skrabec,
	Samuel Holland, Philipp Zabel, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexandre Ghiti, Richard Cochran
  Cc: linux-clk, devicetree, linux-arm-kernel, linux-sunxi,
	linux-kernel, linux-riscv, netdev
In-Reply-To: <20260310-a733-clk-v1-5-36b4e9b24457@pigmoral.tech>

Hi,

On 3/10/26 08:33, Junhui Liu wrote:
> Add the essential bus clocks in the Allwinner A733 CCU, including AHB,
> APB0, APB1, APB_UART, NSI, and MBUS. These buses are necessary for many
> other functional modules. Additionally clocks such as trace, gic and
> cpu_peri are also added as they fall within the register address range
> of the bus clocks, even though they are not strictly bus clocks.
> 
> The MBUS clock is marked as critical to ensure the memory bus remains
> operational at all times. For the NSI and MBUS clocks, the hardware
> requires an update bit (bit 27) to be set so that the configuration
> takes effect and the updated parameters can be correctly read back.
> 
> Signed-off-by: Junhui Liu <junhui.liu@pigmoral.tech>
> ---
>   drivers/clk/sunxi-ng/ccu-sun60i-a733.c | 131 +++++++++++++++++++++++++++++++++
>   1 file changed, 131 insertions(+)
> 
> diff --git a/drivers/clk/sunxi-ng/ccu-sun60i-a733.c b/drivers/clk/sunxi-ng/ccu-sun60i-a733.c
> index cf819504c51f..68457813dbbb 100644
> --- a/drivers/clk/sunxi-ng/ccu-sun60i-a733.c
> +++ b/drivers/clk/sunxi-ng/ccu-sun60i-a733.c
> @@ -19,6 +19,7 @@
>   #include "ccu_common.h"
>   
>   #include "ccu_div.h"
> +#include "ccu_mp.h"
>   #include "ccu_mult.h"
>   #include "ccu_nkmp.h"
>   #include "ccu_nm.h"
> @@ -65,6 +66,16 @@ static const struct clk_hw *pll_ref_hws[] = {
>   	&pll_ref_clk.common.hw
>   };
>   
> +/*
> + * There is a non-software-configurable mux selecting between the DCXO and the
> + * PLL_REF in hardware, whose output is fed to the sys-24M clock. Although both
> + * sys-24M and pll-ref are fixed at 24 MHz, define a 1:1 fixed factor clock to
> + * provide logical separation:
> + * - pll-ref is dedicated to feeding other PLLs
> + * - sys-24M serves as reference clock for downstream functional modules
> + */
> +static CLK_FIXED_FACTOR_HWS(sys_24M_clk, "sys-24M", pll_ref_hws, 1, 1, 0);
> +
>   #define SUN60I_A733_PLL_DDR_REG		0x020
>   static struct ccu_nkmp pll_ddr_clk = {
>   	.enable		= BIT(27),
> @@ -371,6 +382,107 @@ static SUNXI_CCU_M_HWS(pll_de_4x_clk, "pll-de-4x", pll_de_hws,
>   static SUNXI_CCU_M_HWS(pll_de_3x_clk, "pll-de-3x", pll_de_hws,
>   		       SUN60I_A733_PLL_DE_REG, 16, 3, 0);
>   
> +/**************************************************************************
> + *                           bus clocks                                   *
> + **************************************************************************/
> +
> +static const struct clk_parent_data ahb_apb_parents[] = {
> +	{ .hw = &sys_24M_clk.hw },
> +	{ .fw_name = "losc" },
> +	{ .fw_name = "iosc" },
> +	{ .hw = &pll_periph0_600M_clk.hw },
> +};
> +
> +static SUNXI_CCU_M_DATA_WITH_MUX(ahb_clk, "ahb", ahb_apb_parents, 0x500,
> +				 0, 5,		/* M */
> +				 24, 2,		/* mux */
> +				 0);
> +
> +static SUNXI_CCU_M_DATA_WITH_MUX(apb0_clk, "apb0", ahb_apb_parents, 0x510,
> +				 0, 5,		/* M */
> +				 24, 2,		/* mux */
> +				 0);
> +
> +static SUNXI_CCU_M_DATA_WITH_MUX(apb1_clk, "apb1", ahb_apb_parents, 0x518,
> +				 0, 5,		/* M */
> +				 24, 2,		/* mux */
> +				 0);
> +
> +static const struct clk_parent_data apb_uart_parents[] = {
> +	{ .hw = &sys_24M_clk.hw },
> +	{ .fw_name = "losc" },
> +	{ .fw_name = "iosc" },
> +	{ .hw = &pll_periph0_600M_clk.hw },
> +	{ .hw = &pll_periph0_480M_clk.common.hw },
> +};
> +static SUNXI_CCU_M_DATA_WITH_MUX(apb_uart_clk, "apb-uart", apb_uart_parents, 0x538,
> +				 0, 5,		/* M */
> +				 24, 3,		/* mux */
> +				 0);
> +
> +static const struct clk_parent_data trace_parents[] = {
> +	{ .hw = &sys_24M_clk.hw },
> +	{ .fw_name = "losc" },
> +	{ .fw_name = "iosc" },
> +	{ .hw = &pll_periph0_300M_clk.hw },
> +	{ .hw = &pll_periph0_400M_clk.hw },
> +};
> +static SUNXI_CCU_M_DATA_WITH_MUX_GATE(trace_clk, "trace", trace_parents, 0x540,
> +				 0, 5,		/* M */
> +				 24, 3,		/* mux */
> +				 BIT(31),	/* gate */
> +				 0);
> +
> +static const struct clk_parent_data gic_cpu_peri_parents[] = {
> +	{ .hw = &sys_24M_clk.hw },
> +	{ .fw_name = "losc" },
> +	{ .hw = &pll_periph0_600M_clk.hw },
> +	{ .hw = &pll_periph0_480M_clk.common.hw },
> +	{ .hw = &pll_periph0_400M_clk.hw },
> +};
> +static SUNXI_CCU_M_DATA_WITH_MUX_GATE(gic_clk, "gic", gic_cpu_peri_parents, 0x560,

Do we really want to model the GIC clock? The A523 has one as well, as 
we don't describe it there. And while the GICv3 binding describes a 
clock property, the Linux driver completely ignores that.
So if I see this correctly, this clock would become unused, and would be 
turned off, killing the GIC? So we would at least need a CLK_IS_CRITICAL 
flag?

But it's a good reminder to lift this clock to something PLL based, in 
U-Boot's SPL, because I guess the 24MHz are rather slow.

> +				      0, 5,	/* M */
> +				      24, 3,	/* mux */
> +				      BIT(31),	/* gate */
> +				      0);
> +
> +static SUNXI_CCU_M_DATA_WITH_MUX_GATE(cpu_peri_clk, "cpu-peri", gic_cpu_peri_parents, 0x568,

What is this clock about? I don't see it referenced by any peripheral in 
the manual.

> +				      0, 5,	/* M */
> +				      24, 3,	/* mux */
> +				      BIT(31),	/* gate */
> +				      0);
> +
> +static const struct clk_parent_data nsi_parents[] = {
> +	{ .hw = &sys_24M_clk.hw },
> +	{ .hw = &pll_ddr_clk.common.hw },
> +	{ .hw = &pll_periph0_800M_clk.common.hw },
> +	{ .hw = &pll_periph0_600M_clk.hw },
> +	{ .hw = &pll_periph0_480M_clk.common.hw },
> +	{ .hw = &pll_de_3x_clk.common.hw },
> +};
> +static SUNXI_CCU_MP_DATA_WITH_MUX_GATE_FEAT(nsi_clk, "nsi", nsi_parents, 0x580,

Similar question like for the GIC: do we need this in the kernel, and do 
we need to prevent this from being turned off?

> +					    0, 5,	/* M */
> +					    0, 0,	/* no P */
> +					    24, 3,	/* mux */
> +					    BIT(31),	/* gate */
> +					    0, CCU_FEATURE_UPDATE_BIT);
> +
> +static const struct clk_parent_data mbus_parents[] = {
> +	{ .hw = &sys_24M_clk.hw },
> +	{ .hw = &pll_periph1_600M_clk.hw },
> +	{ .hw = &pll_ddr_clk.common.hw },
> +	{ .hw = &pll_periph1_480M_clk.common.hw },
> +	{ .hw = &pll_periph1_400M_clk.hw },
> +	{ .hw = &pll_npu_clk.common.hw },
> +};
> +static SUNXI_CCU_MP_DATA_WITH_MUX_GATE_FEAT(mbus_clk, "mbus", mbus_parents, 0x588,
> +					    0, 5,	/* M */
> +					    0, 0,	/* no P */
> +					    24, 3,	/* mux */
> +					    BIT(31),	/* gate */
> +					    CLK_IS_CRITICAL,
> +					    CCU_FEATURE_UPDATE_BIT);
> +
>   /*
>    * Contains all clocks that are controlled by a hardware register. They
>    * have a (sunxi) .common member, which needs to be initialised by the common
> @@ -407,11 +519,21 @@ static struct ccu_common *sun60i_a733_ccu_clks[] = {
>   	&pll_de_clk.common,
>   	&pll_de_4x_clk.common,
>   	&pll_de_3x_clk.common,
> +	&ahb_clk.common,
> +	&apb0_clk.common,
> +	&apb1_clk.common,
> +	&apb_uart_clk.common,
> +	&trace_clk.common,
> +	&gic_clk.common,
> +	&cpu_peri_clk.common,
> +	&nsi_clk.common,
> +	&mbus_clk.common,
>   };
>   
>   static struct clk_hw_onecell_data sun60i_a733_hw_clks = {
>   	.hws	= {
>   		[CLK_PLL_REF]		= &pll_ref_clk.common.hw,
> +		[CLK_SYS_24M]		= &sys_24M_clk.hw,
>   		[CLK_PLL_DDR]		= &pll_ddr_clk.common.hw,
>   		[CLK_PLL_PERIPH0_4X]	= &pll_periph0_4x_clk.common.hw,
>   		[CLK_PLL_PERIPH0_2X]	= &pll_periph0_2x_clk.common.hw,
> @@ -453,6 +575,15 @@ static struct clk_hw_onecell_data sun60i_a733_hw_clks = {
>   		[CLK_PLL_DE]		= &pll_de_clk.common.hw,
>   		[CLK_PLL_DE_4X]		= &pll_de_4x_clk.common.hw,
>   		[CLK_PLL_DE_3X]		= &pll_de_3x_clk.common.hw,
> +		[CLK_AHB]		= &ahb_clk.common.hw,
> +		[CLK_APB0]		= &apb0_clk.common.hw,
> +		[CLK_APB1]		= &apb1_clk.common.hw,
> +		[CLK_APB_UART]		= &apb_uart_clk.common.hw,
> +		[CLK_TRACE]		= &trace_clk.common.hw,
> +		[CLK_GIC]		= &gic_clk.common.hw,
> +		[CLK_CPU_PERI]		= &cpu_peri_clk.common.hw,
> +		[CLK_NSI]		= &nsi_clk.common.hw,
> +		[CLK_MBUS]		= &mbus_clk.common.hw,
>   	},
>   	.num	= CLK_FANOUT3 + 1,
>   };
> 


^ permalink raw reply

* Re: [PATCH bpf v4] selftests/bpf: Cover partial copy of non-linear test_run output
From: Emil Tsalapatis @ 2026-06-23 23:18 UTC (permalink / raw)
  To: Sun Jian, bpf
  Cc: netdev, linux-kselftest, linux-kernel, ast, daniel, andrii,
	martin.lau, paul.chaignon
In-Reply-To: <20260623014027.402820-1-sun.jian.kdev@gmail.com>

On Mon Jun 22, 2026 at 9:40 PM EDT, Sun Jian wrote:
> prog_run_opts already verifies that BPF_PROG_TEST_RUN returns -ENOSPC
> for a short data_out buffer while still reporting the full output size
> through data_size_out.
>
> Add the same coverage for non-linear test_run output. Use pass-through
> TC and XDP programs with a 9000-byte packet, a 64-byte linear data area,
> and a 100-byte data_out buffer. The expected output spans both the linear
> data and the first fragment.
>
> Verify that test_run returns -ENOSPC, reports the full packet length
> through data_size_out, and copies the packet prefix into data_out for
> both non-linear skb and XDP frags paths.
>
> Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>

Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

> ---
>
> v4:
> - Send only the selftest patch; the fix patch has been applied to bpf/master.
> - Initialize data_out buffers to avoid reading uninitialized stack memory if
>   bpf_prog_test_run_opts() fails unexpectedly.
>
>  .../selftests/bpf/prog_tests/prog_run_opts.c  | 70 +++++++++++++++++++
>  .../selftests/bpf/progs/test_pkt_access.c     | 12 ++++
>  2 files changed, 82 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c b/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c
> index 01f1d1b6715a..beb6fa78fd94 100644
> --- a/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c
> +++ b/tools/testing/selftests/bpf/prog_tests/prog_run_opts.c
> @@ -4,6 +4,10 @@
>  
>  #include "test_pkt_access.skel.h"
>  
> +#define NONLINEAR_PKT_LEN 9000
> +#define NONLINEAR_LINEAR_DATA_LEN 64
> +#define SHORT_OUT_LEN 100
> +
>  static const __u32 duration;
>  
>  static void check_run_cnt(int prog_fd, __u64 run_cnt)
> @@ -20,6 +24,69 @@ static void check_run_cnt(int prog_fd, __u64 run_cnt)
>  	      "incorrect number of repetitions, want %llu have %llu\n", run_cnt, info.run_cnt);
>  }
>  
> +static void init_pkt(__u8 *pkt, size_t len)
> +{
> +	size_t i;
> +
> +	for (i = 0; i < len; i++)
> +		pkt[i] = i & 0xff;
> +}
> +
> +static void test_skb_nonlinear_data_out_partial(struct test_pkt_access *skel)
> +{
> +	LIBBPF_OPTS(bpf_test_run_opts, topts);
> +	__u8 pkt[NONLINEAR_PKT_LEN];
> +	__u8 out[SHORT_OUT_LEN] = {};
> +	struct __sk_buff skb = {};
> +	int prog_fd, err;
> +
> +	init_pkt(pkt, sizeof(pkt));
> +
> +	skb.data_end = NONLINEAR_LINEAR_DATA_LEN;
> +
> +	topts.data_in = pkt;
> +	topts.data_size_in = sizeof(pkt);
> +	topts.data_out = out;
> +	topts.data_size_out = sizeof(out);
> +	topts.ctx_in = &skb;
> +	topts.ctx_size_in = sizeof(skb);
> +
> +	prog_fd = bpf_program__fd(skel->progs.tc_pass_prog);
> +	err = bpf_prog_test_run_opts(prog_fd, &topts);
> +
> +	ASSERT_EQ(err, -ENOSPC, "skb_partial_err");
> +	ASSERT_EQ(topts.data_size_out, sizeof(pkt), "skb_partial_size");
> +	ASSERT_OK(memcmp(out, pkt, sizeof(out)), "skb_partial_data");
> +}
> +
> +static void test_xdp_nonlinear_data_out_partial(struct test_pkt_access *skel)
> +{
> +	LIBBPF_OPTS(bpf_test_run_opts, topts);
> +	__u8 pkt[NONLINEAR_PKT_LEN];
> +	__u8 out[SHORT_OUT_LEN] = {};
> +	struct xdp_md ctx = {};
> +	int prog_fd, err;
> +
> +	init_pkt(pkt, sizeof(pkt));
> +
> +	ctx.data = 0;
> +	ctx.data_end = NONLINEAR_LINEAR_DATA_LEN;
> +
> +	topts.data_in = pkt;
> +	topts.data_size_in = sizeof(pkt);
> +	topts.data_out = out;
> +	topts.data_size_out = sizeof(out);
> +	topts.ctx_in = &ctx;
> +	topts.ctx_size_in = sizeof(ctx);
> +
> +	prog_fd = bpf_program__fd(skel->progs.xdp_frags_pass_prog);
> +	err = bpf_prog_test_run_opts(prog_fd, &topts);
> +
> +	ASSERT_EQ(err, -ENOSPC, "xdp_partial_err");
> +	ASSERT_EQ(topts.data_size_out, sizeof(pkt), "xdp_partial_size");
> +	ASSERT_OK(memcmp(out, pkt, sizeof(out)), "xdp_partial_data");
> +}
> +
>  void test_prog_run_opts(void)
>  {
>  	struct test_pkt_access *skel;
> @@ -69,6 +136,9 @@ void test_prog_run_opts(void)
>  	run_cnt += topts.repeat;
>  	check_run_cnt(prog_fd, run_cnt);
>  
> +	test_skb_nonlinear_data_out_partial(skel);
> +	test_xdp_nonlinear_data_out_partial(skel);
> +
>  cleanup:
>  	if (skel)
>  		test_pkt_access__destroy(skel);
> diff --git a/tools/testing/selftests/bpf/progs/test_pkt_access.c b/tools/testing/selftests/bpf/progs/test_pkt_access.c
> index bce7173152c6..cd284401eebd 100644
> --- a/tools/testing/selftests/bpf/progs/test_pkt_access.c
> +++ b/tools/testing/selftests/bpf/progs/test_pkt_access.c
> @@ -150,3 +150,15 @@ int test_pkt_access(struct __sk_buff *skb)
>  
>  	return TC_ACT_UNSPEC;
>  }
> +
> +SEC("tc")
> +int tc_pass_prog(struct __sk_buff *skb)
> +{
> +	return TC_ACT_OK;
> +}
> +
> +SEC("xdp.frags")
> +int xdp_frags_pass_prog(struct xdp_md *ctx)
> +{
> +	return XDP_PASS;
> +}


^ permalink raw reply

* Re: [PATCH 1/7] xfrm: use compat translator only for u64 alignment mismatch
From: patchwork-bot+netdevbpf @ 2026-06-23 23:30 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: davem, kuba, herbert, netdev
In-Reply-To: <20260622075726.29685-2-steffen.klassert@secunet.com>

Hello:

This series was applied to netdev/net.git (main)
by Steffen Klassert <steffen.klassert@secunet.com>:

On Mon, 22 Jun 2026 09:57:03 +0200 you wrote:
> From: Sanman Pradhan <psanman@juniper.net>
> 
> The XFRM compat layer (CONFIG_XFRM_USER_COMPAT) translates 32-bit xfrm
> netlink and setsockopt messages into the native 64-bit layout. It is
> only needed on architectures where the 32-bit and 64-bit ABIs disagree
> on u64 alignment, which the kernel encodes as COMPAT_FOR_U64_ALIGNMENT.
> 
> [...]

Here is the summary with links:
  - [1/7] xfrm: use compat translator only for u64 alignment mismatch
    https://git.kernel.org/netdev/net/c/355fbcbdc253
  - [2/7] net: af_key: initialize alg_key_len for IPComp states
    https://git.kernel.org/netdev/net/c/d129c3177d7b
  - [3/7] xfrm: Fix dev use-after-free in xfrm async resumption
    https://git.kernel.org/netdev/net/c/8045c0df98d4
  - [4/7] xfrm: Fix xfrm state cache insertion race
    https://git.kernel.org/netdev/net/c/ddd3d0132920
  - [5/7] xfrm: annotate data-races around xfrm_policy_count[] and xfrm_policy_default[]
    https://git.kernel.org/netdev/net/c/68de007d5ac9
  - [6/7] espintcp: use sk_msg_free_partial to fix partial send
    https://git.kernel.org/netdev/net/c/007800408002
  - [7/7] xfrm: validate selector family and prefixlen during match
    https://git.kernel.org/netdev/net/c/40f0b1047918

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net v2] net: usb: lan78xx: restore VLAN and hash filters after link up
From: patchwork-bot+netdevbpf @ 2026-06-23 23:30 UTC (permalink / raw)
  To: Nicolai Buchwitz
  Cc: Thangaraj.S, Rengarajan.S, UNGLinuxDriver, Woojung.Huh,
	andrew+netdev, davem, edumazet, kuba, pabeni, schuchmann, netdev,
	linux-usb, linux-kernel
In-Reply-To: <20260622102911.484045-1-nb@tipi-net.de>

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <kuba@kernel.org>:

On Mon, 22 Jun 2026 12:29:11 +0200 you wrote:
> Configured VLANs intermittently stop receiving traffic after a link
> down/up cycle, e.g. when the network cable is unplugged and plugged back
> in. VLAN filtering stays enabled but all VLAN-tagged frames are dropped
> until a VLAN is added or removed again.
> 
> The LAN7801 datasheet (DS00002123E) states:
> 
> [...]

Here is the summary with links:
  - [net,v2] net: usb: lan78xx: restore VLAN and hash filters after link up
    https://git.kernel.org/netdev/net/c/5c12248673c7

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox