netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/8] Netfilter updates for net-next
@ 2020-11-04 14:11 Pablo Neira Ayuso
  2020-11-05  2:18 ` Jakub Kicinski
  0 siblings, 1 reply; 18+ messages in thread
From: Pablo Neira Ayuso @ 2020-11-04 14:11 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba

Hi,

The following patchset contains Netfilter updates for net-next:

1) Move existing bridge packet reject infra to nf_reject_{ipv4,ipv6}.c
   from Jose M. Guisado.

2) Consolidate nft_reject_inet initialization and dump, also from Jose.

3) Add the netdev reject action, from Jose.

4) Allow to combine the exist flag and the destroy command in ipset,
   from Joszef Kadlecsik.

5) Expose bucket size parameter for hashtables, also from Jozsef.

6) Expose the init value for reproducible ipset listings, from Jozsef.

7) Use __printf attribute in nft_request_module, from Andrew Lunn.

8) Allow to use reject from the inet ingress chain.

Please, pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Thanks.

----------------------------------------------------------------

The following changes since commit 37d38ece9b898ea183db9e5a6582651e6ed64c9a:

  net/mac8390: discard unnecessary breaks (2020-10-29 19:03:46 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git HEAD

for you to fetch changes up to 117ca1f8920cf4087bf82f44bd2a51b49d6aae63:

  netfilter: nft_reject_inet: allow to use reject from inet ingress (2020-11-01 12:52:17 +0100)

----------------------------------------------------------------
Andrew Lunn (1):
      netfilter: nftables: Add __printf() attribute

Jose M. Guisado Gomez (3):
      netfilter: nf_reject: add reject skbuff creation helpers
      netfilter: nft_reject: unify reject init and dump into nft_reject
      netfilter: nft_reject: add reject verdict support for netdev

Jozsef Kadlecsik (3):
      netfilter: ipset: Support the -exist flag with the destroy command
      netfilter: ipset: Add bucketsize parameter to all hash types
      netfilter: ipset: Expose the initval hash parameter to userspace

Pablo Neira Ayuso (1):
      netfilter: nft_reject_inet: allow to use reject from inet ingress

 include/linux/netfilter/ipset/ip_set.h       |   5 +
 include/net/netfilter/ipv4/nf_reject.h       |  10 ++
 include/net/netfilter/ipv6/nf_reject.h       |   9 +
 include/uapi/linux/netfilter/ipset/ip_set.h  |   6 +-
 net/bridge/netfilter/Kconfig                 |   2 +-
 net/bridge/netfilter/nft_reject_bridge.c     | 255 +--------------------------
 net/ipv4/netfilter/nf_reject_ipv4.c          | 128 +++++++++++++-
 net/ipv6/netfilter/nf_reject_ipv6.c          | 139 ++++++++++++++-
 net/netfilter/Kconfig                        |  10 ++
 net/netfilter/Makefile                       |   1 +
 net/netfilter/ipset/ip_set_core.c            |   6 +-
 net/netfilter/ipset/ip_set_hash_gen.h        |  45 +++--
 net/netfilter/ipset/ip_set_hash_ip.c         |   7 +-
 net/netfilter/ipset/ip_set_hash_ipmac.c      |   6 +-
 net/netfilter/ipset/ip_set_hash_ipmark.c     |   7 +-
 net/netfilter/ipset/ip_set_hash_ipport.c     |   7 +-
 net/netfilter/ipset/ip_set_hash_ipportip.c   |   7 +-
 net/netfilter/ipset/ip_set_hash_ipportnet.c  |   7 +-
 net/netfilter/ipset/ip_set_hash_mac.c        |   6 +-
 net/netfilter/ipset/ip_set_hash_net.c        |   7 +-
 net/netfilter/ipset/ip_set_hash_netiface.c   |   7 +-
 net/netfilter/ipset/ip_set_hash_netnet.c     |   7 +-
 net/netfilter/ipset/ip_set_hash_netport.c    |   7 +-
 net/netfilter/ipset/ip_set_hash_netportnet.c |   7 +-
 net/netfilter/nf_tables_api.c                |   3 +-
 net/netfilter/nft_reject.c                   |  12 +-
 net/netfilter/nft_reject_inet.c              |  68 ++-----
 net/netfilter/nft_reject_netdev.c            | 189 ++++++++++++++++++++
 28 files changed, 615 insertions(+), 355 deletions(-)
 create mode 100644 net/netfilter/nft_reject_netdev.c

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 0/8] Netfilter updates for net-next
  2020-11-04 14:11 Pablo Neira Ayuso
@ 2020-11-05  2:18 ` Jakub Kicinski
  0 siblings, 0 replies; 18+ messages in thread
From: Jakub Kicinski @ 2020-11-05  2:18 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, davem, netdev

On Wed,  4 Nov 2020 15:11:41 +0100 Pablo Neira Ayuso wrote:
> 1) Move existing bridge packet reject infra to nf_reject_{ipv4,ipv6}.c
>    from Jose M. Guisado.
> 
> 2) Consolidate nft_reject_inet initialization and dump, also from Jose.
> 
> 3) Add the netdev reject action, from Jose.
> 
> 4) Allow to combine the exist flag and the destroy command in ipset,
>    from Joszef Kadlecsik.
> 
> 5) Expose bucket size parameter for hashtables, also from Jozsef.
> 
> 6) Expose the init value for reproducible ipset listings, from Jozsef.
> 
> 7) Use __printf attribute in nft_request_module, from Andrew Lunn.
> 
> 8) Allow to use reject from the inet ingress chain.

Pulled, thanks!

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net-next 0/8] Netfilter updates for net-next
@ 2021-08-30  9:38 Pablo Neira Ayuso
  0 siblings, 0 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2021-08-30  9:38 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba

Hi,

The following patchset contains Netfilter updates for net-next:

1) Clean up and consolidate ct ecache infrastructure by merging ct and
   expect notifiers, from Florian Westphal.

2) Missing counters and timestamp in nfnetlink_queue and _log conntrack
   information.

3) Missing error check for xt_register_template() in iptables mangle,
   as a incremental fix for the previous pull request, also from
   Florian Westphal.

4) Add netfilter hooks for the SRv6 lightweigh tunnel driver, from
   Ryoga Sato. The hooks are enabled via nf_hooks_lwtunnel sysctl
   to make sure existing netfilter rulesets do not break. There is
   a static key to disable the hooks by default.

   The pktgen_bench_xmit_mode_netif_receive.sh shows no noticeable
   impact in the seg6_input path for non-netfilter users: similar
   numbers with and without this patch.

   This is a sample of the perf report output:

    11.67%  kpktgend_0       [ipv6]                    [k] ipv6_get_saddr_eval
     7.89%  kpktgend_0       [ipv6]                    [k] __ipv6_addr_label
     7.52%  kpktgend_0       [ipv6]                    [k] __ipv6_dev_get_saddr
     6.63%  kpktgend_0       [kernel.vmlinux]          [k] asm_exc_nmi
     4.74%  kpktgend_0       [ipv6]                    [k] fib6_node_lookup_1
     3.48%  kpktgend_0       [kernel.vmlinux]          [k] pskb_expand_head
     3.33%  kpktgend_0       [ipv6]                    [k] ip6_rcv_core.isra.29
     3.33%  kpktgend_0       [ipv6]                    [k] seg6_do_srh_encap
     2.53%  kpktgend_0       [ipv6]                    [k] ipv6_dev_get_saddr
     2.45%  kpktgend_0       [ipv6]                    [k] fib6_table_lookup
     2.24%  kpktgend_0       [kernel.vmlinux]          [k] ___cache_free
     2.16%  kpktgend_0       [ipv6]                    [k] ip6_pol_route
     2.11%  kpktgend_0       [kernel.vmlinux]          [k] __ipv6_addr_type

Please, pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Thanks.

----------------------------------------------------------------

The following changes since commit 87e5ef4b19cec86c861e3ebab3a5d840ecc2f4a4:

  mctp: Remove the repeated declaration (2021-08-25 11:23:14 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git HEAD

for you to fetch changes up to 7a3f5b0de3647c854e34269c3332d7a1e902901a:

  netfilter: add netfilter hooks to SRv6 data plane (2021-08-30 01:51:36 +0200)

----------------------------------------------------------------
Florian Westphal (5):
      netfilter: ecache: remove one indent level
      netfilter: ecache: remove another indent level
      netfilter: ecache: add common helper for nf_conntrack_eventmask_report
      netfilter: ecache: prepare for event notifier merge
      netfilter: ecache: remove nf_exp_event_notifier structure

Lukas Bulwahn (1):
      netfilter: x_tables: handle xt_register_template() returning an error value

Pablo Neira Ayuso (1):
      netfilter: ctnetlink: missing counters and timestamp in nfnetlink_{log,queue}

Ryoga Saito (1):
      netfilter: add netfilter hooks to SRv6 data plane

 Documentation/networking/nf_conntrack-sysctl.rst |   7 +
 include/net/lwtunnel.h                           |   3 +
 include/net/netfilter/nf_conntrack_ecache.h      |  32 ++--
 include/net/netfilter/nf_hooks_lwtunnel.h        |   7 +
 include/net/netns/conntrack.h                    |   1 -
 net/core/lwtunnel.c                              |   3 +
 net/ipv4/netfilter/iptable_mangle.c              |   2 +
 net/ipv6/seg6_iptunnel.c                         |  75 +++++++-
 net/ipv6/seg6_local.c                            | 111 ++++++++----
 net/netfilter/Makefile                           |   3 +
 net/netfilter/nf_conntrack_ecache.c              | 211 +++++++++--------------
 net/netfilter/nf_conntrack_netlink.c             |  56 ++----
 net/netfilter/nf_conntrack_standalone.c          |  15 ++
 net/netfilter/nf_hooks_lwtunnel.c                |  53 ++++++
 14 files changed, 345 insertions(+), 234 deletions(-)
 create mode 100644 include/net/netfilter/nf_hooks_lwtunnel.h
 create mode 100644 net/netfilter/nf_hooks_lwtunnel.c

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net-next 0/8] Netfilter updates for net-next
@ 2023-12-22 11:57 Pablo Neira Ayuso
  0 siblings, 0 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2023-12-22 11:57 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw

Hi,

The following patchset contains Netfilter updates for net-next:

1) Add locking for NFT_MSG_GETSETELEM_RESET requests, to address a
   race scenario with two concurrent processes running a dump-and-reset
   which exposes negative counters to userspace, from Phil Sutter.

2) Use GFP_KERNEL in pipapo GC, from Florian Westphal.

3) Reorder nf_flowtable struct members, place the read-mostly parts
   accessed by the datapath first. From Florian Westphal.

4) Set on dead flag for NFT_MSG_NEWSET in abort path,
   from Florian Westphal.

5) Support filtering zone in ctnetlink, from Felix Huettner.

6) Bail out if user tries to redefine an existing chain with different
   type in nf_tables.

Please, pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next.git nf-next-23-12-22

Thanks.

----------------------------------------------------------------

The following changes since commit 56794e5358542b7c652f202946e53bfd2373b5e0:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net (2023-12-21 22:17:23 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next.git tags/nf-next-23-12-22

for you to fetch changes up to aaba7ddc8507f4ad5bbd07988573967632bc2385:

  netfilter: nf_tables: validate chain type update if available (2023-12-22 12:15:28 +0100)

----------------------------------------------------------------
netfilter pull request 23-12-22

----------------------------------------------------------------
Felix Huettner (1):
      netfilter: ctnetlink: support filtering by zone

Florian Westphal (3):
      netfilter: nft_set_pipapo: prefer gfp_kernel allocation
      netfilter: flowtable: reorder nf_flowtable struct members
      netfilter: nf_tables: mark newset as dead on transaction abort

Pablo Neira Ayuso (1):
      netfilter: nf_tables: validate chain type update if available

Phil Sutter (3):
      netfilter: nf_tables: Pass const set to nft_get_set_elem
      netfilter: nf_tables: Introduce nft_set_dump_ctx_init()
      netfilter: nf_tables: Add locking for NFT_MSG_GETSETELEM_RESET requests

 include/net/netfilter/nf_flow_table.h              |   9 +-
 net/netfilter/nf_conntrack_netlink.c               |  12 +-
 net/netfilter/nf_tables_api.c                      | 147 +++++--
 net/netfilter/nft_set_pipapo.c                     |   2 +-
 tools/testing/selftests/netfilter/.gitignore       |   2 +
 tools/testing/selftests/netfilter/Makefile         |   3 +-
 .../selftests/netfilter/conntrack_dump_flush.c     | 430 +++++++++++++++++++++
 7 files changed, 567 insertions(+), 38 deletions(-)
 create mode 100644 tools/testing/selftests/netfilter/conntrack_dump_flush.c

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net-next 0/8] netfilter: updates for net-next
@ 2025-09-01  8:08 Florian Westphal
  2025-09-01  8:08 ` [PATCH net-next 1/8] netfilter: ebtables: Use vmalloc_array() to improve code Florian Westphal
                   ` (8 more replies)
  0 siblings, 9 replies; 18+ messages in thread
From: Florian Westphal @ 2025-09-01  8:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

Hi,

The following patchset contains Netfilter fixes for *net-next*:

1) prefer vmalloc_array in ebtables, from  Qianfeng Rong.
2) Use csum_replace4 instead of open-coding it, from Christophe Leroy.
3+4) Get rid of GFP_ATOMIC in transaction object allocations, those
     cause silly failures with large sets under memory pressure, from
     myself.
5) Introduce new NFTA_DEVICE_PREFIX attribute in nftables netlink api,
   re-using old NFTA_DEVICE_NAME led to confusion with different
   kernel/userspace versions.  This refines the wildcard interface
   support added in 6.16 release.  From Phil Sutter.
6) Remove test for AVX cpu feature in nftables pipapo set type,
   testing for AVX2 feature is sufficient.
7) Unexport a few function in nf_reject infra: no external callers.
8) Extend payload offset to u16, this was restricted to values <=255
   so far, from Fernando Fernandez Mancera.

Please, pull these changes from:
The following changes since commit 864ecc4a6dade82d3f70eab43dad0e277aa6fc78:

  Merge branch 'net-add-rcu-safety-to-dst-dev' (2025-08-29 19:36:34 -0700)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next.git tags/nf-next-25-09-01

for you to fetch changes up to 0618948e58e09e1ebf59078bf5b7841bbd1ce1d2:

  netfilter: nft_payload: extend offset to 65535 bytes (2025-09-01 09:53:17 +0200)

----------------------------------------------------------------
netfilter pull request nf-next-25-09-01

----------------------------------------------------------------
Christophe Leroy (1):
  netfilter: nft_payload: Use csum_replace4() instead of opencoding

Fernando Fernandez Mancera (1):
  netfilter: nft_payload: extend offset to 65535 bytes

Florian Westphal (4):
  netfilter: nf_tables: allow iter callbacks to sleep
  netfilter: nf_tables: all transaction allocations can now sleep
  netfilter: nft_set_pipapo: remove redundant test for avx feature bit
  netfilter: nf_reject: remove unneeded exports

Phil Sutter (1):
  netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX

Qianfeng Rong (1):
  netfilter: ebtables: Use vmalloc_array() to improve code

 include/net/netfilter/ipv4/nf_reject.h   |   8 --
 include/net/netfilter/ipv6/nf_reject.h   |  10 ---
 include/net/netfilter/nf_tables.h        |   2 +
 include/net/netfilter/nf_tables_core.h   |   2 +-
 include/uapi/linux/netfilter/nf_tables.h |   2 +
 net/bridge/netfilter/ebtables.c          |  14 ++--
 net/ipv4/netfilter/nf_reject_ipv4.c      |  27 +++---
 net/ipv6/netfilter/nf_reject_ipv6.c      |  37 ++++++---
 net/netfilter/nf_tables_api.c            |  89 +++++++++++---------
 net/netfilter/nft_payload.c              |  20 +++--
 net/netfilter/nft_set_hash.c             | 100 ++++++++++++++++++++++-
 net/netfilter/nft_set_pipapo.c           |   3 +-
 net/netfilter/nft_set_pipapo_avx2.c      |   2 +-
 net/netfilter/nft_set_rbtree.c           |  35 ++++++--
 14 files changed, 242 insertions(+), 109 deletions(-)

-- 
2.49.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net-next 1/8] netfilter: ebtables: Use vmalloc_array() to improve code
  2025-09-01  8:08 [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
@ 2025-09-01  8:08 ` Florian Westphal
  2025-09-01  8:08 ` [PATCH net-next 2/8] netfilter: nft_payload: Use csum_replace4() instead of opencoding Florian Westphal
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Florian Westphal @ 2025-09-01  8:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Qianfeng Rong <rongqianfeng@vivo.com>

Remove array_size() calls and replace vmalloc() with vmalloc_array() to
simplify the code.  vmalloc_array() is also optimized better, uses fewer
instructions, and handles overflow more concisely[1].

[1]: https://lore.kernel.org/lkml/abc66ec5-85a4-47e1-9759-2f60ab111971@vivo.com/
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/bridge/netfilter/ebtables.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 3e67d4aff419..5697e3949a36 100644
--- a/net/bridge/netfilter/ebtables.c
+++ b/net/bridge/netfilter/ebtables.c
@@ -920,8 +920,8 @@ static int translate_table(struct net *net, const char *name,
 		 * if an error occurs
 		 */
 		newinfo->chainstack =
-			vmalloc(array_size(nr_cpu_ids,
-					   sizeof(*(newinfo->chainstack))));
+			vmalloc_array(nr_cpu_ids,
+				      sizeof(*(newinfo->chainstack)));
 		if (!newinfo->chainstack)
 			return -ENOMEM;
 		for_each_possible_cpu(i) {
@@ -938,7 +938,7 @@ static int translate_table(struct net *net, const char *name,
 			}
 		}
 
-		cl_s = vmalloc(array_size(udc_cnt, sizeof(*cl_s)));
+		cl_s = vmalloc_array(udc_cnt, sizeof(*cl_s));
 		if (!cl_s)
 			return -ENOMEM;
 		i = 0; /* the i'th udc */
@@ -1018,8 +1018,8 @@ static int do_replace_finish(struct net *net, struct ebt_replace *repl,
 	 * the check on the size is done later, when we have the lock
 	 */
 	if (repl->num_counters) {
-		unsigned long size = repl->num_counters * sizeof(*counterstmp);
-		counterstmp = vmalloc(size);
+		counterstmp = vmalloc_array(repl->num_counters,
+					    sizeof(*counterstmp));
 		if (!counterstmp)
 			return -ENOMEM;
 	}
@@ -1386,7 +1386,7 @@ static int do_update_counters(struct net *net, const char *name,
 	if (num_counters == 0)
 		return -EINVAL;
 
-	tmp = vmalloc(array_size(num_counters, sizeof(*tmp)));
+	tmp = vmalloc_array(num_counters, sizeof(*tmp));
 	if (!tmp)
 		return -ENOMEM;
 
@@ -1526,7 +1526,7 @@ static int copy_counters_to_user(struct ebt_table *t,
 	if (num_counters != nentries)
 		return -EINVAL;
 
-	counterstmp = vmalloc(array_size(nentries, sizeof(*counterstmp)));
+	counterstmp = vmalloc_array(nentries, sizeof(*counterstmp));
 	if (!counterstmp)
 		return -ENOMEM;
 
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 2/8] netfilter: nft_payload: Use csum_replace4() instead of opencoding
  2025-09-01  8:08 [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
  2025-09-01  8:08 ` [PATCH net-next 1/8] netfilter: ebtables: Use vmalloc_array() to improve code Florian Westphal
@ 2025-09-01  8:08 ` Florian Westphal
  2025-09-01  8:08 ` [PATCH net-next 3/8] netfilter: nf_tables: allow iter callbacks to sleep Florian Westphal
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Florian Westphal @ 2025-09-01  8:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Christophe Leroy <christophe.leroy@csgroup.eu>

Open coded calculation can be avoided and replaced by the
equivalent csum_replace4() in nft_csum_replace().

Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nft_payload.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 7dfc5343dae4..059b28ffad0e 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -684,7 +684,7 @@ static const struct nft_expr_ops nft_payload_inner_ops = {
 
 static inline void nft_csum_replace(__sum16 *sum, __wsum fsum, __wsum tsum)
 {
-	*sum = csum_fold(csum_add(csum_sub(~csum_unfold(*sum), fsum), tsum));
+	csum_replace4(sum, (__force __be32)fsum, (__force __be32)tsum);
 	if (*sum == 0)
 		*sum = CSUM_MANGLED_0;
 }
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 3/8] netfilter: nf_tables: allow iter callbacks to sleep
  2025-09-01  8:08 [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
  2025-09-01  8:08 ` [PATCH net-next 1/8] netfilter: ebtables: Use vmalloc_array() to improve code Florian Westphal
  2025-09-01  8:08 ` [PATCH net-next 2/8] netfilter: nft_payload: Use csum_replace4() instead of opencoding Florian Westphal
@ 2025-09-01  8:08 ` Florian Westphal
  2025-09-01  8:08 ` [PATCH net-next 4/8] netfilter: nf_tables: all transaction allocations can now sleep Florian Westphal
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Florian Westphal @ 2025-09-01  8:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

Quoting Sven Auhagen:
  we do see on occasions that we get the following error message, more so on
  x86 systems than on arm64:

  Error: Could not process rule: Cannot allocate memory delete table inet filter

  It is not a consistent error and does not happen all the time.
  We are on Kernel 6.6.80, seems to me like we have something along the lines
  of the nf_tables: allow clone callbacks to sleep problem using GFP_ATOMIC.

As hinted at by Sven, this is because of GFP_ATOMIC allocations during
set flush.

When set is flushed, all elements are deactivated. This triggers a set
walk and each element gets added to the transaction list.

The rbtree and rhashtable sets don't allow the iter callback to sleep:
rbtree walk acquires read side of an rwlock with bh disabled, rhashtable
walk happens with rcu read lock held.

Rbtree is simple enough to resolve:
When the walk context is ITER_READ, no change is needed (the iter
callback must not deactivate elements; we're not in a transaction).

When the iter type is ITER_UPDATE, the rwlock isn't needed because the
caller holds the transaction mutex, this prevents any and all changes to
the ruleset, including add/remove of set elements.

Rhashtable is slightly more complex.
When the iter type is ITER_READ, no change is needed, like rbtree.

For ITER_UPDATE, we hold transaction mutex which prevents elements from
getting free'd, even outside of rcu read lock section.

So build a temporary list of all elements while doing the rcu iteration
and then call the iterator in a second pass.

The disadvantage is the need to iterate twice, but this cost comes with
the benefit to allow the iter callback to use GFP_KERNEL allocations in
a followup patch.

The new list based logic makes it necessary to catch recursive calls to
the same set earlier.

Such walk -> iter -> walk recursion for the same set can happen during
ruleset validation in case userspace gave us a bogus (cyclic) ruleset
where verdict map m jumps to chain that sooner or later also calls
"vmap @m".

Before the new ->in_update_walk test, the ruleset is rejected because the
infinite recursion causes ctx->level to exceed the allowed maximum.

But with the new logic added here, elements would get skipped:

nft_rhash_walk_update would see elements that are on the walk_list of
an older stack frame.

As all recursive calls into same map results in -EMLINK, we can avoid this
problem by using the new in_update_walk flag and reject immediately.

Next patch converts the problematic GFP_ATOMIC allocations.

Reported-by: Sven Auhagen <Sven.Auhagen@belden.com>
Closes: https://lore.kernel.org/netfilter-devel/BY1PR18MB5874110CAFF1ED098D0BC4E7E07BA@BY1PR18MB5874.namprd18.prod.outlook.com/
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_tables.h |   2 +
 net/netfilter/nft_set_hash.c      | 100 +++++++++++++++++++++++++++++-
 net/netfilter/nft_set_rbtree.c    |  35 ++++++++---
 3 files changed, 126 insertions(+), 11 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 891e43a01bdc..e2128663b160 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -556,6 +556,7 @@ struct nft_set_elem_expr {
  * 	@size: maximum set size
  *	@field_len: length of each field in concatenation, bytes
  *	@field_count: number of concatenated fields in element
+ *	@in_update_walk: true during ->walk() in transaction phase
  *	@use: number of rules references to this set
  * 	@nelems: number of elements
  * 	@ndeact: number of deactivated elements queued for removal
@@ -590,6 +591,7 @@ struct nft_set {
 	u32				size;
 	u8				field_len[NFT_REG32_COUNT];
 	u8				field_count;
+	bool				in_update_walk;
 	u32				use;
 	atomic_t			nelems;
 	u32				ndeact;
diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c
index 266d0c637225..ba01ce75d6de 100644
--- a/net/netfilter/nft_set_hash.c
+++ b/net/netfilter/nft_set_hash.c
@@ -30,6 +30,7 @@ struct nft_rhash {
 struct nft_rhash_elem {
 	struct nft_elem_priv		priv;
 	struct rhash_head		node;
+	struct llist_node		walk_node;
 	u32				wq_gc_seq;
 	struct nft_set_ext		ext;
 };
@@ -144,6 +145,7 @@ nft_rhash_update(struct nft_set *set, const u32 *key,
 		goto err1;
 
 	he = nft_elem_priv_cast(elem_priv);
+	init_llist_node(&he->walk_node);
 	prev = rhashtable_lookup_get_insert_key(&priv->ht, &arg, &he->node,
 						nft_rhash_params);
 	if (IS_ERR(prev))
@@ -180,6 +182,7 @@ static int nft_rhash_insert(const struct net *net, const struct nft_set *set,
 	};
 	struct nft_rhash_elem *prev;
 
+	init_llist_node(&he->walk_node);
 	prev = rhashtable_lookup_get_insert_key(&priv->ht, &arg, &he->node,
 						nft_rhash_params);
 	if (IS_ERR(prev))
@@ -261,12 +264,12 @@ static bool nft_rhash_delete(const struct nft_set *set,
 	return true;
 }
 
-static void nft_rhash_walk(const struct nft_ctx *ctx, struct nft_set *set,
-			   struct nft_set_iter *iter)
+static void nft_rhash_walk_ro(const struct nft_ctx *ctx, struct nft_set *set,
+			      struct nft_set_iter *iter)
 {
 	struct nft_rhash *priv = nft_set_priv(set);
-	struct nft_rhash_elem *he;
 	struct rhashtable_iter hti;
+	struct nft_rhash_elem *he;
 
 	rhashtable_walk_enter(&priv->ht, &hti);
 	rhashtable_walk_start(&hti);
@@ -295,6 +298,97 @@ static void nft_rhash_walk(const struct nft_ctx *ctx, struct nft_set *set,
 	rhashtable_walk_exit(&hti);
 }
 
+static void nft_rhash_walk_update(const struct nft_ctx *ctx,
+				  struct nft_set *set,
+				  struct nft_set_iter *iter)
+{
+	struct nft_rhash *priv = nft_set_priv(set);
+	struct nft_rhash_elem *he, *tmp;
+	struct llist_node *first_node;
+	struct rhashtable_iter hti;
+	LLIST_HEAD(walk_list);
+
+	lockdep_assert_held(&nft_pernet(ctx->net)->commit_mutex);
+
+	if (set->in_update_walk) {
+		/* This can happen with bogus rulesets during ruleset validation
+		 * when a verdict map causes a jump back to the same map.
+		 *
+		 * Without this extra check the walk_next loop below will see
+		 * elems on the callers walk_list and skip (not validate) them.
+		 */
+		iter->err = -EMLINK;
+		return;
+	}
+
+	/* walk happens under RCU.
+	 *
+	 * We create a snapshot list so ->iter callback can sleep.
+	 * commit_mutex is held, elements can ...
+	 * .. be added in parallel from dataplane (dynset)
+	 * .. be marked as dead in parallel from dataplane (dynset).
+	 * .. be queued for removal in parallel (gc timeout).
+	 * .. not be freed: transaction mutex is held.
+	 */
+	rhashtable_walk_enter(&priv->ht, &hti);
+	rhashtable_walk_start(&hti);
+
+	while ((he = rhashtable_walk_next(&hti))) {
+		if (IS_ERR(he)) {
+			if (PTR_ERR(he) != -EAGAIN) {
+				iter->err = PTR_ERR(he);
+				break;
+			}
+
+			continue;
+		}
+
+		/* rhashtable resized during walk, skip */
+		if (llist_on_list(&he->walk_node))
+			continue;
+
+		llist_add(&he->walk_node, &walk_list);
+	}
+	rhashtable_walk_stop(&hti);
+	rhashtable_walk_exit(&hti);
+
+	first_node = __llist_del_all(&walk_list);
+	set->in_update_walk = true;
+	llist_for_each_entry_safe(he, tmp, first_node, walk_node) {
+		if (iter->err == 0) {
+			iter->err = iter->fn(ctx, set, iter, &he->priv);
+			if (iter->err == 0)
+				iter->count++;
+		}
+
+		/* all entries must be cleared again, else next ->walk iteration
+		 * will skip entries.
+		 */
+		init_llist_node(&he->walk_node);
+	}
+	set->in_update_walk = false;
+}
+
+static void nft_rhash_walk(const struct nft_ctx *ctx, struct nft_set *set,
+			   struct nft_set_iter *iter)
+{
+	switch (iter->type) {
+	case NFT_ITER_UPDATE:
+		/* only relevant for netlink dumps which use READ type */
+		WARN_ON_ONCE(iter->skip != 0);
+
+		nft_rhash_walk_update(ctx, set, iter);
+		break;
+	case NFT_ITER_READ:
+		nft_rhash_walk_ro(ctx, set, iter);
+		break;
+	default:
+		iter->err = -EINVAL;
+		WARN_ON_ONCE(1);
+		break;
+	}
+}
+
 static bool nft_rhash_expr_needs_gc_run(const struct nft_set *set,
 					struct nft_set_ext *ext)
 {
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 938a257c069e..b311b66df3e9 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -584,15 +584,14 @@ nft_rbtree_deactivate(const struct net *net, const struct nft_set *set,
 	return NULL;
 }
 
-static void nft_rbtree_walk(const struct nft_ctx *ctx,
-			    struct nft_set *set,
-			    struct nft_set_iter *iter)
+static void nft_rbtree_do_walk(const struct nft_ctx *ctx,
+			       struct nft_set *set,
+			       struct nft_set_iter *iter)
 {
 	struct nft_rbtree *priv = nft_set_priv(set);
 	struct nft_rbtree_elem *rbe;
 	struct rb_node *node;
 
-	read_lock_bh(&priv->lock);
 	for (node = rb_first(&priv->root); node != NULL; node = rb_next(node)) {
 		rbe = rb_entry(node, struct nft_rbtree_elem, node);
 
@@ -600,14 +599,34 @@ static void nft_rbtree_walk(const struct nft_ctx *ctx,
 			goto cont;
 
 		iter->err = iter->fn(ctx, set, iter, &rbe->priv);
-		if (iter->err < 0) {
-			read_unlock_bh(&priv->lock);
+		if (iter->err < 0)
 			return;
-		}
 cont:
 		iter->count++;
 	}
-	read_unlock_bh(&priv->lock);
+}
+
+static void nft_rbtree_walk(const struct nft_ctx *ctx,
+			    struct nft_set *set,
+			    struct nft_set_iter *iter)
+{
+	struct nft_rbtree *priv = nft_set_priv(set);
+
+	switch (iter->type) {
+	case NFT_ITER_UPDATE:
+		lockdep_assert_held(&nft_pernet(ctx->net)->commit_mutex);
+		nft_rbtree_do_walk(ctx, set, iter);
+		break;
+	case NFT_ITER_READ:
+		read_lock_bh(&priv->lock);
+		nft_rbtree_do_walk(ctx, set, iter);
+		read_unlock_bh(&priv->lock);
+		break;
+	default:
+		iter->err = -EINVAL;
+		WARN_ON_ONCE(1);
+		break;
+	}
 }
 
 static void nft_rbtree_gc_remove(struct net *net, struct nft_set *set,
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 4/8] netfilter: nf_tables: all transaction allocations can now sleep
  2025-09-01  8:08 [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
                   ` (2 preceding siblings ...)
  2025-09-01  8:08 ` [PATCH net-next 3/8] netfilter: nf_tables: allow iter callbacks to sleep Florian Westphal
@ 2025-09-01  8:08 ` Florian Westphal
  2025-09-01  8:08 ` [PATCH net-next 5/8] netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX Florian Westphal
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Florian Westphal @ 2025-09-01  8:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

Now that nft_setelem_flush is not called with rcu read lock held or
disabled softinterrupts anymore this can now use GFP_KERNEL too.

This is the last atomic allocation of transaction elements, so remove
all gfp_t arguments and the wrapper function.

This makes attempts to delete large sets much more reliable, before
this was prone to transient memory allocation failures.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_tables_api.c | 47 ++++++++++++++---------------------
 1 file changed, 19 insertions(+), 28 deletions(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 58c5425d61c2..54519b3d2868 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -151,12 +151,12 @@ static void nft_ctx_init(struct nft_ctx *ctx,
 	bitmap_zero(ctx->reg_inited, NFT_REG32_NUM);
 }
 
-static struct nft_trans *nft_trans_alloc_gfp(const struct nft_ctx *ctx,
-					     int msg_type, u32 size, gfp_t gfp)
+static struct nft_trans *nft_trans_alloc(const struct nft_ctx *ctx,
+					 int msg_type, u32 size)
 {
 	struct nft_trans *trans;
 
-	trans = kzalloc(size, gfp);
+	trans = kzalloc(size, GFP_KERNEL);
 	if (trans == NULL)
 		return NULL;
 
@@ -172,12 +172,6 @@ static struct nft_trans *nft_trans_alloc_gfp(const struct nft_ctx *ctx,
 	return trans;
 }
 
-static struct nft_trans *nft_trans_alloc(const struct nft_ctx *ctx,
-					 int msg_type, u32 size)
-{
-	return nft_trans_alloc_gfp(ctx, msg_type, size, GFP_KERNEL);
-}
-
 static struct nft_trans_binding *nft_trans_get_binding(struct nft_trans *trans)
 {
 	switch (trans->msg_type) {
@@ -442,8 +436,7 @@ static bool nft_trans_collapse_set_elem_allowed(const struct nft_trans_elem *a,
 
 static bool nft_trans_collapse_set_elem(struct nftables_pernet *nft_net,
 					struct nft_trans_elem *tail,
-					struct nft_trans_elem *trans,
-					gfp_t gfp)
+					struct nft_trans_elem *trans)
 {
 	unsigned int nelems, old_nelems = tail->nelems;
 	struct nft_trans_elem *new_trans;
@@ -466,9 +459,11 @@ static bool nft_trans_collapse_set_elem(struct nftables_pernet *nft_net,
 	/* krealloc might free tail which invalidates list pointers */
 	list_del_init(&tail->nft_trans.list);
 
-	new_trans = krealloc(tail, struct_size(tail, elems, nelems), gfp);
+	new_trans = krealloc(tail, struct_size(tail, elems, nelems),
+			     GFP_KERNEL);
 	if (!new_trans) {
-		list_add_tail(&tail->nft_trans.list, &nft_net->commit_list);
+		list_add_tail(&tail->nft_trans.list,
+			      &nft_net->commit_list);
 		return false;
 	}
 
@@ -484,7 +479,7 @@ static bool nft_trans_collapse_set_elem(struct nftables_pernet *nft_net,
 }
 
 static bool nft_trans_try_collapse(struct nftables_pernet *nft_net,
-				   struct nft_trans *trans, gfp_t gfp)
+				   struct nft_trans *trans)
 {
 	struct nft_trans *tail;
 
@@ -501,7 +496,7 @@ static bool nft_trans_try_collapse(struct nftables_pernet *nft_net,
 	case NFT_MSG_DELSETELEM:
 		return nft_trans_collapse_set_elem(nft_net,
 						   nft_trans_container_elem(tail),
-						   nft_trans_container_elem(trans), gfp);
+						   nft_trans_container_elem(trans));
 	}
 
 	return false;
@@ -537,17 +532,14 @@ static void nft_trans_commit_list_add_tail(struct net *net, struct nft_trans *tr
 	}
 }
 
-static void nft_trans_commit_list_add_elem(struct net *net, struct nft_trans *trans,
-					   gfp_t gfp)
+static void nft_trans_commit_list_add_elem(struct net *net, struct nft_trans *trans)
 {
 	struct nftables_pernet *nft_net = nft_pernet(net);
 
 	WARN_ON_ONCE(trans->msg_type != NFT_MSG_NEWSETELEM &&
 		     trans->msg_type != NFT_MSG_DELSETELEM);
 
-	might_alloc(gfp);
-
-	if (nft_trans_try_collapse(nft_net, trans, gfp)) {
+	if (nft_trans_try_collapse(nft_net, trans)) {
 		kfree(trans);
 		return;
 	}
@@ -7549,7 +7541,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 						}
 
 						ue->priv = elem_priv;
-						nft_trans_commit_list_add_elem(ctx->net, trans, GFP_KERNEL);
+						nft_trans_commit_list_add_elem(ctx->net, trans);
 						goto err_elem_free;
 					}
 				}
@@ -7573,7 +7565,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	}
 
 	nft_trans_container_elem(trans)->elems[0].priv = elem.priv;
-	nft_trans_commit_list_add_elem(ctx->net, trans, GFP_KERNEL);
+	nft_trans_commit_list_add_elem(ctx->net, trans);
 	return 0;
 
 err_set_full:
@@ -7839,7 +7831,7 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set,
 	nft_setelem_data_deactivate(ctx->net, set, elem.priv);
 
 	nft_trans_container_elem(trans)->elems[0].priv = elem.priv;
-	nft_trans_commit_list_add_elem(ctx->net, trans, GFP_KERNEL);
+	nft_trans_commit_list_add_elem(ctx->net, trans);
 	return 0;
 
 fail_ops:
@@ -7864,9 +7856,8 @@ static int nft_setelem_flush(const struct nft_ctx *ctx,
 	if (!nft_set_elem_active(ext, iter->genmask))
 		return 0;
 
-	trans = nft_trans_alloc_gfp(ctx, NFT_MSG_DELSETELEM,
-				    struct_size_t(struct nft_trans_elem, elems, 1),
-				    GFP_ATOMIC);
+	trans = nft_trans_alloc(ctx, NFT_MSG_DELSETELEM,
+				struct_size_t(struct nft_trans_elem, elems, 1));
 	if (!trans)
 		return -ENOMEM;
 
@@ -7877,7 +7868,7 @@ static int nft_setelem_flush(const struct nft_ctx *ctx,
 	nft_trans_elem_set(trans) = set;
 	nft_trans_container_elem(trans)->nelems = 1;
 	nft_trans_container_elem(trans)->elems[0].priv = elem_priv;
-	nft_trans_commit_list_add_elem(ctx->net, trans, GFP_ATOMIC);
+	nft_trans_commit_list_add_elem(ctx->net, trans);
 
 	return 0;
 }
@@ -7894,7 +7885,7 @@ static int __nft_set_catchall_flush(const struct nft_ctx *ctx,
 
 	nft_setelem_data_deactivate(ctx->net, set, elem_priv);
 	nft_trans_container_elem(trans)->elems[0].priv = elem_priv;
-	nft_trans_commit_list_add_elem(ctx->net, trans, GFP_KERNEL);
+	nft_trans_commit_list_add_elem(ctx->net, trans);
 
 	return 0;
 }
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 5/8] netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX
  2025-09-01  8:08 [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
                   ` (3 preceding siblings ...)
  2025-09-01  8:08 ` [PATCH net-next 4/8] netfilter: nf_tables: all transaction allocations can now sleep Florian Westphal
@ 2025-09-01  8:08 ` Florian Westphal
  2025-09-01 20:46   ` Jakub Kicinski
  2025-09-01  8:08 ` [PATCH net-next 6/8] netfilter: nft_set_pipapo: remove redundant test for avx feature bit Florian Westphal
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Florian Westphal @ 2025-09-01  8:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Phil Sutter <phil@nwl.cc>

This new attribute is supposed to be used instead of NFTA_DEVICE_NAME
for simple wildcard interface specs. It holds a NUL-terminated string
representing an interface name prefix to match on.

While kernel code to distinguish full names from prefixes in
NFTA_DEVICE_NAME is simpler than this solution, reusing the existing
attribute with different semantics leads to confusion between different
versions of kernel and user space though:

* With old kernels, wildcards submitted by user space are accepted yet
  silently treated as regular names.
* With old user space, wildcards submitted by kernel may cause crashes
  since libnftnl expects NUL-termination when there is none.

Using a distinct attribute type sanitizes these situations as the
receiving part detects and rejects the unexpected attribute nested in
*_HOOK_DEVS attributes.

Fixes: 6d07a289504a ("netfilter: nf_tables: Support wildcard netdev hook specs")
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/uapi/linux/netfilter/nf_tables.h |  2 ++
 net/netfilter/nf_tables_api.c            | 42 +++++++++++++++++-------
 2 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 2beb30be2c5f..8e0eb832bc01 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -1784,10 +1784,12 @@ enum nft_synproxy_attributes {
  * enum nft_device_attributes - nf_tables device netlink attributes
  *
  * @NFTA_DEVICE_NAME: name of this device (NLA_STRING)
+ * @NFTA_DEVICE_PREFIX: device name prefix, a simple wildcard (NLA_STRING)
  */
 enum nft_devices_attributes {
 	NFTA_DEVICE_UNSPEC,
 	NFTA_DEVICE_NAME,
+	NFTA_DEVICE_PREFIX,
 	__NFTA_DEVICE_MAX
 };
 #define NFTA_DEVICE_MAX		(__NFTA_DEVICE_MAX - 1)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 54519b3d2868..68e273d8821a 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1951,6 +1951,18 @@ static int nft_dump_stats(struct sk_buff *skb, struct nft_stats __percpu *stats)
 	return -ENOSPC;
 }
 
+static bool hook_is_prefix(struct nft_hook *hook)
+{
+	return strlen(hook->ifname) >= hook->ifnamelen;
+}
+
+static int nft_nla_put_hook_dev(struct sk_buff *skb, struct nft_hook *hook)
+{
+	int attr = hook_is_prefix(hook) ? NFTA_DEVICE_PREFIX : NFTA_DEVICE_NAME;
+
+	return nla_put_string(skb, attr, hook->ifname);
+}
+
 static int nft_dump_basechain_hook(struct sk_buff *skb,
 				   const struct net *net, int family,
 				   const struct nft_base_chain *basechain,
@@ -1982,16 +1994,15 @@ static int nft_dump_basechain_hook(struct sk_buff *skb,
 			if (!first)
 				first = hook;
 
-			if (nla_put(skb, NFTA_DEVICE_NAME,
-				    hook->ifnamelen, hook->ifname))
+			if (nft_nla_put_hook_dev(skb, hook))
 				goto nla_put_failure;
 			n++;
 		}
 		nla_nest_end(skb, nest_devs);
 
 		if (n == 1 &&
-		    nla_put(skb, NFTA_HOOK_DEV,
-			    first->ifnamelen, first->ifname))
+		    !hook_is_prefix(first) &&
+		    nla_put_string(skb, NFTA_HOOK_DEV, first->ifname))
 			goto nla_put_failure;
 	}
 	nla_nest_end(skb, nest);
@@ -2302,7 +2313,8 @@ void nf_tables_chain_destroy(struct nft_chain *chain)
 }
 
 static struct nft_hook *nft_netdev_hook_alloc(struct net *net,
-					      const struct nlattr *attr)
+					      const struct nlattr *attr,
+					      bool prefix)
 {
 	struct nf_hook_ops *ops;
 	struct net_device *dev;
@@ -2319,7 +2331,8 @@ static struct nft_hook *nft_netdev_hook_alloc(struct net *net,
 	if (err < 0)
 		goto err_hook_free;
 
-	hook->ifnamelen = nla_len(attr);
+	/* include the terminating NUL-char when comparing non-prefixes */
+	hook->ifnamelen = strlen(hook->ifname) + !prefix;
 
 	/* nf_tables_netdev_event() is called under rtnl_mutex, this is
 	 * indirectly serializing all the other holders of the commit_mutex with
@@ -2366,14 +2379,22 @@ static int nf_tables_parse_netdev_hooks(struct net *net,
 	struct nft_hook *hook, *next;
 	const struct nlattr *tmp;
 	int rem, n = 0, err;
+	bool prefix;
 
 	nla_for_each_nested(tmp, attr, rem) {
-		if (nla_type(tmp) != NFTA_DEVICE_NAME) {
+		switch (nla_type(tmp)) {
+		case NFTA_DEVICE_NAME:
+			prefix = false;
+			break;
+		case NFTA_DEVICE_PREFIX:
+			prefix = true;
+			break;
+		default:
 			err = -EINVAL;
 			goto err_hook;
 		}
 
-		hook = nft_netdev_hook_alloc(net, tmp);
+		hook = nft_netdev_hook_alloc(net, tmp, prefix);
 		if (IS_ERR(hook)) {
 			NL_SET_BAD_ATTR(extack, tmp);
 			err = PTR_ERR(hook);
@@ -2419,7 +2440,7 @@ static int nft_chain_parse_netdev(struct net *net, struct nlattr *tb[],
 	int err;
 
 	if (tb[NFTA_HOOK_DEV]) {
-		hook = nft_netdev_hook_alloc(net, tb[NFTA_HOOK_DEV]);
+		hook = nft_netdev_hook_alloc(net, tb[NFTA_HOOK_DEV], false);
 		if (IS_ERR(hook)) {
 			NL_SET_BAD_ATTR(extack, tb[NFTA_HOOK_DEV]);
 			return PTR_ERR(hook);
@@ -9449,8 +9470,7 @@ static int nf_tables_fill_flowtable_info(struct sk_buff *skb, struct net *net,
 
 	list_for_each_entry_rcu(hook, hook_list, list,
 				lockdep_commit_lock_is_held(net)) {
-		if (nla_put(skb, NFTA_DEVICE_NAME,
-			    hook->ifnamelen, hook->ifname))
+		if (nft_nla_put_hook_dev(skb, hook))
 			goto nla_put_failure;
 	}
 	nla_nest_end(skb, nest_devs);
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 6/8] netfilter: nft_set_pipapo: remove redundant test for avx feature bit
  2025-09-01  8:08 [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
                   ` (4 preceding siblings ...)
  2025-09-01  8:08 ` [PATCH net-next 5/8] netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX Florian Westphal
@ 2025-09-01  8:08 ` Florian Westphal
  2025-09-01  8:08 ` [PATCH net-next 7/8] netfilter: nf_reject: remove unneeded exports Florian Westphal
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Florian Westphal @ 2025-09-01  8:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

Sebastian points out that avx2 depends on avx, see check_cpufeature_deps()
in arch/x86/kernel/cpu/cpuid-deps.c:
avx2 feature bit will be cleared when avx isn't available.

No functional change intended.

Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nft_set_pipapo.c      | 3 +--
 net/netfilter/nft_set_pipapo_avx2.c | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index b385cfcf886f..4b64c3bd8e70 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -530,8 +530,7 @@ static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
 	local_bh_disable();
 
 #if defined(CONFIG_X86_64) && !defined(CONFIG_UML)
-	if (boot_cpu_has(X86_FEATURE_AVX2) && boot_cpu_has(X86_FEATURE_AVX) &&
-	    irq_fpu_usable()) {
+	if (boot_cpu_has(X86_FEATURE_AVX2) && irq_fpu_usable()) {
 		e = pipapo_get_avx2(m, data, genmask, tstamp);
 		local_bh_enable();
 		return e;
diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
index 29326f3fcaf3..7559306d0aed 100644
--- a/net/netfilter/nft_set_pipapo_avx2.c
+++ b/net/netfilter/nft_set_pipapo_avx2.c
@@ -1099,7 +1099,7 @@ bool nft_pipapo_avx2_estimate(const struct nft_set_desc *desc, u32 features,
 	    desc->field_count < NFT_PIPAPO_MIN_FIELDS)
 		return false;
 
-	if (!boot_cpu_has(X86_FEATURE_AVX2) || !boot_cpu_has(X86_FEATURE_AVX))
+	if (!boot_cpu_has(X86_FEATURE_AVX2))
 		return false;
 
 	est->size = pipapo_estimate_size(desc);
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 7/8] netfilter: nf_reject: remove unneeded exports
  2025-09-01  8:08 [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
                   ` (5 preceding siblings ...)
  2025-09-01  8:08 ` [PATCH net-next 6/8] netfilter: nft_set_pipapo: remove redundant test for avx feature bit Florian Westphal
@ 2025-09-01  8:08 ` Florian Westphal
  2025-09-01  8:08 ` [PATCH net-next 8/8] netfilter: nft_payload: extend offset to 65535 bytes Florian Westphal
  2025-09-02 10:53 ` [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
  8 siblings, 0 replies; 18+ messages in thread
From: Florian Westphal @ 2025-09-01  8:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

These functions have no external callers and can be static.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/ipv4/nf_reject.h |  8 ------
 include/net/netfilter/ipv6/nf_reject.h | 10 -------
 net/ipv4/netfilter/nf_reject_ipv4.c    | 27 ++++++++++++-------
 net/ipv6/netfilter/nf_reject_ipv6.c    | 37 +++++++++++++++++---------
 4 files changed, 42 insertions(+), 40 deletions(-)

diff --git a/include/net/netfilter/ipv4/nf_reject.h b/include/net/netfilter/ipv4/nf_reject.h
index c653fcb88354..09de2f2686b5 100644
--- a/include/net/netfilter/ipv4/nf_reject.h
+++ b/include/net/netfilter/ipv4/nf_reject.h
@@ -10,14 +10,6 @@
 void nf_send_unreach(struct sk_buff *skb_in, int code, int hook);
 void nf_send_reset(struct net *net, struct sock *, struct sk_buff *oldskb,
 		   int hook);
-const struct tcphdr *nf_reject_ip_tcphdr_get(struct sk_buff *oldskb,
-					     struct tcphdr *_oth, int hook);
-struct iphdr *nf_reject_iphdr_put(struct sk_buff *nskb,
-				  const struct sk_buff *oldskb,
-				  __u8 protocol, int ttl);
-void nf_reject_ip_tcphdr_put(struct sk_buff *nskb, const struct sk_buff *oldskb,
-			     const struct tcphdr *oth);
-
 struct sk_buff *nf_reject_skb_v4_unreach(struct net *net,
                                          struct sk_buff *oldskb,
                                          const struct net_device *dev,
diff --git a/include/net/netfilter/ipv6/nf_reject.h b/include/net/netfilter/ipv6/nf_reject.h
index d729344ba644..94ec0b9f2838 100644
--- a/include/net/netfilter/ipv6/nf_reject.h
+++ b/include/net/netfilter/ipv6/nf_reject.h
@@ -9,16 +9,6 @@ void nf_send_unreach6(struct net *net, struct sk_buff *skb_in, unsigned char cod
 		      unsigned int hooknum);
 void nf_send_reset6(struct net *net, struct sock *sk, struct sk_buff *oldskb,
 		    int hook);
-const struct tcphdr *nf_reject_ip6_tcphdr_get(struct sk_buff *oldskb,
-					      struct tcphdr *otcph,
-					      unsigned int *otcplen, int hook);
-struct ipv6hdr *nf_reject_ip6hdr_put(struct sk_buff *nskb,
-				     const struct sk_buff *oldskb,
-				     __u8 protocol, int hoplimit);
-void nf_reject_ip6_tcphdr_put(struct sk_buff *nskb,
-			      const struct sk_buff *oldskb,
-			      const struct tcphdr *oth, unsigned int otcplen);
-
 struct sk_buff *nf_reject_skb_v6_tcp_reset(struct net *net,
 					   struct sk_buff *oldskb,
 					   const struct net_device *dev,
diff --git a/net/ipv4/netfilter/nf_reject_ipv4.c b/net/ipv4/netfilter/nf_reject_ipv4.c
index 0d3cb2ba6fc8..05631abe3f0d 100644
--- a/net/ipv4/netfilter/nf_reject_ipv4.c
+++ b/net/ipv4/netfilter/nf_reject_ipv4.c
@@ -12,6 +12,15 @@
 #include <linux/netfilter_ipv4.h>
 #include <linux/netfilter_bridge.h>
 
+static struct iphdr *nf_reject_iphdr_put(struct sk_buff *nskb,
+					 const struct sk_buff *oldskb,
+					 __u8 protocol, int ttl);
+static void nf_reject_ip_tcphdr_put(struct sk_buff *nskb, const struct sk_buff *oldskb,
+				    const struct tcphdr *oth);
+static const struct tcphdr *
+nf_reject_ip_tcphdr_get(struct sk_buff *oldskb,
+			struct tcphdr *_oth, int hook);
+
 static int nf_reject_iphdr_validate(struct sk_buff *skb)
 {
 	struct iphdr *iph;
@@ -136,8 +145,9 @@ struct sk_buff *nf_reject_skb_v4_unreach(struct net *net,
 }
 EXPORT_SYMBOL_GPL(nf_reject_skb_v4_unreach);
 
-const struct tcphdr *nf_reject_ip_tcphdr_get(struct sk_buff *oldskb,
-					     struct tcphdr *_oth, int hook)
+static const struct tcphdr *
+nf_reject_ip_tcphdr_get(struct sk_buff *oldskb,
+			struct tcphdr *_oth, int hook)
 {
 	const struct tcphdr *oth;
 
@@ -163,11 +173,10 @@ const struct tcphdr *nf_reject_ip_tcphdr_get(struct sk_buff *oldskb,
 
 	return oth;
 }
-EXPORT_SYMBOL_GPL(nf_reject_ip_tcphdr_get);
 
-struct iphdr *nf_reject_iphdr_put(struct sk_buff *nskb,
-				  const struct sk_buff *oldskb,
-				  __u8 protocol, int ttl)
+static struct iphdr *nf_reject_iphdr_put(struct sk_buff *nskb,
+					 const struct sk_buff *oldskb,
+					 __u8 protocol, int ttl)
 {
 	struct iphdr *niph, *oiph = ip_hdr(oldskb);
 
@@ -188,10 +197,9 @@ struct iphdr *nf_reject_iphdr_put(struct sk_buff *nskb,
 
 	return niph;
 }
-EXPORT_SYMBOL_GPL(nf_reject_iphdr_put);
 
-void nf_reject_ip_tcphdr_put(struct sk_buff *nskb, const struct sk_buff *oldskb,
-			  const struct tcphdr *oth)
+static void nf_reject_ip_tcphdr_put(struct sk_buff *nskb, const struct sk_buff *oldskb,
+				    const struct tcphdr *oth)
 {
 	struct iphdr *niph = ip_hdr(nskb);
 	struct tcphdr *tcph;
@@ -218,7 +226,6 @@ void nf_reject_ip_tcphdr_put(struct sk_buff *nskb, const struct sk_buff *oldskb,
 	nskb->csum_start = (unsigned char *)tcph - nskb->head;
 	nskb->csum_offset = offsetof(struct tcphdr, check);
 }
-EXPORT_SYMBOL_GPL(nf_reject_ip_tcphdr_put);
 
 static int nf_reject_fill_skb_dst(struct sk_buff *skb_in)
 {
diff --git a/net/ipv6/netfilter/nf_reject_ipv6.c b/net/ipv6/netfilter/nf_reject_ipv6.c
index cb2d38e80de9..6b022449f867 100644
--- a/net/ipv6/netfilter/nf_reject_ipv6.c
+++ b/net/ipv6/netfilter/nf_reject_ipv6.c
@@ -12,6 +12,19 @@
 #include <linux/netfilter_ipv6.h>
 #include <linux/netfilter_bridge.h>
 
+static struct ipv6hdr *
+nf_reject_ip6hdr_put(struct sk_buff *nskb,
+		     const struct sk_buff *oldskb,
+		     __u8 protocol, int hoplimit);
+static void
+nf_reject_ip6_tcphdr_put(struct sk_buff *nskb,
+			 const struct sk_buff *oldskb,
+			 const struct tcphdr *oth, unsigned int otcplen);
+static const struct tcphdr *
+nf_reject_ip6_tcphdr_get(struct sk_buff *oldskb,
+			 struct tcphdr *otcph,
+			 unsigned int *otcplen, int hook);
+
 static bool nf_reject_v6_csum_ok(struct sk_buff *skb, int hook)
 {
 	const struct ipv6hdr *ip6h = ipv6_hdr(skb);
@@ -146,9 +159,10 @@ struct sk_buff *nf_reject_skb_v6_unreach(struct net *net,
 }
 EXPORT_SYMBOL_GPL(nf_reject_skb_v6_unreach);
 
-const struct tcphdr *nf_reject_ip6_tcphdr_get(struct sk_buff *oldskb,
-					      struct tcphdr *otcph,
-					      unsigned int *otcplen, int hook)
+static const struct tcphdr *
+nf_reject_ip6_tcphdr_get(struct sk_buff *oldskb,
+			 struct tcphdr *otcph,
+			 unsigned int *otcplen, int hook)
 {
 	const struct ipv6hdr *oip6h = ipv6_hdr(oldskb);
 	u8 proto;
@@ -192,11 +206,11 @@ const struct tcphdr *nf_reject_ip6_tcphdr_get(struct sk_buff *oldskb,
 
 	return otcph;
 }
-EXPORT_SYMBOL_GPL(nf_reject_ip6_tcphdr_get);
 
-struct ipv6hdr *nf_reject_ip6hdr_put(struct sk_buff *nskb,
-				     const struct sk_buff *oldskb,
-				     __u8 protocol, int hoplimit)
+static struct ipv6hdr *
+nf_reject_ip6hdr_put(struct sk_buff *nskb,
+		     const struct sk_buff *oldskb,
+		     __u8 protocol, int hoplimit)
 {
 	struct ipv6hdr *ip6h;
 	const struct ipv6hdr *oip6h = ipv6_hdr(oldskb);
@@ -216,11 +230,11 @@ struct ipv6hdr *nf_reject_ip6hdr_put(struct sk_buff *nskb,
 
 	return ip6h;
 }
-EXPORT_SYMBOL_GPL(nf_reject_ip6hdr_put);
 
-void nf_reject_ip6_tcphdr_put(struct sk_buff *nskb,
-			      const struct sk_buff *oldskb,
-			      const struct tcphdr *oth, unsigned int otcplen)
+static void
+nf_reject_ip6_tcphdr_put(struct sk_buff *nskb,
+			 const struct sk_buff *oldskb,
+			 const struct tcphdr *oth, unsigned int otcplen)
 {
 	struct tcphdr *tcph;
 
@@ -248,7 +262,6 @@ void nf_reject_ip6_tcphdr_put(struct sk_buff *nskb,
 				      csum_partial(tcph,
 						   sizeof(struct tcphdr), 0));
 }
-EXPORT_SYMBOL_GPL(nf_reject_ip6_tcphdr_put);
 
 static int nf_reject6_fill_skb_dst(struct sk_buff *skb_in)
 {
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 8/8] netfilter: nft_payload: extend offset to 65535 bytes
  2025-09-01  8:08 [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
                   ` (6 preceding siblings ...)
  2025-09-01  8:08 ` [PATCH net-next 7/8] netfilter: nf_reject: remove unneeded exports Florian Westphal
@ 2025-09-01  8:08 ` Florian Westphal
  2025-09-02 10:53 ` [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
  8 siblings, 0 replies; 18+ messages in thread
From: Florian Westphal @ 2025-09-01  8:08 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

From: Fernando Fernandez Mancera <fmancera@suse.de>

In some situations 255 bytes offset is not enough to match or manipulate
the desired packet field. Increase the offset limit to 65535 or U16_MAX.

In addition, the nla policy maximum value is not set anymore as it is
limited to s16. Instead, the maximum value is checked during the payload
expression initialization function.

Tested with the nft command line tool.

table ip filter {
	chain output {
		@nh,2040,8 set 0xff
		@nh,524280,8 set 0xff
		@nh,524280,8 0xff
		@nh,2040,8 0xff
	}
}

Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_tables_core.h |  2 +-
 net/netfilter/nft_payload.c            | 18 +++++++++++-------
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/include/net/netfilter/nf_tables_core.h b/include/net/netfilter/nf_tables_core.h
index 6c2f483d9828..7644cfe9267d 100644
--- a/include/net/netfilter/nf_tables_core.h
+++ b/include/net/netfilter/nf_tables_core.h
@@ -73,7 +73,7 @@ struct nft_ct {
 
 struct nft_payload {
 	enum nft_payload_bases	base:8;
-	u8			offset;
+	u16			offset;
 	u8			len;
 	u8			dreg;
 };
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 059b28ffad0e..b0214418f75a 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -40,7 +40,7 @@ static bool nft_payload_rebuild_vlan_hdr(const struct sk_buff *skb, int mac_off,
 
 /* add vlan header into the user buffer for if tag was removed by offloads */
 static bool
-nft_payload_copy_vlan(u32 *d, const struct sk_buff *skb, u8 offset, u8 len)
+nft_payload_copy_vlan(u32 *d, const struct sk_buff *skb, u16 offset, u8 len)
 {
 	int mac_off = skb_mac_header(skb) - skb->data;
 	u8 *vlanh, *dst_u8 = (u8 *) d;
@@ -212,7 +212,7 @@ static const struct nla_policy nft_payload_policy[NFTA_PAYLOAD_MAX + 1] = {
 	[NFTA_PAYLOAD_SREG]		= { .type = NLA_U32 },
 	[NFTA_PAYLOAD_DREG]		= { .type = NLA_U32 },
 	[NFTA_PAYLOAD_BASE]		= { .type = NLA_U32 },
-	[NFTA_PAYLOAD_OFFSET]		= NLA_POLICY_MAX(NLA_BE32, 255),
+	[NFTA_PAYLOAD_OFFSET]		= { .type = NLA_BE32 },
 	[NFTA_PAYLOAD_LEN]		= NLA_POLICY_MAX(NLA_BE32, 255),
 	[NFTA_PAYLOAD_CSUM_TYPE]	= { .type = NLA_U32 },
 	[NFTA_PAYLOAD_CSUM_OFFSET]	= NLA_POLICY_MAX(NLA_BE32, 255),
@@ -797,7 +797,7 @@ static int nft_payload_csum_inet(struct sk_buff *skb, const u32 *src,
 
 struct nft_payload_set {
 	enum nft_payload_bases	base:8;
-	u8			offset;
+	u16			offset;
 	u8			len;
 	u8			sreg;
 	u8			csum_type;
@@ -812,7 +812,7 @@ struct nft_payload_vlan_hdr {
 };
 
 static bool
-nft_payload_set_vlan(const u32 *src, struct sk_buff *skb, u8 offset, u8 len,
+nft_payload_set_vlan(const u32 *src, struct sk_buff *skb, u16 offset, u8 len,
 		     int *vlan_hlen)
 {
 	struct nft_payload_vlan_hdr *vlanh;
@@ -940,14 +940,18 @@ static int nft_payload_set_init(const struct nft_ctx *ctx,
 				const struct nft_expr *expr,
 				const struct nlattr * const tb[])
 {
+	u32 csum_offset, offset, csum_type = NFT_PAYLOAD_CSUM_NONE;
 	struct nft_payload_set *priv = nft_expr_priv(expr);
-	u32 csum_offset, csum_type = NFT_PAYLOAD_CSUM_NONE;
 	int err;
 
 	priv->base        = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_BASE]));
-	priv->offset      = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_OFFSET]));
 	priv->len         = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_LEN]));
 
+	err = nft_parse_u32_check(tb[NFTA_PAYLOAD_OFFSET], U16_MAX, &offset);
+	if (err < 0)
+		return err;
+	priv->offset = offset;
+
 	if (tb[NFTA_PAYLOAD_CSUM_TYPE])
 		csum_type = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_CSUM_TYPE]));
 	if (tb[NFTA_PAYLOAD_CSUM_OFFSET]) {
@@ -1069,7 +1073,7 @@ nft_payload_select_ops(const struct nft_ctx *ctx,
 	if (tb[NFTA_PAYLOAD_DREG] == NULL)
 		return ERR_PTR(-EINVAL);
 
-	err = nft_parse_u32_check(tb[NFTA_PAYLOAD_OFFSET], U8_MAX, &offset);
+	err = nft_parse_u32_check(tb[NFTA_PAYLOAD_OFFSET], U16_MAX, &offset);
 	if (err < 0)
 		return ERR_PTR(err);
 
-- 
2.49.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 5/8] netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX
  2025-09-01  8:08 ` [PATCH net-next 5/8] netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX Florian Westphal
@ 2025-09-01 20:46   ` Jakub Kicinski
  2025-09-01 21:12     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2025-09-01 20:46 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netdev, Paolo Abeni, David S. Miller, Eric Dumazet,
	netfilter-devel, pablo

On Mon,  1 Sep 2025 10:08:39 +0200 Florian Westphal wrote:
> This new attribute is supposed to be used instead of NFTA_DEVICE_NAME
> for simple wildcard interface specs. It holds a NUL-terminated string
> representing an interface name prefix to match on.
> 
> While kernel code to distinguish full names from prefixes in
> NFTA_DEVICE_NAME is simpler than this solution, reusing the existing
> attribute with different semantics leads to confusion between different
> versions of kernel and user space though:
> 
> * With old kernels, wildcards submitted by user space are accepted yet
>   silently treated as regular names.
> * With old user space, wildcards submitted by kernel may cause crashes
>   since libnftnl expects NUL-termination when there is none.
> 
> Using a distinct attribute type sanitizes these situations as the
> receiving part detects and rejects the unexpected attribute nested in
> *_HOOK_DEVS attributes.
> 
> Fixes: 6d07a289504a ("netfilter: nf_tables: Support wildcard netdev hook specs")

Why is this not targeting net? The sooner we adjust the uAPI the better.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 5/8] netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX
  2025-09-01 20:46   ` Jakub Kicinski
@ 2025-09-01 21:12     ` Pablo Neira Ayuso
  2025-09-02  0:04       ` Florian Westphal
  0 siblings, 1 reply; 18+ messages in thread
From: Pablo Neira Ayuso @ 2025-09-01 21:12 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Florian Westphal, netdev, Paolo Abeni, David S. Miller,
	Eric Dumazet, netfilter-devel

On Mon, Sep 01, 2025 at 01:46:02PM -0700, Jakub Kicinski wrote:
> On Mon,  1 Sep 2025 10:08:39 +0200 Florian Westphal wrote:
> > This new attribute is supposed to be used instead of NFTA_DEVICE_NAME
> > for simple wildcard interface specs. It holds a NUL-terminated string
> > representing an interface name prefix to match on.
> > 
> > While kernel code to distinguish full names from prefixes in
> > NFTA_DEVICE_NAME is simpler than this solution, reusing the existing
> > attribute with different semantics leads to confusion between different
> > versions of kernel and user space though:
> > 
> > * With old kernels, wildcards submitted by user space are accepted yet
> >   silently treated as regular names.
> > * With old user space, wildcards submitted by kernel may cause crashes
> >   since libnftnl expects NUL-termination when there is none.
> > 
> > Using a distinct attribute type sanitizes these situations as the
> > receiving part detects and rejects the unexpected attribute nested in
> > *_HOOK_DEVS attributes.
> > 
> > Fixes: 6d07a289504a ("netfilter: nf_tables: Support wildcard netdev hook specs")
> 
> Why is this not targeting net? The sooner we adjust the uAPI the better.

I think there were doubts that was possible at this stage.

But I agree, it is a bit late but better fix it there.

Florian?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 5/8] netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX
  2025-09-01 21:12     ` Pablo Neira Ayuso
@ 2025-09-02  0:04       ` Florian Westphal
  2025-09-02 13:03         ` Paolo Abeni
  0 siblings, 1 reply; 18+ messages in thread
From: Florian Westphal @ 2025-09-02  0:04 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Jakub Kicinski, netdev, Paolo Abeni, David S. Miller,
	Eric Dumazet, netfilter-devel

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Mon, Sep 01, 2025 at 01:46:02PM -0700, Jakub Kicinski wrote:
> > Why is this not targeting net? The sooner we adjust the uAPI the better.

I considered it a new feature rather than a bug fix:

Userspace can't rely on the existing api because kernels before
6.16 don't special-case the names provided, and nftables doesn't
make use of the 6.16 "accident" (the attempt to re-use the existing
device name attribute for this).

The corresponding userspace changes (v4 uses the new attribute)
haven't been merged yet.

But sure, getting rid of the "accident" faster makes sense,
thanks for suggesting this.

> I think there were doubts that was possible at this stage.
> But I agree, it is a bit late but better fix it there.

Alright, I'll send a new nf-next PR with this one dropped in a few hours
and a separate nf.git PR with this patch included.

If you prefer to just apply this series sand this patch that worls for
me too, I'll just follow this changesets' patchwork state.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 0/8] netfilter: updates for net-next
  2025-09-01  8:08 [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
                   ` (7 preceding siblings ...)
  2025-09-01  8:08 ` [PATCH net-next 8/8] netfilter: nft_payload: extend offset to 65535 bytes Florian Westphal
@ 2025-09-02 10:53 ` Florian Westphal
  8 siblings, 0 replies; 18+ messages in thread
From: Florian Westphal @ 2025-09-02 10:53 UTC (permalink / raw)
  To: netdev
  Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
	netfilter-devel, pablo

Florian Westphal <fw@strlen.de> wrote:
> The following patchset contains Netfilter fixes for *net-next*:
> 
> 1) prefer vmalloc_array in ebtables, from  Qianfeng Rong.
> 2) Use csum_replace4 instead of open-coding it, from Christophe Leroy.
> 3+4) Get rid of GFP_ATOMIC in transaction object allocations, those
>      cause silly failures with large sets under memory pressure, from
>      myself.
> 5) Introduce new NFTA_DEVICE_PREFIX attribute in nftables netlink api,
>    re-using old NFTA_DEVICE_NAME led to confusion with different
>    kernel/userspace versions.  This refines the wildcard interface
>    support added in 6.16 release.  From Phil Sutter.

As per discussion I'll route patch 5 via net tree instead, so:

pw-bot: changes-requested

A new nf-next -> net-next MR will follow.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 5/8] netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX
  2025-09-02  0:04       ` Florian Westphal
@ 2025-09-02 13:03         ` Paolo Abeni
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Abeni @ 2025-09-02 13:03 UTC (permalink / raw)
  To: Florian Westphal, Pablo Neira Ayuso
  Cc: Jakub Kicinski, netdev, David S. Miller, Eric Dumazet,
	netfilter-devel

On 9/2/25 2:04 AM, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
>> On Mon, Sep 01, 2025 at 01:46:02PM -0700, Jakub Kicinski wrote:
>>> Why is this not targeting net? The sooner we adjust the uAPI the better.
> 
> I considered it a new feature rather than a bug fix:
> 
> Userspace can't rely on the existing api because kernels before
> 6.16 don't special-case the names provided, and nftables doesn't
> make use of the 6.16 "accident" (the attempt to re-use the existing
> device name attribute for this).
> 
> The corresponding userspace changes (v4 uses the new attribute)
> haven't been merged yet.
> 
> But sure, getting rid of the "accident" faster makes sense,
> thanks for suggesting this.
> 
>> I think there were doubts that was possible at this stage.
>> But I agree, it is a bit late but better fix it there.
> 
> Alright, I'll send a new nf-next PR with this one dropped in a few hours
> and a separate nf.git PR with this patch included.

I agree that this latter option is preferable.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-09-02 13:03 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-01  8:08 [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
2025-09-01  8:08 ` [PATCH net-next 1/8] netfilter: ebtables: Use vmalloc_array() to improve code Florian Westphal
2025-09-01  8:08 ` [PATCH net-next 2/8] netfilter: nft_payload: Use csum_replace4() instead of opencoding Florian Westphal
2025-09-01  8:08 ` [PATCH net-next 3/8] netfilter: nf_tables: allow iter callbacks to sleep Florian Westphal
2025-09-01  8:08 ` [PATCH net-next 4/8] netfilter: nf_tables: all transaction allocations can now sleep Florian Westphal
2025-09-01  8:08 ` [PATCH net-next 5/8] netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX Florian Westphal
2025-09-01 20:46   ` Jakub Kicinski
2025-09-01 21:12     ` Pablo Neira Ayuso
2025-09-02  0:04       ` Florian Westphal
2025-09-02 13:03         ` Paolo Abeni
2025-09-01  8:08 ` [PATCH net-next 6/8] netfilter: nft_set_pipapo: remove redundant test for avx feature bit Florian Westphal
2025-09-01  8:08 ` [PATCH net-next 7/8] netfilter: nf_reject: remove unneeded exports Florian Westphal
2025-09-01  8:08 ` [PATCH net-next 8/8] netfilter: nft_payload: extend offset to 65535 bytes Florian Westphal
2025-09-02 10:53 ` [PATCH net-next 0/8] netfilter: updates for net-next Florian Westphal
  -- strict thread matches above, loose matches on Subject: below --
2023-12-22 11:57 [PATCH net-next 0/8] Netfilter " Pablo Neira Ayuso
2021-08-30  9:38 Pablo Neira Ayuso
2020-11-04 14:11 Pablo Neira Ayuso
2020-11-05  2:18 ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).