netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/9] Netfilter updates for net-next
@ 2024-08-22 22:19 Pablo Neira Ayuso
  2024-08-22 22:19 ` [PATCH net-next 1/9] netfilter: nfnetlink_queue: unbreak SCTP traffic Pablo Neira Ayuso
                   ` (8 more replies)
  0 siblings, 9 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-22 22:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw

Hi,

The following batch contains Netfilter updates for net-next:

Patch #1 fix checksum calculation in nfnetlink_queue with SCTP,
	 segment GSO packet since skb_zerocopy() does not support
	 GSO_BY_FRAGS, from Antonio Ojea.

Patch #2 extend nfnetlink_queue coverage to handle SCTP packets,
	 from Antonio Ojea.

Patch #3 uses consume_skb() instead of kfree_skb() in nfnetlink,
         from Donald Hunter.

Patch #4 adds a dedicate commit list for sets to speed up
	 intra-transaction lookups, from Florian Westphal.

Patch #5 skips removal of element from abort path for the pipapo
         backend, ditching the shadow copy of this datastructure
	 is sufficient.

Patch #6 moves nf_ct_netns_get() out of nf_conncount_init() to
	 let users of conncoiunt decide when to enable conntrack,
	 this is needed by openvswitch, from Xin Long.

Patch #7 pass context to all nft_parse_register_load() in
	 preparation for the next patch.

Patches #8 and #9 reject loads from uninitialized registers from
	 control plane to remove register initialization from
	 datapath. From Florian Westphal.

Please, pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next.git nf-next-24-08-23

Thanks.

----------------------------------------------------------------

The following changes since commit 1bf8e07c382bd4f04ede81ecc05267a8ffd60999:

  dt-binding: ptp: fsl,ptp: add pci1957,ee02 compatible string for fsl,enetc-ptp (2024-08-19 09:48:53 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next.git tags/nf-next-24-08-23

for you to fetch changes up to c88baabf16d1ef74ab8832de9761226406af5507:

  netfilter: nf_tables: don't initialize registers in nft_do_chain() (2024-08-20 12:37:25 +0200)

----------------------------------------------------------------
netfilter pull request 24-08-23

----------------------------------------------------------------
Antonio Ojea (2):
      netfilter: nfnetlink_queue: unbreak SCTP traffic
      selftests: netfilter: nft_queue.sh: sctp coverage

Donald Hunter (1):
      netfilter: nfnetlink: convert kfree_skb to consume_skb

Florian Westphal (4):
      netfilter: nf_tables: store new sets in dedicated list
      netfilter: nf_tables: pass context structure to nft_parse_register_load
      netfilter: nf_tables: allow loads only when register is initialized
      netfilter: nf_tables: don't initialize registers in nft_do_chain()

Pablo Neira Ayuso (1):
      netfilter: nf_tables: do not remove elements if set backend implements .abort

Xin Long (1):
      netfilter: move nf_ct_netns_get out of nf_conncount_init

 include/net/netfilter/nf_conntrack_count.h         |  6 +-
 include/net/netfilter/nf_tables.h                  |  6 +-
 net/bridge/netfilter/nft_meta_bridge.c             |  2 +-
 net/core/dev.c                                     |  1 +
 net/ipv4/netfilter/nft_dup_ipv4.c                  |  4 +-
 net/ipv6/netfilter/nft_dup_ipv6.c                  |  4 +-
 net/netfilter/nf_conncount.c                       | 15 +---
 net/netfilter/nf_tables_api.c                      | 75 +++++++++++++++----
 net/netfilter/nf_tables_core.c                     |  2 +-
 net/netfilter/nfnetlink.c                          | 14 ++--
 net/netfilter/nfnetlink_queue.c                    | 12 ++-
 net/netfilter/nft_bitwise.c                        |  4 +-
 net/netfilter/nft_byteorder.c                      |  2 +-
 net/netfilter/nft_cmp.c                            |  6 +-
 net/netfilter/nft_ct.c                             |  2 +-
 net/netfilter/nft_dup_netdev.c                     |  2 +-
 net/netfilter/nft_dynset.c                         |  4 +-
 net/netfilter/nft_exthdr.c                         |  2 +-
 net/netfilter/nft_fwd_netdev.c                     |  6 +-
 net/netfilter/nft_hash.c                           |  2 +-
 net/netfilter/nft_lookup.c                         |  2 +-
 net/netfilter/nft_masq.c                           |  4 +-
 net/netfilter/nft_meta.c                           |  2 +-
 net/netfilter/nft_nat.c                            |  8 +-
 net/netfilter/nft_objref.c                         |  2 +-
 net/netfilter/nft_payload.c                        |  2 +-
 net/netfilter/nft_queue.c                          |  2 +-
 net/netfilter/nft_range.c                          |  2 +-
 net/netfilter/nft_redir.c                          |  4 +-
 net/netfilter/nft_tproxy.c                         |  4 +-
 net/netfilter/xt_connlimit.c                       | 15 +++-
 net/openvswitch/conntrack.c                        |  5 +-
 tools/testing/selftests/net/netfilter/config       |  2 +
 tools/testing/selftests/net/netfilter/nft_queue.sh | 85 +++++++++++++++++++++-
 34 files changed, 226 insertions(+), 84 deletions(-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net-next 1/9] netfilter: nfnetlink_queue: unbreak SCTP traffic
  2024-08-22 22:19 [PATCH net-next 0/9] Netfilter updates for net-next Pablo Neira Ayuso
@ 2024-08-22 22:19 ` Pablo Neira Ayuso
  2024-08-26 15:50   ` patchwork-bot+netdevbpf
  2024-08-22 22:19 ` [PATCH net-next 2/9] selftests: netfilter: nft_queue.sh: sctp coverage Pablo Neira Ayuso
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-22 22:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw

From: Antonio Ojea <aojea@google.com>

when packet is enqueued with nfqueue and GSO is enabled, checksum
calculation has to take into account the protocol, as SCTP uses a
32 bits CRC checksum.

Enter skb_gso_segment() path in case of SCTP GSO packets because
skb_zerocopy() does not support for GSO_BY_FRAGS.

Joint work with Pablo.

Signed-off-by: Antonio Ojea <aojea@google.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/core/dev.c                  |  1 +
 net/netfilter/nfnetlink_queue.c | 12 ++++++++++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index e7260889d4cb..8384282acadf 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3386,6 +3386,7 @@ int skb_crc32c_csum_help(struct sk_buff *skb)
 out:
 	return ret;
 }
+EXPORT_SYMBOL(skb_crc32c_csum_help);
 
 __be16 skb_network_protocol(struct sk_buff *skb, int *depth)
 {
diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index e0716da256bf..d2773ce9b585 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -540,6 +540,14 @@ static int nfqnl_put_bridge(struct nf_queue_entry *entry, struct sk_buff *skb)
 	return -1;
 }
 
+static int nf_queue_checksum_help(struct sk_buff *entskb)
+{
+	if (skb_csum_is_sctp(entskb))
+		return skb_crc32c_csum_help(entskb);
+
+	return skb_checksum_help(entskb);
+}
+
 static struct sk_buff *
 nfqnl_build_packet_message(struct net *net, struct nfqnl_instance *queue,
 			   struct nf_queue_entry *entry,
@@ -602,7 +610,7 @@ nfqnl_build_packet_message(struct net *net, struct nfqnl_instance *queue,
 	case NFQNL_COPY_PACKET:
 		if (!(queue->flags & NFQA_CFG_F_GSO) &&
 		    entskb->ip_summed == CHECKSUM_PARTIAL &&
-		    skb_checksum_help(entskb))
+		    nf_queue_checksum_help(entskb))
 			return NULL;
 
 		data_len = READ_ONCE(queue->copy_range);
@@ -1014,7 +1022,7 @@ nfqnl_enqueue_packet(struct nf_queue_entry *entry, unsigned int queuenum)
 		break;
 	}
 
-	if ((queue->flags & NFQA_CFG_F_GSO) || !skb_is_gso(skb))
+	if (!skb_is_gso(skb) || ((queue->flags & NFQA_CFG_F_GSO) && !skb_is_gso_sctp(skb)))
 		return __nfqnl_enqueue_packet(net, queue, entry);
 
 	nf_bridge_adjust_skb_data(skb);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 2/9] selftests: netfilter: nft_queue.sh: sctp coverage
  2024-08-22 22:19 [PATCH net-next 0/9] Netfilter updates for net-next Pablo Neira Ayuso
  2024-08-22 22:19 ` [PATCH net-next 1/9] netfilter: nfnetlink_queue: unbreak SCTP traffic Pablo Neira Ayuso
@ 2024-08-22 22:19 ` Pablo Neira Ayuso
  2024-08-27  2:25   ` Jakub Kicinski
  2024-08-22 22:19 ` [PATCH net-next 3/9] netfilter: nfnetlink: convert kfree_skb to consume_skb Pablo Neira Ayuso
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-22 22:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw

From: Antonio Ojea <aojea@google.com>

Test that nfqueue with and without GSO process SCTP packets correctly.

Joint work with Florian and Pablo.

Signed-off-by: Antonio Ojea <aojea@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 tools/testing/selftests/net/netfilter/config  |  2 +
 .../selftests/net/netfilter/nft_queue.sh      | 85 ++++++++++++++++++-
 2 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/netfilter/config b/tools/testing/selftests/net/netfilter/config
index 63ef80ef47a4..b2dd4db45215 100644
--- a/tools/testing/selftests/net/netfilter/config
+++ b/tools/testing/selftests/net/netfilter/config
@@ -87,3 +87,5 @@ CONFIG_XFRM_USER=m
 CONFIG_XFRM_STATISTICS=y
 CONFIG_NET_PKTGEN=m
 CONFIG_TUN=m
+CONFIG_INET_DIAG=m
+CONFIG_SCTP_DIAG=m
diff --git a/tools/testing/selftests/net/netfilter/nft_queue.sh b/tools/testing/selftests/net/netfilter/nft_queue.sh
index c61d23a8c88d..f3bdeb1271eb 100755
--- a/tools/testing/selftests/net/netfilter/nft_queue.sh
+++ b/tools/testing/selftests/net/netfilter/nft_queue.sh
@@ -25,6 +25,9 @@ cleanup()
 }
 
 checktool "nft --version" "test without nft tool"
+checktool "socat -h" "run test without socat"
+
+modprobe -q sctp
 
 trap cleanup EXIT
 
@@ -265,7 +268,6 @@ test_tcp_forward()
 
 test_tcp_localhost()
 {
-	dd conv=sparse status=none if=/dev/zero bs=1M count=200 of="$TMPINPUT"
 	timeout 5 ip netns exec "$nsrouter" socat -u TCP-LISTEN:12345 STDOUT >/dev/null &
 	local rpid=$!
 
@@ -375,6 +377,82 @@ EOF
 	wait 2>/dev/null
 }
 
+sctp_listener_ready()
+{
+	ss -S -N "$1" -ln -o "sport = :12345" | grep -q 12345
+}
+
+test_sctp_forward()
+{
+	ip netns exec "$nsrouter" nft -f /dev/stdin <<EOF
+flush ruleset
+table inet sctpq {
+        chain forward {
+        type filter hook forward priority 0; policy accept;
+                sctp dport 12345 queue num 10
+        }
+}
+EOF
+	timeout 60 ip netns exec "$ns2" socat -u SCTP-LISTEN:12345 STDOUT > "$TMPFILE1" &
+	local rpid=$!
+
+	busywait "$BUSYWAIT_TIMEOUT" sctp_listener_ready "$ns2"
+
+	ip netns exec "$nsrouter" ./nf_queue -q 10 -G -t "$timeout" &
+	local nfqpid=$!
+
+	ip netns exec "$ns1" socat -u STDIN SCTP:10.0.2.99:12345 <"$TMPINPUT" >/dev/null
+
+	if ! ip netns exec "$nsrouter" nft delete table inet sctpq; then
+		echo "FAIL:  Could not delete sctpq table"
+		exit 1
+	fi
+
+	wait "$rpid" && echo "PASS: sctp and nfqueue in forward chain"
+
+	if ! diff -u "$TMPINPUT" "$TMPFILE1" ; then
+		echo "FAIL: lost packets?!" 1>&2
+		exit 1
+	fi
+}
+
+test_sctp_output()
+{
+        ip netns exec "$ns1" nft -f /dev/stdin <<EOF
+table inet sctpq {
+        chain output {
+        type filter hook output priority 0; policy accept;
+                sctp dport 12345 queue num 11
+        }
+}
+EOF
+	# reduce test file size, software segmentation causes sk wmem increase.
+	dd conv=sparse status=none if=/dev/zero bs=1M count=50 of="$TMPINPUT"
+
+	timeout 60 ip netns exec "$ns2" socat -u SCTP-LISTEN:12345 STDOUT > "$TMPFILE1" &
+	local rpid=$!
+
+	busywait "$BUSYWAIT_TIMEOUT" sctp_listener_ready "$ns2"
+
+	ip netns exec "$ns1" ./nf_queue -q 11 -t "$timeout" &
+	local nfqpid=$!
+
+	ip netns exec "$ns1" socat -u STDIN SCTP:10.0.2.99:12345 <"$TMPINPUT" >/dev/null
+
+	if ! ip netns exec "$ns1" nft delete table inet sctpq; then
+		echo "FAIL:  Could not delete sctpq table"
+		exit 1
+	fi
+
+	# must wait before checking completeness of output file.
+	wait "$rpid" && echo "PASS: sctp and nfqueue in output chain with GSO"
+
+	if ! diff -u "$TMPINPUT" "$TMPFILE1" ; then
+		echo "FAIL: lost packets?!" 1>&2
+		exit 1
+	fi
+}
+
 test_queue_removal()
 {
 	read tainted_then < /proc/sys/kernel/tainted
@@ -443,11 +521,16 @@ test_queue 10
 # same.  We queue to a second program as well.
 load_ruleset "filter2" 20
 test_queue 20
+ip netns exec "$ns1" nft flush ruleset
 
 test_tcp_forward
 test_tcp_localhost
 test_tcp_localhost_connectclose
 test_tcp_localhost_requeue
+test_sctp_forward
+test_sctp_output
+
+# should be last, adds vrf device in ns1 and changes routes
 test_icmp_vrf
 test_queue_removal
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 3/9] netfilter: nfnetlink: convert kfree_skb to consume_skb
  2024-08-22 22:19 [PATCH net-next 0/9] Netfilter updates for net-next Pablo Neira Ayuso
  2024-08-22 22:19 ` [PATCH net-next 1/9] netfilter: nfnetlink_queue: unbreak SCTP traffic Pablo Neira Ayuso
  2024-08-22 22:19 ` [PATCH net-next 2/9] selftests: netfilter: nft_queue.sh: sctp coverage Pablo Neira Ayuso
@ 2024-08-22 22:19 ` Pablo Neira Ayuso
  2024-08-22 22:19 ` [PATCH net-next 4/9] netfilter: nf_tables: store new sets in dedicated list Pablo Neira Ayuso
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-22 22:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw

From: Donald Hunter <donald.hunter@gmail.com>

Use consume_skb in the batch code path to avoid generating spurious
NOT_SPECIFIED skb drop reasons.

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nfnetlink.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/net/netfilter/nfnetlink.c b/net/netfilter/nfnetlink.c
index 932b3ddb34f1..7784ec094097 100644
--- a/net/netfilter/nfnetlink.c
+++ b/net/netfilter/nfnetlink.c
@@ -402,27 +402,27 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
 		{
 			nfnl_unlock(subsys_id);
 			netlink_ack(oskb, nlh, -EOPNOTSUPP, NULL);
-			return kfree_skb(skb);
+			return consume_skb(skb);
 		}
 	}
 
 	if (!ss->valid_genid || !ss->commit || !ss->abort) {
 		nfnl_unlock(subsys_id);
 		netlink_ack(oskb, nlh, -EOPNOTSUPP, NULL);
-		return kfree_skb(skb);
+		return consume_skb(skb);
 	}
 
 	if (!try_module_get(ss->owner)) {
 		nfnl_unlock(subsys_id);
 		netlink_ack(oskb, nlh, -EOPNOTSUPP, NULL);
-		return kfree_skb(skb);
+		return consume_skb(skb);
 	}
 
 	if (!ss->valid_genid(net, genid)) {
 		module_put(ss->owner);
 		nfnl_unlock(subsys_id);
 		netlink_ack(oskb, nlh, -ERESTART, NULL);
-		return kfree_skb(skb);
+		return consume_skb(skb);
 	}
 
 	nfnl_unlock(subsys_id);
@@ -567,7 +567,7 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (status & NFNL_BATCH_REPLAY) {
 		ss->abort(net, oskb, NFNL_ABORT_AUTOLOAD);
 		nfnl_err_reset(&err_list);
-		kfree_skb(skb);
+		consume_skb(skb);
 		module_put(ss->owner);
 		goto replay;
 	} else if (status == NFNL_BATCH_DONE) {
@@ -593,7 +593,7 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
 		err = ss->abort(net, oskb, abort_action);
 		if (err == -EAGAIN) {
 			nfnl_err_reset(&err_list);
-			kfree_skb(skb);
+			consume_skb(skb);
 			module_put(ss->owner);
 			status |= NFNL_BATCH_FAILURE;
 			goto replay_abort;
@@ -601,7 +601,7 @@ static void nfnetlink_rcv_batch(struct sk_buff *skb, struct nlmsghdr *nlh,
 	}
 
 	nfnl_err_deliver(&err_list, oskb);
-	kfree_skb(skb);
+	consume_skb(skb);
 	module_put(ss->owner);
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 4/9] netfilter: nf_tables: store new sets in dedicated list
  2024-08-22 22:19 [PATCH net-next 0/9] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2024-08-22 22:19 ` [PATCH net-next 3/9] netfilter: nfnetlink: convert kfree_skb to consume_skb Pablo Neira Ayuso
@ 2024-08-22 22:19 ` Pablo Neira Ayuso
  2024-08-22 22:19 ` [PATCH net-next 5/9] netfilter: nf_tables: do not remove elements if set backend implements .abort Pablo Neira Ayuso
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-22 22:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw

From: Florian Westphal <fw@strlen.de>

nft_set_lookup_byid() is very slow when transaction becomes large, due to
walk of the transaction list.

Add a dedicated list that contains only the new sets.

Before: nft -f ruleset 0.07s user 0.00s system 0% cpu 1:04.84 total
After: nft -f ruleset 0.07s user 0.00s system 0% cpu 30.115 total

.. where ruleset contains ~10 sets with ~100k elements.
The above number is for a combined flush+reload of the ruleset.

With previous flush, even the first NEWELEM has to walk through a few
hundred thousands of DELSET(ELEM) transactions before the first NEWSET
object. To cope with random-order-newset-newsetelem we'd need to replace
commit_set_list with a hashtable.

Expectation is that a NEWELEM operation refers to the most recently added
set, so last entry of the dedicated list should be the set we want.

NB: This is not a bug fix per se (functionality is fine), but with
larger transaction batches list search takes forever, so it would be
nice to speed this up for -stable too, hence adding a "fixes" tag.

Fixes: 958bee14d071 ("netfilter: nf_tables: use new transaction infrastructure to handle sets")
Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_tables.h |  2 ++
 net/netfilter/nf_tables_api.c     | 29 ++++++++++++++++++++---------
 2 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 1bfdd16890fa..2be4738eae1c 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1674,6 +1674,7 @@ struct nft_trans_rule {
 
 struct nft_trans_set {
 	struct nft_trans_binding	nft_trans_binding;
+	struct list_head		list_trans_newset;
 	struct nft_set			*set;
 	u32				set_id;
 	u32				gc_int;
@@ -1875,6 +1876,7 @@ static inline int nft_request_module(struct net *net, const char *fmt, ...) { re
 struct nftables_pernet {
 	struct list_head	tables;
 	struct list_head	commit_list;
+	struct list_head	commit_set_list;
 	struct list_head	binding_list;
 	struct list_head	module_list;
 	struct list_head	notify_list;
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 0a2f79346958..3ea5d0163510 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -393,6 +393,7 @@ static void nft_trans_commit_list_add_tail(struct net *net, struct nft_trans *tr
 {
 	struct nftables_pernet *nft_net = nft_pernet(net);
 	struct nft_trans_binding *binding;
+	struct nft_trans_set *trans_set;
 
 	list_add_tail(&trans->list, &nft_net->commit_list);
 
@@ -402,9 +403,13 @@ static void nft_trans_commit_list_add_tail(struct net *net, struct nft_trans *tr
 
 	switch (trans->msg_type) {
 	case NFT_MSG_NEWSET:
+		trans_set = nft_trans_container_set(trans);
+
 		if (!nft_trans_set_update(trans) &&
 		    nft_set_is_anonymous(nft_trans_set(trans)))
 			list_add_tail(&binding->binding_list, &nft_net->binding_list);
+
+		list_add_tail(&trans_set->list_trans_newset, &nft_net->commit_set_list);
 		break;
 	case NFT_MSG_NEWCHAIN:
 		if (!nft_trans_chain_update(trans) &&
@@ -611,6 +616,7 @@ static int __nft_trans_set_add(const struct nft_ctx *ctx, int msg_type,
 
 	trans_set = nft_trans_container_set(trans);
 	INIT_LIST_HEAD(&trans_set->nft_trans_binding.binding_list);
+	INIT_LIST_HEAD(&trans_set->list_trans_newset);
 
 	if (msg_type == NFT_MSG_NEWSET && ctx->nla[NFTA_SET_ID] && !desc) {
 		nft_trans_set_id(trans) =
@@ -4485,17 +4491,16 @@ static struct nft_set *nft_set_lookup_byid(const struct net *net,
 {
 	struct nftables_pernet *nft_net = nft_pernet(net);
 	u32 id = ntohl(nla_get_be32(nla));
-	struct nft_trans *trans;
+	struct nft_trans_set *trans;
 
-	list_for_each_entry(trans, &nft_net->commit_list, list) {
-		if (trans->msg_type == NFT_MSG_NEWSET) {
-			struct nft_set *set = nft_trans_set(trans);
+	/* its likely the id we need is at the tail, not at start */
+	list_for_each_entry_reverse(trans, &nft_net->commit_set_list, list_trans_newset) {
+		struct nft_set *set = trans->set;
 
-			if (id == nft_trans_set_id(trans) &&
-			    set->table == table &&
-			    nft_active_genmask(set, genmask))
-				return set;
-		}
+		if (id == trans->set_id &&
+		    set->table == table &&
+		    nft_active_genmask(set, genmask))
+			return set;
 	}
 	return ERR_PTR(-ENOENT);
 }
@@ -10447,6 +10452,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
 				nft_flow_rule_destroy(nft_trans_flow_rule(trans));
 			break;
 		case NFT_MSG_NEWSET:
+			list_del(&nft_trans_container_set(trans)->list_trans_newset);
 			if (nft_trans_set_update(trans)) {
 				struct nft_set *set = nft_trans_set(trans);
 
@@ -10755,6 +10761,7 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
 			nft_trans_destroy(trans);
 			break;
 		case NFT_MSG_NEWSET:
+			list_del(&nft_trans_container_set(trans)->list_trans_newset);
 			if (nft_trans_set_update(trans)) {
 				nft_trans_destroy(trans);
 				break;
@@ -10850,6 +10857,8 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
 		}
 	}
 
+	WARN_ON_ONCE(!list_empty(&nft_net->commit_set_list));
+
 	nft_set_abort_update(&set_update_list);
 
 	synchronize_rcu();
@@ -11519,6 +11528,7 @@ static int __net_init nf_tables_init_net(struct net *net)
 
 	INIT_LIST_HEAD(&nft_net->tables);
 	INIT_LIST_HEAD(&nft_net->commit_list);
+	INIT_LIST_HEAD(&nft_net->commit_set_list);
 	INIT_LIST_HEAD(&nft_net->binding_list);
 	INIT_LIST_HEAD(&nft_net->module_list);
 	INIT_LIST_HEAD(&nft_net->notify_list);
@@ -11549,6 +11559,7 @@ static void __net_exit nf_tables_exit_net(struct net *net)
 	gc_seq = nft_gc_seq_begin(nft_net);
 
 	WARN_ON_ONCE(!list_empty(&nft_net->commit_list));
+	WARN_ON_ONCE(!list_empty(&nft_net->commit_set_list));
 
 	if (!list_empty(&nft_net->module_list))
 		nf_tables_module_autoload_cleanup(net);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 5/9] netfilter: nf_tables: do not remove elements if set backend implements .abort
  2024-08-22 22:19 [PATCH net-next 0/9] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (3 preceding siblings ...)
  2024-08-22 22:19 ` [PATCH net-next 4/9] netfilter: nf_tables: store new sets in dedicated list Pablo Neira Ayuso
@ 2024-08-22 22:19 ` Pablo Neira Ayuso
  2024-08-22 22:19 ` [PATCH net-next 6/9] netfilter: move nf_ct_netns_get out of nf_conncount_init Pablo Neira Ayuso
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-22 22:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw

pipapo set backend maintains two copies of the datastructure, removing
the elements from the copy that is going to be discarded slows down
the abort path significantly, from several minutes to few seconds after
this patch.

This patch was previously reverted by

  f86fb94011ae ("netfilter: nf_tables: revert do not remove elements if set backend implements .abort")

but it is now possible since recent work by Florian Westphal to perform
on-demand clone from insert/remove path:

  532aec7e878b ("netfilter: nft_set_pipapo: remove dirty flag")
  3f1d886cc7c3 ("netfilter: nft_set_pipapo: move cloning of match info to insert/removal path")
  a238106703ab ("netfilter: nft_set_pipapo: prepare pipapo_get helper for on-demand clone")
  c5444786d0ea ("netfilter: nft_set_pipapo: merge deactivate helper into caller")
  6c108d9bee44 ("netfilter: nft_set_pipapo: prepare walk function for on-demand clone")
  8b8a2417558c ("netfilter: nft_set_pipapo: prepare destroy function for on-demand clone")
  80efd2997fb9 ("netfilter: nft_set_pipapo: make pipapo_clone helper return NULL")
  a590f4760922 ("netfilter: nft_set_pipapo: move prove_locking helper around")

after this series, the clone is fully released once aborted, no need to
take it back to previous state. Thus, no stale reference to elements can
occur.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_api.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 3ea5d0163510..c85d037a363b 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -10789,7 +10789,10 @@ static int __nf_tables_abort(struct net *net, enum nfnl_abort_action action)
 				break;
 			}
 			te = nft_trans_container_elem(trans);
-			nft_setelem_remove(net, te->set, te->elem_priv);
+			if (!te->set->ops->abort ||
+			    nft_setelem_is_catchall(te->set, te->elem_priv))
+				nft_setelem_remove(net, te->set, te->elem_priv);
+
 			if (!nft_setelem_is_catchall(te->set, te->elem_priv))
 				atomic_dec(&te->set->nelems);
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 6/9] netfilter: move nf_ct_netns_get out of nf_conncount_init
  2024-08-22 22:19 [PATCH net-next 0/9] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (4 preceding siblings ...)
  2024-08-22 22:19 ` [PATCH net-next 5/9] netfilter: nf_tables: do not remove elements if set backend implements .abort Pablo Neira Ayuso
@ 2024-08-22 22:19 ` Pablo Neira Ayuso
  2024-08-22 22:19 ` [PATCH net-next 7/9] netfilter: nf_tables: pass context structure to nft_parse_register_load Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-22 22:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw

From: Xin Long <lucien.xin@gmail.com>

This patch is to move nf_ct_netns_get() out of nf_conncount_init()
and let the consumers of nf_conncount decide if they want to turn
on netfilter conntrack.

It makes nf_conncount more flexible to be used in other places and
avoids netfilter conntrack turned on when using it in openvswitch
conntrack.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_conntrack_count.h |  6 ++----
 net/netfilter/nf_conncount.c               | 15 +++------------
 net/netfilter/xt_connlimit.c               | 15 +++++++++++++--
 net/openvswitch/conntrack.c                |  5 ++---
 4 files changed, 20 insertions(+), 21 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_count.h b/include/net/netfilter/nf_conntrack_count.h
index e227d997fc71..1b58b5b91ff6 100644
--- a/include/net/netfilter/nf_conntrack_count.h
+++ b/include/net/netfilter/nf_conntrack_count.h
@@ -15,10 +15,8 @@ struct nf_conncount_list {
 	unsigned int count;	/* length of list */
 };
 
-struct nf_conncount_data *nf_conncount_init(struct net *net, unsigned int family,
-					    unsigned int keylen);
-void nf_conncount_destroy(struct net *net, unsigned int family,
-			  struct nf_conncount_data *data);
+struct nf_conncount_data *nf_conncount_init(struct net *net, unsigned int keylen);
+void nf_conncount_destroy(struct net *net, struct nf_conncount_data *data);
 
 unsigned int nf_conncount_count(struct net *net,
 				struct nf_conncount_data *data,
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 34ba14e59e95..4890af4dc263 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -522,11 +522,10 @@ unsigned int nf_conncount_count(struct net *net,
 }
 EXPORT_SYMBOL_GPL(nf_conncount_count);
 
-struct nf_conncount_data *nf_conncount_init(struct net *net, unsigned int family,
-					    unsigned int keylen)
+struct nf_conncount_data *nf_conncount_init(struct net *net, unsigned int keylen)
 {
 	struct nf_conncount_data *data;
-	int ret, i;
+	int i;
 
 	if (keylen % sizeof(u32) ||
 	    keylen / sizeof(u32) > MAX_KEYLEN ||
@@ -539,12 +538,6 @@ struct nf_conncount_data *nf_conncount_init(struct net *net, unsigned int family
 	if (!data)
 		return ERR_PTR(-ENOMEM);
 
-	ret = nf_ct_netns_get(net, family);
-	if (ret < 0) {
-		kfree(data);
-		return ERR_PTR(ret);
-	}
-
 	for (i = 0; i < ARRAY_SIZE(data->root); ++i)
 		data->root[i] = RB_ROOT;
 
@@ -581,13 +574,11 @@ static void destroy_tree(struct rb_root *r)
 	}
 }
 
-void nf_conncount_destroy(struct net *net, unsigned int family,
-			  struct nf_conncount_data *data)
+void nf_conncount_destroy(struct net *net, struct nf_conncount_data *data)
 {
 	unsigned int i;
 
 	cancel_work_sync(&data->gc_work);
-	nf_ct_netns_put(net, family);
 
 	for (i = 0; i < ARRAY_SIZE(data->root); ++i)
 		destroy_tree(&data->root[i]);
diff --git a/net/netfilter/xt_connlimit.c b/net/netfilter/xt_connlimit.c
index 5d04ef80a61d..0e762277bcf8 100644
--- a/net/netfilter/xt_connlimit.c
+++ b/net/netfilter/xt_connlimit.c
@@ -86,6 +86,7 @@ static int connlimit_mt_check(const struct xt_mtchk_param *par)
 {
 	struct xt_connlimit_info *info = par->matchinfo;
 	unsigned int keylen;
+	int ret;
 
 	keylen = sizeof(u32);
 	if (par->family == NFPROTO_IPV6)
@@ -93,8 +94,17 @@ static int connlimit_mt_check(const struct xt_mtchk_param *par)
 	else
 		keylen += sizeof(struct in_addr);
 
+	ret = nf_ct_netns_get(par->net, par->family);
+	if (ret < 0) {
+		pr_info_ratelimited("cannot load conntrack support for proto=%u\n",
+				    par->family);
+		return ret;
+	}
+
 	/* init private data */
-	info->data = nf_conncount_init(par->net, par->family, keylen);
+	info->data = nf_conncount_init(par->net, keylen);
+	if (IS_ERR(info->data))
+		nf_ct_netns_put(par->net, par->family);
 
 	return PTR_ERR_OR_ZERO(info->data);
 }
@@ -103,7 +113,8 @@ static void connlimit_mt_destroy(const struct xt_mtdtor_param *par)
 {
 	const struct xt_connlimit_info *info = par->matchinfo;
 
-	nf_conncount_destroy(par->net, par->family, info->data);
+	nf_conncount_destroy(par->net, info->data);
+	nf_ct_netns_put(par->net, par->family);
 }
 
 static struct xt_match connlimit_mt_reg __read_mostly = {
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index a3da5ee34f92..3bb4810234aa 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -1608,8 +1608,7 @@ static int ovs_ct_limit_init(struct net *net, struct ovs_net *ovs_net)
 	for (i = 0; i < CT_LIMIT_HASH_BUCKETS; i++)
 		INIT_HLIST_HEAD(&ovs_net->ct_limit_info->limits[i]);
 
-	ovs_net->ct_limit_info->data =
-		nf_conncount_init(net, NFPROTO_INET, sizeof(u32));
+	ovs_net->ct_limit_info->data = nf_conncount_init(net, sizeof(u32));
 
 	if (IS_ERR(ovs_net->ct_limit_info->data)) {
 		err = PTR_ERR(ovs_net->ct_limit_info->data);
@@ -1626,7 +1625,7 @@ static void ovs_ct_limit_exit(struct net *net, struct ovs_net *ovs_net)
 	const struct ovs_ct_limit_info *info = ovs_net->ct_limit_info;
 	int i;
 
-	nf_conncount_destroy(net, NFPROTO_INET, info->data);
+	nf_conncount_destroy(net, info->data);
 	for (i = 0; i < CT_LIMIT_HASH_BUCKETS; ++i) {
 		struct hlist_head *head = &info->limits[i];
 		struct ovs_ct_limit *ct_limit;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 7/9] netfilter: nf_tables: pass context structure to nft_parse_register_load
  2024-08-22 22:19 [PATCH net-next 0/9] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (5 preceding siblings ...)
  2024-08-22 22:19 ` [PATCH net-next 6/9] netfilter: move nf_ct_netns_get out of nf_conncount_init Pablo Neira Ayuso
@ 2024-08-22 22:19 ` Pablo Neira Ayuso
  2024-08-22 22:19 ` [PATCH net-next 8/9] netfilter: nf_tables: allow loads only when register is initialized Pablo Neira Ayuso
  2024-08-22 22:19 ` [PATCH net-next 9/9] netfilter: nf_tables: don't initialize registers in nft_do_chain() Pablo Neira Ayuso
  8 siblings, 0 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-22 22:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw

From: Florian Westphal <fw@strlen.de>

Mechanical transformation, no logical changes intended.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_tables.h      | 3 ++-
 net/bridge/netfilter/nft_meta_bridge.c | 2 +-
 net/ipv4/netfilter/nft_dup_ipv4.c      | 4 ++--
 net/ipv6/netfilter/nft_dup_ipv6.c      | 4 ++--
 net/netfilter/nf_tables_api.c          | 3 ++-
 net/netfilter/nft_bitwise.c            | 4 ++--
 net/netfilter/nft_byteorder.c          | 2 +-
 net/netfilter/nft_cmp.c                | 6 +++---
 net/netfilter/nft_ct.c                 | 2 +-
 net/netfilter/nft_dup_netdev.c         | 2 +-
 net/netfilter/nft_dynset.c             | 4 ++--
 net/netfilter/nft_exthdr.c             | 2 +-
 net/netfilter/nft_fwd_netdev.c         | 6 +++---
 net/netfilter/nft_hash.c               | 2 +-
 net/netfilter/nft_lookup.c             | 2 +-
 net/netfilter/nft_masq.c               | 4 ++--
 net/netfilter/nft_meta.c               | 2 +-
 net/netfilter/nft_nat.c                | 8 ++++----
 net/netfilter/nft_objref.c             | 2 +-
 net/netfilter/nft_payload.c            | 2 +-
 net/netfilter/nft_queue.c              | 2 +-
 net/netfilter/nft_range.c              | 2 +-
 net/netfilter/nft_redir.c              | 4 ++--
 net/netfilter/nft_tproxy.c             | 4 ++--
 24 files changed, 40 insertions(+), 38 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 2be4738eae1c..573995e7a27e 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -254,7 +254,8 @@ static inline enum nft_registers nft_type_to_reg(enum nft_data_types type)
 int nft_parse_u32_check(const struct nlattr *attr, int max, u32 *dest);
 int nft_dump_register(struct sk_buff *skb, unsigned int attr, unsigned int reg);
 
-int nft_parse_register_load(const struct nlattr *attr, u8 *sreg, u32 len);
+int nft_parse_register_load(const struct nft_ctx *ctx,
+			    const struct nlattr *attr, u8 *sreg, u32 len);
 int nft_parse_register_store(const struct nft_ctx *ctx,
 			     const struct nlattr *attr, u8 *dreg,
 			     const struct nft_data *data,
diff --git a/net/bridge/netfilter/nft_meta_bridge.c b/net/bridge/netfilter/nft_meta_bridge.c
index bd4d1b4d745f..4d8e15927217 100644
--- a/net/bridge/netfilter/nft_meta_bridge.c
+++ b/net/bridge/netfilter/nft_meta_bridge.c
@@ -142,7 +142,7 @@ static int nft_meta_bridge_set_init(const struct nft_ctx *ctx,
 	}
 
 	priv->len = len;
-	err = nft_parse_register_load(tb[NFTA_META_SREG], &priv->sreg, len);
+	err = nft_parse_register_load(ctx, tb[NFTA_META_SREG], &priv->sreg, len);
 	if (err < 0)
 		return err;
 
diff --git a/net/ipv4/netfilter/nft_dup_ipv4.c b/net/ipv4/netfilter/nft_dup_ipv4.c
index a522c3a3be52..ef5dd88107dd 100644
--- a/net/ipv4/netfilter/nft_dup_ipv4.c
+++ b/net/ipv4/netfilter/nft_dup_ipv4.c
@@ -40,13 +40,13 @@ static int nft_dup_ipv4_init(const struct nft_ctx *ctx,
 	if (tb[NFTA_DUP_SREG_ADDR] == NULL)
 		return -EINVAL;
 
-	err = nft_parse_register_load(tb[NFTA_DUP_SREG_ADDR], &priv->sreg_addr,
+	err = nft_parse_register_load(ctx, tb[NFTA_DUP_SREG_ADDR], &priv->sreg_addr,
 				      sizeof(struct in_addr));
 	if (err < 0)
 		return err;
 
 	if (tb[NFTA_DUP_SREG_DEV])
-		err = nft_parse_register_load(tb[NFTA_DUP_SREG_DEV],
+		err = nft_parse_register_load(ctx, tb[NFTA_DUP_SREG_DEV],
 					      &priv->sreg_dev, sizeof(int));
 
 	return err;
diff --git a/net/ipv6/netfilter/nft_dup_ipv6.c b/net/ipv6/netfilter/nft_dup_ipv6.c
index c82f3fdd4a65..492a811828a7 100644
--- a/net/ipv6/netfilter/nft_dup_ipv6.c
+++ b/net/ipv6/netfilter/nft_dup_ipv6.c
@@ -38,13 +38,13 @@ static int nft_dup_ipv6_init(const struct nft_ctx *ctx,
 	if (tb[NFTA_DUP_SREG_ADDR] == NULL)
 		return -EINVAL;
 
-	err = nft_parse_register_load(tb[NFTA_DUP_SREG_ADDR], &priv->sreg_addr,
+	err = nft_parse_register_load(ctx, tb[NFTA_DUP_SREG_ADDR], &priv->sreg_addr,
 				      sizeof(struct in6_addr));
 	if (err < 0)
 		return err;
 
 	if (tb[NFTA_DUP_SREG_DEV])
-		err = nft_parse_register_load(tb[NFTA_DUP_SREG_DEV],
+		err = nft_parse_register_load(ctx, tb[NFTA_DUP_SREG_DEV],
 					      &priv->sreg_dev, sizeof(int));
 
 	return err;
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index c85d037a363b..d05fc17bb9b7 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -11038,7 +11038,8 @@ static int nft_validate_register_load(enum nft_registers reg, unsigned int len)
 	return 0;
 }
 
-int nft_parse_register_load(const struct nlattr *attr, u8 *sreg, u32 len)
+int nft_parse_register_load(const struct nft_ctx *ctx,
+			    const struct nlattr *attr, u8 *sreg, u32 len)
 {
 	u32 reg;
 	int err;
diff --git a/net/netfilter/nft_bitwise.c b/net/netfilter/nft_bitwise.c
index ca857afbf061..7de95674fd8c 100644
--- a/net/netfilter/nft_bitwise.c
+++ b/net/netfilter/nft_bitwise.c
@@ -171,7 +171,7 @@ static int nft_bitwise_init(const struct nft_ctx *ctx,
 
 	priv->len = len;
 
-	err = nft_parse_register_load(tb[NFTA_BITWISE_SREG], &priv->sreg,
+	err = nft_parse_register_load(ctx, tb[NFTA_BITWISE_SREG], &priv->sreg,
 				      priv->len);
 	if (err < 0)
 		return err;
@@ -365,7 +365,7 @@ static int nft_bitwise_fast_init(const struct nft_ctx *ctx,
 	struct nft_bitwise_fast_expr *priv = nft_expr_priv(expr);
 	int err;
 
-	err = nft_parse_register_load(tb[NFTA_BITWISE_SREG], &priv->sreg,
+	err = nft_parse_register_load(ctx, tb[NFTA_BITWISE_SREG], &priv->sreg,
 				      sizeof(u32));
 	if (err < 0)
 		return err;
diff --git a/net/netfilter/nft_byteorder.c b/net/netfilter/nft_byteorder.c
index f6e791a68101..2f82a444d21b 100644
--- a/net/netfilter/nft_byteorder.c
+++ b/net/netfilter/nft_byteorder.c
@@ -139,7 +139,7 @@ static int nft_byteorder_init(const struct nft_ctx *ctx,
 
 	priv->len = len;
 
-	err = nft_parse_register_load(tb[NFTA_BYTEORDER_SREG], &priv->sreg,
+	err = nft_parse_register_load(ctx, tb[NFTA_BYTEORDER_SREG], &priv->sreg,
 				      priv->len);
 	if (err < 0)
 		return err;
diff --git a/net/netfilter/nft_cmp.c b/net/netfilter/nft_cmp.c
index cd4652259095..2605f43737bc 100644
--- a/net/netfilter/nft_cmp.c
+++ b/net/netfilter/nft_cmp.c
@@ -83,7 +83,7 @@ static int nft_cmp_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
 	if (err < 0)
 		return err;
 
-	err = nft_parse_register_load(tb[NFTA_CMP_SREG], &priv->sreg, desc.len);
+	err = nft_parse_register_load(ctx, tb[NFTA_CMP_SREG], &priv->sreg, desc.len);
 	if (err < 0)
 		return err;
 
@@ -222,7 +222,7 @@ static int nft_cmp_fast_init(const struct nft_ctx *ctx,
 	if (err < 0)
 		return err;
 
-	err = nft_parse_register_load(tb[NFTA_CMP_SREG], &priv->sreg, desc.len);
+	err = nft_parse_register_load(ctx, tb[NFTA_CMP_SREG], &priv->sreg, desc.len);
 	if (err < 0)
 		return err;
 
@@ -323,7 +323,7 @@ static int nft_cmp16_fast_init(const struct nft_ctx *ctx,
 	if (err < 0)
 		return err;
 
-	err = nft_parse_register_load(tb[NFTA_CMP_SREG], &priv->sreg, desc.len);
+	err = nft_parse_register_load(ctx, tb[NFTA_CMP_SREG], &priv->sreg, desc.len);
 	if (err < 0)
 		return err;
 
diff --git a/net/netfilter/nft_ct.c b/net/netfilter/nft_ct.c
index 452ed94c3a4d..67a41cd2baaf 100644
--- a/net/netfilter/nft_ct.c
+++ b/net/netfilter/nft_ct.c
@@ -606,7 +606,7 @@ static int nft_ct_set_init(const struct nft_ctx *ctx,
 	}
 
 	priv->len = len;
-	err = nft_parse_register_load(tb[NFTA_CT_SREG], &priv->sreg, len);
+	err = nft_parse_register_load(ctx, tb[NFTA_CT_SREG], &priv->sreg, len);
 	if (err < 0)
 		goto err1;
 
diff --git a/net/netfilter/nft_dup_netdev.c b/net/netfilter/nft_dup_netdev.c
index e5739a59ebf1..0573f96ce079 100644
--- a/net/netfilter/nft_dup_netdev.c
+++ b/net/netfilter/nft_dup_netdev.c
@@ -40,7 +40,7 @@ static int nft_dup_netdev_init(const struct nft_ctx *ctx,
 	if (tb[NFTA_DUP_SREG_DEV] == NULL)
 		return -EINVAL;
 
-	return nft_parse_register_load(tb[NFTA_DUP_SREG_DEV], &priv->sreg_dev,
+	return nft_parse_register_load(ctx, tb[NFTA_DUP_SREG_DEV], &priv->sreg_dev,
 				       sizeof(int));
 }
 
diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c
index b4ada3ab2167..6920df754265 100644
--- a/net/netfilter/nft_dynset.c
+++ b/net/netfilter/nft_dynset.c
@@ -215,7 +215,7 @@ static int nft_dynset_init(const struct nft_ctx *ctx,
 			return err;
 	}
 
-	err = nft_parse_register_load(tb[NFTA_DYNSET_SREG_KEY], &priv->sreg_key,
+	err = nft_parse_register_load(ctx, tb[NFTA_DYNSET_SREG_KEY], &priv->sreg_key,
 				      set->klen);
 	if (err < 0)
 		return err;
@@ -226,7 +226,7 @@ static int nft_dynset_init(const struct nft_ctx *ctx,
 		if (set->dtype == NFT_DATA_VERDICT)
 			return -EOPNOTSUPP;
 
-		err = nft_parse_register_load(tb[NFTA_DYNSET_SREG_DATA],
+		err = nft_parse_register_load(ctx, tb[NFTA_DYNSET_SREG_DATA],
 					      &priv->sreg_data, set->dlen);
 		if (err < 0)
 			return err;
diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c
index 6eb571d0c3fd..6bfd33516241 100644
--- a/net/netfilter/nft_exthdr.c
+++ b/net/netfilter/nft_exthdr.c
@@ -588,7 +588,7 @@ static int nft_exthdr_tcp_set_init(const struct nft_ctx *ctx,
 	priv->flags  = flags;
 	priv->op     = op;
 
-	return nft_parse_register_load(tb[NFTA_EXTHDR_SREG], &priv->sreg,
+	return nft_parse_register_load(ctx, tb[NFTA_EXTHDR_SREG], &priv->sreg,
 				       priv->len);
 }
 
diff --git a/net/netfilter/nft_fwd_netdev.c b/net/netfilter/nft_fwd_netdev.c
index 358e742afad7..c83a794025f9 100644
--- a/net/netfilter/nft_fwd_netdev.c
+++ b/net/netfilter/nft_fwd_netdev.c
@@ -52,7 +52,7 @@ static int nft_fwd_netdev_init(const struct nft_ctx *ctx,
 	if (tb[NFTA_FWD_SREG_DEV] == NULL)
 		return -EINVAL;
 
-	return nft_parse_register_load(tb[NFTA_FWD_SREG_DEV], &priv->sreg_dev,
+	return nft_parse_register_load(ctx, tb[NFTA_FWD_SREG_DEV], &priv->sreg_dev,
 				       sizeof(int));
 }
 
@@ -178,12 +178,12 @@ static int nft_fwd_neigh_init(const struct nft_ctx *ctx,
 		return -EOPNOTSUPP;
 	}
 
-	err = nft_parse_register_load(tb[NFTA_FWD_SREG_DEV], &priv->sreg_dev,
+	err = nft_parse_register_load(ctx, tb[NFTA_FWD_SREG_DEV], &priv->sreg_dev,
 				      sizeof(int));
 	if (err < 0)
 		return err;
 
-	return nft_parse_register_load(tb[NFTA_FWD_SREG_ADDR], &priv->sreg_addr,
+	return nft_parse_register_load(ctx, tb[NFTA_FWD_SREG_ADDR], &priv->sreg_addr,
 				       addr_len);
 }
 
diff --git a/net/netfilter/nft_hash.c b/net/netfilter/nft_hash.c
index 868d68302d22..5d034bbb6913 100644
--- a/net/netfilter/nft_hash.c
+++ b/net/netfilter/nft_hash.c
@@ -92,7 +92,7 @@ static int nft_jhash_init(const struct nft_ctx *ctx,
 
 	priv->len = len;
 
-	err = nft_parse_register_load(tb[NFTA_HASH_SREG], &priv->sreg, len);
+	err = nft_parse_register_load(ctx, tb[NFTA_HASH_SREG], &priv->sreg, len);
 	if (err < 0)
 		return err;
 
diff --git a/net/netfilter/nft_lookup.c b/net/netfilter/nft_lookup.c
index f3080fa1b226..580e4b1deb9b 100644
--- a/net/netfilter/nft_lookup.c
+++ b/net/netfilter/nft_lookup.c
@@ -113,7 +113,7 @@ static int nft_lookup_init(const struct nft_ctx *ctx,
 	if (IS_ERR(set))
 		return PTR_ERR(set);
 
-	err = nft_parse_register_load(tb[NFTA_LOOKUP_SREG], &priv->sreg,
+	err = nft_parse_register_load(ctx, tb[NFTA_LOOKUP_SREG], &priv->sreg,
 				      set->klen);
 	if (err < 0)
 		return err;
diff --git a/net/netfilter/nft_masq.c b/net/netfilter/nft_masq.c
index 8a14aaca93bb..cb43c72a8c2a 100644
--- a/net/netfilter/nft_masq.c
+++ b/net/netfilter/nft_masq.c
@@ -52,13 +52,13 @@ static int nft_masq_init(const struct nft_ctx *ctx,
 		priv->flags = ntohl(nla_get_be32(tb[NFTA_MASQ_FLAGS]));
 
 	if (tb[NFTA_MASQ_REG_PROTO_MIN]) {
-		err = nft_parse_register_load(tb[NFTA_MASQ_REG_PROTO_MIN],
+		err = nft_parse_register_load(ctx, tb[NFTA_MASQ_REG_PROTO_MIN],
 					      &priv->sreg_proto_min, plen);
 		if (err < 0)
 			return err;
 
 		if (tb[NFTA_MASQ_REG_PROTO_MAX]) {
-			err = nft_parse_register_load(tb[NFTA_MASQ_REG_PROTO_MAX],
+			err = nft_parse_register_load(ctx, tb[NFTA_MASQ_REG_PROTO_MAX],
 						      &priv->sreg_proto_max,
 						      plen);
 			if (err < 0)
diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
index 9139ce38ea7b..0214ad1ced2f 100644
--- a/net/netfilter/nft_meta.c
+++ b/net/netfilter/nft_meta.c
@@ -657,7 +657,7 @@ int nft_meta_set_init(const struct nft_ctx *ctx,
 	}
 
 	priv->len = len;
-	err = nft_parse_register_load(tb[NFTA_META_SREG], &priv->sreg, len);
+	err = nft_parse_register_load(ctx, tb[NFTA_META_SREG], &priv->sreg, len);
 	if (err < 0)
 		return err;
 
diff --git a/net/netfilter/nft_nat.c b/net/netfilter/nft_nat.c
index 808f5802c270..983dd937fe02 100644
--- a/net/netfilter/nft_nat.c
+++ b/net/netfilter/nft_nat.c
@@ -214,13 +214,13 @@ static int nft_nat_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
 	priv->family = family;
 
 	if (tb[NFTA_NAT_REG_ADDR_MIN]) {
-		err = nft_parse_register_load(tb[NFTA_NAT_REG_ADDR_MIN],
+		err = nft_parse_register_load(ctx, tb[NFTA_NAT_REG_ADDR_MIN],
 					      &priv->sreg_addr_min, alen);
 		if (err < 0)
 			return err;
 
 		if (tb[NFTA_NAT_REG_ADDR_MAX]) {
-			err = nft_parse_register_load(tb[NFTA_NAT_REG_ADDR_MAX],
+			err = nft_parse_register_load(ctx, tb[NFTA_NAT_REG_ADDR_MAX],
 						      &priv->sreg_addr_max,
 						      alen);
 			if (err < 0)
@@ -234,13 +234,13 @@ static int nft_nat_init(const struct nft_ctx *ctx, const struct nft_expr *expr,
 
 	plen = sizeof_field(struct nf_nat_range, min_proto.all);
 	if (tb[NFTA_NAT_REG_PROTO_MIN]) {
-		err = nft_parse_register_load(tb[NFTA_NAT_REG_PROTO_MIN],
+		err = nft_parse_register_load(ctx, tb[NFTA_NAT_REG_PROTO_MIN],
 					      &priv->sreg_proto_min, plen);
 		if (err < 0)
 			return err;
 
 		if (tb[NFTA_NAT_REG_PROTO_MAX]) {
-			err = nft_parse_register_load(tb[NFTA_NAT_REG_PROTO_MAX],
+			err = nft_parse_register_load(ctx, tb[NFTA_NAT_REG_PROTO_MAX],
 						      &priv->sreg_proto_max,
 						      plen);
 			if (err < 0)
diff --git a/net/netfilter/nft_objref.c b/net/netfilter/nft_objref.c
index 509011b1ef59..09da7a3f9f96 100644
--- a/net/netfilter/nft_objref.c
+++ b/net/netfilter/nft_objref.c
@@ -143,7 +143,7 @@ static int nft_objref_map_init(const struct nft_ctx *ctx,
 	if (!(set->flags & NFT_SET_OBJECT))
 		return -EINVAL;
 
-	err = nft_parse_register_load(tb[NFTA_OBJREF_SET_SREG], &priv->sreg,
+	err = nft_parse_register_load(ctx, tb[NFTA_OBJREF_SET_SREG], &priv->sreg,
 				      set->klen);
 	if (err < 0)
 		return err;
diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c
index 50429cbd42da..330609a76fb2 100644
--- a/net/netfilter/nft_payload.c
+++ b/net/netfilter/nft_payload.c
@@ -981,7 +981,7 @@ static int nft_payload_set_init(const struct nft_ctx *ctx,
 	}
 	priv->csum_type = csum_type;
 
-	return nft_parse_register_load(tb[NFTA_PAYLOAD_SREG], &priv->sreg,
+	return nft_parse_register_load(ctx, tb[NFTA_PAYLOAD_SREG], &priv->sreg,
 				       priv->len);
 }
 
diff --git a/net/netfilter/nft_queue.c b/net/netfilter/nft_queue.c
index b2b8127c8d43..44e6817e6e29 100644
--- a/net/netfilter/nft_queue.c
+++ b/net/netfilter/nft_queue.c
@@ -136,7 +136,7 @@ static int nft_queue_sreg_init(const struct nft_ctx *ctx,
 	struct nft_queue *priv = nft_expr_priv(expr);
 	int err;
 
-	err = nft_parse_register_load(tb[NFTA_QUEUE_SREG_QNUM],
+	err = nft_parse_register_load(ctx, tb[NFTA_QUEUE_SREG_QNUM],
 				      &priv->sreg_qnum, sizeof(u32));
 	if (err < 0)
 		return err;
diff --git a/net/netfilter/nft_range.c b/net/netfilter/nft_range.c
index 51ae64cd268f..ea382f7bbd78 100644
--- a/net/netfilter/nft_range.c
+++ b/net/netfilter/nft_range.c
@@ -83,7 +83,7 @@ static int nft_range_init(const struct nft_ctx *ctx, const struct nft_expr *expr
 		goto err2;
 	}
 
-	err = nft_parse_register_load(tb[NFTA_RANGE_SREG], &priv->sreg,
+	err = nft_parse_register_load(ctx, tb[NFTA_RANGE_SREG], &priv->sreg,
 				      desc_from.len);
 	if (err < 0)
 		goto err2;
diff --git a/net/netfilter/nft_redir.c b/net/netfilter/nft_redir.c
index a58bd8d291ff..6568cc264078 100644
--- a/net/netfilter/nft_redir.c
+++ b/net/netfilter/nft_redir.c
@@ -51,13 +51,13 @@ static int nft_redir_init(const struct nft_ctx *ctx,
 
 	plen = sizeof_field(struct nf_nat_range, min_proto.all);
 	if (tb[NFTA_REDIR_REG_PROTO_MIN]) {
-		err = nft_parse_register_load(tb[NFTA_REDIR_REG_PROTO_MIN],
+		err = nft_parse_register_load(ctx, tb[NFTA_REDIR_REG_PROTO_MIN],
 					      &priv->sreg_proto_min, plen);
 		if (err < 0)
 			return err;
 
 		if (tb[NFTA_REDIR_REG_PROTO_MAX]) {
-			err = nft_parse_register_load(tb[NFTA_REDIR_REG_PROTO_MAX],
+			err = nft_parse_register_load(ctx, tb[NFTA_REDIR_REG_PROTO_MAX],
 						      &priv->sreg_proto_max,
 						      plen);
 			if (err < 0)
diff --git a/net/netfilter/nft_tproxy.c b/net/netfilter/nft_tproxy.c
index 71412adb73d4..1b691393d8b1 100644
--- a/net/netfilter/nft_tproxy.c
+++ b/net/netfilter/nft_tproxy.c
@@ -254,14 +254,14 @@ static int nft_tproxy_init(const struct nft_ctx *ctx,
 	}
 
 	if (tb[NFTA_TPROXY_REG_ADDR]) {
-		err = nft_parse_register_load(tb[NFTA_TPROXY_REG_ADDR],
+		err = nft_parse_register_load(ctx, tb[NFTA_TPROXY_REG_ADDR],
 					      &priv->sreg_addr, alen);
 		if (err < 0)
 			return err;
 	}
 
 	if (tb[NFTA_TPROXY_REG_PORT]) {
-		err = nft_parse_register_load(tb[NFTA_TPROXY_REG_PORT],
+		err = nft_parse_register_load(ctx, tb[NFTA_TPROXY_REG_PORT],
 					      &priv->sreg_port, sizeof(u16));
 		if (err < 0)
 			return err;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 8/9] netfilter: nf_tables: allow loads only when register is initialized
  2024-08-22 22:19 [PATCH net-next 0/9] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (6 preceding siblings ...)
  2024-08-22 22:19 ` [PATCH net-next 7/9] netfilter: nf_tables: pass context structure to nft_parse_register_load Pablo Neira Ayuso
@ 2024-08-22 22:19 ` Pablo Neira Ayuso
  2024-08-22 22:19 ` [PATCH net-next 9/9] netfilter: nf_tables: don't initialize registers in nft_do_chain() Pablo Neira Ayuso
  8 siblings, 0 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-22 22:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw

From: Florian Westphal <fw@strlen.de>

Reject rules where a load occurs from a register that has not seen a store
early in the same rule.

commit 4c905f6740a3 ("netfilter: nf_tables: initialize registers in
nft_do_chain()")
had to add a unconditional memset to the nftables register space to avoid
leaking stack information to userspace.

This memset shows up in benchmarks.  After this change, this commit can
be reverted again.

Note that this breaks userspace compatibility, because theoretically
you can do

  rule 1: reg2 := meta load iif, reg2  == 1 jump ...
  rule 2: reg2 == 2 jump ...   // read access with no store in this rule

... after this change this is rejected.

Neither nftables nor iptables-nft generate such rules, each rule is
always standalone.

This resuts in a small increase of nft_ctx structure by sizeof(long).

To cope with hypothetical rulesets like the example above one could emit
on-demand "reg[x] = 0" store when generating the datapath blob in
nf_tables_commit_chain_prepare().

A patch that does this is linked to below.

For now, lets disable this.  In nf_tables, a rule is the smallest
unit that can be replaced from userspace, i.e. a hypothetical ruleset
that relies on earlier initialisations of registers can't be changed
at will as register usage would need to be coordinated.

Link: https://lore.kernel.org/netfilter-devel/20240627135330.17039-4-fw@strlen.de/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_tables.h |  1 +
 net/netfilter/nf_tables_api.c     | 38 +++++++++++++++++++++++++++----
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 573995e7a27e..1cc33d946d41 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -221,6 +221,7 @@ struct nft_ctx {
 	u8				family;
 	u8				level;
 	bool				report;
+	DECLARE_BITMAP(reg_inited, NFT_REG32_NUM);
 };
 
 enum nft_data_desc_flags {
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index d05fc17bb9b7..904f2e25b4a4 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -146,6 +146,8 @@ static void nft_ctx_init(struct nft_ctx *ctx,
 	ctx->report	= nlmsg_report(nlh);
 	ctx->flags	= nlh->nlmsg_flags;
 	ctx->seq	= nlh->nlmsg_seq;
+
+	bitmap_zero(ctx->reg_inited, NFT_REG32_NUM);
 }
 
 static struct nft_trans *nft_trans_alloc_gfp(const struct nft_ctx *ctx,
@@ -11041,8 +11043,8 @@ static int nft_validate_register_load(enum nft_registers reg, unsigned int len)
 int nft_parse_register_load(const struct nft_ctx *ctx,
 			    const struct nlattr *attr, u8 *sreg, u32 len)
 {
-	u32 reg;
-	int err;
+	int err, invalid_reg;
+	u32 reg, next_register;
 
 	err = nft_parse_register(attr, &reg);
 	if (err < 0)
@@ -11052,11 +11054,36 @@ int nft_parse_register_load(const struct nft_ctx *ctx,
 	if (err < 0)
 		return err;
 
+	next_register = DIV_ROUND_UP(len, NFT_REG32_SIZE) + reg;
+
+	/* Can't happen: nft_validate_register_load() should have failed */
+	if (WARN_ON_ONCE(next_register > NFT_REG32_NUM))
+		return -EINVAL;
+
+	/* find first register that did not see an earlier store. */
+	invalid_reg = find_next_zero_bit(ctx->reg_inited, NFT_REG32_NUM, reg);
+
+	/* invalid register within the range that we're loading from? */
+	if (invalid_reg < next_register)
+		return -ENODATA;
+
 	*sreg = reg;
 	return 0;
 }
 EXPORT_SYMBOL_GPL(nft_parse_register_load);
 
+static void nft_saw_register_store(const struct nft_ctx *__ctx,
+				   int reg, unsigned int len)
+{
+	unsigned int registers = DIV_ROUND_UP(len, NFT_REG32_SIZE);
+	struct nft_ctx *ctx = (struct nft_ctx *)__ctx;
+
+	if (WARN_ON_ONCE(len == 0 || reg < 0))
+		return;
+
+	bitmap_set(ctx->reg_inited, reg, registers);
+}
+
 static int nft_validate_register_store(const struct nft_ctx *ctx,
 				       enum nft_registers reg,
 				       const struct nft_data *data,
@@ -11078,7 +11105,7 @@ static int nft_validate_register_store(const struct nft_ctx *ctx,
 				return err;
 		}
 
-		return 0;
+		break;
 	default:
 		if (type != NFT_DATA_VALUE)
 			return -EINVAL;
@@ -11091,8 +11118,11 @@ static int nft_validate_register_store(const struct nft_ctx *ctx,
 		    sizeof_field(struct nft_regs, data))
 			return -ERANGE;
 
-		return 0;
+		break;
 	}
+
+	nft_saw_register_store(ctx, reg, len);
+	return 0;
 }
 
 int nft_parse_register_store(const struct nft_ctx *ctx,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH net-next 9/9] netfilter: nf_tables: don't initialize registers in nft_do_chain()
  2024-08-22 22:19 [PATCH net-next 0/9] Netfilter updates for net-next Pablo Neira Ayuso
                   ` (7 preceding siblings ...)
  2024-08-22 22:19 ` [PATCH net-next 8/9] netfilter: nf_tables: allow loads only when register is initialized Pablo Neira Ayuso
@ 2024-08-22 22:19 ` Pablo Neira Ayuso
  8 siblings, 0 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-22 22:19 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev, kuba, pabeni, edumazet, fw

From: Florian Westphal <fw@strlen.de>

revert commit 4c905f6740a3 ("netfilter: nf_tables: initialize registers in
nft_do_chain()").

Previous patch makes sure that loads from uninitialized registers are
detected from the control plane. in this case rule blob auto-zeroes
registers.  Thus the explicit zeroing is not needed anymore.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index a48d5f0e2f3e..75598520b0fa 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -256,7 +256,7 @@ nft_do_chain(struct nft_pktinfo *pkt, void *priv)
 	const struct net *net = nft_net(pkt);
 	const struct nft_expr *expr, *last;
 	const struct nft_rule_dp *rule;
-	struct nft_regs regs = {};
+	struct nft_regs regs;
 	unsigned int stackptr = 0;
 	struct nft_jumpstack jumpstack[NFT_JUMP_STACK_SIZE];
 	bool genbit = READ_ONCE(net->nft.gencursor);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 1/9] netfilter: nfnetlink_queue: unbreak SCTP traffic
  2024-08-22 22:19 ` [PATCH net-next 1/9] netfilter: nfnetlink_queue: unbreak SCTP traffic Pablo Neira Ayuso
@ 2024-08-26 15:50   ` patchwork-bot+netdevbpf
  0 siblings, 0 replies; 18+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-08-26 15:50 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, davem, netdev, kuba, pabeni, edumazet, fw

Hello:

This series was applied to netdev/net-next.git (main)
by Pablo Neira Ayuso <pablo@netfilter.org>:

On Fri, 23 Aug 2024 00:19:31 +0200 you wrote:
> From: Antonio Ojea <aojea@google.com>
> 
> when packet is enqueued with nfqueue and GSO is enabled, checksum
> calculation has to take into account the protocol, as SCTP uses a
> 32 bits CRC checksum.
> 
> Enter skb_gso_segment() path in case of SCTP GSO packets because
> skb_zerocopy() does not support for GSO_BY_FRAGS.
> 
> [...]

Here is the summary with links:
  - [net-next,1/9] netfilter: nfnetlink_queue: unbreak SCTP traffic
    https://git.kernel.org/netdev/net-next/c/26a77d02891a
  - [net-next,2/9] selftests: netfilter: nft_queue.sh: sctp coverage
    https://git.kernel.org/netdev/net-next/c/4e97d521c2be
  - [net-next,3/9] netfilter: nfnetlink: convert kfree_skb to consume_skb
    https://git.kernel.org/netdev/net-next/c/e2444c1d4639
  - [net-next,4/9] netfilter: nf_tables: store new sets in dedicated list
    https://git.kernel.org/netdev/net-next/c/c1aa38866b9c
  - [net-next,5/9] netfilter: nf_tables: do not remove elements if set backend implements .abort
    https://git.kernel.org/netdev/net-next/c/c9526aeb4998
  - [net-next,6/9] netfilter: move nf_ct_netns_get out of nf_conncount_init
    https://git.kernel.org/netdev/net-next/c/d5283b47e225
  - [net-next,7/9] netfilter: nf_tables: pass context structure to nft_parse_register_load
    https://git.kernel.org/netdev/net-next/c/7ea0522ef81a
  - [net-next,8/9] netfilter: nf_tables: allow loads only when register is initialized
    https://git.kernel.org/netdev/net-next/c/14fb07130c7d
  - [net-next,9/9] netfilter: nf_tables: don't initialize registers in nft_do_chain()
    https://git.kernel.org/netdev/net-next/c/c88baabf16d1

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next 2/9] selftests: netfilter: nft_queue.sh: sctp coverage
  2024-08-22 22:19 ` [PATCH net-next 2/9] selftests: netfilter: nft_queue.sh: sctp coverage Pablo Neira Ayuso
@ 2024-08-27  2:25   ` Jakub Kicinski
  2024-08-27  9:00     ` [PATCH net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build Florian Westphal
  0 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2024-08-27  2:25 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: netfilter-devel, davem, netdev, pabeni, edumazet, fw,
	Antonio Ojea

On Fri, 23 Aug 2024 00:19:32 +0200 Pablo Neira Ayuso wrote:
> From: Antonio Ojea <aojea@google.com>
> 
> Test that nfqueue with and without GSO process SCTP packets correctly.
> 
> Joint work with Florian and Pablo.

This is unhappy on a debug kernel in netdev CI:

# 2024/08/26 17:38:31 socat[12451] E write(7, 0x563837a56000, 8192): Cannot send after transport endpoint shutdown
# Binary files /tmp/tmp.ZZAJtF9z9R and /tmp/tmp.hS8W1Te84V differ
# FAIL: lost packets?!

https://netdev-3.bots.linux.dev/vmksft-nf-dbg/results/745281/2-nft-queue-sh/stdout

Works fine on a non-debug kernel tho:

https://netdev-3.bots.linux.dev/vmksft-nf/results/745282/5-nft-queue-sh/stdout

so gotta be some race.

Please follow up soon, I'll disable this test case for now.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build
  2024-08-27  2:25   ` Jakub Kicinski
@ 2024-08-27  9:00     ` Florian Westphal
  2024-08-28 14:48       ` Pablo Neira Ayuso
                         ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Florian Westphal @ 2024-08-27  9:00 UTC (permalink / raw)
  To: netdev
  Cc: netfilter-devel, Jakub Kicinski, Paolo Abeni, Eric Dumazet,
	David S. Miller, Florian Westphal

The sctp selftest is very slow on debug kernels.

Reported-by: Jakub Kicinski <kuba@kernel.org>
Closes: https://lore.kernel.org/netdev/20240826192500.32efa22c@kernel.org/
Fixes: 4e97d521c2be ("selftests: netfilter: nft_queue.sh: sctp coverage")
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 Lets see if CI is happy after this tweak.

 tools/testing/selftests/net/netfilter/nft_queue.sh | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/net/netfilter/nft_queue.sh b/tools/testing/selftests/net/netfilter/nft_queue.sh
index f3bdeb1271eb..9e5f423bff09 100755
--- a/tools/testing/selftests/net/netfilter/nft_queue.sh
+++ b/tools/testing/selftests/net/netfilter/nft_queue.sh
@@ -39,7 +39,9 @@ TMPFILE2=$(mktemp)
 TMPFILE3=$(mktemp)
 
 TMPINPUT=$(mktemp)
-dd conv=sparse status=none if=/dev/zero bs=1M count=200 of="$TMPINPUT"
+COUNT=200
+[ "$KSFT_MACHINE_SLOW" = "yes" ] && COUNT=25
+dd conv=sparse status=none if=/dev/zero bs=1M count=$COUNT of="$TMPINPUT"
 
 if ! ip link add veth0 netns "$nsrouter" type veth peer name eth0 netns "$ns1" > /dev/null 2>&1; then
     echo "SKIP: No virtual ethernet pair device support in kernel"
-- 
2.46.0


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build
  2024-08-27  9:00     ` [PATCH net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build Florian Westphal
@ 2024-08-28 14:48       ` Pablo Neira Ayuso
  2024-08-28 22:49       ` Jakub Kicinski
  2024-08-29  8:40       ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 18+ messages in thread
From: Pablo Neira Ayuso @ 2024-08-28 14:48 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netdev, netfilter-devel, Jakub Kicinski, Paolo Abeni,
	Eric Dumazet, David S. Miller

On Tue, Aug 27, 2024 at 11:00:12AM +0200, Florian Westphal wrote:
> The sctp selftest is very slow on debug kernels.
> 
> Reported-by: Jakub Kicinski <kuba@kernel.org>
> Closes: https://lore.kernel.org/netdev/20240826192500.32efa22c@kernel.org/
> Fixes: 4e97d521c2be ("selftests: netfilter: nft_queue.sh: sctp coverage")
> Signed-off-by: Florian Westphal <fw@strlen.de>

Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>

> ---
>  Lets see if CI is happy after this tweak.
> 
>  tools/testing/selftests/net/netfilter/nft_queue.sh | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/net/netfilter/nft_queue.sh b/tools/testing/selftests/net/netfilter/nft_queue.sh
> index f3bdeb1271eb..9e5f423bff09 100755
> --- a/tools/testing/selftests/net/netfilter/nft_queue.sh
> +++ b/tools/testing/selftests/net/netfilter/nft_queue.sh
> @@ -39,7 +39,9 @@ TMPFILE2=$(mktemp)
>  TMPFILE3=$(mktemp)
>  
>  TMPINPUT=$(mktemp)
> -dd conv=sparse status=none if=/dev/zero bs=1M count=200 of="$TMPINPUT"
> +COUNT=200
> +[ "$KSFT_MACHINE_SLOW" = "yes" ] && COUNT=25
> +dd conv=sparse status=none if=/dev/zero bs=1M count=$COUNT of="$TMPINPUT"
>  
>  if ! ip link add veth0 netns "$nsrouter" type veth peer name eth0 netns "$ns1" > /dev/null 2>&1; then
>      echo "SKIP: No virtual ethernet pair device support in kernel"
> -- 
> 2.46.0
> 
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build
  2024-08-27  9:00     ` [PATCH net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build Florian Westphal
  2024-08-28 14:48       ` Pablo Neira Ayuso
@ 2024-08-28 22:49       ` Jakub Kicinski
  2024-08-29  8:01         ` Florian Westphal
  2024-08-29  8:40       ` patchwork-bot+netdevbpf
  2 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2024-08-28 22:49 UTC (permalink / raw)
  To: Florian Westphal
  Cc: netdev, netfilter-devel, Paolo Abeni, Eric Dumazet,
	David S. Miller

On Tue, 27 Aug 2024 11:00:12 +0200 Florian Westphal wrote:
> The sctp selftest is very slow on debug kernels.

I think there may be something more going on here? :(

For reference net-next-2024-08-27--12-00 is when this fix got queued:

https://netdev.bots.linux.dev/contest.html?executor=vmksft-nf-dbg&test=nft-queue-sh

Since then we still see occasional flakes. But take a look at 
the runtime. If it's happy the test case takes under a minute.
When it's unhappy it times out (after 5 minutes). I'll increase
the timeout to 10 minutes, but 1min vs 5min feels like it may
be getting stuck rather than being slow..

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build
  2024-08-28 22:49       ` Jakub Kicinski
@ 2024-08-29  8:01         ` Florian Westphal
  2024-08-29  8:35           ` Paolo Abeni
  0 siblings, 1 reply; 18+ messages in thread
From: Florian Westphal @ 2024-08-29  8:01 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Florian Westphal, netdev, netfilter-devel, Paolo Abeni,
	Eric Dumazet, David S. Miller

Jakub Kicinski <kuba@kernel.org> wrote:
> On Tue, 27 Aug 2024 11:00:12 +0200 Florian Westphal wrote:
> > The sctp selftest is very slow on debug kernels.
> 
> I think there may be something more going on here? :(
> 
> For reference net-next-2024-08-27--12-00 is when this fix got queued:
> 
> https://netdev.bots.linux.dev/contest.html?executor=vmksft-nf-dbg&test=nft-queue-sh
> 
> Since then we still see occasional flakes. But take a look at 
> the runtime. If it's happy the test case takes under a minute.
> When it's unhappy it times out (after 5 minutes). I'll increase
> the timeout to 10 minutes, but 1min vs 5min feels like it may
> be getting stuck rather than being slow..

Yes, its stuck.  Only reason I could imagine is that there is a 2s
delay between starting the nf_queue test prog and the first packet
getting sent.  That would make the listener exit early and then
socat sender would hang.

I'll test following tomorrow on an old / slow machine:

diff --git a/tools/testing/selftests/net/netfilter/nft_queue.sh b/tools/testing/selftests/net/netfilter/nft_queue.sh
--- a/tools/testing/selftests/net/netfilter/nft_queue.sh
+++ b/tools/testing/selftests/net/netfilter/nft_queue.sh
@@ -39,7 +39,10 @@ TMPFILE2=$(mktemp)
 TMPFILE3=$(mktemp)
 
 TMPINPUT=$(mktemp)
-dd conv=sparse status=none if=/dev/zero bs=1M count=200 of="$TMPINPUT"
+
+COUNT=200
+[ "$KSFT_MACHINE_SLOW" = "yes" ] && COUNT=25
+dd conv=sparse status=none if=/dev/zero bs=1M count=$COUNT of="$TMPINPUT"
 
 if ! ip link add veth0 netns "$nsrouter" type veth peer name eth0 netns "$ns1" > /dev/null 2>&1; then
     echo "SKIP: No virtual ethernet pair device support in kernel"
@@ -398,7 +401,7 @@ EOF
 
 	busywait "$BUSYWAIT_TIMEOUT" sctp_listener_ready "$ns2"
 
-	ip netns exec "$nsrouter" ./nf_queue -q 10 -G -t "$timeout" &
+	ip netns exec "$nsrouter" ./nf_queue -q 10 -G &
 	local nfqpid=$!
 
 	ip netns exec "$ns1" socat -u STDIN SCTP:10.0.2.99:12345 <"$TMPINPUT" >/dev/null
@@ -409,6 +412,7 @@ EOF
 	fi
 
 	wait "$rpid" && echo "PASS: sctp and nfqueue in forward chain"
+	kill "$nfqpid"
 
 	if ! diff -u "$TMPINPUT" "$TMPFILE1" ; then
 		echo "FAIL: lost packets?!" 1>&2
@@ -434,7 +438,7 @@ EOF
 
 	busywait "$BUSYWAIT_TIMEOUT" sctp_listener_ready "$ns2"
 
-	ip netns exec "$ns1" ./nf_queue -q 11 -t "$timeout" &
+	ip netns exec "$ns1" ./nf_queue -q 11 &
 	local nfqpid=$!
 
 	ip netns exec "$ns1" socat -u STDIN SCTP:10.0.2.99:12345 <"$TMPINPUT" >/dev/null
@@ -446,6 +450,7 @@ EOF
 
 	# must wait before checking completeness of output file.
 	wait "$rpid" && echo "PASS: sctp and nfqueue in output chain with GSO"
+	kill "$nfqpid"
 
 	if ! diff -u "$TMPINPUT" "$TMPFILE1" ; then
 		echo "FAIL: lost packets?!" 1>&2
-- 
2.46.0



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build
  2024-08-29  8:01         ` Florian Westphal
@ 2024-08-29  8:35           ` Paolo Abeni
  0 siblings, 0 replies; 18+ messages in thread
From: Paolo Abeni @ 2024-08-29  8:35 UTC (permalink / raw)
  To: Florian Westphal, Jakub Kicinski
  Cc: netdev, netfilter-devel, Eric Dumazet, David S. Miller

On 8/29/24 10:01, Florian Westphal wrote:
> Jakub Kicinski <kuba@kernel.org> wrote:
>> On Tue, 27 Aug 2024 11:00:12 +0200 Florian Westphal wrote:
>>> The sctp selftest is very slow on debug kernels.
>>
>> I think there may be something more going on here? :(
>>
>> For reference net-next-2024-08-27--12-00 is when this fix got queued:
>>
>> https://netdev.bots.linux.dev/contest.html?executor=vmksft-nf-dbg&test=nft-queue-sh
>>
>> Since then we still see occasional flakes. But take a look at
>> the runtime. If it's happy the test case takes under a minute.
>> When it's unhappy it times out (after 5 minutes). I'll increase
>> the timeout to 10 minutes, but 1min vs 5min feels like it may
>> be getting stuck rather than being slow..
> 
> Yes, its stuck.  Only reason I could imagine is that there is a 2s
> delay between starting the nf_queue test prog and the first packet
> getting sent.  That would make the listener exit early and then
> socat sender would hang.

As the root cause for this latter hang-up looks unrelated, and this 
patch is improving the current CI status, I'll apply it as-is.

The other issue will be fixed by a separated patch.

Thanks,

Paolo



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build
  2024-08-27  9:00     ` [PATCH net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build Florian Westphal
  2024-08-28 14:48       ` Pablo Neira Ayuso
  2024-08-28 22:49       ` Jakub Kicinski
@ 2024-08-29  8:40       ` patchwork-bot+netdevbpf
  2 siblings, 0 replies; 18+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-08-29  8:40 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netdev, netfilter-devel, kuba, pabeni, edumazet, davem

Hello:

This patch was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Tue, 27 Aug 2024 11:00:12 +0200 you wrote:
> The sctp selftest is very slow on debug kernels.
> 
> Reported-by: Jakub Kicinski <kuba@kernel.org>
> Closes: https://lore.kernel.org/netdev/20240826192500.32efa22c@kernel.org/
> Fixes: 4e97d521c2be ("selftests: netfilter: nft_queue.sh: sctp coverage")
> Signed-off-by: Florian Westphal <fw@strlen.de>
> 
> [...]

Here is the summary with links:
  - [net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build
    https://git.kernel.org/netdev/net-next/c/0a8b08c554da

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-08-29  8:40 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-22 22:19 [PATCH net-next 0/9] Netfilter updates for net-next Pablo Neira Ayuso
2024-08-22 22:19 ` [PATCH net-next 1/9] netfilter: nfnetlink_queue: unbreak SCTP traffic Pablo Neira Ayuso
2024-08-26 15:50   ` patchwork-bot+netdevbpf
2024-08-22 22:19 ` [PATCH net-next 2/9] selftests: netfilter: nft_queue.sh: sctp coverage Pablo Neira Ayuso
2024-08-27  2:25   ` Jakub Kicinski
2024-08-27  9:00     ` [PATCH net-next] selftests: netfilter: nft_queue.sh: reduce test file size for debug build Florian Westphal
2024-08-28 14:48       ` Pablo Neira Ayuso
2024-08-28 22:49       ` Jakub Kicinski
2024-08-29  8:01         ` Florian Westphal
2024-08-29  8:35           ` Paolo Abeni
2024-08-29  8:40       ` patchwork-bot+netdevbpf
2024-08-22 22:19 ` [PATCH net-next 3/9] netfilter: nfnetlink: convert kfree_skb to consume_skb Pablo Neira Ayuso
2024-08-22 22:19 ` [PATCH net-next 4/9] netfilter: nf_tables: store new sets in dedicated list Pablo Neira Ayuso
2024-08-22 22:19 ` [PATCH net-next 5/9] netfilter: nf_tables: do not remove elements if set backend implements .abort Pablo Neira Ayuso
2024-08-22 22:19 ` [PATCH net-next 6/9] netfilter: move nf_ct_netns_get out of nf_conncount_init Pablo Neira Ayuso
2024-08-22 22:19 ` [PATCH net-next 7/9] netfilter: nf_tables: pass context structure to nft_parse_register_load Pablo Neira Ayuso
2024-08-22 22:19 ` [PATCH net-next 8/9] netfilter: nf_tables: allow loads only when register is initialized Pablo Neira Ayuso
2024-08-22 22:19 ` [PATCH net-next 9/9] netfilter: nf_tables: don't initialize registers in nft_do_chain() Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).