netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Florian Westphal <fw@strlen.de>
To: <netdev@vger.kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	<netfilter-devel@vger.kernel.org>,
	pablo@netfilter.org
Subject: [PATCH net 0/7] netfilter: updates for net
Date: Wed, 10 Sep 2025 21:03:01 +0200	[thread overview]
Message-ID: <20250910190308.13356-1-fw@strlen.de> (raw)

Hi,

The following patchset contains Netfilter fixes for *net*:

WARNING: This results in a conflict on net -> net-next merge.
Merge resolution walkthrough is at the end of this cover letter, see
MERGE WALKTHROUGH.

  Merge branch 'mptcp-misc-fixes-for-v6-17-rc6' (2025-09-09 18:39:55 -0700)

are available in the Git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git tags/nf-25-09-10-v2

for you to fetch changes up to 37a9675e61a2a2a721a28043ffdf2c8ec81eba37:

  MAINTAINERS: add Phil as netfilter reviewer (2025-09-10 20:32:46 +0200)

First patch adds a lockdep annotation for a false-positive splat.
Last patch adds formal reviewer tag for Phil Sutter to MAINTAINERS.

Rest of the patches resolve spurious false negative results during set
lookups while another CPU is processing a transaction.

This has been broken at least since v4.18 when an unconditional
synchronize_rcu call was removed from the commit phase of nf_tables.

Quoting from Stefan Hanreichs original report:

 It seems like we've found an issue with atomicity when reloading
 nftables rulesets. Sometimes there is a small window where rules
 containing sets do not seem to apply to incoming traffic, due to the set
 apparently being empty for a short amount of time when flushing / adding
 elements.

Exanple ruleset:
table ip filter {
  set match {
    type ipv4_addr
    flags interval
    elements = { 0.0.0.0-192.168.2.19, 192.168.2.21-255.255.255.255 }
  }

  chain pre {
    type filter hook prerouting priority filter; policy accept;
    ip saddr @match accept
    counter comment "must never match"
  }
}

Reproducer transaction:
while true:
nft -f -<<EOF
 flush set ip filter match
 create element ip filter match { \
    0.0.0.0-192.168.2.19, 192.168.2.21-255.255.255.255 }
EOF
done

Then create traffic. to/from e.g. 192.168.2.1 to 192.168.3.10.
Once in a while the counter will increment even though the
'ip saddr @match' rule should have accepted the packet.

See individual patches for details.

Thanks to Stefan Hanreich for an initial description and reproducer for
this bug and to Pablo Neira Ayuso for reviewing earlier iterations of
the patchset.

Florian Westphal (7):
  netfilter: nft_set_bitmap: fix lockdep splat due to missing annotation
  netfilter: nft_set_pipapo: don't check genbit from packetpath lookups
  netfilter: nft_set_rbtree: continue traversal if element is inactive
  netfilter: nf_tables: place base_seq in struct net
  netfilter: nf_tables: make nft_set_do_lookup available unconditionally
  netfilter: nf_tables: restart set lookup on base_seq change
  MAINTAINERS: add Phil as netfilter reviewer

 MAINTAINERS                            |  1 +
 include/net/netfilter/nf_tables.h      |  1 -
 include/net/netfilter/nf_tables_core.h | 10 +---
 include/net/netns/nftables.h           |  1 +
 net/netfilter/nf_tables_api.c          | 66 +++++++++++++-------------
 net/netfilter/nft_lookup.c             | 46 ++++++++++++++++--
 net/netfilter/nft_set_bitmap.c         |  3 +-
 net/netfilter/nft_set_pipapo.c         | 20 +++++++-
 net/netfilter/nft_set_pipapo_avx2.c    |  4 +-
 net/netfilter/nft_set_rbtree.c         |  6 +--
 10 files changed, 103 insertions(+), 55 deletions(-)

MERGE WALKTHROUGH:

When merging this to net-next, you should see following:

CONFLICT (content): Merge conflict in net/netfilter/nft_set_pipapo.c
CONFLICT (content): Merge conflict in net/netfilter/nft_set_pipapo_avx2.c

Instructions for net/netfilter/nft_set_pipapo.c:

@@@ -562,7 -539,7 +578,11 @@@ nft_pipapo_lookup(const struct net *net
        const struct nft_pipapo_elem *e;
  
        m = rcu_dereference(priv->match);
++<<<<<<< HEAD
 +      e = pipapo_get_slow(m, (const u8 *)key, genmask, get_jiffies_64());
++=======
+       e = pipapo_get(m, (const u8 *)key, NFT_GENMASK_ANY, get_jiffies_64());
++>>>>>>> 352fd037254683c940630a6c5c8aa8c8ca38ae88
  
        return e ? &e->ext : NULL;
  }

Take the HEAD chunk, with 'genmask' replaced by NFT_GENMASK_ANY, i.e.:

e = pipapo_get_slow(m, (const u8 *)key, NFT_GENMASK_ANY, get_jiffies_64());

Instructions for net/netfilter/nft_set_pipapo_avx2.c:

++<<<<<<< HEAD
++=======
+       const struct nft_pipapo_match *m;
++>>>>>>> 352fd037254683c940630a6c5c8aa8c8ca38ae88

Take the HEAD chunk, i.e. delete 'const struct nft_pipapo_match *m;':
In -next, this is passed as function argument.

++<<<<<<< HEAD
 +              if (ret < 0) {
 +                      scratch->map_index = map_index;
 +                      kernel_fpu_end();
 +                      __local_unlock_nested_bh(&scratch->bh_lock);
 +                      return NULL;
++=======
+               if (ret < 0)
+                       goto out;
+ 
+               if (last) {
+                       const struct nft_set_ext *e = &f->mt[ret].e->ext;
+ 
+                       if (unlikely(nft_set_elem_expired(e)))
+                               goto next_match;
+ 
+                       ext = e;
+                       goto out;
++>>>>>>> 352fd037254683c940630a6c5c8aa8c8ca38ae88

Take the HEAD chunk and discard the other; including if (last) { branch.

Then, in nft_pipapo_avx2_lookup(), make this change:

@@ -1274,9 +1273,8 @@
 nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
                       const u32 *key)
 {
        struct nft_pipapo *priv = nft_set_priv(set);
-       u8 genmask = nft_genmask_cur(net);
        const struct nft_pipapo_match *m;
        const u8 *rp = (const u8 *)key;
        const struct nft_pipapo_elem *e;

@@ -1292,9 +1290,9 @@
        }

        m = rcu_dereference(priv->match);

-       e = pipapo_get_avx2(m, rp, genmask, get_jiffies_64());
+       e = pipapo_get_avx2(m, rp, NFT_GENMASK_ANY, get_jiffies_64());
        local_bh_enable();

        return e ? &e->ext : NULL;

After this change, you are done.
The expected diff vs the net-next main branch in these two files is:

--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -549,6 +549,23 @@ static struct nft_pipapo_elem *pipapo_get(const struct nft_pipapo_match *m,
  *
  * This function is called from the data path.  It will search for
  * an element matching the given key in the current active copy.
+ * Unlike other set types, this uses NFT_GENMASK_ANY instead of
+ * nft_genmask_cur().
[trimmed rest of comment]
  *
  * Return: ntables API extension pointer or NULL if no match.
  */
@@ -557,12 +574,11 @@ nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
                  const u32 *key)
 {
        struct nft_pipapo *priv = nft_set_priv(set);
-       u8 genmask = nft_genmask_cur(net);
        const struct nft_pipapo_match *m;
        const struct nft_pipapo_elem *e;

        m = rcu_dereference(priv->match);
-       e = pipapo_get_slow(m, (const u8 *)key, genmask, get_jiffies_64());
+       e = pipapo_get_slow(m, (const u8 *)key, NFT_GENMASK_ANY, get_jiffies_64());

        return e ? &e->ext : NULL;
 }

--- a/net/netfilter/nft_set_pipapo_avx2.c
+++ b/net/netfilter/nft_set_pipapo_avx2.c
@@ -1275,7 +1275,6 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
                       const u32 *key)
 {
        struct nft_pipapo *priv = nft_set_priv(set);
-       u8 genmask = nft_genmask_cur(net);
        const struct nft_pipapo_match *m;
        const u8 *rp = (const u8 *)key;
        const struct nft_pipapo_elem *e;
@@ -1293,7 +1292,7 @@ nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,

        m = rcu_dereference(priv->match);

-       e = pipapo_get_avx2(m, rp, genmask, get_jiffies_64());
+       e = pipapo_get_avx2(m, rp, NFT_GENMASK_ANY, get_jiffies_64());
        local_bh_enable();

        return e ? &e->ext : NULL;
-- 
2.49.1

             reply	other threads:[~2025-09-10 19:03 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-10 19:03 Florian Westphal [this message]
2025-09-10 19:03 ` [PATCH net 1/7] netfilter: nft_set_bitmap: fix lockdep splat due to missing annotation Florian Westphal
2025-09-11  2:40   ` patchwork-bot+netdevbpf
2025-09-10 19:03 ` [PATCH net 2/7] netfilter: nft_set_pipapo: don't check genbit from packetpath lookups Florian Westphal
2025-09-10 19:03 ` [PATCH net 3/7] netfilter: nft_set_rbtree: continue traversal if element is inactive Florian Westphal
2025-09-10 19:03 ` [PATCH net 4/7] netfilter: nf_tables: place base_seq in struct net Florian Westphal
2025-09-10 19:03 ` [PATCH net 5/7] netfilter: nf_tables: make nft_set_do_lookup available unconditionally Florian Westphal
2025-09-10 19:03 ` [PATCH net 6/7] netfilter: nf_tables: restart set lookup on base_seq change Florian Westphal
2025-09-10 19:03 ` [PATCH net 7/7] MAINTAINERS: add Phil as netfilter reviewer Florian Westphal
2025-09-11  7:16 ` [PATCH net 0/7] netfilter: updates for net: manual merge Matthieu Baerts
  -- strict thread matches above, loose matches on Subject: below --
2023-10-12  8:57 [PATCH net 0/7] netfilter updates for net Florian Westphal
2023-05-10  8:33 [PATCH net 0/7] Netfilter " Pablo Neira Ayuso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250910190308.13356-1-fw@strlen.de \
    --to=fw@strlen.de \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).