* [PATCH net 00/10] netfilter: updates for net
@ 2026-02-17 16:32 Florian Westphal
2026-02-17 16:32 ` [PATCH net 01/10] netfilter: annotate NAT helper hook pointers with __rcu Florian Westphal
` (9 more replies)
0 siblings, 10 replies; 12+ messages in thread
From: Florian Westphal @ 2026-02-17 16:32 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
Hi,
The following patchset contains Netfilter fixes for *net*:
1) Add missing __rcu annotations to NAT helper hook pointers in Amanda, FTP,
IRC, SNMP and TFTP helpers. From Sun Jian.
2-4):
- Add global spinlock to serialize nft_counter fetch+reset operations.
- Use atomic64_xchg() for nft_quota reset instead of read+subtract pattern.
Note AI review detects a race in this change but it isn't new. The
'racing' bit only exists to prevent constant stream of 'quota expired'
notifications.
- Revert commit_mutex usage in nf_tables reset path, it caused
circular lock dependency. All from Brian Witte.
5) Fix uninitialized l3num value in nf_conntrack_h323 helper.
6) Fix musl libc compatibility in netfilter_bridge.h UAPI header. This
change isn't nice (UAPI headers should not include libc headers), but
as-is musl builds may fail due to redefinition of struct ethhdr.
7) Fix protocol checksum validation in IPVS for IPv6 with extension headers,
from Julian Anastasov.
8) Fix device reference leak in IPVS when netdev goes down. Also from
Julian.
9) Remove WARN_ON_ONCE when accessing forward path array, this can
trigger with sufficiently long forward paths. From Pablo Neira Ayuso.
10) Fix use-after-free in nf_tables_addchain() error path, from Inseo An.
Please, pull these changes from:
The following changes since commit 77c5e3fdd2793f478e6fdae55c9ea85b21d06f8f:
Merge branch 'selftests-forwarding-fix-br_netfilter-related-test-failures' (2026-02-17 13:34:41 +0100)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git tags/nf-26-02-17
for you to fetch changes up to 71e99ee20fc3f662555118cf1159443250647533:
netfilter: nf_tables: fix use-after-free in nf_tables_addchain() (2026-02-17 15:04:20 +0100)
----------------------------------------------------------------
netfilter pull request nf-26-02-17
----------------------------------------------------------------
Brian Witte (3):
netfilter: nft_counter: serialize reset with spinlock
netfilter: nft_quota: use atomic64_xchg for reset
netfilter: nf_tables: revert commit_mutex usage in reset path
Florian Westphal (1):
netfilter: nf_conntrack_h323: don't pass uninitialised l3num value
Inseo An (1):
netfilter: nf_tables: fix use-after-free in nf_tables_addchain()
Julian Anastasov (2):
ipvs: skip ipv6 extension headers for csum checks
ipvs: do not keep dest_dst if dev is going down
Pablo Neira Ayuso (1):
net: remove WARN_ON_ONCE when accessing forward path array
Phil Sutter (1):
include: uapi: netfilter_bridge.h: Cover for musl libc
Sun Jian (1):
netfilter: annotate NAT helper hook pointers with __rcu
include/linux/netfilter/nf_conntrack_amanda.h | 2 +-
include/linux/netfilter/nf_conntrack_ftp.h | 2 +-
include/linux/netfilter/nf_conntrack_irc.h | 2 +-
include/linux/netfilter/nf_conntrack_snmp.h | 2 +-
include/linux/netfilter/nf_conntrack_tftp.h | 2 +-
include/uapi/linux/netfilter_bridge.h | 4 +
net/core/dev.c | 2 +-
net/netfilter/ipvs/ip_vs_proto_sctp.c | 18 +-
net/netfilter/ipvs/ip_vs_proto_tcp.c | 21 +-
net/netfilter/ipvs/ip_vs_proto_udp.c | 20 +-
net/netfilter/ipvs/ip_vs_xmit.c | 46 +++-
net/netfilter/nf_conntrack_amanda.c | 14 +-
net/netfilter/nf_conntrack_ftp.c | 14 +-
net/netfilter/nf_conntrack_h323_main.c | 10 +-
net/netfilter/nf_conntrack_irc.c | 13 +-
net/netfilter/nf_conntrack_snmp.c | 8 +-
net/netfilter/nf_conntrack_tftp.c | 7 +-
net/netfilter/nf_tables_api.c | 249 +++---------------
net/netfilter/nft_counter.c | 20 +-
net/netfilter/nft_quota.c | 13 +-
20 files changed, 166 insertions(+), 303 deletions(-)
--
2.52.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH net 01/10] netfilter: annotate NAT helper hook pointers with __rcu
2026-02-17 16:32 [PATCH net 00/10] netfilter: updates for net Florian Westphal
@ 2026-02-17 16:32 ` Florian Westphal
2026-02-19 1:20 ` patchwork-bot+netdevbpf
2026-02-17 16:32 ` [PATCH net 02/10] netfilter: nft_counter: serialize reset with spinlock Florian Westphal
` (8 subsequent siblings)
9 siblings, 1 reply; 12+ messages in thread
From: Florian Westphal @ 2026-02-17 16:32 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Sun Jian <sun.jian.kdev@gmail.com>
The NAT helper hook pointers are updated and dereferenced under RCU rules,
but lack the proper __rcu annotation.
This makes sparse report address space mismatches when the hooks are used
with rcu_dereference().
Add the missing __rcu annotations to the global hook pointer declarations
and definitions in Amanda, FTP, IRC, SNMP and TFTP.
No functional change intended.
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
include/linux/netfilter/nf_conntrack_amanda.h | 2 +-
include/linux/netfilter/nf_conntrack_ftp.h | 2 +-
include/linux/netfilter/nf_conntrack_irc.h | 2 +-
include/linux/netfilter/nf_conntrack_snmp.h | 2 +-
include/linux/netfilter/nf_conntrack_tftp.h | 2 +-
net/netfilter/nf_conntrack_amanda.c | 14 +++++++-------
net/netfilter/nf_conntrack_ftp.c | 14 +++++++-------
net/netfilter/nf_conntrack_irc.c | 13 +++++++------
net/netfilter/nf_conntrack_snmp.c | 8 ++++----
net/netfilter/nf_conntrack_tftp.c | 7 ++++---
10 files changed, 34 insertions(+), 32 deletions(-)
diff --git a/include/linux/netfilter/nf_conntrack_amanda.h b/include/linux/netfilter/nf_conntrack_amanda.h
index 6f0ac896fcc9..dfe89f38d1f7 100644
--- a/include/linux/netfilter/nf_conntrack_amanda.h
+++ b/include/linux/netfilter/nf_conntrack_amanda.h
@@ -7,7 +7,7 @@
#include <linux/skbuff.h>
#include <net/netfilter/nf_conntrack_expect.h>
-extern unsigned int (*nf_nat_amanda_hook)(struct sk_buff *skb,
+extern unsigned int (__rcu *nf_nat_amanda_hook)(struct sk_buff *skb,
enum ip_conntrack_info ctinfo,
unsigned int protoff,
unsigned int matchoff,
diff --git a/include/linux/netfilter/nf_conntrack_ftp.h b/include/linux/netfilter/nf_conntrack_ftp.h
index 0e38302820b9..f31292642035 100644
--- a/include/linux/netfilter/nf_conntrack_ftp.h
+++ b/include/linux/netfilter/nf_conntrack_ftp.h
@@ -26,7 +26,7 @@ struct nf_ct_ftp_master {
/* For NAT to hook in when we find a packet which describes what other
* connection we should expect. */
-extern unsigned int (*nf_nat_ftp_hook)(struct sk_buff *skb,
+extern unsigned int (__rcu *nf_nat_ftp_hook)(struct sk_buff *skb,
enum ip_conntrack_info ctinfo,
enum nf_ct_ftp_type type,
unsigned int protoff,
diff --git a/include/linux/netfilter/nf_conntrack_irc.h b/include/linux/netfilter/nf_conntrack_irc.h
index d02255f721e1..4f3ca5621998 100644
--- a/include/linux/netfilter/nf_conntrack_irc.h
+++ b/include/linux/netfilter/nf_conntrack_irc.h
@@ -8,7 +8,7 @@
#define IRC_PORT 6667
-extern unsigned int (*nf_nat_irc_hook)(struct sk_buff *skb,
+extern unsigned int (__rcu *nf_nat_irc_hook)(struct sk_buff *skb,
enum ip_conntrack_info ctinfo,
unsigned int protoff,
unsigned int matchoff,
diff --git a/include/linux/netfilter/nf_conntrack_snmp.h b/include/linux/netfilter/nf_conntrack_snmp.h
index 87e4f33eb55f..99107e4f5234 100644
--- a/include/linux/netfilter/nf_conntrack_snmp.h
+++ b/include/linux/netfilter/nf_conntrack_snmp.h
@@ -5,7 +5,7 @@
#include <linux/netfilter.h>
#include <linux/skbuff.h>
-extern int (*nf_nat_snmp_hook)(struct sk_buff *skb,
+extern int (__rcu *nf_nat_snmp_hook)(struct sk_buff *skb,
unsigned int protoff,
struct nf_conn *ct,
enum ip_conntrack_info ctinfo);
diff --git a/include/linux/netfilter/nf_conntrack_tftp.h b/include/linux/netfilter/nf_conntrack_tftp.h
index dc4c1b9beac0..1490b68dd7d1 100644
--- a/include/linux/netfilter/nf_conntrack_tftp.h
+++ b/include/linux/netfilter/nf_conntrack_tftp.h
@@ -19,7 +19,7 @@ struct tftphdr {
#define TFTP_OPCODE_ACK 4
#define TFTP_OPCODE_ERROR 5
-extern unsigned int (*nf_nat_tftp_hook)(struct sk_buff *skb,
+extern unsigned int (__rcu *nf_nat_tftp_hook)(struct sk_buff *skb,
enum ip_conntrack_info ctinfo,
struct nf_conntrack_expect *exp);
diff --git a/net/netfilter/nf_conntrack_amanda.c b/net/netfilter/nf_conntrack_amanda.c
index 7be4c35e4795..c0132559f6af 100644
--- a/net/netfilter/nf_conntrack_amanda.c
+++ b/net/netfilter/nf_conntrack_amanda.c
@@ -37,13 +37,13 @@ MODULE_PARM_DESC(master_timeout, "timeout for the master connection");
module_param(ts_algo, charp, 0400);
MODULE_PARM_DESC(ts_algo, "textsearch algorithm to use (default kmp)");
-unsigned int (*nf_nat_amanda_hook)(struct sk_buff *skb,
- enum ip_conntrack_info ctinfo,
- unsigned int protoff,
- unsigned int matchoff,
- unsigned int matchlen,
- struct nf_conntrack_expect *exp)
- __read_mostly;
+unsigned int (__rcu *nf_nat_amanda_hook)(struct sk_buff *skb,
+ enum ip_conntrack_info ctinfo,
+ unsigned int protoff,
+ unsigned int matchoff,
+ unsigned int matchlen,
+ struct nf_conntrack_expect *exp)
+ __read_mostly;
EXPORT_SYMBOL_GPL(nf_nat_amanda_hook);
enum amanda_strings {
diff --git a/net/netfilter/nf_conntrack_ftp.c b/net/netfilter/nf_conntrack_ftp.c
index 617f744a2e3a..5e00f9123c38 100644
--- a/net/netfilter/nf_conntrack_ftp.c
+++ b/net/netfilter/nf_conntrack_ftp.c
@@ -43,13 +43,13 @@ module_param_array(ports, ushort, &ports_c, 0400);
static bool loose;
module_param(loose, bool, 0600);
-unsigned int (*nf_nat_ftp_hook)(struct sk_buff *skb,
- enum ip_conntrack_info ctinfo,
- enum nf_ct_ftp_type type,
- unsigned int protoff,
- unsigned int matchoff,
- unsigned int matchlen,
- struct nf_conntrack_expect *exp);
+unsigned int (__rcu *nf_nat_ftp_hook)(struct sk_buff *skb,
+ enum ip_conntrack_info ctinfo,
+ enum nf_ct_ftp_type type,
+ unsigned int protoff,
+ unsigned int matchoff,
+ unsigned int matchlen,
+ struct nf_conntrack_expect *exp);
EXPORT_SYMBOL_GPL(nf_nat_ftp_hook);
static int try_rfc959(const char *, size_t, struct nf_conntrack_man *,
diff --git a/net/netfilter/nf_conntrack_irc.c b/net/netfilter/nf_conntrack_irc.c
index 5703846bea3b..b8e6d724acd1 100644
--- a/net/netfilter/nf_conntrack_irc.c
+++ b/net/netfilter/nf_conntrack_irc.c
@@ -30,12 +30,13 @@ static unsigned int dcc_timeout __read_mostly = 300;
static char *irc_buffer;
static DEFINE_SPINLOCK(irc_buffer_lock);
-unsigned int (*nf_nat_irc_hook)(struct sk_buff *skb,
- enum ip_conntrack_info ctinfo,
- unsigned int protoff,
- unsigned int matchoff,
- unsigned int matchlen,
- struct nf_conntrack_expect *exp) __read_mostly;
+unsigned int (__rcu *nf_nat_irc_hook)(struct sk_buff *skb,
+ enum ip_conntrack_info ctinfo,
+ unsigned int protoff,
+ unsigned int matchoff,
+ unsigned int matchlen,
+ struct nf_conntrack_expect *exp)
+ __read_mostly;
EXPORT_SYMBOL_GPL(nf_nat_irc_hook);
#define HELPER_NAME "irc"
diff --git a/net/netfilter/nf_conntrack_snmp.c b/net/netfilter/nf_conntrack_snmp.c
index daacf2023fa5..387dd6e58f88 100644
--- a/net/netfilter/nf_conntrack_snmp.c
+++ b/net/netfilter/nf_conntrack_snmp.c
@@ -25,10 +25,10 @@ static unsigned int timeout __read_mostly = 30;
module_param(timeout, uint, 0400);
MODULE_PARM_DESC(timeout, "timeout for master connection/replies in seconds");
-int (*nf_nat_snmp_hook)(struct sk_buff *skb,
- unsigned int protoff,
- struct nf_conn *ct,
- enum ip_conntrack_info ctinfo);
+int (__rcu *nf_nat_snmp_hook)(struct sk_buff *skb,
+ unsigned int protoff,
+ struct nf_conn *ct,
+ enum ip_conntrack_info ctinfo);
EXPORT_SYMBOL_GPL(nf_nat_snmp_hook);
static int snmp_conntrack_help(struct sk_buff *skb, unsigned int protoff,
diff --git a/net/netfilter/nf_conntrack_tftp.c b/net/netfilter/nf_conntrack_tftp.c
index 80ee53f29f68..89e9914e5d03 100644
--- a/net/netfilter/nf_conntrack_tftp.c
+++ b/net/netfilter/nf_conntrack_tftp.c
@@ -32,9 +32,10 @@ static unsigned int ports_c;
module_param_array(ports, ushort, &ports_c, 0400);
MODULE_PARM_DESC(ports, "Port numbers of TFTP servers");
-unsigned int (*nf_nat_tftp_hook)(struct sk_buff *skb,
- enum ip_conntrack_info ctinfo,
- struct nf_conntrack_expect *exp) __read_mostly;
+unsigned int (__rcu *nf_nat_tftp_hook)(struct sk_buff *skb,
+ enum ip_conntrack_info ctinfo,
+ struct nf_conntrack_expect *exp)
+ __read_mostly;
EXPORT_SYMBOL_GPL(nf_nat_tftp_hook);
static int tftp_help(struct sk_buff *skb,
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 02/10] netfilter: nft_counter: serialize reset with spinlock
2026-02-17 16:32 [PATCH net 00/10] netfilter: updates for net Florian Westphal
2026-02-17 16:32 ` [PATCH net 01/10] netfilter: annotate NAT helper hook pointers with __rcu Florian Westphal
@ 2026-02-17 16:32 ` Florian Westphal
2026-02-17 16:32 ` [PATCH net 03/10] netfilter: nft_quota: use atomic64_xchg for reset Florian Westphal
` (7 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Florian Westphal @ 2026-02-17 16:32 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Brian Witte <brianwitte@mailfence.com>
Add a global static spinlock to serialize counter fetch+reset
operations, preventing concurrent dump-and-reset from underrunning
values.
The lock is taken before fetching the total so that two parallel
resets cannot both read the same counter values and then both
subtract them.
A global lock is used for simplicity since resets are infrequent.
If this becomes a bottleneck, it can be replaced with a per-net
lock later.
Fixes: bd662c4218f9 ("netfilter: nf_tables: Add locking for NFT_MSG_GETOBJ_RESET requests")
Fixes: 3d483faa6663 ("netfilter: nf_tables: Add locking for NFT_MSG_GETSETELEM_RESET requests")
Fixes: 3cb03edb4de3 ("netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests")
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Brian Witte <brianwitte@mailfence.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nft_counter.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/net/netfilter/nft_counter.c b/net/netfilter/nft_counter.c
index 0d70325280cc..169ae93688bc 100644
--- a/net/netfilter/nft_counter.c
+++ b/net/netfilter/nft_counter.c
@@ -32,6 +32,9 @@ struct nft_counter_percpu_priv {
static DEFINE_PER_CPU(struct u64_stats_sync, nft_counter_sync);
+/* control plane only: sync fetch+reset */
+static DEFINE_SPINLOCK(nft_counter_lock);
+
static inline void nft_counter_do_eval(struct nft_counter_percpu_priv *priv,
struct nft_regs *regs,
const struct nft_pktinfo *pkt)
@@ -148,13 +151,25 @@ static void nft_counter_fetch(struct nft_counter_percpu_priv *priv,
}
}
+static void nft_counter_fetch_and_reset(struct nft_counter_percpu_priv *priv,
+ struct nft_counter_tot *total)
+{
+ spin_lock(&nft_counter_lock);
+ nft_counter_fetch(priv, total);
+ nft_counter_reset(priv, total);
+ spin_unlock(&nft_counter_lock);
+}
+
static int nft_counter_do_dump(struct sk_buff *skb,
struct nft_counter_percpu_priv *priv,
bool reset)
{
struct nft_counter_tot total;
- nft_counter_fetch(priv, &total);
+ if (unlikely(reset))
+ nft_counter_fetch_and_reset(priv, &total);
+ else
+ nft_counter_fetch(priv, &total);
if (nla_put_be64(skb, NFTA_COUNTER_BYTES, cpu_to_be64(total.bytes),
NFTA_COUNTER_PAD) ||
@@ -162,9 +177,6 @@ static int nft_counter_do_dump(struct sk_buff *skb,
NFTA_COUNTER_PAD))
goto nla_put_failure;
- if (reset)
- nft_counter_reset(priv, &total);
-
return 0;
nla_put_failure:
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 03/10] netfilter: nft_quota: use atomic64_xchg for reset
2026-02-17 16:32 [PATCH net 00/10] netfilter: updates for net Florian Westphal
2026-02-17 16:32 ` [PATCH net 01/10] netfilter: annotate NAT helper hook pointers with __rcu Florian Westphal
2026-02-17 16:32 ` [PATCH net 02/10] netfilter: nft_counter: serialize reset with spinlock Florian Westphal
@ 2026-02-17 16:32 ` Florian Westphal
2026-02-17 16:32 ` [PATCH net 04/10] netfilter: nf_tables: revert commit_mutex usage in reset path Florian Westphal
` (6 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Florian Westphal @ 2026-02-17 16:32 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Brian Witte <brianwitte@mailfence.com>
Use atomic64_xchg() to atomically read and zero the consumed value
on reset, which is simpler than the previous read+sub pattern and
doesn't require lock serialization.
Fixes: bd662c4218f9 ("netfilter: nf_tables: Add locking for NFT_MSG_GETOBJ_RESET requests")
Fixes: 3d483faa6663 ("netfilter: nf_tables: Add locking for NFT_MSG_GETSETELEM_RESET requests")
Fixes: 3cb03edb4de3 ("netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests")
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Brian Witte <brianwitte@mailfence.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nft_quota.c | 13 +++++++------
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/net/netfilter/nft_quota.c b/net/netfilter/nft_quota.c
index df0798da2329..cb6c0e04ff67 100644
--- a/net/netfilter/nft_quota.c
+++ b/net/netfilter/nft_quota.c
@@ -140,11 +140,16 @@ static int nft_quota_do_dump(struct sk_buff *skb, struct nft_quota *priv,
u64 consumed, consumed_cap, quota;
u32 flags = priv->flags;
- /* Since we inconditionally increment consumed quota for each packet
+ /* Since we unconditionally increment consumed quota for each packet
* that we see, don't go over the quota boundary in what we send to
* userspace.
*/
- consumed = atomic64_read(priv->consumed);
+ if (reset) {
+ consumed = atomic64_xchg(priv->consumed, 0);
+ clear_bit(NFT_QUOTA_DEPLETED_BIT, &priv->flags);
+ } else {
+ consumed = atomic64_read(priv->consumed);
+ }
quota = atomic64_read(&priv->quota);
if (consumed >= quota) {
consumed_cap = quota;
@@ -160,10 +165,6 @@ static int nft_quota_do_dump(struct sk_buff *skb, struct nft_quota *priv,
nla_put_be32(skb, NFTA_QUOTA_FLAGS, htonl(flags)))
goto nla_put_failure;
- if (reset) {
- atomic64_sub(consumed, priv->consumed);
- clear_bit(NFT_QUOTA_DEPLETED_BIT, &priv->flags);
- }
return 0;
nla_put_failure:
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 04/10] netfilter: nf_tables: revert commit_mutex usage in reset path
2026-02-17 16:32 [PATCH net 00/10] netfilter: updates for net Florian Westphal
` (2 preceding siblings ...)
2026-02-17 16:32 ` [PATCH net 03/10] netfilter: nft_quota: use atomic64_xchg for reset Florian Westphal
@ 2026-02-17 16:32 ` Florian Westphal
2026-02-17 16:32 ` [PATCH net 05/10] netfilter: nf_conntrack_h323: don't pass uninitialised l3num value Florian Westphal
` (5 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Florian Westphal @ 2026-02-17 16:32 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Brian Witte <brianwitte@mailfence.com>
It causes circular lock dependency between commit_mutex, nfnl_subsys_ipset
and nlk_cb_mutex when nft reset, ipset list, and iptables-nft with '-m set'
rule run at the same time.
Previous patches made it safe to run individual reset handlers concurrently
so commit_mutex is no longer required to prevent this.
Fixes: bd662c4218f9 ("netfilter: nf_tables: Add locking for NFT_MSG_GETOBJ_RESET requests")
Fixes: 3d483faa6663 ("netfilter: nf_tables: Add locking for NFT_MSG_GETSETELEM_RESET requests")
Fixes: 3cb03edb4de3 ("netfilter: nf_tables: Add locking for NFT_MSG_GETRULE_RESET requests")
Link: https://lore.kernel.org/all/aUh_3mVRV8OrGsVo@strlen.de/
Reported-by: <syzbot+ff16b505ec9152e5f448@syzkaller.appspotmail.com>
Closes: https://syzkaller.appspot.com/bug?extid=ff16b505ec9152e5f448
Signed-off-by: Brian Witte <brianwitte@mailfence.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_tables_api.c | 248 ++++++----------------------------
1 file changed, 42 insertions(+), 206 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 1ed034a47bd0..819056ea1ce1 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3901,23 +3901,6 @@ static int nf_tables_dump_rules(struct sk_buff *skb,
return skb->len;
}
-static int nf_tables_dumpreset_rules(struct sk_buff *skb,
- struct netlink_callback *cb)
-{
- struct nftables_pernet *nft_net = nft_pernet(sock_net(skb->sk));
- int ret;
-
- /* Mutex is held is to prevent that two concurrent dump-and-reset calls
- * do not underrun counters and quotas. The commit_mutex is used for
- * the lack a better lock, this is not transaction path.
- */
- mutex_lock(&nft_net->commit_mutex);
- ret = nf_tables_dump_rules(skb, cb);
- mutex_unlock(&nft_net->commit_mutex);
-
- return ret;
-}
-
static int nf_tables_dump_rules_start(struct netlink_callback *cb)
{
struct nft_rule_dump_ctx *ctx = (void *)cb->ctx;
@@ -3937,16 +3920,10 @@ static int nf_tables_dump_rules_start(struct netlink_callback *cb)
return -ENOMEM;
}
}
- return 0;
-}
-
-static int nf_tables_dumpreset_rules_start(struct netlink_callback *cb)
-{
- struct nft_rule_dump_ctx *ctx = (void *)cb->ctx;
-
- ctx->reset = true;
+ if (NFNL_MSG_TYPE(cb->nlh->nlmsg_type) == NFT_MSG_GETRULE_RESET)
+ ctx->reset = true;
- return nf_tables_dump_rules_start(cb);
+ return 0;
}
static int nf_tables_dump_rules_done(struct netlink_callback *cb)
@@ -4012,6 +3989,8 @@ static int nf_tables_getrule(struct sk_buff *skb, const struct nfnl_info *info,
u32 portid = NETLINK_CB(skb).portid;
struct net *net = info->net;
struct sk_buff *skb2;
+ bool reset = false;
+ char *buf;
if (info->nlh->nlmsg_flags & NLM_F_DUMP) {
struct netlink_dump_control c = {
@@ -4025,47 +4004,16 @@ static int nf_tables_getrule(struct sk_buff *skb, const struct nfnl_info *info,
return nft_netlink_dump_start_rcu(info->sk, skb, info->nlh, &c);
}
- skb2 = nf_tables_getrule_single(portid, info, nla, false);
- if (IS_ERR(skb2))
- return PTR_ERR(skb2);
-
- return nfnetlink_unicast(skb2, net, portid);
-}
-
-static int nf_tables_getrule_reset(struct sk_buff *skb,
- const struct nfnl_info *info,
- const struct nlattr * const nla[])
-{
- struct nftables_pernet *nft_net = nft_pernet(info->net);
- u32 portid = NETLINK_CB(skb).portid;
- struct net *net = info->net;
- struct sk_buff *skb2;
- char *buf;
-
- if (info->nlh->nlmsg_flags & NLM_F_DUMP) {
- struct netlink_dump_control c = {
- .start= nf_tables_dumpreset_rules_start,
- .dump = nf_tables_dumpreset_rules,
- .done = nf_tables_dump_rules_done,
- .module = THIS_MODULE,
- .data = (void *)nla,
- };
-
- return nft_netlink_dump_start_rcu(info->sk, skb, info->nlh, &c);
- }
-
- if (!try_module_get(THIS_MODULE))
- return -EINVAL;
- rcu_read_unlock();
- mutex_lock(&nft_net->commit_mutex);
- skb2 = nf_tables_getrule_single(portid, info, nla, true);
- mutex_unlock(&nft_net->commit_mutex);
- rcu_read_lock();
- module_put(THIS_MODULE);
+ if (NFNL_MSG_TYPE(info->nlh->nlmsg_type) == NFT_MSG_GETRULE_RESET)
+ reset = true;
+ skb2 = nf_tables_getrule_single(portid, info, nla, reset);
if (IS_ERR(skb2))
return PTR_ERR(skb2);
+ if (!reset)
+ return nfnetlink_unicast(skb2, net, portid);
+
buf = kasprintf(GFP_ATOMIC, "%.*s:%u",
nla_len(nla[NFTA_RULE_TABLE]),
(char *)nla_data(nla[NFTA_RULE_TABLE]),
@@ -6324,6 +6272,10 @@ static int nf_tables_dump_set(struct sk_buff *skb, struct netlink_callback *cb)
nla_nest_end(skb, nest);
nlmsg_end(skb, nlh);
+ if (dump_ctx->reset && args.iter.count > args.iter.skip)
+ audit_log_nft_set_reset(table, cb->seq,
+ args.iter.count - args.iter.skip);
+
rcu_read_unlock();
if (args.iter.err && args.iter.err != -EMSGSIZE)
@@ -6339,26 +6291,6 @@ static int nf_tables_dump_set(struct sk_buff *skb, struct netlink_callback *cb)
return -ENOSPC;
}
-static int nf_tables_dumpreset_set(struct sk_buff *skb,
- struct netlink_callback *cb)
-{
- struct nftables_pernet *nft_net = nft_pernet(sock_net(skb->sk));
- struct nft_set_dump_ctx *dump_ctx = cb->data;
- int ret, skip = cb->args[0];
-
- mutex_lock(&nft_net->commit_mutex);
-
- ret = nf_tables_dump_set(skb, cb);
-
- if (cb->args[0] > skip)
- audit_log_nft_set_reset(dump_ctx->ctx.table, cb->seq,
- cb->args[0] - skip);
-
- mutex_unlock(&nft_net->commit_mutex);
-
- return ret;
-}
-
static int nf_tables_dump_set_start(struct netlink_callback *cb)
{
struct nft_set_dump_ctx *dump_ctx = cb->data;
@@ -6602,8 +6534,13 @@ static int nf_tables_getsetelem(struct sk_buff *skb,
{
struct netlink_ext_ack *extack = info->extack;
struct nft_set_dump_ctx dump_ctx;
+ int rem, err = 0, nelems = 0;
+ struct net *net = info->net;
struct nlattr *attr;
- int rem, err = 0;
+ bool reset = false;
+
+ if (NFNL_MSG_TYPE(info->nlh->nlmsg_type) == NFT_MSG_GETSETELEM_RESET)
+ reset = true;
if (info->nlh->nlmsg_flags & NLM_F_DUMP) {
struct netlink_dump_control c = {
@@ -6613,7 +6550,7 @@ static int nf_tables_getsetelem(struct sk_buff *skb,
.module = THIS_MODULE,
};
- err = nft_set_dump_ctx_init(&dump_ctx, skb, info, nla, false);
+ err = nft_set_dump_ctx_init(&dump_ctx, skb, info, nla, reset);
if (err)
return err;
@@ -6624,75 +6561,21 @@ static int nf_tables_getsetelem(struct sk_buff *skb,
if (!nla[NFTA_SET_ELEM_LIST_ELEMENTS])
return -EINVAL;
- err = nft_set_dump_ctx_init(&dump_ctx, skb, info, nla, false);
+ err = nft_set_dump_ctx_init(&dump_ctx, skb, info, nla, reset);
if (err)
return err;
nla_for_each_nested(attr, nla[NFTA_SET_ELEM_LIST_ELEMENTS], rem) {
- err = nft_get_set_elem(&dump_ctx.ctx, dump_ctx.set, attr, false);
- if (err < 0) {
- NL_SET_BAD_ATTR(extack, attr);
- break;
- }
- }
-
- return err;
-}
-
-static int nf_tables_getsetelem_reset(struct sk_buff *skb,
- const struct nfnl_info *info,
- const struct nlattr * const nla[])
-{
- struct nftables_pernet *nft_net = nft_pernet(info->net);
- struct netlink_ext_ack *extack = info->extack;
- struct nft_set_dump_ctx dump_ctx;
- int rem, err = 0, nelems = 0;
- struct nlattr *attr;
-
- if (info->nlh->nlmsg_flags & NLM_F_DUMP) {
- struct netlink_dump_control c = {
- .start = nf_tables_dump_set_start,
- .dump = nf_tables_dumpreset_set,
- .done = nf_tables_dump_set_done,
- .module = THIS_MODULE,
- };
-
- err = nft_set_dump_ctx_init(&dump_ctx, skb, info, nla, true);
- if (err)
- return err;
-
- c.data = &dump_ctx;
- return nft_netlink_dump_start_rcu(info->sk, skb, info->nlh, &c);
- }
-
- if (!nla[NFTA_SET_ELEM_LIST_ELEMENTS])
- return -EINVAL;
-
- if (!try_module_get(THIS_MODULE))
- return -EINVAL;
- rcu_read_unlock();
- mutex_lock(&nft_net->commit_mutex);
- rcu_read_lock();
-
- err = nft_set_dump_ctx_init(&dump_ctx, skb, info, nla, true);
- if (err)
- goto out_unlock;
-
- nla_for_each_nested(attr, nla[NFTA_SET_ELEM_LIST_ELEMENTS], rem) {
- err = nft_get_set_elem(&dump_ctx.ctx, dump_ctx.set, attr, true);
+ err = nft_get_set_elem(&dump_ctx.ctx, dump_ctx.set, attr, reset);
if (err < 0) {
NL_SET_BAD_ATTR(extack, attr);
break;
}
nelems++;
}
- audit_log_nft_set_reset(dump_ctx.ctx.table, nft_base_seq(info->net), nelems);
-
-out_unlock:
- rcu_read_unlock();
- mutex_unlock(&nft_net->commit_mutex);
- rcu_read_lock();
- module_put(THIS_MODULE);
+ if (reset)
+ audit_log_nft_set_reset(dump_ctx.ctx.table, nft_base_seq(net),
+ nelems);
return err;
}
@@ -8564,19 +8447,6 @@ static int nf_tables_dump_obj(struct sk_buff *skb, struct netlink_callback *cb)
return skb->len;
}
-static int nf_tables_dumpreset_obj(struct sk_buff *skb,
- struct netlink_callback *cb)
-{
- struct nftables_pernet *nft_net = nft_pernet(sock_net(skb->sk));
- int ret;
-
- mutex_lock(&nft_net->commit_mutex);
- ret = nf_tables_dump_obj(skb, cb);
- mutex_unlock(&nft_net->commit_mutex);
-
- return ret;
-}
-
static int nf_tables_dump_obj_start(struct netlink_callback *cb)
{
struct nft_obj_dump_ctx *ctx = (void *)cb->ctx;
@@ -8593,16 +8463,10 @@ static int nf_tables_dump_obj_start(struct netlink_callback *cb)
if (nla[NFTA_OBJ_TYPE])
ctx->type = ntohl(nla_get_be32(nla[NFTA_OBJ_TYPE]));
- return 0;
-}
-
-static int nf_tables_dumpreset_obj_start(struct netlink_callback *cb)
-{
- struct nft_obj_dump_ctx *ctx = (void *)cb->ctx;
-
- ctx->reset = true;
+ if (NFNL_MSG_TYPE(cb->nlh->nlmsg_type) == NFT_MSG_GETOBJ_RESET)
+ ctx->reset = true;
- return nf_tables_dump_obj_start(cb);
+ return 0;
}
static int nf_tables_dump_obj_done(struct netlink_callback *cb)
@@ -8664,42 +8528,16 @@ nf_tables_getobj_single(u32 portid, const struct nfnl_info *info,
static int nf_tables_getobj(struct sk_buff *skb, const struct nfnl_info *info,
const struct nlattr * const nla[])
{
- u32 portid = NETLINK_CB(skb).portid;
- struct sk_buff *skb2;
-
- if (info->nlh->nlmsg_flags & NLM_F_DUMP) {
- struct netlink_dump_control c = {
- .start = nf_tables_dump_obj_start,
- .dump = nf_tables_dump_obj,
- .done = nf_tables_dump_obj_done,
- .module = THIS_MODULE,
- .data = (void *)nla,
- };
-
- return nft_netlink_dump_start_rcu(info->sk, skb, info->nlh, &c);
- }
-
- skb2 = nf_tables_getobj_single(portid, info, nla, false);
- if (IS_ERR(skb2))
- return PTR_ERR(skb2);
-
- return nfnetlink_unicast(skb2, info->net, portid);
-}
-
-static int nf_tables_getobj_reset(struct sk_buff *skb,
- const struct nfnl_info *info,
- const struct nlattr * const nla[])
-{
- struct nftables_pernet *nft_net = nft_pernet(info->net);
u32 portid = NETLINK_CB(skb).portid;
struct net *net = info->net;
struct sk_buff *skb2;
+ bool reset = false;
char *buf;
if (info->nlh->nlmsg_flags & NLM_F_DUMP) {
struct netlink_dump_control c = {
- .start = nf_tables_dumpreset_obj_start,
- .dump = nf_tables_dumpreset_obj,
+ .start = nf_tables_dump_obj_start,
+ .dump = nf_tables_dump_obj,
.done = nf_tables_dump_obj_done,
.module = THIS_MODULE,
.data = (void *)nla,
@@ -8708,18 +8546,16 @@ static int nf_tables_getobj_reset(struct sk_buff *skb,
return nft_netlink_dump_start_rcu(info->sk, skb, info->nlh, &c);
}
- if (!try_module_get(THIS_MODULE))
- return -EINVAL;
- rcu_read_unlock();
- mutex_lock(&nft_net->commit_mutex);
- skb2 = nf_tables_getobj_single(portid, info, nla, true);
- mutex_unlock(&nft_net->commit_mutex);
- rcu_read_lock();
- module_put(THIS_MODULE);
+ if (NFNL_MSG_TYPE(info->nlh->nlmsg_type) == NFT_MSG_GETOBJ_RESET)
+ reset = true;
+ skb2 = nf_tables_getobj_single(portid, info, nla, reset);
if (IS_ERR(skb2))
return PTR_ERR(skb2);
+ if (!reset)
+ return nfnetlink_unicast(skb2, net, NETLINK_CB(skb).portid);
+
buf = kasprintf(GFP_ATOMIC, "%.*s:%u",
nla_len(nla[NFTA_OBJ_TABLE]),
(char *)nla_data(nla[NFTA_OBJ_TABLE]),
@@ -10037,7 +9873,7 @@ static const struct nfnl_callback nf_tables_cb[NFT_MSG_MAX] = {
.policy = nft_rule_policy,
},
[NFT_MSG_GETRULE_RESET] = {
- .call = nf_tables_getrule_reset,
+ .call = nf_tables_getrule,
.type = NFNL_CB_RCU,
.attr_count = NFTA_RULE_MAX,
.policy = nft_rule_policy,
@@ -10091,7 +9927,7 @@ static const struct nfnl_callback nf_tables_cb[NFT_MSG_MAX] = {
.policy = nft_set_elem_list_policy,
},
[NFT_MSG_GETSETELEM_RESET] = {
- .call = nf_tables_getsetelem_reset,
+ .call = nf_tables_getsetelem,
.type = NFNL_CB_RCU,
.attr_count = NFTA_SET_ELEM_LIST_MAX,
.policy = nft_set_elem_list_policy,
@@ -10137,7 +9973,7 @@ static const struct nfnl_callback nf_tables_cb[NFT_MSG_MAX] = {
.policy = nft_obj_policy,
},
[NFT_MSG_GETOBJ_RESET] = {
- .call = nf_tables_getobj_reset,
+ .call = nf_tables_getobj,
.type = NFNL_CB_RCU,
.attr_count = NFTA_OBJ_MAX,
.policy = nft_obj_policy,
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 05/10] netfilter: nf_conntrack_h323: don't pass uninitialised l3num value
2026-02-17 16:32 [PATCH net 00/10] netfilter: updates for net Florian Westphal
` (3 preceding siblings ...)
2026-02-17 16:32 ` [PATCH net 04/10] netfilter: nf_tables: revert commit_mutex usage in reset path Florian Westphal
@ 2026-02-17 16:32 ` Florian Westphal
2026-02-17 16:32 ` [PATCH net 06/10] include: uapi: netfilter_bridge.h: Cover for musl libc Florian Westphal
` (4 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Florian Westphal @ 2026-02-17 16:32 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
Mihail Milev reports: Error: UNINIT (CWE-457):
net/netfilter/nf_conntrack_h323_main.c:1189:2: var_decl:
Declaring variable "tuple" without initializer.
net/netfilter/nf_conntrack_h323_main.c:1197:2:
uninit_use_in_call: Using uninitialized value "tuple.src.l3num" when calling "__nf_ct_expect_find".
net/netfilter/nf_conntrack_expect.c:142:2:
read_value: Reading value "tuple->src.l3num" when calling "nf_ct_expect_dst_hash".
1195| tuple.dst.protonum = IPPROTO_TCP;
1196|
1197|-> exp = __nf_ct_expect_find(net, nf_ct_zone(ct), &tuple);
1198| if (exp && exp->master == ct)
1199| return exp;
Switch this to a C99 initialiser and set the l3num value.
Fixes: f587de0e2feb ("[NETFILTER]: nf_conntrack/nf_nat: add H.323 helper port")
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_conntrack_h323_main.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/nf_conntrack_h323_main.c b/net/netfilter/nf_conntrack_h323_main.c
index 17f1f453d481..a2a0e22ccee1 100644
--- a/net/netfilter/nf_conntrack_h323_main.c
+++ b/net/netfilter/nf_conntrack_h323_main.c
@@ -1187,13 +1187,13 @@ static struct nf_conntrack_expect *find_expect(struct nf_conn *ct,
{
struct net *net = nf_ct_net(ct);
struct nf_conntrack_expect *exp;
- struct nf_conntrack_tuple tuple;
+ struct nf_conntrack_tuple tuple = {
+ .src.l3num = nf_ct_l3num(ct),
+ .dst.protonum = IPPROTO_TCP,
+ .dst.u.tcp.port = port,
+ };
- memset(&tuple.src.u3, 0, sizeof(tuple.src.u3));
- tuple.src.u.tcp.port = 0;
memcpy(&tuple.dst.u3, addr, sizeof(tuple.dst.u3));
- tuple.dst.u.tcp.port = port;
- tuple.dst.protonum = IPPROTO_TCP;
exp = __nf_ct_expect_find(net, nf_ct_zone(ct), &tuple);
if (exp && exp->master == ct)
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 06/10] include: uapi: netfilter_bridge.h: Cover for musl libc
2026-02-17 16:32 [PATCH net 00/10] netfilter: updates for net Florian Westphal
` (4 preceding siblings ...)
2026-02-17 16:32 ` [PATCH net 05/10] netfilter: nf_conntrack_h323: don't pass uninitialised l3num value Florian Westphal
@ 2026-02-17 16:32 ` Florian Westphal
2026-02-17 16:32 ` [PATCH net 07/10] ipvs: skip ipv6 extension headers for csum checks Florian Westphal
` (3 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Florian Westphal @ 2026-02-17 16:32 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Phil Sutter <phil@nwl.cc>
Musl defines its own struct ethhdr and thus defines __UAPI_DEF_ETHHDR to
zero. To avoid struct redefinition errors, user space is therefore
supposed to include netinet/if_ether.h before (or instead of)
linux/if_ether.h. To relieve them from this burden, include the libc
header here if not building for kernel space.
Reported-by: Alyssa Ross <hi@alyssa.is>
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
include/uapi/linux/netfilter_bridge.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/uapi/linux/netfilter_bridge.h b/include/uapi/linux/netfilter_bridge.h
index f6e8d1e05c97..758de72b2764 100644
--- a/include/uapi/linux/netfilter_bridge.h
+++ b/include/uapi/linux/netfilter_bridge.h
@@ -5,6 +5,10 @@
/* bridge-specific defines for netfilter.
*/
+#ifndef __KERNEL__
+#include <netinet/if_ether.h> /* for __UAPI_DEF_ETHHDR if defined */
+#endif
+
#include <linux/in.h>
#include <linux/netfilter.h>
#include <linux/if_ether.h>
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 07/10] ipvs: skip ipv6 extension headers for csum checks
2026-02-17 16:32 [PATCH net 00/10] netfilter: updates for net Florian Westphal
` (5 preceding siblings ...)
2026-02-17 16:32 ` [PATCH net 06/10] include: uapi: netfilter_bridge.h: Cover for musl libc Florian Westphal
@ 2026-02-17 16:32 ` Florian Westphal
2026-02-17 16:32 ` [PATCH net 08/10] ipvs: do not keep dest_dst if dev is going down Florian Westphal
` (2 subsequent siblings)
9 siblings, 0 replies; 12+ messages in thread
From: Florian Westphal @ 2026-02-17 16:32 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Julian Anastasov <ja@ssi.bg>
Protocol checksum validation fails for IPv6 if there are extension
headers before the protocol header. iph->len already contains its
offset, so use it to fix the problem.
Fixes: 2906f66a5682 ("ipvs: SCTP Trasport Loadbalancing Support")
Fixes: 0bbdd42b7efa ("IPVS: Extend protocol DNAT/SNAT and state handlers")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/ipvs/ip_vs_proto_sctp.c | 18 ++++++------------
net/netfilter/ipvs/ip_vs_proto_tcp.c | 21 +++++++--------------
net/netfilter/ipvs/ip_vs_proto_udp.c | 20 +++++++-------------
3 files changed, 20 insertions(+), 39 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_proto_sctp.c b/net/netfilter/ipvs/ip_vs_proto_sctp.c
index 83e452916403..63c78a1f3918 100644
--- a/net/netfilter/ipvs/ip_vs_proto_sctp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_sctp.c
@@ -10,7 +10,8 @@
#include <net/ip_vs.h>
static int
-sctp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp);
+sctp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
+ unsigned int sctphoff);
static int
sctp_conn_schedule(struct netns_ipvs *ipvs, int af, struct sk_buff *skb,
@@ -108,7 +109,7 @@ sctp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
int ret;
/* Some checks before mangling */
- if (!sctp_csum_check(cp->af, skb, pp))
+ if (!sctp_csum_check(cp->af, skb, pp, sctphoff))
return 0;
/* Call application helper if needed */
@@ -156,7 +157,7 @@ sctp_dnat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
int ret;
/* Some checks before mangling */
- if (!sctp_csum_check(cp->af, skb, pp))
+ if (!sctp_csum_check(cp->af, skb, pp, sctphoff))
return 0;
/* Call application helper if needed */
@@ -185,19 +186,12 @@ sctp_dnat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
}
static int
-sctp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp)
+sctp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
+ unsigned int sctphoff)
{
- unsigned int sctphoff;
struct sctphdr *sh;
__le32 cmp, val;
-#ifdef CONFIG_IP_VS_IPV6
- if (af == AF_INET6)
- sctphoff = sizeof(struct ipv6hdr);
- else
-#endif
- sctphoff = ip_hdrlen(skb);
-
sh = (struct sctphdr *)(skb->data + sctphoff);
cmp = sh->checksum;
val = sctp_compute_cksum(skb, sctphoff);
diff --git a/net/netfilter/ipvs/ip_vs_proto_tcp.c b/net/netfilter/ipvs/ip_vs_proto_tcp.c
index f68a1533ee45..8cc0a8ce6241 100644
--- a/net/netfilter/ipvs/ip_vs_proto_tcp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_tcp.c
@@ -28,7 +28,8 @@
#include <net/ip_vs.h>
static int
-tcp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp);
+tcp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
+ unsigned int tcphoff);
static int
tcp_conn_schedule(struct netns_ipvs *ipvs, int af, struct sk_buff *skb,
@@ -165,7 +166,7 @@ tcp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
int ret;
/* Some checks before mangling */
- if (!tcp_csum_check(cp->af, skb, pp))
+ if (!tcp_csum_check(cp->af, skb, pp, tcphoff))
return 0;
/* Call application helper if needed */
@@ -243,7 +244,7 @@ tcp_dnat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
int ret;
/* Some checks before mangling */
- if (!tcp_csum_check(cp->af, skb, pp))
+ if (!tcp_csum_check(cp->af, skb, pp, tcphoff))
return 0;
/*
@@ -300,17 +301,9 @@ tcp_dnat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
static int
-tcp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp)
+tcp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
+ unsigned int tcphoff)
{
- unsigned int tcphoff;
-
-#ifdef CONFIG_IP_VS_IPV6
- if (af == AF_INET6)
- tcphoff = sizeof(struct ipv6hdr);
- else
-#endif
- tcphoff = ip_hdrlen(skb);
-
switch (skb->ip_summed) {
case CHECKSUM_NONE:
skb->csum = skb_checksum(skb, tcphoff, skb->len - tcphoff, 0);
@@ -321,7 +314,7 @@ tcp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp)
if (csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
&ipv6_hdr(skb)->daddr,
skb->len - tcphoff,
- ipv6_hdr(skb)->nexthdr,
+ IPPROTO_TCP,
skb->csum)) {
IP_VS_DBG_RL_PKT(0, af, pp, skb, 0,
"Failed checksum for");
diff --git a/net/netfilter/ipvs/ip_vs_proto_udp.c b/net/netfilter/ipvs/ip_vs_proto_udp.c
index 0f0107c80dd2..f9de632e38cd 100644
--- a/net/netfilter/ipvs/ip_vs_proto_udp.c
+++ b/net/netfilter/ipvs/ip_vs_proto_udp.c
@@ -24,7 +24,8 @@
#include <net/ip6_checksum.h>
static int
-udp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp);
+udp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
+ unsigned int udphoff);
static int
udp_conn_schedule(struct netns_ipvs *ipvs, int af, struct sk_buff *skb,
@@ -154,7 +155,7 @@ udp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
int ret;
/* Some checks before mangling */
- if (!udp_csum_check(cp->af, skb, pp))
+ if (!udp_csum_check(cp->af, skb, pp, udphoff))
return 0;
/*
@@ -237,7 +238,7 @@ udp_dnat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
int ret;
/* Some checks before mangling */
- if (!udp_csum_check(cp->af, skb, pp))
+ if (!udp_csum_check(cp->af, skb, pp, udphoff))
return 0;
/*
@@ -296,17 +297,10 @@ udp_dnat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
static int
-udp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp)
+udp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp,
+ unsigned int udphoff)
{
struct udphdr _udph, *uh;
- unsigned int udphoff;
-
-#ifdef CONFIG_IP_VS_IPV6
- if (af == AF_INET6)
- udphoff = sizeof(struct ipv6hdr);
- else
-#endif
- udphoff = ip_hdrlen(skb);
uh = skb_header_pointer(skb, udphoff, sizeof(_udph), &_udph);
if (uh == NULL)
@@ -324,7 +318,7 @@ udp_csum_check(int af, struct sk_buff *skb, struct ip_vs_protocol *pp)
if (csum_ipv6_magic(&ipv6_hdr(skb)->saddr,
&ipv6_hdr(skb)->daddr,
skb->len - udphoff,
- ipv6_hdr(skb)->nexthdr,
+ IPPROTO_UDP,
skb->csum)) {
IP_VS_DBG_RL_PKT(0, af, pp, skb, 0,
"Failed checksum for");
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 08/10] ipvs: do not keep dest_dst if dev is going down
2026-02-17 16:32 [PATCH net 00/10] netfilter: updates for net Florian Westphal
` (6 preceding siblings ...)
2026-02-17 16:32 ` [PATCH net 07/10] ipvs: skip ipv6 extension headers for csum checks Florian Westphal
@ 2026-02-17 16:32 ` Florian Westphal
2026-02-17 16:32 ` [PATCH net 09/10] net: remove WARN_ON_ONCE when accessing forward path array Florian Westphal
2026-02-17 16:32 ` [PATCH net 10/10] netfilter: nf_tables: fix use-after-free in nf_tables_addchain() Florian Westphal
9 siblings, 0 replies; 12+ messages in thread
From: Florian Westphal @ 2026-02-17 16:32 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Julian Anastasov <ja@ssi.bg>
There is race between the netdev notifier ip_vs_dst_event()
and the code that caches dst with dev that is going down.
As the FIB can be notified for the closed device after our
handler finishes, it is possible valid route to be returned
and cached resuling in a leaked dev reference until the dest
is not removed.
To prevent new dest_dst to be attached to dest just after the
handler dropped the old one, add a netif_running() check
to make sure the notifier handler is not currently running
for device that is closing.
Fixes: 7a4f0761fce3 ("IPVS: init and cleanup restructuring")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/ipvs/ip_vs_xmit.c | 46 ++++++++++++++++++++++++++-------
1 file changed, 36 insertions(+), 10 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c
index f861d116cc33..4389bfe3050d 100644
--- a/net/netfilter/ipvs/ip_vs_xmit.c
+++ b/net/netfilter/ipvs/ip_vs_xmit.c
@@ -294,6 +294,12 @@ static inline bool decrement_ttl(struct netns_ipvs *ipvs,
return true;
}
+/* rt has device that is down */
+static bool rt_dev_is_down(const struct net_device *dev)
+{
+ return dev && !netif_running(dev);
+}
+
/* Get route to destination or remote server */
static int
__ip_vs_get_out_rt(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb,
@@ -309,9 +315,11 @@ __ip_vs_get_out_rt(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb,
if (dest) {
dest_dst = __ip_vs_dst_check(dest);
- if (likely(dest_dst))
+ if (likely(dest_dst)) {
rt = dst_rtable(dest_dst->dst_cache);
- else {
+ if (ret_saddr)
+ *ret_saddr = dest_dst->dst_saddr.ip;
+ } else {
dest_dst = ip_vs_dest_dst_alloc();
spin_lock_bh(&dest->dst_lock);
if (!dest_dst) {
@@ -327,14 +335,22 @@ __ip_vs_get_out_rt(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb,
ip_vs_dest_dst_free(dest_dst);
goto err_unreach;
}
- __ip_vs_dst_set(dest, dest_dst, &rt->dst, 0);
+ /* It is forbidden to attach dest->dest_dst if
+ * device is going down.
+ */
+ if (!rt_dev_is_down(dst_dev_rcu(&rt->dst)))
+ __ip_vs_dst_set(dest, dest_dst, &rt->dst, 0);
+ else
+ noref = 0;
spin_unlock_bh(&dest->dst_lock);
IP_VS_DBG(10, "new dst %pI4, src %pI4, refcnt=%d\n",
&dest->addr.ip, &dest_dst->dst_saddr.ip,
rcuref_read(&rt->dst.__rcuref));
+ if (ret_saddr)
+ *ret_saddr = dest_dst->dst_saddr.ip;
+ if (!noref)
+ ip_vs_dest_dst_free(dest_dst);
}
- if (ret_saddr)
- *ret_saddr = dest_dst->dst_saddr.ip;
} else {
noref = 0;
@@ -471,9 +487,11 @@ __ip_vs_get_out_rt_v6(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb,
if (dest) {
dest_dst = __ip_vs_dst_check(dest);
- if (likely(dest_dst))
+ if (likely(dest_dst)) {
rt = dst_rt6_info(dest_dst->dst_cache);
- else {
+ if (ret_saddr)
+ *ret_saddr = dest_dst->dst_saddr.in6;
+ } else {
u32 cookie;
dest_dst = ip_vs_dest_dst_alloc();
@@ -494,14 +512,22 @@ __ip_vs_get_out_rt_v6(struct netns_ipvs *ipvs, int skb_af, struct sk_buff *skb,
}
rt = dst_rt6_info(dst);
cookie = rt6_get_cookie(rt);
- __ip_vs_dst_set(dest, dest_dst, &rt->dst, cookie);
+ /* It is forbidden to attach dest->dest_dst if
+ * device is going down.
+ */
+ if (!rt_dev_is_down(dst_dev_rcu(&rt->dst)))
+ __ip_vs_dst_set(dest, dest_dst, &rt->dst, cookie);
+ else
+ noref = 0;
spin_unlock_bh(&dest->dst_lock);
IP_VS_DBG(10, "new dst %pI6, src %pI6, refcnt=%d\n",
&dest->addr.in6, &dest_dst->dst_saddr.in6,
rcuref_read(&rt->dst.__rcuref));
+ if (ret_saddr)
+ *ret_saddr = dest_dst->dst_saddr.in6;
+ if (!noref)
+ ip_vs_dest_dst_free(dest_dst);
}
- if (ret_saddr)
- *ret_saddr = dest_dst->dst_saddr.in6;
} else {
noref = 0;
dst = __ip_vs_route_output_v6(net, daddr, ret_saddr, do_xfrm,
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 09/10] net: remove WARN_ON_ONCE when accessing forward path array
2026-02-17 16:32 [PATCH net 00/10] netfilter: updates for net Florian Westphal
` (7 preceding siblings ...)
2026-02-17 16:32 ` [PATCH net 08/10] ipvs: do not keep dest_dst if dev is going down Florian Westphal
@ 2026-02-17 16:32 ` Florian Westphal
2026-02-17 16:32 ` [PATCH net 10/10] netfilter: nf_tables: fix use-after-free in nf_tables_addchain() Florian Westphal
9 siblings, 0 replies; 12+ messages in thread
From: Florian Westphal @ 2026-02-17 16:32 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Pablo Neira Ayuso <pablo@netfilter.org>
Although unlikely, recent support for IPIP tunnels increases chances of
reaching this WARN_ON_ONCE if userspace manages to build a sufficiently
long forward path.
Remove it.
Fixes: ddb94eafab8b ("net: resolve forwarding path from virtual netdevice and HW destination address")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/core/dev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index 96f89eb797e8..096b3ff13f6b 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -744,7 +744,7 @@ static struct net_device_path *dev_fwd_path(struct net_device_path_stack *stack)
{
int k = stack->num_paths++;
- if (WARN_ON_ONCE(k >= NET_DEVICE_PATH_STACK_MAX))
+ if (k >= NET_DEVICE_PATH_STACK_MAX)
return NULL;
return &stack->path[k];
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH net 10/10] netfilter: nf_tables: fix use-after-free in nf_tables_addchain()
2026-02-17 16:32 [PATCH net 00/10] netfilter: updates for net Florian Westphal
` (8 preceding siblings ...)
2026-02-17 16:32 ` [PATCH net 09/10] net: remove WARN_ON_ONCE when accessing forward path array Florian Westphal
@ 2026-02-17 16:32 ` Florian Westphal
9 siblings, 0 replies; 12+ messages in thread
From: Florian Westphal @ 2026-02-17 16:32 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Inseo An <y0un9sa@gmail.com>
nf_tables_addchain() publishes the chain to table->chains via
list_add_tail_rcu() (in nft_chain_add()) before registering hooks.
If nf_tables_register_hook() then fails, the error path calls
nft_chain_del() (list_del_rcu()) followed by nf_tables_chain_destroy()
with no RCU grace period in between.
This creates two use-after-free conditions:
1) Control-plane: nf_tables_dump_chains() traverses table->chains
under rcu_read_lock(). A concurrent dump can still be walking
the chain when the error path frees it.
2) Packet path: for NFPROTO_INET, nf_register_net_hook() briefly
installs the IPv4 hook before IPv6 registration fails. Packets
entering nft_do_chain() via the transient IPv4 hook can still be
dereferencing chain->blob_gen_X when the error path frees the
chain.
Add synchronize_rcu() between nft_chain_del() and the chain destroy
so that all RCU readers -- both dump threads and in-flight packet
evaluation -- have finished before the chain is freed.
Fixes: 91c7b38dc9f0 ("netfilter: nf_tables: use new transaction infrastructure to handle chain")
Signed-off-by: Inseo An <y0un9sa@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_tables_api.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 819056ea1ce1..0c5a4855b97d 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2823,6 +2823,7 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 policy,
err_register_hook:
nft_chain_del(chain);
+ synchronize_rcu();
err_chain_add:
nft_trans_destroy(trans);
err_trans:
--
2.52.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH net 01/10] netfilter: annotate NAT helper hook pointers with __rcu
2026-02-17 16:32 ` [PATCH net 01/10] netfilter: annotate NAT helper hook pointers with __rcu Florian Westphal
@ 2026-02-19 1:20 ` patchwork-bot+netdevbpf
0 siblings, 0 replies; 12+ messages in thread
From: patchwork-bot+netdevbpf @ 2026-02-19 1:20 UTC (permalink / raw)
To: Florian Westphal
Cc: netdev, pabeni, davem, edumazet, kuba, netfilter-devel, pablo
Hello:
This series was applied to netdev/net.git (main)
by Florian Westphal <fw@strlen.de>:
On Tue, 17 Feb 2026 17:32:24 +0100 you wrote:
> From: Sun Jian <sun.jian.kdev@gmail.com>
>
> The NAT helper hook pointers are updated and dereferenced under RCU rules,
> but lack the proper __rcu annotation.
>
> This makes sparse report address space mismatches when the hooks are used
> with rcu_dereference().
>
> [...]
Here is the summary with links:
- [net,01/10] netfilter: annotate NAT helper hook pointers with __rcu
https://git.kernel.org/netdev/net/c/07919126ecfc
- [net,02/10] netfilter: nft_counter: serialize reset with spinlock
https://git.kernel.org/netdev/net/c/779c60a5190c
- [net,03/10] netfilter: nft_quota: use atomic64_xchg for reset
https://git.kernel.org/netdev/net/c/30c4d7fb59ac
- [net,04/10] netfilter: nf_tables: revert commit_mutex usage in reset path
https://git.kernel.org/netdev/net/c/7f261bb906bf
- [net,05/10] netfilter: nf_conntrack_h323: don't pass uninitialised l3num value
https://git.kernel.org/netdev/net/c/a6d28eb8efe9
- [net,06/10] include: uapi: netfilter_bridge.h: Cover for musl libc
https://git.kernel.org/netdev/net/c/4edd4ba71ce0
- [net,07/10] ipvs: skip ipv6 extension headers for csum checks
https://git.kernel.org/netdev/net/c/05cfe9863ef0
- [net,08/10] ipvs: do not keep dest_dst if dev is going down
https://git.kernel.org/netdev/net/c/8fde939b0206
- [net,09/10] net: remove WARN_ON_ONCE when accessing forward path array
https://git.kernel.org/netdev/net/c/008e7a7c293b
- [net,10/10] netfilter: nf_tables: fix use-after-free in nf_tables_addchain()
https://git.kernel.org/netdev/net/c/71e99ee20fc3
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-02-19 1:20 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-17 16:32 [PATCH net 00/10] netfilter: updates for net Florian Westphal
2026-02-17 16:32 ` [PATCH net 01/10] netfilter: annotate NAT helper hook pointers with __rcu Florian Westphal
2026-02-19 1:20 ` patchwork-bot+netdevbpf
2026-02-17 16:32 ` [PATCH net 02/10] netfilter: nft_counter: serialize reset with spinlock Florian Westphal
2026-02-17 16:32 ` [PATCH net 03/10] netfilter: nft_quota: use atomic64_xchg for reset Florian Westphal
2026-02-17 16:32 ` [PATCH net 04/10] netfilter: nf_tables: revert commit_mutex usage in reset path Florian Westphal
2026-02-17 16:32 ` [PATCH net 05/10] netfilter: nf_conntrack_h323: don't pass uninitialised l3num value Florian Westphal
2026-02-17 16:32 ` [PATCH net 06/10] include: uapi: netfilter_bridge.h: Cover for musl libc Florian Westphal
2026-02-17 16:32 ` [PATCH net 07/10] ipvs: skip ipv6 extension headers for csum checks Florian Westphal
2026-02-17 16:32 ` [PATCH net 08/10] ipvs: do not keep dest_dst if dev is going down Florian Westphal
2026-02-17 16:32 ` [PATCH net 09/10] net: remove WARN_ON_ONCE when accessing forward path array Florian Westphal
2026-02-17 16:32 ` [PATCH net 10/10] netfilter: nf_tables: fix use-after-free in nf_tables_addchain() Florian Westphal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox