* [PATCH net 0/9] netfilter: updates for net
@ 2026-06-30 4:52 Florian Westphal
0 siblings, 0 replies; 11+ messages in thread
From: Florian Westphal @ 2026-06-30 4:52 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
Hi,
The following patchset contains Netfilter fixes for *net*.
Due to bug volume the plan is to make a second *net* pull request
this Friday.
1) Zero nf_conntrack_expect at allocation to prevent uninitialized data
leaks to userspace. Add missing exp->dir initialization.
2) Prevent out-of-bounds writes in nft_set_pipapo caused by inconsistent
clones during allocation failures. Fail operations if the clone enters an
error state. This was a day-0 bug.
3) Fix use-after-free race between ipset dump and array resizing. Protect
array pointer access with rcu_read_lock(). From Xiang Mei. Bug existed
since v4.20.
4) Validate skb_dst() exists before access in nf_conntrack_sip.
This Prevent crash when called from tc ingress or openvswitch.
From Pablo Neira Ayuso. Bug added in 4.3 when ovs gained support
for conntrack helpers.
5) Cap the maximum number of expectations to NF_CT_EXPECT_MAX_CNT during
userspace helper policy updates. Also from Pablo.
6) Prevent NULL pointer dereference in nft_fib on netdev egress hooks. Add
nft_fib_netdev_validate() to restrict fib expressions to appropriate
netdev hooks. Restrict nft_fib_validate() to IPv4, IPv6, and INET
protocols. From Theodor Arsenij Larionov-Trichkine.
Bug was exposed in v5.16 when egress hooks got added.
7) Restrict nfnetlink_queue writes to network headers. Validate IP/IPv6
header length and disable extension headers or IP option modifications.
Disable bridge modification for now, its unlikely anyone is using this.
8) Restrict arbitrary writes to link-layer and network headers in nftables.
Prevent link-layer modifications from spilling into network headers.
Prevent writes to IP version and length fields.
9) Restrict L3 checksum update offset to IPv4. Else csum offset can be
used to munge arbitrary header offsets, rendering the previous change moot.
These three patches are follow-ups to a 7.1 change that disabled
header rewrite ability in unprivileged network namespaces.
unprivileged netns support is not yet enabled again here.
Please, pull these changes from:
The following changes since commit 1398b1014909618f65ff6bcebcb2ee5ccd44fdc0:
MAINTAINERS: Update Jason Wang's email address (2026-06-29 19:09:00 -0700)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git tags/nf-26-06-30
for you to fetch changes up to e2c4a0c805f7be21c8288e8562145a6691e11559:
netfilter: nftables: restrict checkum update offset (2026-06-30 06:37:12 +0200)
----------------------------------------------------------------
netfilter pull request nf-26-06-30
----------------------------------------------------------------
Florian Westphal (5):
netfilter: nf_conntrack_expect: zero at allocation time
netfilter: nft_set_pipapo: don't leak bad clone into future transaction
netfilter: nfnetlink_queue: restrict writes to network header
netfilter: nftables: restrict linklayer and network header writes
netfilter: nftables: restrict checkum update offset
Pablo Neira Ayuso (2):
netfilter: nf_conntrack_sip: validate skb_dst() before accessing it
netfilter: nfnetlink_cthelper: cap to maximum number of expectation per master
Theodor Arsenij Larionov-Trichkine (1):
netfilter: nft_fib: reject fib expression on the netdev egress hook
Xiang Mei (1):
netfilter: ipset: fix race between dump and ip_set_list resize
net/netfilter/ipset/ip_set_core.c | 8 +-
net/netfilter/nf_conntrack_expect.c | 3 +-
net/netfilter/nf_conntrack_netlink.c | 11 +-
net/netfilter/nf_conntrack_sip.c | 7 +-
net/netfilter/nfnetlink_cthelper.c | 2 +
net/netfilter/nfnetlink_queue.c | 170 +++++++++++++++++
net/netfilter/nft_fib.c | 9 +
net/netfilter/nft_fib_netdev.c | 29 ++-
net/netfilter/nft_payload.c | 270 +++++++++++++++++++++++++++
net/netfilter/nft_set_pipapo.c | 34 +++-
net/netfilter/nft_set_pipapo.h | 8 +
11 files changed, 531 insertions(+), 20 deletions(-)
--
2.53.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net 0/9] netfilter: updates for net
@ 2026-07-03 12:57 Florian Westphal
2026-07-03 12:57 ` [PATCH net 1/9] netfilter: nf_nat_sip: reload possible stale data pointer Florian Westphal
` (8 more replies)
0 siblings, 9 replies; 11+ messages in thread
From: Florian Westphal @ 2026-07-03 12:57 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
Hi,
The following patchset contains Netfilter fixes for *net*, all
for ancient problems. Patch 7 raised drive-by sashiko findings,
but those are not related to the change itself.
1) Rebuild the nf_nat_sip data pointer to prevent stale access after SKB
reallocation. Restrict UDP mangling to UDP streams to avoid TCP packet
corruption.
2) Prevent undefined behavior in xt_u32 caused by invalid shift counts.
From Wyatt Feng.
3) Use u64 variables to prevent incorrect comparisons on links exceeding
34 Gbps in xt_rateest. From Feng Wu.
4) Cap the number of expectations per master during nfnetlink_cthelper
updates. From Pablo Neira Ayuso.
5) Mark malformed IPv6 extension headers for hotdrop in ip6tables.
From Zhixing Chen.
6) Skip the end element of an open interval during the get command when its
closest match is the interval's start element. Also from Pablo Neira Ayuso.
7) Fix PMTU calculation for GUE/GRE tunnels in IPVS during ICMP fragmentation
error handling. Include additional tunnel header length when computing the
new MTU. From Yizhou Zhao.
8) Reset full ip_vs_seq structures in ip_vs_conn_new. Also from Yizhou Zhao.
9) Reject invalid shift parameters in xt_connmark. Also from Wyatt Feng.
Please, pull these changes from:
The following changes since commit d335dcc6f521571d57117b8deeebc940836e5450:
gue: validate REMCSUM private option length (2026-07-03 09:34:53 +0100)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf.git tags/nf-26-07-03
for you to fetch changes up to 1b47026fb4b35bac850ad6e8a4ad7fc018e09ebc:
netfilter: xt_connmark: reject invalid shift parameters (2026-07-03 14:45:21 +0200)
----------------------------------------------------------------
netfilter pull request nf-26-07-03
----------------------------------------------------------------
Feng Wu (1):
netfilter: xt_rateest: fix u64 truncation in xt_rateest_mt()
Florian Westphal (1):
netfilter: nf_nat_sip: reload possible stale data pointer
Pablo Neira Ayuso (2):
netfilter: nfnetlink_cthelper: cap to maximum number of expectation
per master on updates
netfilter: nft_set_rbtree: get command skips end element with open
interval
Wyatt Feng (2):
netfilter: xt_u32: reject invalid shift counts
netfilter: xt_connmark: reject invalid shift parameters
Yizhou Zhao (2):
ipvs: fix PMTU for GUE/GRE tunnel ICMP errors
ipvs: reset full ip_vs_seq structs in ip_vs_conn_new
Zhixing Chen (1):
netfilter: ip6tables: mark malformed IPv6 extension headers for hotdrop
net/ipv6/netfilter/ip6t_ah.c | 5 +++++
net/ipv6/netfilter/ip6t_hbh.c | 1 +
net/ipv6/netfilter/ip6t_rt.c | 3 ++-
net/netfilter/ipvs/ip_vs_conn.c | 4 ++--
net/netfilter/ipvs/ip_vs_core.c | 6 +++---
net/netfilter/nf_nat_sip.c | 11 +++++++++++
net/netfilter/nf_tables_api.c | 3 +++
net/netfilter/nfnetlink_cthelper.c | 2 ++
net/netfilter/nft_set_rbtree.c | 8 ++++++--
net/netfilter/xt_connmark.c | 14 ++++++++++++--
net/netfilter/xt_rateest.c | 2 +-
net/netfilter/xt_u32.c | 12 +++++++++++-
12 files changed, 59 insertions(+), 12 deletions(-)
--
2.54.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH net 1/9] netfilter: nf_nat_sip: reload possible stale data pointer
2026-07-03 12:57 [PATCH net 0/9] netfilter: updates for net Florian Westphal
@ 2026-07-03 12:57 ` Florian Westphal
2026-07-03 12:57 ` [PATCH net 2/9] netfilter: xt_u32: reject invalid shift counts Florian Westphal
` (7 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Florian Westphal @ 2026-07-03 12:57 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
quoting sashiko:
------------------------------------------------------------------------
[..] noticed a potential memory bug and header corruption involving the
SIP NAT helper.
In net/netfilter/nf_nat_sip.c:nf_nat_sip():
if (skb_ensure_writable(skb, skb->len)) {
nf_ct_helper_log(skb, ct, "cannot mangle packet");
return NF_DROP;
}
uh = (void *)skb->data + protoff;
uh->dest = ct_sip_info->forced_dport;
if (!nf_nat_mangle_udp_packet(skb, ct, ctinfo, protoff,
0, 0, NULL, 0)) {
If a cloned or fragmented SKB is reallocated by skb_ensure_writable(), the
old data buffer is freed. However, nf_nat_sip() fails to update *dptr to
point to the new buffer.
It also appears to use nf_nat_mangle_udp_packet() on what could be a TCP
packet, which would overwrite the sequence number with a checksum update.
------------------------------------------------------------------------
nf_conntrack_sip linerizes skbs, hence no fragmented skb can be seen.
But clones are possible, so rebuild dptr.
Disable nf_nat_mangle_udp_packet() branch for TCP streams.
It doesn't look like this can ever happen, else we should have received
bug reports about this, so just check the conntrack is UDP and drop
otherwise.
The calling conntrack_sip set ->forced_dport for SIP_HDR_VIA_UDP messages,
so I don't think this is ever expected to be true for a TCP stream.
Fixes: 7266507d8999 ("netfilter: nf_ct_sip: support Cisco 7941/7945 IP phones")
Cc: stable@vger.kernel.org
Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_nat_sip.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/net/netfilter/nf_nat_sip.c b/net/netfilter/nf_nat_sip.c
index 67c04d8143ab..aea02f6aff09 100644
--- a/net/netfilter/nf_nat_sip.c
+++ b/net/netfilter/nf_nat_sip.c
@@ -289,13 +289,24 @@ static unsigned int nf_nat_sip(struct sk_buff *skb, unsigned int protoff,
/* Mangle destination port for Cisco phones, then fix up checksums */
if (dir == IP_CT_DIR_REPLY && ct_sip_info->forced_dport) {
+ int doff = *dptr - (const char *)skb->data;
struct udphdr *uh;
+ if (doff <= 0) {
+ DEBUG_NET_WARN_ON_ONCE(1);
+ return NF_DROP;
+ }
+
+ /* ct_sip_info->forced_dport only expected with UDP */
+ if (nf_ct_protonum(ct) != IPPROTO_UDP)
+ return NF_DROP;
+
if (skb_ensure_writable(skb, skb->len)) {
nf_ct_helper_log(skb, ct, "cannot mangle packet");
return NF_DROP;
}
+ *dptr = skb->data + doff;
uh = (void *)skb->data + protoff;
uh->dest = ct_sip_info->forced_dport;
--
2.54.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 2/9] netfilter: xt_u32: reject invalid shift counts
2026-07-03 12:57 [PATCH net 0/9] netfilter: updates for net Florian Westphal
2026-07-03 12:57 ` [PATCH net 1/9] netfilter: nf_nat_sip: reload possible stale data pointer Florian Westphal
@ 2026-07-03 12:57 ` Florian Westphal
2026-07-03 12:57 ` [PATCH net 3/9] netfilter: xt_rateest: fix u64 truncation in xt_rateest_mt() Florian Westphal
` (6 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Florian Westphal @ 2026-07-03 12:57 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Wyatt Feng <bronzed_45_vested@icloud.com>
u32_match_it() executes rule-supplied shift operands on a 32-bit
value. A malformed u32 rule can provide a shift count of 32 or more,
triggering an undefined shift out-of-bounds during packet evaluation.
Validate XT_U32_LEFTSH and XT_U32_RIGHTSH operands in
u32_mt_checkentry() and reject malformed rules before they reach the
packet path.
Fixes: 1b50b8a371e9 ("[NETFILTER]: Add u32 match")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Yifan Wu <yifanwucs@gmail.com>
Reported-by: Juefei Pu <tomapufckgml@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <bird@lzu.edu.cn>
Assisted-by: Codex:GPT-5.4
Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com>
Signed-off-by: Ren Wei <n05ec@lzu.edu.cn>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/xt_u32.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/xt_u32.c b/net/netfilter/xt_u32.c
index 117d4615d668..ec1a21e3b6e2 100644
--- a/net/netfilter/xt_u32.c
+++ b/net/netfilter/xt_u32.c
@@ -100,7 +100,7 @@ static int u32_mt_checkentry(const struct xt_mtchk_param *par)
{
const struct xt_u32 *data = par->matchinfo;
const struct xt_u32_test *ct;
- unsigned int i;
+ unsigned int i, j;
if (data->ntests > ARRAY_SIZE(data->tests))
return -EINVAL;
@@ -111,6 +111,16 @@ static int u32_mt_checkentry(const struct xt_mtchk_param *par)
if (ct->nnums > ARRAY_SIZE(ct->location) ||
ct->nvalues > ARRAY_SIZE(ct->value))
return -EINVAL;
+
+ for (j = 1; j < ct->nnums; ++j) {
+ switch (ct->location[j].nextop) {
+ case XT_U32_LEFTSH:
+ case XT_U32_RIGHTSH:
+ if (ct->location[j].number >= 32)
+ return -EINVAL;
+ break;
+ }
+ }
}
return 0;
--
2.54.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 3/9] netfilter: xt_rateest: fix u64 truncation in xt_rateest_mt()
2026-07-03 12:57 [PATCH net 0/9] netfilter: updates for net Florian Westphal
2026-07-03 12:57 ` [PATCH net 1/9] netfilter: nf_nat_sip: reload possible stale data pointer Florian Westphal
2026-07-03 12:57 ` [PATCH net 2/9] netfilter: xt_u32: reject invalid shift counts Florian Westphal
@ 2026-07-03 12:57 ` Florian Westphal
2026-07-03 12:57 ` [PATCH net 4/9] netfilter: nfnetlink_cthelper: cap to maximum number of expectation per master on updates Florian Westphal
` (5 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Florian Westphal @ 2026-07-03 12:57 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Feng Wu <wufengwufengwufeng@gmail.com>
On links faster than ~34 Gbps, where byte rate may exceed 2^32-1
(~ 4.3 GBps), the comparison result becomes incorrect because the
truncated value no longer reflects the actual estimator rate.
Fix by changing the local variables to u64.
Fixes: 1c0d32fde5bd ("net_sched: gen_estimator: complete rewrite of rate estimators")
Signed-off-by: Feng Wu <wufengwufengwufeng@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/xt_rateest.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/xt_rateest.c b/net/netfilter/xt_rateest.c
index b1d736c15fcb..7c05b6342578 100644
--- a/net/netfilter/xt_rateest.c
+++ b/net/netfilter/xt_rateest.c
@@ -16,7 +16,7 @@ xt_rateest_mt(const struct sk_buff *skb, struct xt_action_param *par)
{
const struct xt_rateest_match_info *info = par->matchinfo;
struct gnet_stats_rate_est64 sample = {0};
- u_int32_t bps1, bps2, pps1, pps2;
+ u64 bps1, bps2, pps1, pps2;
bool ret = true;
gen_estimator_read(&info->est1->rate_est, &sample);
--
2.54.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 4/9] netfilter: nfnetlink_cthelper: cap to maximum number of expectation per master on updates
2026-07-03 12:57 [PATCH net 0/9] netfilter: updates for net Florian Westphal
` (2 preceding siblings ...)
2026-07-03 12:57 ` [PATCH net 3/9] netfilter: xt_rateest: fix u64 truncation in xt_rateest_mt() Florian Westphal
@ 2026-07-03 12:57 ` Florian Westphal
2026-07-03 12:57 ` [PATCH net 5/9] netfilter: ip6tables: mark malformed IPv6 extension headers for hotdrop Florian Westphal
` (4 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Florian Westphal @ 2026-07-03 12:57 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Pablo Neira Ayuso <pablo@netfilter.org>
Really cap it to NF_CT_EXPECT_MAX_CNT (255) on updates.
The commit ("netfilter: nfnetlink_cthelper: cap to maximum number of
expectation per master") only covers creation of helpers, not updates.
Fixes: 397c8300972f ("netfilter: nf_conntrack_helper: cap maximum number of expectation at helper registration")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nfnetlink_cthelper.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/net/netfilter/nfnetlink_cthelper.c b/net/netfilter/nfnetlink_cthelper.c
index 2cbcca9110db..f062ac210343 100644
--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -316,6 +316,8 @@ nfnl_cthelper_update_policy_one(const struct nf_conntrack_expect_policy *policy,
new_policy->max_expected =
ntohl(nla_get_be32(tb[NFCTH_POLICY_EXPECT_MAX]));
+ if (!new_policy->max_expected)
+ new_policy->max_expected = NF_CT_EXPECT_MAX_CNT;
if (new_policy->max_expected > NF_CT_EXPECT_MAX_CNT)
return -EINVAL;
--
2.54.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 5/9] netfilter: ip6tables: mark malformed IPv6 extension headers for hotdrop
2026-07-03 12:57 [PATCH net 0/9] netfilter: updates for net Florian Westphal
` (3 preceding siblings ...)
2026-07-03 12:57 ` [PATCH net 4/9] netfilter: nfnetlink_cthelper: cap to maximum number of expectation per master on updates Florian Westphal
@ 2026-07-03 12:57 ` Florian Westphal
2026-07-03 12:57 ` [PATCH net 6/9] netfilter: nft_set_rbtree: get command skips end element with open interval Florian Westphal
` (3 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Florian Westphal @ 2026-07-03 12:57 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Zhixing Chen <running910@gmail.com>
The ah, hbh and rt matches check that the fixed extension header is
present, then use the header length field to derive the advertised
extension header length for matching.
For the ah match, add the missing advertised-length check. For hbh
and rt, update the existing advertised-length checks. In all three
cases, set hotdrop to true before returning false when the advertised
extension header length exceeds the available skb data.
Returning false treats the packet as a rule mismatch. Set hotdrop to
true and drop malformed packets so they cannot bypass rules intended
to drop packets with these IPv6 extension headers.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Zhixing Chen <running910@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/ipv6/netfilter/ip6t_ah.c | 5 +++++
net/ipv6/netfilter/ip6t_hbh.c | 1 +
net/ipv6/netfilter/ip6t_rt.c | 3 ++-
3 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/netfilter/ip6t_ah.c b/net/ipv6/netfilter/ip6t_ah.c
index 70da2f2ce064..1258783ed876 100644
--- a/net/ipv6/netfilter/ip6t_ah.c
+++ b/net/ipv6/netfilter/ip6t_ah.c
@@ -56,6 +56,11 @@ static bool ah_mt6(const struct sk_buff *skb, struct xt_action_param *par)
}
hdrlen = ipv6_authlen(ah);
+ if (skb->len - ptr < hdrlen) {
+ /* Packet smaller than its length field */
+ par->hotdrop = true;
+ return false;
+ }
pr_debug("IPv6 AH LEN %u %u ", hdrlen, ah->hdrlen);
pr_debug("RES %04X ", ah->reserved);
diff --git a/net/ipv6/netfilter/ip6t_hbh.c b/net/ipv6/netfilter/ip6t_hbh.c
index 450dd53846a2..6d1a5d2026a6 100644
--- a/net/ipv6/netfilter/ip6t_hbh.c
+++ b/net/ipv6/netfilter/ip6t_hbh.c
@@ -75,6 +75,7 @@ hbh_mt6(const struct sk_buff *skb, struct xt_action_param *par)
hdrlen = ipv6_optlen(oh);
if (skb->len - ptr < hdrlen) {
/* Packet smaller than it's length field */
+ par->hotdrop = true;
return false;
}
diff --git a/net/ipv6/netfilter/ip6t_rt.c b/net/ipv6/netfilter/ip6t_rt.c
index 5561bd9cea81..278b52752f36 100644
--- a/net/ipv6/netfilter/ip6t_rt.c
+++ b/net/ipv6/netfilter/ip6t_rt.c
@@ -56,7 +56,8 @@ static bool rt_mt6(const struct sk_buff *skb, struct xt_action_param *par)
hdrlen = ipv6_optlen(rh);
if (skb->len - ptr < hdrlen) {
- /* Pcket smaller than its length field */
+ /* Packet smaller than its length field */
+ par->hotdrop = true;
return false;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 6/9] netfilter: nft_set_rbtree: get command skips end element with open interval
2026-07-03 12:57 [PATCH net 0/9] netfilter: updates for net Florian Westphal
` (4 preceding siblings ...)
2026-07-03 12:57 ` [PATCH net 5/9] netfilter: ip6tables: mark malformed IPv6 extension headers for hotdrop Florian Westphal
@ 2026-07-03 12:57 ` Florian Westphal
2026-07-03 12:57 ` [PATCH net 7/9] ipvs: fix PMTU for GUE/GRE tunnel ICMP errors Florian Westphal
` (2 subsequent siblings)
8 siblings, 0 replies; 11+ messages in thread
From: Florian Westphal @ 2026-07-03 12:57 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Pablo Neira Ayuso <pablo@netfilter.org>
The get command on intervals provide partial matches such as subranges
for usability reasons. However, an open interval has no closing end
element. If the closing element matches within the range of the open
internal, ie. its closest match is the start element of the open range,
then, return 0 but offer no matching element to userspace through
netlink as a special case. Userspace provides at least a matching start
element in this case and the closing end element matching the open
interal is ignored.
Another possibility is to report the matching start element of the open
interval for this end interval. However, this results in duplicated
matching being listed in userspace because userspace does not expect a
start element as response to a end element.
Fixes: 2aa34191f06f ("netfilter: nft_set_rbtree: use binary search array in get command")
Reported-by: Melbin K Mathew <mlbnkm1@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/nf_tables_api.c | 3 +++
net/netfilter/nft_set_rbtree.c | 8 ++++++--
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 4884f7f7aaee..a9eaf9455c77 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -6563,6 +6563,9 @@ static int nft_get_set_elem(struct nft_ctx *ctx, const struct nft_set *set,
if (err < 0)
return err;
+ if (!elem.priv)
+ return 0;
+
err = -ENOMEM;
skb = nlmsg_new(NLMSG_GOODSIZE, GFP_ATOMIC);
if (skb == NULL)
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 018bbb6df4ce..6222e9bb57bc 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -184,10 +184,14 @@ nft_rbtree_get(const struct net *net, const struct nft_set *set,
if (!interval || nft_set_elem_expired(interval->from))
return ERR_PTR(-ENOENT);
- if (flags & NFT_SET_ELEM_INTERVAL_END)
+ if (flags & NFT_SET_ELEM_INTERVAL_END) {
+ if (!interval->to)
+ return NULL;
+
rbe = container_of(interval->to, struct nft_rbtree_elem, ext);
- else
+ } else {
rbe = container_of(interval->from, struct nft_rbtree_elem, ext);
+ }
return &rbe->priv;
}
--
2.54.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 7/9] ipvs: fix PMTU for GUE/GRE tunnel ICMP errors
2026-07-03 12:57 [PATCH net 0/9] netfilter: updates for net Florian Westphal
` (5 preceding siblings ...)
2026-07-03 12:57 ` [PATCH net 6/9] netfilter: nft_set_rbtree: get command skips end element with open interval Florian Westphal
@ 2026-07-03 12:57 ` Florian Westphal
2026-07-03 12:57 ` [PATCH net 8/9] ipvs: reset full ip_vs_seq structs in ip_vs_conn_new Florian Westphal
2026-07-03 12:57 ` [PATCH net 9/9] netfilter: xt_connmark: reject invalid shift parameters Florian Westphal
8 siblings, 0 replies; 11+ messages in thread
From: Florian Westphal @ 2026-07-03 12:57 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
When an ICMP Fragmentation Needed error is received for a tunneled IPVS
connection, ip_vs_in_icmp() recomputes the MTU that the original packet
can use by subtracting the tunnel overhead from the reported next-hop
MTU.
The current code always subtracts sizeof(struct iphdr), which is only
the IPIP overhead. For GUE and GRE tunnels, ipvs_udp_decap() and
ipvs_gre_decap() already compute the additional tunnel header length,
but that value is scoped to the decapsulation block and is lost before
the ICMP_FRAG_NEEDED handling. As a result, the ICMP error sent back to
the client advertises an MTU that is too large, so PMTUD can fail to
converge for GUE/GRE-tunneled real servers.
With a reported next-hop MTU of 1400, a GUE tunnel currently returns
1380 to the client. The correct value is 1368:
1400 - sizeof(struct iphdr) - sizeof(struct udphdr) -
sizeof(struct guehdr)
Hoist the tunnel header length into the main ip_vs_in_icmp() scope and
subtract sizeof(struct iphdr) + ulen in the Fragmentation Needed path.
The IPIP path keeps ulen as 0, so its existing 1400 - 20 = 1380 result
is unchanged.
Fixes: 508f744c0de3 ("ipvs: strip udp tunnel headers from icmp errors")
Cc: stable@vger.kernel.org
Reported-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Reported-by: Yuxiang Yang <yangyx22@mails.tsinghua.edu.cn>
Reported-by: Ao Wang <wangao@seu.edu.cn>
Reported-by: Xuewei Feng <fengxw06@126.com>
Reported-by: Qi Li <qli01@tsinghua.edu.cn>
Reported-by: Ke Xu <xuke@tsinghua.edu.cn>
Assisted-by: Claude-Code:GLM-5.2
Signed-off-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/ipvs/ip_vs_core.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index d40b404c1bf6..906f2c361676 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -1767,6 +1767,7 @@ ip_vs_in_icmp(struct netns_ipvs *ipvs, struct sk_buff *skb, int *related,
bool tunnel, new_cp = false;
union nf_inet_addr *raddr;
char *outer_proto = "IPIP";
+ int ulen = 0;
*related = 1;
@@ -1831,7 +1832,6 @@ ip_vs_in_icmp(struct netns_ipvs *ipvs, struct sk_buff *skb, int *related,
/* Error for our tunnel must arrive at LOCAL_IN */
(skb_rtable(skb)->rt_flags & RTCF_LOCAL)) {
__u8 iproto;
- int ulen;
/* Non-first fragment has no UDP/GRE header */
if (unlikely(cih->frag_off & htons(IP_OFFSET)))
@@ -1936,8 +1936,8 @@ ip_vs_in_icmp(struct netns_ipvs *ipvs, struct sk_buff *skb, int *related,
if (dest_dst)
mtu = dst_mtu(dest_dst->dst_cache);
}
- if (mtu > 68 + sizeof(struct iphdr))
- mtu -= sizeof(struct iphdr);
+ if (mtu > 68 + sizeof(struct iphdr) + ulen)
+ mtu -= sizeof(struct iphdr) + ulen;
info = htonl(mtu);
}
/* Strip outer IP, ICMP and IPIP/UDP/GRE, go to IP header of
--
2.54.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 8/9] ipvs: reset full ip_vs_seq structs in ip_vs_conn_new
2026-07-03 12:57 [PATCH net 0/9] netfilter: updates for net Florian Westphal
` (6 preceding siblings ...)
2026-07-03 12:57 ` [PATCH net 7/9] ipvs: fix PMTU for GUE/GRE tunnel ICMP errors Florian Westphal
@ 2026-07-03 12:57 ` Florian Westphal
2026-07-03 12:57 ` [PATCH net 9/9] netfilter: xt_connmark: reject invalid shift parameters Florian Westphal
8 siblings, 0 replies; 11+ messages in thread
From: Florian Westphal @ 2026-07-03 12:57 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Commit 9a05475cebdd ("ipvs: avoid kmem_cache_zalloc in
ip_vs_conn_new") changed ip_vs_conn_new() to allocate an ip_vs_conn
object with kmem_cache_alloc(). The function then initializes many
fields explicitly, but only resets in_seq.delta and out_seq.delta in the
two struct ip_vs_seq members.
That leaves init_seq and previous_delta uninitialized. This is normally
harmless while the corresponding IP_VS_CONN_F_IN_SEQ or
IP_VS_CONN_F_OUT_SEQ flag is clear. For connections learned from a sync
message, however, ip_vs_proc_conn() preserves those flags from
IP_VS_CONN_F_BACKUP_MASK and passes opt=NULL when the message omits
IPVS_OPT_SEQ_DATA. In that case the new connection can be hashed with
SEQ flags set but with the rest of in_seq/out_seq still containing stale
slab data.
When a packet for such a connection is later handled by an IPVS
application helper, vs_fix_seq() and vs_fix_ack_seq() use
previous_delta and init_seq to rewrite TCP sequence numbers. A malformed
sync message can therefore make forwarded packets carry stale slab bytes
in their TCP seq/ack numbers, and can also corrupt the forwarded TCP
flow.
Reset both struct ip_vs_seq members completely before publishing the
connection. This matches the existing "reset struct ip_vs_seq" comment
and keeps the sequence-adjustment gates inactive unless valid sequence
data is installed later.
Fixes: 9a05475cebdd ("ipvs: avoid kmem_cache_zalloc in ip_vs_conn_new")
Cc: stable@vger.kernel.org
Reported-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Reported-by: Yuxiang Yang <yangyx22@mails.tsinghua.edu.cn>
Reported-by: Ao Wang <wangao@seu.edu.cn>
Reported-by: Xuewei Feng <fengxw06@126.com>
Reported-by: Qi Li <qli01@tsinghua.edu.cn>
Reported-by: Ke Xu <xuke@tsinghua.edu.cn>
Assisted-by: Claude-Code:GLM-5.2
Signed-off-by: Yizhou Zhao <zhaoyz24@mails.tsinghua.edu.cn>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/ipvs/ip_vs_conn.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index cb36641f8d1c..6ed2622363f0 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -1420,8 +1420,8 @@ ip_vs_conn_new(const struct ip_vs_conn_param *p, int dest_af,
cp->app = NULL;
cp->app_data = NULL;
/* reset struct ip_vs_seq */
- cp->in_seq.delta = 0;
- cp->out_seq.delta = 0;
+ memset(&cp->in_seq, 0, sizeof(cp->in_seq));
+ memset(&cp->out_seq, 0, sizeof(cp->out_seq));
if (unlikely(flags & IP_VS_CONN_F_NO_CPORT)) {
int af_id = ip_vs_af_index(cp->af);
--
2.54.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH net 9/9] netfilter: xt_connmark: reject invalid shift parameters
2026-07-03 12:57 [PATCH net 0/9] netfilter: updates for net Florian Westphal
` (7 preceding siblings ...)
2026-07-03 12:57 ` [PATCH net 8/9] ipvs: reset full ip_vs_seq structs in ip_vs_conn_new Florian Westphal
@ 2026-07-03 12:57 ` Florian Westphal
8 siblings, 0 replies; 11+ messages in thread
From: Florian Westphal @ 2026-07-03 12:57 UTC (permalink / raw)
To: netdev
Cc: Paolo Abeni, David S. Miller, Eric Dumazet, Jakub Kicinski,
netfilter-devel, pablo
From: Wyatt Feng <bronzed_45_vested@icloud.com>
Revision 2 of the CONNMARK target accepts user-controlled shift
parameters and applies them to 32-bit mark values in
connmark_tg_shift().
A shift_bits value of 32 or more triggers an undefined-shift bug when
the rule is evaluated. Invalid shift_dir values are also accepted and
silently fall back to the left-shift path.
Reject invalid revision-2 shift parameters in connmark_tg_check() so
malformed rules fail at installation time, before they can reach the
packet path.
Fixes: 472a73e00757 ("netfilter: xt_conntrack: Support bit-shifting for CONNMARK & MARK targets.")
Reported-by: Yuan Tan <yuantan098@gmail.com>
Reported-by: Zhengchuan Liang <zcliangcn@gmail.com>
Reported-by: Xin Liu <dstsmallbird@foxmail.com>
Assisted-by: Codex:GPT-5.4
Signed-off-by: Wyatt Feng <bronzed_45_vested@icloud.com>
Reviewed-by: Ren Wei <enjou1224z@gmail.com>
Reviewed-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Florian Westphal <fw@strlen.de>
---
net/netfilter/xt_connmark.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/xt_connmark.c b/net/netfilter/xt_connmark.c
index 4277084de2e7..2cf27f7d59b9 100644
--- a/net/netfilter/xt_connmark.c
+++ b/net/netfilter/xt_connmark.c
@@ -112,6 +112,16 @@ static int connmark_tg_check(const struct xt_tgchk_param *par)
return ret;
}
+static int connmark_tg_check_v2(const struct xt_tgchk_param *par)
+{
+ const struct xt_connmark_tginfo2 *info = par->targinfo;
+
+ if (info->shift_dir > D_SHIFT_RIGHT || info->shift_bits >= 32)
+ return -EINVAL;
+
+ return connmark_tg_check(par);
+}
+
static void connmark_tg_destroy(const struct xt_tgdtor_param *par)
{
nf_ct_netns_put(par->net, par->family);
@@ -162,7 +172,7 @@ static struct xt_target connmark_tg_reg[] __read_mostly = {
.name = "CONNMARK",
.revision = 2,
.family = NFPROTO_IPV4,
- .checkentry = connmark_tg_check,
+ .checkentry = connmark_tg_check_v2,
.target = connmark_tg_v2,
.targetsize = sizeof(struct xt_connmark_tginfo2),
.destroy = connmark_tg_destroy,
@@ -183,7 +193,7 @@ static struct xt_target connmark_tg_reg[] __read_mostly = {
.name = "CONNMARK",
.revision = 2,
.family = NFPROTO_IPV6,
- .checkentry = connmark_tg_check,
+ .checkentry = connmark_tg_check_v2,
.target = connmark_tg_v2,
.targetsize = sizeof(struct xt_connmark_tginfo2),
.destroy = connmark_tg_destroy,
--
2.54.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-07-03 12:57 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-03 12:57 [PATCH net 0/9] netfilter: updates for net Florian Westphal
2026-07-03 12:57 ` [PATCH net 1/9] netfilter: nf_nat_sip: reload possible stale data pointer Florian Westphal
2026-07-03 12:57 ` [PATCH net 2/9] netfilter: xt_u32: reject invalid shift counts Florian Westphal
2026-07-03 12:57 ` [PATCH net 3/9] netfilter: xt_rateest: fix u64 truncation in xt_rateest_mt() Florian Westphal
2026-07-03 12:57 ` [PATCH net 4/9] netfilter: nfnetlink_cthelper: cap to maximum number of expectation per master on updates Florian Westphal
2026-07-03 12:57 ` [PATCH net 5/9] netfilter: ip6tables: mark malformed IPv6 extension headers for hotdrop Florian Westphal
2026-07-03 12:57 ` [PATCH net 6/9] netfilter: nft_set_rbtree: get command skips end element with open interval Florian Westphal
2026-07-03 12:57 ` [PATCH net 7/9] ipvs: fix PMTU for GUE/GRE tunnel ICMP errors Florian Westphal
2026-07-03 12:57 ` [PATCH net 8/9] ipvs: reset full ip_vs_seq structs in ip_vs_conn_new Florian Westphal
2026-07-03 12:57 ` [PATCH net 9/9] netfilter: xt_connmark: reject invalid shift parameters Florian Westphal
-- strict thread matches above, loose matches on Subject: below --
2026-06-30 4:52 [PATCH net 0/9] netfilter: updates for net Florian Westphal
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox