From: Sasha Levin <sashal@kernel.org>
To: patches@lists.linux.dev, stable@vger.kernel.org
Cc: Pablo Neira Ayuso <pablo@netfilter.org>,
Florian Westphal <fw@strlen.de>, Sasha Levin <sashal@kernel.org>,
davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
pabeni@redhat.com, netfilter-devel@vger.kernel.org,
coreteam@netfilter.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: [PATCH AUTOSEL 6.18] netfilter: ctnetlink: ensure safe access to master conntrack
Date: Mon, 20 Apr 2026 09:21:13 -0400 [thread overview]
Message-ID: <20260420132314.1023554-279-sashal@kernel.org> (raw)
In-Reply-To: <20260420132314.1023554-1-sashal@kernel.org>
From: Pablo Neira Ayuso <pablo@netfilter.org>
[ Upstream commit bffcaad9afdfe45d7fc777397d3b83c1e3ebffe5 ]
Holding reference on the expectation is not sufficient, the master
conntrack object can just go away, making exp->master invalid.
To access exp->master safely:
- Grab the nf_conntrack_expect_lock, this gets serialized with
clean_from_lists() which also holds this lock when the master
conntrack goes away.
- Hold reference on master conntrack via nf_conntrack_find_get().
Not so easy since the master tuple to look up for the master conntrack
is not available in the existing problematic paths.
This patch goes for extending the nf_conntrack_expect_lock section
to address this issue for simplicity, in the cases that are described
below this is just slightly extending the lock section.
The add expectation command already holds a reference to the master
conntrack from ctnetlink_create_expect().
However, the delete expectation command needs to grab the spinlock
before looking up for the expectation. Expand the existing spinlock
section to address this to cover the expectation lookup. Note that,
the nf_ct_expect_iterate_net() calls already grabs the spinlock while
iterating over the expectation table, which is correct.
The get expectation command needs to grab the spinlock to ensure master
conntrack does not go away. This also expands the existing spinlock
section to cover the expectation lookup too. I needed to move the
netlink skb allocation out of the spinlock to keep it GFP_KERNEL.
For the expectation events, the IPEXP_DESTROY event is already delivered
under the spinlock, just move the delivery of IPEXP_NEW under the
spinlock too because the master conntrack event cache is reached through
exp->master.
While at it, add lockdep notations to help identify what codepaths need
to grab the spinlock.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
LLM Generated explanations, may be completely bogus:
Error: Failed to generate final synthesis
include/net/netfilter/nf_conntrack_core.h | 5 ++++
net/netfilter/nf_conntrack_ecache.c | 2 ++
net/netfilter/nf_conntrack_expect.c | 10 +++++++-
net/netfilter/nf_conntrack_netlink.c | 28 +++++++++++++++--------
4 files changed, 35 insertions(+), 10 deletions(-)
diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h
index 3384859a89210..8883575adcc1e 100644
--- a/include/net/netfilter/nf_conntrack_core.h
+++ b/include/net/netfilter/nf_conntrack_core.h
@@ -83,6 +83,11 @@ void nf_conntrack_lock(spinlock_t *lock);
extern spinlock_t nf_conntrack_expect_lock;
+static inline void lockdep_nfct_expect_lock_held(void)
+{
+ lockdep_assert_held(&nf_conntrack_expect_lock);
+}
+
/* ctnetlink code shared by both ctnetlink and nf_conntrack_bpf */
static inline void __nf_ct_set_timeout(struct nf_conn *ct, u64 timeout)
diff --git a/net/netfilter/nf_conntrack_ecache.c b/net/netfilter/nf_conntrack_ecache.c
index 81baf20826046..9df159448b897 100644
--- a/net/netfilter/nf_conntrack_ecache.c
+++ b/net/netfilter/nf_conntrack_ecache.c
@@ -247,6 +247,8 @@ void nf_ct_expect_event_report(enum ip_conntrack_expect_events event,
struct nf_ct_event_notifier *notify;
struct nf_conntrack_ecache *e;
+ lockdep_nfct_expect_lock_held();
+
rcu_read_lock();
notify = rcu_dereference(net->ct.nf_conntrack_event_cb);
if (!notify)
diff --git a/net/netfilter/nf_conntrack_expect.c b/net/netfilter/nf_conntrack_expect.c
index 2234c444a320e..24d0576d84b7f 100644
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -51,6 +51,7 @@ void nf_ct_unlink_expect_report(struct nf_conntrack_expect *exp,
struct net *net = nf_ct_exp_net(exp);
struct nf_conntrack_net *cnet;
+ lockdep_nfct_expect_lock_held();
WARN_ON(!master_help);
WARN_ON(timer_pending(&exp->timeout));
@@ -118,6 +119,8 @@ nf_ct_exp_equal(const struct nf_conntrack_tuple *tuple,
bool nf_ct_remove_expect(struct nf_conntrack_expect *exp)
{
+ lockdep_nfct_expect_lock_held();
+
if (timer_delete(&exp->timeout)) {
nf_ct_unlink_expect(exp);
nf_ct_expect_put(exp);
@@ -177,6 +180,8 @@ nf_ct_find_expectation(struct net *net,
struct nf_conntrack_expect *i, *exp = NULL;
unsigned int h;
+ lockdep_nfct_expect_lock_held();
+
if (!cnet->expect_count)
return NULL;
@@ -459,6 +464,8 @@ static inline int __nf_ct_expect_check(struct nf_conntrack_expect *expect,
unsigned int h;
int ret = 0;
+ lockdep_nfct_expect_lock_held();
+
if (!master_help) {
ret = -ESHUTDOWN;
goto out;
@@ -515,8 +522,9 @@ int nf_ct_expect_related_report(struct nf_conntrack_expect *expect,
nf_ct_expect_insert(expect);
- spin_unlock_bh(&nf_conntrack_expect_lock);
nf_ct_expect_event_report(IPEXP_NEW, expect, portid, report);
+ spin_unlock_bh(&nf_conntrack_expect_lock);
+
return 0;
out:
spin_unlock_bh(&nf_conntrack_expect_lock);
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 879413b9fa06a..becffc15e7579 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -3337,31 +3337,37 @@ static int ctnetlink_get_expect(struct sk_buff *skb,
if (err < 0)
return err;
+ skb2 = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+ if (!skb2)
+ return -ENOMEM;
+
+ spin_lock_bh(&nf_conntrack_expect_lock);
exp = nf_ct_expect_find_get(info->net, &zone, &tuple);
- if (!exp)
+ if (!exp) {
+ spin_unlock_bh(&nf_conntrack_expect_lock);
+ kfree_skb(skb2);
return -ENOENT;
+ }
if (cda[CTA_EXPECT_ID]) {
__be32 id = nla_get_be32(cda[CTA_EXPECT_ID]);
if (id != nf_expect_get_id(exp)) {
nf_ct_expect_put(exp);
+ spin_unlock_bh(&nf_conntrack_expect_lock);
+ kfree_skb(skb2);
return -ENOENT;
}
}
- skb2 = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
- if (!skb2) {
- nf_ct_expect_put(exp);
- return -ENOMEM;
- }
-
rcu_read_lock();
err = ctnetlink_exp_fill_info(skb2, NETLINK_CB(skb).portid,
info->nlh->nlmsg_seq, IPCTNL_MSG_EXP_NEW,
exp);
rcu_read_unlock();
nf_ct_expect_put(exp);
+ spin_unlock_bh(&nf_conntrack_expect_lock);
+
if (err <= 0) {
kfree_skb(skb2);
return -ENOMEM;
@@ -3408,22 +3414,26 @@ static int ctnetlink_del_expect(struct sk_buff *skb,
if (err < 0)
return err;
+ spin_lock_bh(&nf_conntrack_expect_lock);
+
/* bump usage count to 2 */
exp = nf_ct_expect_find_get(info->net, &zone, &tuple);
- if (!exp)
+ if (!exp) {
+ spin_unlock_bh(&nf_conntrack_expect_lock);
return -ENOENT;
+ }
if (cda[CTA_EXPECT_ID]) {
__be32 id = nla_get_be32(cda[CTA_EXPECT_ID]);
if (id != nf_expect_get_id(exp)) {
nf_ct_expect_put(exp);
+ spin_unlock_bh(&nf_conntrack_expect_lock);
return -ENOENT;
}
}
/* after list removal, usage count == 1 */
- spin_lock_bh(&nf_conntrack_expect_lock);
if (timer_delete(&exp->timeout)) {
nf_ct_unlink_expect_report(exp, NETLINK_CB(skb).portid,
nlmsg_report(info->nlh));
--
2.53.0
next prev parent reply other threads:[~2026-04-20 13:32 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260420132314.1023554-1-sashal@kernel.org>
2026-04-20 13:16 ` [PATCH AUTOSEL 7.0-5.10] FDDI: defxx: Rate-limit memory allocation errors Sasha Levin
2026-04-20 13:16 ` [PATCH AUTOSEL 6.18] xsk: fix XDP_UMEM_SG_FLAG issues Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-5.10] net: rose: reject truncated CLEAR_REQUEST frames in state machines Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 6.18] netfilter: nfnetlink_queue: nfqnl_instance GFP_ATOMIC -> GFP_KERNEL_ACCOUNT allocation Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.18] net: mana: hardening: Validate adapter_mtu from MANA_QUERY_DEV_CONFIG Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-5.10] enic: add V2 SR-IOV VF device ID Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.6] ipv6: move IFA_F_PERMANENT percpu allocation in process scope Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 6.18] netfilter: nfnetlink_log: initialize nfgenmsg in NLMSG_DONE terminator Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 6.18] net: increase IP_TUNNEL_RECURSION_LIMIT to 5 Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-6.1] net: lan743x: fix SGMII detection on PCI1xxxx B0+ during warm reset Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 7.0-5.10] vmxnet3: Suppress page allocation warning for massive Rx Data ring Sasha Levin
2026-04-20 13:17 ` [PATCH AUTOSEL 6.18] xfrm: Wait for RCU readers during policy netns exit Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] ixgbe: stop re-reading flash on every get_drvinfo for e610 Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] devlink: Fix incorrect skb socket family dumping Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.12] net: sfp: add quirk for ZOERAX SFP-2.5G-T Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-6.18] ipv6: discard fragment queue earlier if there is malformed datagram Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] af_unix: read UNIX_DIAG_VFS data under unix_state_lock Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] ipv4: nexthop: allocate skb dynamically in rtm_get_nexthop() Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] xfrm: fix refcount leak in xfrm_migrate_policy_find Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] selftests: net: bridge_vlan_mcast: wait for h1 before querier check Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 6.18] xsk: tighten UMEM headroom validation to account for tailroom and min frame Sasha Levin
2026-04-20 13:18 ` [PATCH AUTOSEL 7.0-5.15] gve: fix SW coalescing when hw-GRO is used Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] netfilter: ip6t_eui64: reject invalid MAC header for all packets Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] l2tp: Drop large packets with UDP encap Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-5.10] net: ethernet: ravb: Disable interrupts when closing device Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0] dsa: tag_mxl862xx: set dsa_default_offload_fwd_mark() Sasha Levin
2026-04-20 13:34 ` Daniel Golle
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.1] ipv4: validate IPV4_DEVCONF attributes properly Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] ipv4: nexthop: avoid duplicate NHA_HW_STATS_ENABLE on nexthop group dump Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] net: ipa: fix event ring index not programmed for IPA v5.0+ Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-5.10] net: core: allow netdev_upper_get_next_dev_rcu from bh context Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] net: txgbe: leave space for null terminators on property_entry Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-5.10] net: initialize sk_rx_queue_mapping in sk_clone() Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 7.0-6.19] gve: Advertise NETIF_F_GRO_HW instead of NETIF_F_LRO Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] netfilter: conntrack: add missing netlink policy validations Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] rtnetlink: add missing netlink_ns_capable() check for peer netns Sasha Levin
2026-04-20 13:19 ` [PATCH AUTOSEL 6.18] ipv6: ioam: fix potential NULL dereferences in __ioam6_fill_trace_data() Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.1] net: sched: cls_u32: Avoid memcpy() false-positive warning in u32_init_knode() Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] xsk: respect tailroom for ZC setups Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.18] tcp: use WRITE_ONCE() for tsoffset in tcp_v6_connect() Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] net: mdio: realtek-rtl9300: use scoped device_for_each_child_node loop Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.12] net: ethernet: mtk_eth_soc: avoid writing to ESW registers on MT7628 Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] ipvs: fix NULL deref in ip_vs_add_service error path Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-6.18] net: hsr: emit notification for PRP slave2 changed hw addr on port deletion Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 7.0-5.10] net: hamradio: scc: validate bufsize in SIOCSCCSMEM ioctl Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] xfrm: account XFRMA_IF_ID in aevent size calculation Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] netfilter: nft_set_pipapo_avx2: don't return non-matching entry on expiry Sasha Levin
2026-04-20 13:20 ` [PATCH AUTOSEL 6.18] bridge: guard local VLAN-0 FDB helpers against NULL vlan group Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 7.0-5.10] net: hamradio: bpqether: validate frame length in bpq_rcv() Sasha Levin
2026-04-20 13:21 ` Sasha Levin [this message]
2026-04-20 13:21 ` [PATCH AUTOSEL 7.0-6.18] hinic3: Add msg_send_lock for message sending concurrecy Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 7.0] netfilter: require Ethernet MAC header before using eth_hdr() Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] net: sched: act_csum: validate nested VLAN headers Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] net: ipa: fix GENERIC_CMD register field masks for IPA v5.0+ Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] dt-bindings: net: Fix Tegra234 MGBE PTP clock Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] net: ioam6: fix OOB and missing lock Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] ipv4: icmp: fix null-ptr-deref in icmp_build_probe() Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] nfc: s3fwrn5: allocate rx skb before consuming bytes Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] xsk: validate MTU against usable frame size on bind Sasha Levin
2026-04-20 13:21 ` [PATCH AUTOSEL 6.18] xfrm_user: fix info leak in build_mapping() Sasha Levin
2026-04-20 13:22 ` [PATCH AUTOSEL 6.18] net: lapbether: handle NETDEV_PRE_TYPE_CHANGE Sasha Levin
2026-04-20 13:22 ` [PATCH AUTOSEL 6.18] net: airoha: Fix memory leak in airoha_qdma_rx_process() Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260420132314.1023554-279-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=coreteam@netfilter.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fw@strlen.de \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pablo@netfilter.org \
--cc=patches@lists.linux.dev \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox