From: Jakub Kicinski <kuba@kernel.org>
To: pablo@netfilter.org
Cc: Jakub Kicinski <kuba@kernel.org>,
netfilter-devel@vger.kernel.org, davem@davemloft.net,
netdev@vger.kernel.org, pabeni@redhat.com, edumazet@google.com,
fw@strlen.de, horms@kernel.org
Subject: Re: [PATCH net 11/11] ipvs: fix the spin_lock usage for RT build
Date: Mon, 27 Apr 2026 19:13:43 -0700 [thread overview]
Message-ID: <20260428021343.793695-1-kuba@kernel.org> (raw)
In-Reply-To: <20260424190513.32823-12-pablo@netfilter.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
ipvs: fix the spin_lock usage for RT build
This commit fixes spin_lock sleeping issues on PREEMPT_RT by adjusting the
use of cp->lock in combination with hlist_bl bit locks. It removes the
spin_lock from ip_vs_conn_hash() and ip_vs_conn_unlink(), and changes the
lock ordering in ip_vs_conn_fill_cport().
Link: https://sashiko.dev/#/patchset/20260415200216.79699-1-ja%40ssi.bg
Link: https://sashiko.dev/#/patchset/20260420165539.85174-4-ja%40ssi.bg
Link: https://sashiko.dev/#/patchset/20260422135823.50489-4-ja%40ssi.bg
> diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
> index 84a4921a7865a9..9e23cda8482588 100644
> --- a/net/netfilter/ipvs/ip_vs_conn.c
> +++ b/net/netfilter/ipvs/ip_vs_conn.c
[ ... ]
> @@ -704,14 +702,23 @@ void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport)
> hlist_bl_unlock(head);
> write_seqcount_end(&t->seqc[hash_key & t->seqc_mask]);
> preempt_enable_nested();
> - spin_unlock_bh(&t->lock[hash_key & t->lock_mask].l);
> + spin_unlock(&t->lock[hash_key & t->lock_mask].l);
> + spin_unlock_bh(&cp->lock);
> hash_key = hash_key_new;
> goto retry;
> }
>
> - spin_lock(&cp->lock);
> - if ((cp->flags & IP_VS_CONN_F_NO_CPORT) &&
> - (cp->flags & IP_VS_CONN_F_HASHED)) {
> + /* Fill cport once, even if multiple packets try to do it */
> + if (cp->flags & IP_VS_CONN_F_NO_CPORT && (!cp->cport || by_me)) {
Does removing the IP_VS_CONN_F_HASHED check here lead to a NULL pointer
dereference for IP_VS_CONN_F_ONE_PACKET connections?
Connections flagged with IP_VS_CONN_F_ONE_PACKET bypass hashing:
ip_vs_conn_hash() {
if (cp->flags & IP_VS_CONN_F_ONE_PACKET)
return 0;
...
}
Because they bypass hashing, their hash list nodes (hn->node) are never
linked, leaving hn->node.pprev as NULL. If such a connection is created
with a source port of 0, it gets the IP_VS_CONN_F_NO_CPORT flag.
When ip_vs_conn_fill_cport() is later invoked to populate the port, the
missing IP_VS_CONN_F_HASHED check allows the execution to proceed.
> + /* If we race with resizing make sure cport is set for dir 1 */
> + if (!cp->cport) {
> + cp->cport = cport;
> + by_me = true;
> + }
> + if (!dir) {
> + atomic_dec(&ipvs->no_cport_conns[af_id]);
> + cp->flags &= ~IP_VS_CONN_F_NO_CPORT;
> + }
> /* We do not recalc hash_key_r under lock, we assume the
> * parameters in cp do not change, i.e. cport is
> * the only possible change.
> @@ -726,21 +733,17 @@ void ip_vs_conn_fill_cport(struct ip_vs_conn *cp, __be16 cport)
> hlist_bl_del_rcu(&hn->node);
If head != head2 here, hlist_bl_del_rcu(&hn->node) is called. Since
hn->node.pprev is NULL, wouldn't this cause a panic when __hlist_bl_del()
executes *pprev = next?
Is it possible to retain the IP_VS_CONN_F_HASHED check or otherwise
ensure unhashed nodes aren't unlinked?
> hlist_bl_add_head_rcu(&hn->node, head_new);
> }
> - if (!dir) {
> - atomic_dec(&ipvs->no_cport_conns[af_id]);
> - cp->flags &= ~IP_VS_CONN_F_NO_CPORT;
> - cp->cport = cport;
> - }
> }
--
pw-bot: cr
prev parent reply other threads:[~2026-04-28 2:13 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-24 19:05 [PATCH net 00/11] Netfilter/IPVS fixes for net Pablo Neira Ayuso
2026-04-24 19:05 ` [PATCH net 01/11] netfilter: arp_tables: fix IEEE1394 ARP payload parsing Pablo Neira Ayuso
2026-04-24 19:05 ` [PATCH net 02/11] netfilter: nf_tables: use list_del_rcu for netlink hooks Pablo Neira Ayuso
2026-04-24 19:05 ` [PATCH net 03/11] rculist: add list_splice_rcu() for private lists Pablo Neira Ayuso
2026-04-24 19:05 ` [PATCH net 04/11] netfilter: nf_tables: join hook list via splice_list_rcu() in commit phase Pablo Neira Ayuso
2026-04-24 19:05 ` [PATCH net 05/11] netfilter: nf_tables: add hook transactions for device deletions Pablo Neira Ayuso
2026-04-24 19:05 ` [PATCH net 06/11] netfilter: xt_policy: fix strict mode inbound policy matching Pablo Neira Ayuso
2026-04-24 19:05 ` [PATCH net 07/11] netfilter: reject zero shift in nft_bitwise Pablo Neira Ayuso
2026-04-24 19:05 ` [PATCH net 08/11] netfilter: nf_conntrack_sip: don't use simple_strtoul Pablo Neira Ayuso
2026-04-24 19:05 ` [PATCH net 09/11] ipvs: fixes for the new ip_vs_status info Pablo Neira Ayuso
2026-04-24 19:05 ` [PATCH net 10/11] ipvs: fix races around the conn_lfactor and svc_lfactor sysctl vars Pablo Neira Ayuso
2026-04-24 19:05 ` [PATCH net 11/11] ipvs: fix the spin_lock usage for RT build Pablo Neira Ayuso
2026-04-28 2:13 ` Jakub Kicinski
2026-04-28 2:13 ` Jakub Kicinski [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428021343.793695-1-kuba@kernel.org \
--to=kuba@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=fw@strlen.de \
--cc=horms@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pablo@netfilter.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox