netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: xuanqiang.luo@linux.dev
To: edumazet@google.com, kuniyu@google.com
Cc: kerneljasonxing@gmail.com, davem@davemloft.net, kuba@kernel.org,
	netdev@vger.kernel.org, Xuanqiang Luo <luoxuanqiang@kylinos.cn>
Subject: [PATCH net-next v5 0/3] net: Avoid ehash lookup races
Date: Wed, 24 Sep 2025 09:50:31 +0800	[thread overview]
Message-ID: <20250924015034.587056-1-xuanqiang.luo@linux.dev> (raw)

From: Xuanqiang Luo <luoxuanqiang@kylinos.cn>

After replacing R/W locks with RCU in commit 3ab5aee7fe84 ("net: Convert
TCP & DCCP hash tables to use RCU / hlist_nulls"), a race window emerged
during the switch from reqsk/sk to sk/tw.

Now that both timewait sock (tw) and full sock (sk) reside on the same
ehash chain, it is appropriate to introduce hlist_nulls replace
operations, to eliminate the race conditions caused by this window.

Before this series of patches, I previously sent another version of the
patch, attempting to avoid the issue using a lock mechanism. However, it
seems there are some problems with that approach now, so I've switched to
the "replace" method in the current patches to resolve the issue.
For details, refer to:
https://lore.kernel.org/netdev/20250903024406.2418362-1-xuanqiang.luo@linux.dev/

Before I encountered this type of issue recently, I found there had been
several historical discussions about it. Therefore, I'm adding this
background information for those interested to reference:
1. https://lore.kernel.org/lkml/20230118015941.1313-1-kerneljasonxing@gmail.com/
2. https://lore.kernel.org/netdev/20230606064306.9192-1-duanmuquan@baidu.com/

---

Changes:
  v5:
    * Patch 1
	* Rename __hlist_nulls_replace_rcu() to hlist_nulls_replace_rcu()
	  and update the description of hlist_nulls_replace_init_rcu().
    * Patch 2
	* Remove __sk_nulls_replace_node_init_rcu() and inline it into
	  sk_nulls_replace_node_init_rcu().
	* Use DEBUG_NET_WARN_ON_ONCE() instead of WARN_ON().
    * Patch 3
	* Move smp_wmb() after setting the refcount.

  v4: https://lore.kernel.org/all/20250920105945.538042-1-xuanqiang.luo@linux.dev/
    * Patch 1
	* Use WRITE_ONCE() for ->next in __hlist_nulls_replace_rcu(), and
	  add why in the commit message.
	* Remove the node hash check in hlist_nulls_replace_init_rcu() to
	  avoid redundancy. Also remove the return value, as it serves no
	  purpose in this patch series.
    * Patch 3
	* Remove the check of hlist_nulls_replace_init_rcu() return value
	  in inet_twsk_hashdance_schedule() as it is unnecessary.
          Thanks to Kuni for clarifying this.

  v3: https://lore.kernel.org/all/20250916103054.719584-1-xuanqiang.luo@linux.dev/
    * Add more background information on this type of issue to the letter
      cover.

  v2: https://lore.kernel.org/all/20250916064614.605075-1-xuanqiang.luo@linux.dev/
    * Patch 1
	* Use WRITE_ONCE() to initialize old->pprev.
    * Patch 2&3
	* Optimize sk hashed check. Thanks Kuni for pointing it out!

  v1: https://lore.kernel.org/all/20250915070308.111816-1-xuanqiang.luo@linux.dev/

Xuanqiang Luo (3):
  rculist: Add hlist_nulls_replace_rcu() and
    hlist_nulls_replace_init_rcu()
  inet: Avoid ehash lookup race in inet_ehash_insert()
  inet: Avoid ehash lookup race in inet_twsk_hashdance_schedule()

 include/linux/rculist_nulls.h | 52 +++++++++++++++++++++++++++++++++++
 include/net/sock.h            | 14 ++++++++++
 net/ipv4/inet_hashtables.c    |  4 ++-
 net/ipv4/inet_timewait_sock.c | 24 +++++++---------
 4 files changed, 79 insertions(+), 15 deletions(-)

-- 
2.25.1


             reply	other threads:[~2025-09-24  1:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-24  1:50 xuanqiang.luo [this message]
2025-09-24  1:50 ` [PATCH net-next v5 1/3] rculist: Add hlist_nulls_replace_rcu() and hlist_nulls_replace_init_rcu() xuanqiang.luo
2025-09-25  0:54   ` Jakub Kicinski
2025-09-25  1:50     ` luoxuanqiang
2025-09-24  1:50 ` [PATCH net-next v5 2/3] inet: Avoid ehash lookup race in inet_ehash_insert() xuanqiang.luo
2025-09-24  1:50 ` [PATCH net-next v5 3/3] inet: Avoid ehash lookup race in inet_twsk_hashdance_schedule() xuanqiang.luo
2025-09-25  0:55   ` Jakub Kicinski
2025-09-25  1:51     ` luoxuanqiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250924015034.587056-1-xuanqiang.luo@linux.dev \
    --to=xuanqiang.luo@linux.dev \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=kerneljasonxing@gmail.com \
    --cc=kuba@kernel.org \
    --cc=kuniyu@google.com \
    --cc=luoxuanqiang@kylinos.cn \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).