public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Kuniyuki Iwashima <kuniyu@amazon.com>,
	Jakub Kicinski <kuba@kernel.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.15 58/78] Revert "tcp: avoid the lookup process failing to get sk in ehash table"
Date: Tue, 25 Jul 2023 12:46:49 +0200	[thread overview]
Message-ID: <20230725104453.506004557@linuxfoundation.org> (raw)
In-Reply-To: <20230725104451.275227789@linuxfoundation.org>

From: Kuniyuki Iwashima <kuniyu@amazon.com>

[ Upstream commit 81b3ade5d2b98ad6e0a473b0e1e420a801275592 ]

This reverts commit 3f4ca5fafc08881d7a57daa20449d171f2887043.

Commit 3f4ca5fafc08 ("tcp: avoid the lookup process failing to get sk in
ehash table") reversed the order in how a socket is inserted into ehash
to fix an issue that ehash-lookup could fail when reqsk/full sk/twsk are
swapped.  However, it introduced another lookup failure.

The full socket in ehash is allocated from a slab with SLAB_TYPESAFE_BY_RCU
and does not have SOCK_RCU_FREE, so the socket could be reused even while
it is being referenced on another CPU doing RCU lookup.

Let's say a socket is reused and inserted into the same hash bucket during
lookup.  After the blamed commit, a new socket is inserted at the end of
the list.  If that happens, we will skip sockets placed after the previous
position of the reused socket, resulting in ehash lookup failure.

As described in Documentation/RCU/rculist_nulls.rst, we should insert a
new socket at the head of the list to avoid such an issue.

This issue, the swap-lookup-failure, and another variant reported in [0]
can all be handled properly by adding a locked ehash lookup suggested by
Eric Dumazet [1].

However, this issue could occur for every packet, thus more likely than
the other two races, so let's revert the change for now.

Link: https://lore.kernel.org/netdev/20230606064306.9192-1-duanmuquan@baidu.com/ [0]
Link: https://lore.kernel.org/netdev/CANn89iK8snOz8TYOhhwfimC7ykYA78GA3Nyv8x06SZYa1nKdyA@mail.gmail.com/ [1]
Fixes: 3f4ca5fafc08 ("tcp: avoid the lookup process failing to get sk in ehash table")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230717215918.15723-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/ipv4/inet_hashtables.c    | 17 ++---------------
 net/ipv4/inet_timewait_sock.c |  8 ++++----
 2 files changed, 6 insertions(+), 19 deletions(-)

diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index 39b3db5b61190..2936676f86eb8 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -571,20 +571,8 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk, bool *found_dup_sk)
 	spin_lock(lock);
 	if (osk) {
 		WARN_ON_ONCE(sk->sk_hash != osk->sk_hash);
-		ret = sk_hashed(osk);
-		if (ret) {
-			/* Before deleting the node, we insert a new one to make
-			 * sure that the look-up-sk process would not miss either
-			 * of them and that at least one node would exist in ehash
-			 * table all the time. Otherwise there's a tiny chance
-			 * that lookup process could find nothing in ehash table.
-			 */
-			__sk_nulls_add_node_tail_rcu(sk, list);
-			sk_nulls_del_node_init_rcu(osk);
-		}
-		goto unlock;
-	}
-	if (found_dup_sk) {
+		ret = sk_nulls_del_node_init_rcu(osk);
+	} else if (found_dup_sk) {
 		*found_dup_sk = inet_ehash_lookup_by_sk(sk, list);
 		if (*found_dup_sk)
 			ret = false;
@@ -593,7 +581,6 @@ bool inet_ehash_insert(struct sock *sk, struct sock *osk, bool *found_dup_sk)
 	if (ret)
 		__sk_nulls_add_node_rcu(sk, list);
 
-unlock:
 	spin_unlock(lock);
 
 	return ret;
diff --git a/net/ipv4/inet_timewait_sock.c b/net/ipv4/inet_timewait_sock.c
index fe6340c363b43..437afe392e667 100644
--- a/net/ipv4/inet_timewait_sock.c
+++ b/net/ipv4/inet_timewait_sock.c
@@ -81,10 +81,10 @@ void inet_twsk_put(struct inet_timewait_sock *tw)
 }
 EXPORT_SYMBOL_GPL(inet_twsk_put);
 
-static void inet_twsk_add_node_tail_rcu(struct inet_timewait_sock *tw,
-					struct hlist_nulls_head *list)
+static void inet_twsk_add_node_rcu(struct inet_timewait_sock *tw,
+				   struct hlist_nulls_head *list)
 {
-	hlist_nulls_add_tail_rcu(&tw->tw_node, list);
+	hlist_nulls_add_head_rcu(&tw->tw_node, list);
 }
 
 static void inet_twsk_add_bind_node(struct inet_timewait_sock *tw,
@@ -120,7 +120,7 @@ void inet_twsk_hashdance(struct inet_timewait_sock *tw, struct sock *sk,
 
 	spin_lock(lock);
 
-	inet_twsk_add_node_tail_rcu(tw, &ehead->chain);
+	inet_twsk_add_node_rcu(tw, &ehead->chain);
 
 	/* Step 3: Remove SK from hash chain */
 	if (__sk_nulls_del_node_init_rcu(sk))
-- 
2.39.2




  parent reply	other threads:[~2023-07-25 11:11 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-25 10:45 [PATCH 5.15 00/78] 5.15.123-rc1 review Greg Kroah-Hartman
2023-07-25 10:45 ` [PATCH 5.15 01/78] ALSA: hda/realtek - remove 3k pull low procedure Greg Kroah-Hartman
2023-07-25 10:45 ` [PATCH 5.15 02/78] ALSA: hda/realtek: Add quirk for Clevo NS70AU Greg Kroah-Hartman
2023-07-25 10:45 ` [PATCH 5.15 03/78] ALSA: hda/realtek: Enable Mute LED on HP Laptop 15s-eq2xxx Greg Kroah-Hartman
2023-07-25 10:45 ` [PATCH 5.15 04/78] keys: Fix linking a duplicate key to a keyrings assoc_array Greg Kroah-Hartman
2023-07-25 10:45 ` [PATCH 5.15 05/78] perf probe: Add test for regression introduced by switch to die_get_decl_file() Greg Kroah-Hartman
2023-07-25 10:45 ` [PATCH 5.15 06/78] btrfs: fix warning when putting transaction with qgroups enabled after abort Greg Kroah-Hartman
2023-07-25 10:45 ` [PATCH 5.15 07/78] fuse: revalidate: dont invalidate if interrupted Greg Kroah-Hartman
2023-07-25 10:45 ` [PATCH 5.15 08/78] btrfs: zoned: fix memory leak after finding block group with super blocks Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 09/78] fuse: ioctl: translate ENOSYS in outarg Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 10/78] selftests: tc: set timeout to 15 minutes Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 11/78] selftests: tc: add ct action kconfig dep Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 12/78] regmap: Drop initial version of maximum transfer length fixes Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 13/78] regmap: Account for register length in SMBus I/O limits Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 14/78] can: bcm: Fix UAF in bcm_proc_show() Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 15/78] selftests: tc: add ConnTrack procfs kconfig Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 16/78] drm/client: Fix memory leak in drm_client_target_cloned Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 17/78] drm/client: Fix memory leak in drm_client_modeset_probe Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 18/78] drm/amd/display: Disable MPC split by default on special asic Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 19/78] drm/amd/display: Keep PHY active for DP displays on DCN31 Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 20/78] ASoC: fsl_sai: Disable bit clock with transmitter Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 21/78] ASoC: codecs: wcd938x: fix missing clsh ctrl error handling Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 22/78] ASoC: codecs: wcd-mbhc-v2: fix resource leaks on component remove Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 23/78] ASoC: codecs: wcd938x: " Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 24/78] ASoC: codecs: wcd938x: fix missing mbhc init error handling Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 25/78] ASoC: codecs: wcd934x: fix resource leaks on component remove Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 26/78] ASoC: codecs: wcd938x: fix codec initialisation race Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 27/78] ASoC: codecs: wcd938x: fix soundwire " Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 28/78] ext4: correct inline offset when handling xattrs in inode body Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 29/78] [PATCH AUTOSEL 4.14 1/9] drm/radeon: Fix integer overflow in radeon_cs_parser_init Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 30/78] [PATCH AUTOSEL 4.14 2/9] ALSA: emu10k1: roll up loops in DSP setup code for Audigy Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 31/78] [PATCH AUTOSEL 4.14 3/9] quota: Properly disable quotas when add_dquot_ref() fails Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 32/78] [PATCH AUTOSEL 4.14 4/9] quota: fix warning in dqgrab() Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 33/78] [PATCH AUTOSEL 4.14 5/9] udf: Fix uninitialized array access for some pathnames Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 34/78] [PATCH AUTOSEL 4.14 6/9] fs: jfs: Fix UBSAN: array-index-out-of-bounds in dbAllocDmapLev Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 35/78] [PATCH AUTOSEL 4.14 7/9] MIPS: dec: prom: Address -Warray-bounds warning Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 36/78] [PATCH AUTOSEL 4.14 8/9] FS: JFS: Fix null-ptr-deref Read in txBegin Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 37/78] [PATCH AUTOSEL 4.14 9/9] FS: JFS: Check for read-only mounted filesystem " Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 38/78] spi: bcm63xx: fix max prepend length Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 39/78] fbdev: imxfb: warn about invalid left/right margin Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 40/78] perf build: Fix library not found error when using CSLIBS Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 41/78] pinctrl: amd: Use amd_pinconf_set() for all config options Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 42/78] net: ethernet: ti: cpsw_ale: Fix cpsw_ale_get_field()/cpsw_ale_set_field() Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 43/78] bridge: Add extack warning when enabling STP in netns Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 44/78] ethernet: use eth_hw_addr_set() instead of ether_addr_copy() Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 45/78] of: net: add a helper for loading netdev->dev_addr Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 46/78] ethernet: use of_get_ethdev_address() Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 47/78] net: ethernet: mtk_eth_soc: handle probe deferral Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 48/78] net: sched: cls_bpf: Undo tcf_bind_filter in case of an error Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 49/78] iavf: Fix use-after-free in free_netdev Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 50/78] iavf: Fix out-of-bounds when setting channels on remove Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 51/78] security: keys: Modify mismatched function name Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 52/78] octeontx2-pf: Dont allocate BPIDs for LBK interfaces Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 53/78] bpf: Fix subprog idx logic in check_max_stack_depth Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 54/78] igc: Prevent garbled TX queue with XDP ZEROCOPY Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 55/78] tcp: annotate data-races around tcp_rsk(req)->ts_recent Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 56/78] net: ipv4: Use kfree_sensitive instead of kfree Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 57/78] net:ipv6: check return value of pskb_trim() Greg Kroah-Hartman
2023-07-25 10:46 ` Greg Kroah-Hartman [this message]
2023-07-25 10:46 ` [PATCH 5.15 59/78] fbdev: au1200fb: Fix missing IRQ check in au1200fb_drv_probe Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 60/78] llc: Dont drop packet from non-root netns Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 61/78] netfilter: nf_tables: fix spurious set element insertion failure Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 62/78] netfilter: nft_set_pipapo: fix improper element removal Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 63/78] netfilter: nf_tables: skip bound chain in netns release path Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 64/78] netfilter: nf_tables: skip bound chain on rule flush Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 65/78] tcp: annotate data-races around tp->tcp_tx_delay Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 66/78] tcp: annotate data-races around tp->keepalive_time Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 67/78] tcp: annotate data-races around tp->keepalive_intvl Greg Kroah-Hartman
2023-07-25 10:46 ` [PATCH 5.15 68/78] tcp: annotate data-races around tp->keepalive_probes Greg Kroah-Hartman
2023-07-25 10:47 ` [PATCH 5.15 69/78] tcp: annotate data-races around icsk->icsk_syn_retries Greg Kroah-Hartman
2023-07-25 10:47 ` [PATCH 5.15 70/78] tcp: annotate data-races around tp->linger2 Greg Kroah-Hartman
2023-07-25 10:47 ` [PATCH 5.15 71/78] tcp: annotate data-races around rskq_defer_accept Greg Kroah-Hartman
2023-07-25 10:47 ` [PATCH 5.15 72/78] tcp: annotate data-races around tp->notsent_lowat Greg Kroah-Hartman
2023-07-25 10:47 ` [PATCH 5.15 73/78] tcp: annotate data-races around icsk->icsk_user_timeout Greg Kroah-Hartman
2023-07-25 10:47 ` [PATCH 5.15 74/78] tcp: annotate data-races around fastopenq.max_qlen Greg Kroah-Hartman
2023-07-25 10:47 ` [PATCH 5.15 75/78] net: phy: prevent stale pointer dereference in phy_init() Greg Kroah-Hartman
2023-07-25 10:47 ` [PATCH 5.15 76/78] jbd2: recheck chechpointing non-dirty buffer Greg Kroah-Hartman
2023-07-25 10:47 ` [PATCH 5.15 77/78] tracing/histograms: Return an error if we fail to add histogram to hist_vars list Greg Kroah-Hartman
2023-07-25 10:47 ` [PATCH 5.15 78/78] nixge: fix mac address error handling again Greg Kroah-Hartman
2023-07-25 16:27 ` [PATCH 5.15 00/78] 5.15.123-rc1 review Jon Hunter
2023-07-25 18:08 ` SeongJae Park
2023-07-25 20:20 ` Shuah Khan
2023-07-25 21:52 ` Florian Fainelli
2023-07-26 14:19 ` Naresh Kamboju
2023-07-27  0:02 ` Guenter Roeck
2023-07-27  0:10 ` Ron Economos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230725104453.506004557@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=kuba@kernel.org \
    --cc=kuniyu@amazon.com \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox