public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Mike Snitzer <snitzer@kernel.org>,
	Jeff Layton <jlayton@kernel.org>,
	Anna Schumaker <anna.schumaker@oracle.com>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.15 15/76] nfs: avoid i_lock contention in nfs_clear_invalid_mapping
Date: Tue, 12 Nov 2024 11:20:40 +0100	[thread overview]
Message-ID: <20241112101840.362126459@linuxfoundation.org> (raw)
In-Reply-To: <20241112101839.777512218@linuxfoundation.org>

5.15-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Mike Snitzer <snitzer@kernel.org>

[ Upstream commit 867da60d463bb2a3e28c9235c487e56e96cffa00 ]

Multi-threaded buffered reads to the same file exposed significant
inode spinlock contention in nfs_clear_invalid_mapping().

Eliminate this spinlock contention by checking flags without locking,
instead using smp_rmb and smp_load_acquire accordingly, but then take
spinlock and double-check these inode flags.

Also refactor nfs_set_cache_invalid() slightly to use
smp_store_release() to pair with nfs_clear_invalid_mapping()'s
smp_load_acquire().

While this fix is beneficial for all multi-threaded buffered reads
issued by an NFS client, this issue was identified in the context of
surprisingly low LOCALIO performance with 4K multi-threaded buffered
read IO.  This fix dramatically speeds up LOCALIO performance:

before: read: IOPS=1583k, BW=6182MiB/s (6482MB/s)(121GiB/20002msec)
after:  read: IOPS=3046k, BW=11.6GiB/s (12.5GB/s)(232GiB/20001msec)

Fixes: 17dfeb911339 ("NFS: Fix races in nfs_revalidate_mapping")
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/nfs/inode.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index edf59f809ded9..eb549a66a748e 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -212,12 +212,15 @@ void nfs_set_cache_invalid(struct inode *inode, unsigned long flags)
 		nfs_fscache_invalidate(inode);
 	flags &= ~(NFS_INO_REVAL_PAGECACHE | NFS_INO_REVAL_FORCED);
 
-	nfsi->cache_validity |= flags;
+	flags |= nfsi->cache_validity;
+	if (inode->i_mapping->nrpages == 0)
+		flags &= ~NFS_INO_INVALID_DATA;
 
-	if (inode->i_mapping->nrpages == 0) {
-		nfsi->cache_validity &= ~NFS_INO_INVALID_DATA;
-		nfs_ooo_clear(nfsi);
-	} else if (nfsi->cache_validity & NFS_INO_INVALID_DATA) {
+	/* pairs with nfs_clear_invalid_mapping()'s smp_load_acquire() */
+	smp_store_release(&nfsi->cache_validity, flags);
+
+	if (inode->i_mapping->nrpages == 0 ||
+	    nfsi->cache_validity & NFS_INO_INVALID_DATA) {
 		nfs_ooo_clear(nfsi);
 	}
 	trace_nfs_set_cache_invalid(inode, 0);
@@ -1350,6 +1353,13 @@ int nfs_clear_invalid_mapping(struct address_space *mapping)
 					 nfs_wait_bit_killable, TASK_KILLABLE);
 		if (ret)
 			goto out;
+		smp_rmb(); /* pairs with smp_wmb() below */
+		if (test_bit(NFS_INO_INVALIDATING, bitlock))
+			continue;
+		/* pairs with nfs_set_cache_invalid()'s smp_store_release() */
+		if (!(smp_load_acquire(&nfsi->cache_validity) & NFS_INO_INVALID_DATA))
+			goto out;
+		/* Slow-path that double-checks with spinlock held */
 		spin_lock(&inode->i_lock);
 		if (test_bit(NFS_INO_INVALIDATING, bitlock)) {
 			spin_unlock(&inode->i_lock);
-- 
2.43.0




  parent reply	other threads:[~2024-11-12 10:23 UTC|newest]

Thread overview: 85+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-12 10:20 [PATCH 5.15 00/76] 5.15.172-rc1 review Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 01/76] arm64: dts: rockchip: Fix rt5651 compatible value on rk3399-sapphire-excavator Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 02/76] arm64: dts: rockchip: Remove hdmis 2nd interrupt on rk3328 Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 03/76] arm64: dts: rockchip: Fix bluetooth properties on Rock960 boards Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 04/76] arm64: dts: rockchip: Remove #cooling-cells from fan on Theobroma lion Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 05/76] arm64: dts: rockchip: Fix LED triggers on rk3308-roc-cc Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 06/76] arm64: dts: imx8mp: correct sdhc ipg clk Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 07/76] ARM: dts: rockchip: fix rk3036 acodec node Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 08/76] ARM: dts: rockchip: drop grf reference from rk3036 hdmi Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 09/76] ARM: dts: rockchip: Fix the spi controller on rk3036 Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 10/76] ARM: dts: rockchip: Fix the realtek audio codec on rk3036-kylin Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 11/76] HID: core: zero-initialize the report buffer Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 12/76] NFSv3: only use NFS timeout for MOUNT when protocols are compatible Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 13/76] NFS: Add a tracepoint to show the results of nfs_set_cache_invalid() Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 14/76] NFSv3: handle out-of-order write replies Greg Kroah-Hartman
2024-11-12 10:20 ` Greg Kroah-Hartman [this message]
2024-11-12 10:20 ` [PATCH 5.15 16/76] security/keys: fix slab-out-of-bounds in key_task_permission Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 17/76] net: enetc: set MAC address to the VF net_device Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 18/76] sctp: properly validate chunk size in sctp_sf_ootb() Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 19/76] can: c_can: fix {rx,tx}_errors statistics Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 20/76] i40e: fix race condition by adding filters intermediate sync state Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 21/76] net: hns3: fix kernel crash when uninstalling driver Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 22/76] net: phy: ti: add PHY_RST_AFTER_CLK_EN flag Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 23/76] net: stmmac: Fix unbalanced IRQ wake disable warning on single irq case Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 24/76] net: arc: fix the device for dma_map_single/dma_unmap_single Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 25/76] Revert "ALSA: hda/conexant: Mute speakers at suspend / shutdown" Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 26/76] media: stb0899_algo: initialize cfr before using it Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 27/76] media: dvbdev: prevent the risk of out of memory access Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 28/76] media: dvb_frontend: dont play tricks with underflow values Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 29/76] media: adv7604: prevent underflow condition when reporting colorspace Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 30/76] scsi: sd_zbc: Use kvzalloc() to allocate REPORT ZONES buffer Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 31/76] ALSA: firewire-lib: fix return value on fail in amdtp_tscm_init() Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 32/76] ASoC: stm32: spdifrx: fix dma channel release in stm32_spdifrx_remove Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 33/76] media: s5p-jpeg: prevent buffer overflows Greg Kroah-Hartman
2024-11-12 10:20 ` [PATCH 5.15 34/76] media: cx24116: prevent overflows on SNR calculus Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 35/76] media: pulse8-cec: fix data timestamp at pulse8_setup() Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 36/76] media: v4l2-tpg: prevent the risk of a division by zero Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 37/76] media: v4l2-ctrls-api: fix error handling for v4l2_g_ctrl() Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 38/76] pwm: imx-tpm: Use correct MODULO value for EPWM mode Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 39/76] drm/amdgpu: Adjust debugfs eviction and IB access permissions Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 40/76] drm/amdgpu: add missing size check in amdgpu_debugfs_gprwave_read() Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 41/76] drm/amdgpu: prevent NULL pointer dereference if ATIF is not supported Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 42/76] thermal/drivers/qcom/lmh: Remove false lockdep backtrace Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 43/76] dm cache: correct the number of origin blocks to match the target length Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 44/76] dm cache: fix out-of-bounds access to the dirty bitset when resizing Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 45/76] dm cache: optimize dirty bit checking with find_next_bit " Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 46/76] dm cache: fix potential out-of-bounds access on the first resume Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 47/76] dm-unstriped: cast an operand to sector_t to prevent potential uint32_t overflow Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 48/76] ALSA: usb-audio: Add quirk for HP 320 FHD Webcam Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 49/76] posix-cpu-timers: Clear TICK_DEP_BIT_POSIX_TIMER on clone Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 50/76] io_uring: rename kiocb_end_write() local helper Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 51/76] fs: create kiocb_{start,end}_write() helpers Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 52/76] io_uring: use " Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 53/76] io_uring/rw: fix missing NOWAIT check for O_DIRECT start write Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 54/76] nfs: Fix KMSAN warning in decode_getfattr_attrs() Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 55/76] btrfs: reinitialize delayed ref list after deleting it from the list Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 56/76] net: bridge: xmit: make sure we have at least eth header len bytes Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 57/76] ice: Add a per-VF limit on number of FDIR filters Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 58/76] net: do not delay dst_entries_add() in dst_release() Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 59/76] media: uvcvideo: Skip parsing frames of type UVC_VS_UNDEFINED in uvc_parse_format Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 60/76] fs/proc: fix compile warning about variable vmcore_mmap_ops Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 61/76] usb: musb: sunxi: Fix accessing an released usb phy Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 62/76] usb: dwc3: fix fault at system suspend if device was already runtime suspended Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 63/76] usb: typec: fix potential out of bounds in ucsi_ccg_update_set_new_cam_cmd() Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 64/76] USB: serial: io_edgeport: fix use after free in debug printk Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 65/76] USB: serial: qcserial: add support for Sierra Wireless EM86xx Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 66/76] USB: serial: option: add Fibocom FG132 0x0112 composition Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 67/76] USB: serial: option: add Quectel RG650V Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 68/76] irqchip/gic-v3: Force propagation of the active state with a read-back Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 69/76] ocfs2: remove entry once instead of null-ptr-dereference in ocfs2_xa_remove() Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 70/76] ucounts: fix counter leak in inc_rlimit_get_ucounts() Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 71/76] ALSA: usb-audio: Support jack detection on Dell dock Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 72/76] ALSA: usb-audio: Add quirks for Dell WD19 dock Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 73/76] ACPI: PRM: Clean up guid type in struct prm_handler_info Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 74/76] hv_sock: Initializing vsk->trans to NULL to prevent a dangling pointer Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 75/76] vsock/virtio: Initialization of the dangling pointer occurring in vsk->trans Greg Kroah-Hartman
2024-11-12 10:21 ` [PATCH 5.15 76/76] ALSA: usb-audio: Add endianness annotations Greg Kroah-Hartman
2024-11-12 16:05 ` [PATCH 5.15 00/76] 5.15.172-rc1 review Harshit Mogalapalli
2024-11-12 23:22 ` Shuah Khan
2024-11-13  0:45 ` Ron Economos
2024-11-13  0:56 ` Florian Fainelli
2024-11-13 11:28 ` Naresh Kamboju
2024-11-13 13:30 ` Mark Brown
2024-11-13 19:58 ` Jon Hunter
2024-11-14 10:49 ` [PATCH 5.15] " Hardik Garg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241112101840.362126459@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=anna.schumaker@oracle.com \
    --cc=jlayton@kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=snitzer@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox