From: Aubrey Li <aubrey.li@linux.intel.com>
To: Matthew Wilcox <willy@infradead.org>,
Andrew Morton <akpm@linux-foundation.org>,
Nanhai Zou <nanhai.zou@intel.com>,
Gang Deng <gang.deng@intel.com>,
Tianyou Li <tianyou.li@intel.com>,
Vinicius Gomes <vinicius.gomes@intel.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
Chen Yu <yu.c.chen@intel.com>
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org,
Aubrey Li <aubrey.li@linux.intel.com>
Subject: [PATCH] mm/readahead: Skip fully overlapped range
Date: Tue, 23 Sep 2025 11:59:46 +0800 [thread overview]
Message-ID: <20250923035946.2560876-1-aubrey.li@linux.intel.com> (raw)
RocksDB sequential read benchmark under high concurrency shows severe
lock contention. Multiple threads may issue readahead on the same file
simultaneously, which leads to heavy contention on the xas spinlock in
filemap_add_folio(). Perf profiling indicates 30%~60% of CPU time spent
there.
To mitigate this issue, a readahead request will be skipped if its
range is fully covered by an ongoing readahead. This avoids redundant
work and significantly reduces lock contention. In one-second sampling,
contention on xas spinlock dropped from 138,314 times to 2,144 times,
resulting in a large performance improvement in the benchmark.
w/o patch w/ patch
RocksDB-readseq (ops/sec)
(32-threads) 1.2M 2.4M
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Vinicius Gomes <vinicius.gomes@intel.com>
Cc: Tianyou Li <tianyou.li@intel.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Suggested-by: Nanhai Zou <nanhai.zou@intel.com>
Tested-by: Gang Deng <gang.deng@intel.com>
Signed-off-by: Aubrey Li <aubrey.li@linux.intel.com>
---
mm/readahead.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
diff --git a/mm/readahead.c b/mm/readahead.c
index 20d36d6b055e..57ae1a137730 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -337,7 +337,7 @@ void force_page_cache_ra(struct readahead_control *ractl,
struct address_space *mapping = ractl->mapping;
struct file_ra_state *ra = ractl->ra;
struct backing_dev_info *bdi = inode_to_bdi(mapping->host);
- unsigned long max_pages;
+ unsigned long max_pages, index;
if (unlikely(!mapping->a_ops->read_folio && !mapping->a_ops->readahead))
return;
@@ -348,6 +348,19 @@ void force_page_cache_ra(struct readahead_control *ractl,
*/
max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages);
nr_to_read = min_t(unsigned long, nr_to_read, max_pages);
+
+ index = readahead_index(ractl);
+ /*
+ * Skip this readahead if the requested range is fully covered
+ * by the ongoing readahead range. This typically occurs in
+ * concurrent scenarios.
+ */
+ if (index >= ra->start && index + nr_to_read <= ra->start + ra->size)
+ return;
+
+ ra->start = index;
+ ra->size = nr_to_read;
+
while (nr_to_read) {
unsigned long this_chunk = (2 * 1024 * 1024) / PAGE_SIZE;
@@ -357,6 +370,10 @@ void force_page_cache_ra(struct readahead_control *ractl,
nr_to_read -= this_chunk;
}
+
+ /* Reset readahead state to allow the next readahead */
+ ra->start = 0;
+ ra->size = 0;
}
/*
--
2.43.0
next reply other threads:[~2025-09-23 3:38 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-23 3:59 Aubrey Li [this message]
2025-09-23 3:49 ` [PATCH] mm/readahead: Skip fully overlapped range Andrew Morton
2025-09-23 5:11 ` Aubrey Li
2025-09-23 9:57 ` Jan Kara
2025-09-24 0:27 ` Aubrey Li
2025-09-30 5:35 ` Aubrey Li
2025-10-11 22:20 ` Andrew Morton
2025-10-16 16:21 ` Jan Kara
2025-11-07 10:28 ` Aubrey Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250923035946.2560876-1-aubrey.li@linux.intel.com \
--to=aubrey.li@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=gang.deng@intel.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nanhai.zou@intel.com \
--cc=tianyou.li@intel.com \
--cc=tim.c.chen@linux.intel.com \
--cc=vinicius.gomes@intel.com \
--cc=willy@infradead.org \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).