From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 13052184524 for ; Mon, 1 Dec 2025 21:24:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764624288; cv=none; b=W3J1ZlxorpV/c0eE2O6SE86UPEYL7YcieSz4er99EgWMEH5XauipeQGvYmVt62iiQzwKVC8gbvzONPbyOb2pebnhNgnNB9hkecLuKgHdYzLnlUnDAIH6n059IZCzp8bKXF36IRXJzBWsl2HpjZkLmimlHi3FTT5hS2f6J60c8I4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764624288; c=relaxed/simple; bh=6UGD/fGgHCQcObtOB1ey7uX6RIdFCH249Z9koV5esqo=; h=Message-ID:Date:MIME-Version:Cc:Subject:To:References:From: In-Reply-To:Content-Type; b=ea9HtxojnsHnS7jjXDQhvonEoQzFp3hF8cAyBtlnK8I1xXspiDy61YCLMgZMswa/DL42488H4Wy+1Bzm0UX+Sj2our1sOPq1Br7utVkjjrECKGgGhHlg1ngFbDv/unFiFOJ+2tW4ilZtBwA/f7OWk/OidgnRAFEccBlZpDzoiDA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=qE9nK5rR; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="qE9nK5rR" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 14E66C4CEF1; Mon, 1 Dec 2025 21:24:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1764624286; bh=6UGD/fGgHCQcObtOB1ey7uX6RIdFCH249Z9koV5esqo=; h=Date:Cc:Subject:To:References:From:In-Reply-To:From; b=qE9nK5rREhJIhcSRHka2jxPDQ9BNarNv+z77cnRpPOI02NwB8c7W+/kdY0IGO0fn0 XKWEuV1b7oZqmFkGmAS6N/AMwGDR6AtkEnfjTlrA8UQ6Hwy5GZGdGzxy9p7g9EWSL3 KuJjPB8joNKWUbw+5NBGRS63gzF+Y8JIZSyjQgTi/F2vWLL4qs4MsCMc413zC30Wyo hOEpiUPx6wVp6GfntPhLVJFrp3DVKUo1owE4zlRsRB48e1U3C56UQeaKi7w3u72ImV RRUD6FoHIItFeiYw2ipSQe8cfOffuBTcZrJuYBmblJKwpXOhIi6qb+I8ZsfbIqtH/1 p2RaIwsPbXfMg== Message-ID: <381c2557-a714-41f6-8dd2-e0df1ca65919@kernel.org> Date: Mon, 1 Dec 2025 13:24:44 -0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: chao@kernel.org Subject: Re: [f2fs-dev] [PATCH 1/4] mm/readahead: fix the broken readahead for POSIX_FADV_WILLNEED To: Jaegeuk Kim , linux-kernel@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, linux-mm@kvack.org, Matthew Wilcox References: <20251201210152.909339-1-jaegeuk@kernel.org> <20251201210152.909339-2-jaegeuk@kernel.org> Content-Language: en-US From: Chao Yu In-Reply-To: <20251201210152.909339-2-jaegeuk@kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit On 2025/12/2 05:01, Jaegeuk Kim via Linux-f2fs-devel wrote: > This patch fixes the broken readahead flow for POSIX_FADV_WILLNEED, where > the problem is, in force_page_cache_ra(nr_to_read), nr_to_read is cut by > the below code. > > max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages); > nr_to_read = min_t(unsigned long, nr_to_read, max_pages); > > IOWs, we are not able to read ahead larger than the above max_pages which > is most likely the range of 2MB and 16MB. Note, it doesn't make sense > to set ra->ra_pages to the entire file size. Instead, let's fix this logic. > > Before: > f2fs_fadvise: dev = (252,16), ino = 14, i_size = 4294967296 offset:0, len:4294967296, advise:3 > page_cache_ra_unbounded: dev=252:16 ino=e index=0 nr_to_read=512 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=512 nr_to_read=512 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=1024 nr_to_read=512 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=1536 nr_to_read=512 lookahead_size=0 > > After: > f2fs_fadvise: dev = (252,16), ino = 14, i_size = 4294967296 offset:0, len:4294967296, advise:3 > page_cache_ra_unbounded: dev=252:16 ino=e index=0 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=2048 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=4096 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=6144 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=8192 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=10240 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=12288 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=14336 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=16384 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=18432 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=20480 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=22528 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=24576 nr_to_read=2048 lookahead_size=0 > ... > page_cache_ra_unbounded: dev=252:16 ino=e index=1042432 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=1044480 nr_to_read=2048 lookahead_size=0 > page_cache_ra_unbounded: dev=252:16 ino=e index=1046528 nr_to_read=2048 lookahead_size=0 > > Cc: linux-mm@kvack.org > Cc: Matthew Wilcox (Oracle) > Signed-off-by: Jaegeuk Kim > --- > mm/readahead.c | 27 ++++++++++++--------------- > 1 file changed, 12 insertions(+), 15 deletions(-) > > diff --git a/mm/readahead.c b/mm/readahead.c > index 3a4b5d58eeb6..c0db049a5b7b 100644 > --- a/mm/readahead.c > +++ b/mm/readahead.c > @@ -311,7 +311,7 @@ EXPORT_SYMBOL_GPL(page_cache_ra_unbounded); > * behaviour which would occur if page allocations are causing VM writeback. > * We really don't want to intermingle reads and writes like that. > */ > -static void do_page_cache_ra(struct readahead_control *ractl, > +static int do_page_cache_ra(struct readahead_control *ractl, > unsigned long nr_to_read, unsigned long lookahead_size) > { > struct inode *inode = ractl->mapping->host; > @@ -320,45 +320,42 @@ static void do_page_cache_ra(struct readahead_control *ractl, > pgoff_t end_index; /* The last page we want to read */ > > if (isize == 0) > - return; > + return -EINVAL; > > end_index = (isize - 1) >> PAGE_SHIFT; > if (index > end_index) > - return; > + return -EINVAL; > /* Don't read past the page containing the last byte of the file */ > if (nr_to_read > end_index - index) > nr_to_read = end_index - index + 1; > > page_cache_ra_unbounded(ractl, nr_to_read, lookahead_size); > + return 0; > } > > /* > - * Chunk the readahead into 2 megabyte units, so that we don't pin too much > - * memory at once. > + * Chunk the readahead per the block device capacity, and read all nr_to_read. > */ > void force_page_cache_ra(struct readahead_control *ractl, > unsigned long nr_to_read) > { > struct address_space *mapping = ractl->mapping; > - struct file_ra_state *ra = ractl->ra; > struct backing_dev_info *bdi = inode_to_bdi(mapping->host); > - unsigned long max_pages; > + unsigned long this_chunk; > > if (unlikely(!mapping->a_ops->read_folio && !mapping->a_ops->readahead)) > return; > > /* > - * If the request exceeds the readahead window, allow the read to > - * be up to the optimal hardware IO size > + * Consier the optimal hardware IO size for readahead chunk. s/Consier/Consider Thanks, > */ > - max_pages = max_t(unsigned long, bdi->io_pages, ra->ra_pages); > - nr_to_read = min_t(unsigned long, nr_to_read, max_pages); > + this_chunk = max_t(unsigned long, bdi->io_pages, ractl->ra->ra_pages); > + > while (nr_to_read) { > - unsigned long this_chunk = (2 * 1024 * 1024) / PAGE_SIZE; > + this_chunk = min_t(unsigned long, this_chunk, nr_to_read); > > - if (this_chunk > nr_to_read) > - this_chunk = nr_to_read; > - do_page_cache_ra(ractl, this_chunk, 0); > + if (do_page_cache_ra(ractl, this_chunk, 0)) > + break; > > nr_to_read -= this_chunk; > }