[PATCH v1 0/2] Tiny optimization for large read operations

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v1 0/2] Tiny optimization for large read operations
@ 2025-07-28  8:39 Chi Zhiling
  2025-07-28  8:39 ` [PATCH v1 1/2] mm/filemap: Do not use is_partially_uptodate for entire folio Chi Zhiling
  2025-07-28  8:39 ` [PATCH v1 2/2] mm/filemap: Skip non-uptodate folio if there are available folios Chi Zhiling
  0 siblings, 2 replies; 3+ messages in thread
From: Chi Zhiling @ 2025-07-28  8:39 UTC (permalink / raw)
  To: willy, akpm; +Cc: linux-fsdevel, linux-mm, linux-kernel, Chi Zhiling

From: Chi Zhiling <chizhiling@kylinos.cn>

This series contains two patches,

1. Skip calling is_partially_uptodate for entire folio to save time,
I have reviewed the mpage and iomap implementations and didn't spot any
issues, but this change likely needs more thorough review.

2. Skip calling filemap_uptodate if there are ready folios in the batch,
This might save a few milliseconds in practice, but I didn't observe
measurable improvements in my tests.

Changes from rfc:
- update commits
- switch to the new solution which provided by Matthew Wilcox.

rfc:
https://lore.kernel.org/linux-fsdevel/20250723101825.607184-1-chizhiling@163.com/

Chi Zhiling (2):
  mm/filemap: Do not use is_partially_uptodate for entire folio
  mm/filemap: Skip non-uptodate folio if there are available folios

 mm/filemap.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [PATCH v1 1/2] mm/filemap: Do not use is_partially_uptodate for entire folio
  2025-07-28  8:39 [PATCH v1 0/2] Tiny optimization for large read operations Chi Zhiling
@ 2025-07-28  8:39 ` Chi Zhiling
  2025-07-28  8:39 ` [PATCH v1 2/2] mm/filemap: Skip non-uptodate folio if there are available folios Chi Zhiling
  1 sibling, 0 replies; 3+ messages in thread
From: Chi Zhiling @ 2025-07-28  8:39 UTC (permalink / raw)
  To: willy, akpm; +Cc: linux-fsdevel, linux-mm, linux-kernel, Chi Zhiling

From: Chi Zhiling <chizhiling@kylinos.cn>

When a folio is marked as non-uptodate, it means the folio contains
some non-uptodate data. Therefore, calling is_partially_uptodate()
to recheck the entire folio is redundant.

If all data in a folio is actually up-to-date but the folio lacks the
uptodate flag, it will still be treated as non-uptodate in many other
places. Thus, there should be no special case handling for filemap.

Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
---
 mm/filemap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/filemap.c b/mm/filemap.c
index 0e103fc99a8e..00c30f7f7dc3 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2447,6 +2447,9 @@ static bool filemap_range_uptodate(struct address_space *mapping,
 		pos -= folio_pos(folio);
 	}
 
+	if (pos == 0 && count >= folio_size(folio))
+		return false;
+
 	return mapping->a_ops->is_partially_uptodate(folio, pos, count);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH v1 2/2] mm/filemap: Skip non-uptodate folio if there are available folios
  2025-07-28  8:39 [PATCH v1 0/2] Tiny optimization for large read operations Chi Zhiling
  2025-07-28  8:39 ` [PATCH v1 1/2] mm/filemap: Do not use is_partially_uptodate for entire folio Chi Zhiling
@ 2025-07-28  8:39 ` Chi Zhiling
  1 sibling, 0 replies; 3+ messages in thread
From: Chi Zhiling @ 2025-07-28  8:39 UTC (permalink / raw)
  To: willy, akpm; +Cc: linux-fsdevel, linux-mm, linux-kernel, Chi Zhiling

From: Chi Zhiling <chizhiling@kylinos.cn>

When reading data exceeding the maximum IO size, the operation is split
into multiple IO requests, but the data isn't immediately copied to
userspace after each IO completion.

For example, when reading 2560k data from a device with 1280k maximum IO
size, the following sequence occurs:

1. read 1280k
2. copy 41 pages and issue read ahead for next 1280k
3. copy 31 pages to user buffer
4. wait the next 1280k
5. copy 8 pages to user buffer
6. copy 20 folios(64k) to user buffer

The 8 pages in step 5 are copied after the second 1280k completes(step 4)
due to waiting for a non-uptodate folio in filemap_update_page.
We can copy the 8 pages before the second 1280k completes(step 4) to
reduce the latency of this read operation.

After applying the patch, these 8 pages will be copied before the next IO
completes:

1. read 1280k
2. copy 41 pages and issue read ahead for next 1280k
3. copy 31 pages to user buffer
4. copy 8 pages to user buffer
5. wait the next 1280k
6. copy 20 folios(64k) to user buffer

This patch drops a setting of IOCB_NOWAIT for AIO, which is fine because
filemap_read will set it again for AIO.

The final solution provided by Matthew Wilcox:
Link: https://lore.kernel.org/linux-fsdevel/aIDy076Sxt544qja@casper.infradead.org/

Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
---
 mm/filemap.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 00c30f7f7dc3..d2e07184b281 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2623,9 +2623,10 @@ static int filemap_get_pages(struct kiocb *iocb, size_t count,
 			goto err;
 	}
 	if (!folio_test_uptodate(folio)) {
-		if ((iocb->ki_flags & IOCB_WAITQ) &&
-		    folio_batch_count(fbatch) > 1)
-			iocb->ki_flags |= IOCB_NOWAIT;
+		if (folio_batch_count(fbatch) > 1) {
+			err = -EAGAIN;
+			goto err;
+		}
 		err = filemap_update_page(iocb, mapping, count, folio,
 					  need_uptodate);
 		if (err)
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-07-28  8:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-28  8:39 [PATCH v1 0/2] Tiny optimization for large read operations Chi Zhiling
2025-07-28  8:39 ` [PATCH v1 1/2] mm/filemap: Do not use is_partially_uptodate for entire folio Chi Zhiling
2025-07-28  8:39 ` [PATCH v1 2/2] mm/filemap: Skip non-uptodate folio if there are available folios Chi Zhiling

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).