linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [RFC] iomap: use huge zero folio in iomap_dio_zero
@ 2024-05-07 18:38 Ritesh Harjani
  2024-05-08 11:42 ` Christoph Hellwig
  0 siblings, 1 reply; 25+ messages in thread
From: Ritesh Harjani @ 2024-05-07 18:38 UTC (permalink / raw)
  To: Christoph Hellwig, Pankaj Raghav (Samsung)
  Cc: hch, willy, mcgrof, akpm, brauner, chandan.babu, david, djwong,
	gost.dev, hare, john.g.garry, linux-block, linux-fsdevel,
	linux-mm, linux-xfs, p.raghav, ziy

Christoph Hellwig <hch@infradead.org> writes:

> On Tue, May 07, 2024 at 04:58:12PM +0200, Pankaj Raghav (Samsung) wrote:
>> +	if (len > PAGE_SIZE) {
>> +		folio = mm_get_huge_zero_folio(current->mm);
>
> I don't think the mm_struct based interfaces work well here, as I/O
> completions don't come in through the same mm.  You'll want to use

But right now iomap_dio_zero() is only called from the submission
context right i.e. iomap_dio_bio_iter(). Could you please explain the
dependency with the completion context to have same mm_struct here?

> lower level interfaces like get_huge_zero_page and use them at
> mount time.
>

Even so, should we not check whether allocation of hugepage is of any
value or not depending upon how large the length or (blocksize in case of
mount time) really is.
i.e. say if the len for zeroing is just 2 times the PAGE_SIZE, then it
doesn't really make sense to allocate a 2MB hugepage and sometimes 16MB
hugepage on some archs (like Power with hash mmu).

maybe something like if len > 16 * pagesize?

>> +		if (!folio)
>> +			folio = zero_page_folio;
>
> And then don't bother with a fallback.

The hugepage allocation can still fail during mount time (if we mount
late when the system memory is already fragmented). So we might still
need a fallback to ZERO_PAGE(0), right?

-ritesh

^ permalink raw reply	[flat|nested] 25+ messages in thread
* [PATCH v5 07/11] iomap: fix iomap_dio_zero() for fs bs > system page size
@ 2024-05-03  9:53 Luis Chamberlain
  2024-05-07 14:58 ` [RFC] iomap: use huge zero folio in iomap_dio_zero Pankaj Raghav (Samsung)
  0 siblings, 1 reply; 25+ messages in thread
From: Luis Chamberlain @ 2024-05-03  9:53 UTC (permalink / raw)
  To: akpm, willy, djwong, brauner, david, chandan.babu
  Cc: hare, ritesh.list, john.g.garry, ziy, linux-fsdevel, linux-xfs,
	linux-mm, linux-block, gost.dev, p.raghav, kernel, mcgrof

From: Pankaj Raghav <p.raghav@samsung.com>

iomap_dio_zero() will pad a fs block with zeroes if the direct IO size
< fs block size. iomap_dio_zero() has an implicit assumption that fs block
size < page_size. This is true for most filesystems at the moment.

If the block size > page size, this will send the contents of the page
next to zero page(as len > PAGE_SIZE) to the underlying block device,
causing FS corruption.

iomap is a generic infrastructure and it should not make any assumptions
about the fs block size and the page size of the system.

Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Hannes Reinecke <hare@suse.de>
---
 fs/iomap/direct-io.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index f3b43d223a46..5f481068de5b 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -239,14 +239,23 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio,
 	struct page *page = ZERO_PAGE(0);
 	struct bio *bio;
 
-	bio = iomap_dio_alloc_bio(iter, dio, 1, REQ_OP_WRITE | REQ_SYNC | REQ_IDLE);
+	WARN_ON_ONCE(len > (BIO_MAX_VECS * PAGE_SIZE));
+
+	bio = iomap_dio_alloc_bio(iter, dio, BIO_MAX_VECS,
+				  REQ_OP_WRITE | REQ_SYNC | REQ_IDLE);
 	fscrypt_set_bio_crypt_ctx(bio, inode, pos >> inode->i_blkbits,
 				  GFP_KERNEL);
+
 	bio->bi_iter.bi_sector = iomap_sector(&iter->iomap, pos);
 	bio->bi_private = dio;
 	bio->bi_end_io = iomap_dio_bio_end_io;
 
-	__bio_add_page(bio, page, len, 0);
+	while (len) {
+		unsigned int io_len = min_t(unsigned int, len, PAGE_SIZE);
+
+		__bio_add_page(bio, page, io_len, 0);
+		len -= io_len;
+	}
 	iomap_dio_submit_bio(iter, dio, bio, pos);
 }
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2024-05-17 13:30 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-07 18:38 [RFC] iomap: use huge zero folio in iomap_dio_zero Ritesh Harjani
2024-05-08 11:42 ` Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2024-05-03  9:53 [PATCH v5 07/11] iomap: fix iomap_dio_zero() for fs bs > system page size Luis Chamberlain
2024-05-07 14:58 ` [RFC] iomap: use huge zero folio in iomap_dio_zero Pankaj Raghav (Samsung)
2024-05-07 15:11   ` Zi Yan
2024-05-07 16:11   ` Christoph Hellwig
2024-05-08 11:39     ` Pankaj Raghav (Samsung)
2024-05-08 11:43       ` Christoph Hellwig
2024-05-09 12:31         ` Pankaj Raghav (Samsung)
2024-05-09 12:46           ` Christoph Hellwig
2024-05-09 12:55             ` Pankaj Raghav (Samsung)
2024-05-09 12:58               ` Christoph Hellwig
2024-05-09 14:32                 ` Darrick J. Wong
2024-05-09 15:05                   ` Christoph Hellwig
2024-05-09 15:08                     ` Darrick J. Wong
2024-05-09 15:09                       ` Christoph Hellwig
2024-05-15  0:50   ` Matthew Wilcox
2024-05-15  2:34     ` Keith Busch
2024-05-15  4:04       ` Matthew Wilcox
2024-05-15 15:59         ` Pankaj Raghav (Samsung)
2024-05-15 18:03           ` Matthew Wilcox
2024-05-16 15:02             ` Pankaj Raghav (Samsung)
2024-05-17 12:36               ` Hannes Reinecke
2024-05-17 12:56                 ` Hannes Reinecke
2024-05-17 13:30                 ` Matthew Wilcox
2024-05-15 11:48     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).