From: Christoph Hellwig <hch@lst.de>
To: Matthew Wilcox <willy@infradead.org>
Cc: "Pankaj Raghav (Samsung)" <kernel@pankajraghav.com>,
hch@lst.de, mcgrof@kernel.org, akpm@linux-foundation.org,
brauner@kernel.org, chandan.babu@oracle.com, david@fromorbit.com,
djwong@kernel.org, gost.dev@samsung.com, hare@suse.de,
john.g.garry@oracle.com, linux-block@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
linux-xfs@vger.kernel.org, p.raghav@samsung.com,
ritesh.list@gmail.com, ziy@nvidia.com
Subject: Re: [RFC] iomap: use huge zero folio in iomap_dio_zero
Date: Wed, 15 May 2024 13:48:50 +0200 [thread overview]
Message-ID: <20240515114850.GB1938@lst.de> (raw)
In-Reply-To: <ZkQG7bdFStBLFv3g@casper.infradead.org>
On Wed, May 15, 2024 at 01:50:53AM +0100, Matthew Wilcox wrote:
> On Tue, May 07, 2024 at 04:58:12PM +0200, Pankaj Raghav (Samsung) wrote:
> > Instead of looping with ZERO_PAGE, use a huge zero folio to zero pad the
> > block. Fallback to ZERO_PAGE if mm_get_huge_zero_folio() fails.
>
> So the block people say we're doing this all wrong. We should be
> issuing a REQ_OP_WRITE_ZEROES bio, and the block layer will take care of
> using the ZERO_PAGE if the hardware doesn't natively support
> WRITE_ZEROES or a DISCARD that zeroes or ...
Not sure who "the block people" are, but while this sounds smart
it actually is a really bad idea.
Think about what we are doing here, we zero parts of a file system
block as part of a direct I/O write operation. So the amount is
relatively small and it is part of a fast path I/O operation. It
also will most likely land on the indirection entry on the device.
If you use a write zeroes it will go down a separate slow path in
the device instead of using the highly optimized write path and
slow the whole operation down. Even worse there are chances that
it will increase write amplification because there are two separate
operations now instead of one merged one (either a block layer or
device merge).
And I'm not sure what "block layer person" still doesn't understand
that discard do not zero data, but maybe we'll need yet another
education campaign there.
next prev parent reply other threads:[~2024-05-15 11:49 UTC|newest]
Thread overview: 53+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-03 9:53 [PATCH v5 00/11] enable bs > ps in XFS Luis Chamberlain
2024-05-03 9:53 ` [PATCH v5 01/11] readahead: rework loop in page_cache_ra_unbounded() Luis Chamberlain
2024-05-03 9:53 ` [PATCH v5 02/11] fs: Allow fine-grained control of folio sizes Luis Chamberlain
2024-05-03 9:53 ` [PATCH v5 03/11] filemap: allocate mapping_min_order folios in the page cache Luis Chamberlain
2024-05-03 9:53 ` [PATCH v5 04/11] readahead: allocate folios with mapping_min_order in readahead Luis Chamberlain
2024-05-03 14:32 ` Hannes Reinecke
2024-05-03 9:53 ` [PATCH v5 05/11] mm: split a folio in minimum folio order chunks Luis Chamberlain
2024-05-03 14:53 ` Zi Yan
2024-05-15 15:32 ` Matthew Wilcox
2024-05-16 14:56 ` Pankaj Raghav (Samsung)
2024-05-03 9:53 ` [PATCH v5 06/11] filemap: cap PTE range to be created to allowed zero fill in folio_map_range() Luis Chamberlain
2024-05-03 9:53 ` [PATCH v5 07/11] iomap: fix iomap_dio_zero() for fs bs > system page size Luis Chamberlain
2024-05-07 14:58 ` [RFC] iomap: use huge zero folio in iomap_dio_zero Pankaj Raghav (Samsung)
2024-05-07 15:11 ` Zi Yan
2024-05-07 16:11 ` Christoph Hellwig
2024-05-08 11:39 ` Pankaj Raghav (Samsung)
2024-05-08 11:43 ` Christoph Hellwig
2024-05-09 12:31 ` Pankaj Raghav (Samsung)
2024-05-09 12:46 ` Christoph Hellwig
2024-05-09 12:55 ` Pankaj Raghav (Samsung)
2024-05-09 12:58 ` Christoph Hellwig
2024-05-09 14:32 ` Darrick J. Wong
2024-05-09 15:05 ` Christoph Hellwig
2024-05-09 15:08 ` Darrick J. Wong
2024-05-09 15:09 ` Christoph Hellwig
2024-05-15 0:50 ` Matthew Wilcox
2024-05-15 2:34 ` Keith Busch
2024-05-15 4:04 ` Matthew Wilcox
2024-05-15 15:59 ` Pankaj Raghav (Samsung)
2024-05-15 18:03 ` Matthew Wilcox
2024-05-16 15:02 ` Pankaj Raghav (Samsung)
2024-05-17 12:36 ` Hannes Reinecke
2024-05-17 12:56 ` Hannes Reinecke
2024-05-17 13:30 ` Matthew Wilcox
2024-05-15 11:48 ` Christoph Hellwig [this message]
2024-05-07 16:00 ` [PATCH v5 07/11] iomap: fix iomap_dio_zero() for fs bs > system page size Matthew Wilcox
2024-05-07 16:10 ` Christoph Hellwig
2024-05-07 16:11 ` Matthew Wilcox
2024-05-07 16:13 ` Christoph Hellwig
2024-05-08 4:24 ` Matthew Wilcox
2024-05-08 11:22 ` Pankaj Raghav (Samsung)
2024-05-08 11:36 ` Christoph Hellwig
2024-05-08 11:20 ` Pankaj Raghav (Samsung)
2024-05-03 9:53 ` [PATCH v5 08/11] xfs: use kvmalloc for xattr buffers Luis Chamberlain
2024-05-03 9:53 ` [PATCH v5 09/11] xfs: expose block size in stat Luis Chamberlain
2024-05-03 9:53 ` [PATCH v5 10/11] xfs: make the calculation generic in xfs_sb_validate_fsb_count() Luis Chamberlain
2024-05-07 8:40 ` John Garry
2024-05-07 21:13 ` Darrick J. Wong
2024-05-08 11:28 ` Pankaj Raghav (Samsung)
2024-05-03 9:53 ` [PATCH v5 11/11] xfs: enable block size larger than page size support Luis Chamberlain
2024-05-07 0:05 ` Dave Chinner
-- strict thread matches above, loose matches on Subject: below --
2024-05-07 18:38 [RFC] iomap: use huge zero folio in iomap_dio_zero Ritesh Harjani
2024-05-08 11:42 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240515114850.GB1938@lst.de \
--to=hch@lst.de \
--cc=akpm@linux-foundation.org \
--cc=brauner@kernel.org \
--cc=chandan.babu@oracle.com \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=gost.dev@samsung.com \
--cc=hare@suse.de \
--cc=john.g.garry@oracle.com \
--cc=kernel@pankajraghav.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=ritesh.list@gmail.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).