From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: Dave Chinner <david@fromorbit.com>, Jens Axboe <axboe@kernel.dk>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
linux-fsdevel@vger.kernel.org,
linux-ext4 <linux-ext4@vger.kernel.org>
Subject: Re: BLKZEROOUT + pread should return zeroes, right?
Date: Tue, 14 Oct 2014 18:25:34 -0700 [thread overview]
Message-ID: <20141015012534.GB12013@birch.djwong.org> (raw)
In-Reply-To: <20141014063210.GK9738@thunk.org>
On Tue, Oct 14, 2014 at 02:32:10AM -0400, Theodore Ts'o wrote:
> The bottom line is for most of the use cases we are talking about,
> we're only zero'ing one or two 4k blocks at a time, so I've never been
> convinced that it's worth it to use BLKZEROOUT.
>
> We could add page cache coherency features to BLKZEROOUT, but I'm not
> entirely sure it's worth the effort. No user space program would be
> able to take advantage of adding coherency for several years, or
Well then let's change BLKZEROOUT to require O_DIRECT instead of hiding the
coherency problem, and introduce BLKZEROOUT_INV which issues the zero out and
then takes care of page cache coherency.
(Or at least the first part...)
> adding feature tests, etc., and is it worth the upside of being able
> to use WRITE SAME for a few 4k or 8k writes? (Which the vast majority
> of storage devices don't support anyway....)
I've converted mke2fs and e2fsck to use BLKZEROOUT to zero the journal and the
inode tables when they want something to really be zero, and ext2fs_fallocate
uses it to zero the fallocated range. I suspect those three will zero long
runs of sectors each call.
As for WRITE_SAME support, if it's there, why ignore it? The ioctl exists;
someone else is bound to use it sooner or later.
A further optimization to mke2fs would be to detect that we've run
discard-with-zeroes and therefore can skip issuing subsequent zeroouts on the
same ranges, but I'm wary that discard-zeroes-data does what it purports to do.
If it /does/ work reliably, though, ext2fs_zero_blocks() could be rerouted to
use discard instead. Really my reason for wanting to use zeroout is that in
guaranteeing the zero-read behavior afterwards it seems like it ought to be
less problematic than discard has been.
--D
>
> Cheers,
>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2014-10-15 1:25 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-14 3:01 BLKZEROOUT + pread should return zeroes, right? Darrick J. Wong
2014-10-14 4:27 ` Dave Chinner
2014-10-14 6:02 ` Darrick J. Wong
2014-10-14 6:32 ` Theodore Ts'o
2014-10-15 1:25 ` Darrick J. Wong [this message]
2014-10-15 1:32 ` Martin K. Petersen
2014-10-16 20:04 ` Darrick J. Wong
2014-10-15 10:02 ` Theodore Ts'o
2014-10-15 12:09 ` Martin K. Petersen
2014-10-18 0:03 ` [RFC PATCH] block: make BLKZEROOUT invalidate page cache contents Darrick J. Wong
2014-10-14 9:21 ` BLKZEROOUT + pread should return zeroes, right? Christoph Hellwig
2014-10-14 13:44 ` Martin K. Petersen
2014-10-14 18:57 ` Zach Brown
2014-10-14 20:21 ` Dave Chinner
2014-10-15 1:02 ` Martin K. Petersen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141015012534.GB12013@birch.djwong.org \
--to=darrick.wong@oracle.com \
--cc=axboe@kernel.dk \
--cc=david@fromorbit.com \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).