From: Theodore Ts'o <tytso@mit.edu>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-ext4@vger.kernel.org
Subject: Re: [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks
Date: Sat, 18 Oct 2014 12:32:55 -0400 [thread overview]
Message-ID: <20141018163255.GB30124@thunk.org> (raw)
In-Reply-To: <20140913221253.13646.7723.stgit@birch.djwong.org>
On Sat, Sep 13, 2014 at 03:12:53PM -0700, Darrick J. Wong wrote:
> Plumb a new call into the IO manager to support translating
> ext2fs_zero_blocks calls into the equivalent kernel-level BLKZEROOUT
> ioctl or FALLOC_FL_ZERO_RANGE fallocate flag primitives when possible.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
> contrib/fallocate.c | 14 +++++++++
I've separated out the contrib/fallocate change and created a separate
commit for it, since it really is a separate change.
What I'd like to see for the zero_blocks change io_manager is:
(a) if we try to zero a range past the end of the file, we should just
truncate the file to set i_size. Similarly, if this is a regular
file, we should try to use PUNCH_HOLE. We already try to keep a raw
file system image file to be sparse, so I don't see any real problems
with this.
(b) for a block device, if IO_FLAG_DIRECT_IO is set, it shoud be safe
to try to use te BLKZEROOUT. If not, we can use
posix_fadvise(POSIX_FADV_DONTNEED) and verify that this correctly zaps
the relevant parts of the buffer cache. If it doesn't do the right
thing, we can use BLKFLSBUF, which will zap the entire buffer cache
for the device. Which is pretty heavy weight, but I really think it
only makes sense to use zeroout for zeroing the inode table and the
journal file.
Even if we patch the kernel to make BLKZEROOUT to automatically do
this, we can't count on it, and in particular if it turns out we have
to use BLKFLSBUF, we're not going to want to use this for zero'ing a
single 4k block. It doesn't happen that often, and I don't think
there will be much if any measurable difference in performance if we
use WRITE SAME vs. WRITE for a small region.
Does this make sense?
- Ted
P.S. Once we do this, when using mke2fs on a file, we should really
use punch_hole and disable lazy_itable_init, to save I/O bandwidth on
VM's running on cloud systems.
next prev parent reply other threads:[~2014-10-19 11:33 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-13 22:11 [PATCH 00/34] e2fsprogs Summer 2014 patchbomb, part 6 Darrick J. Wong
2014-09-13 22:11 ` [PATCH 01/34] e2fsck: offer to clear overlapping extents Darrick J. Wong
2014-09-19 1:45 ` Theodore Ts'o
2014-09-13 22:11 ` [PATCH 02/34] e2fsck: fix sliding the directory block down on bigalloc Darrick J. Wong
2014-09-19 1:45 ` Theodore Ts'o
2014-09-13 22:11 ` [PATCH 03/34] misc: zero s_jnl_blocks when adding journal online or removing external journal Darrick J. Wong
2014-09-19 1:45 ` Theodore Ts'o
2014-09-13 22:11 ` [PATCH 04/34] libext2fs: ext2fs_new_block2() should call alloc_block hook Darrick J. Wong
2014-09-13 22:11 ` [PATCH 05/34] debugfs: manage needs_recover feature when messing with the journal Darrick J. Wong
2014-09-19 6:01 ` Theodore Ts'o
2014-09-13 22:11 ` [PATCH 06/34] debugfs: add LIBINTL to debugfs link command Darrick J. Wong
2014-09-19 4:46 ` Theodore Ts'o
2014-10-17 21:07 ` Darrick J. Wong
2014-10-18 16:10 ` Theodore Ts'o
2014-09-13 22:11 ` [PATCH 07/34] ext2fs: add readahead method to improve scanning Darrick J. Wong
2014-09-19 16:15 ` Theodore Ts'o
2014-09-13 22:12 ` [PATCH 08/34] libext2fs/e2fsck: provide routines to read-ahead metadata Darrick J. Wong
2014-09-13 22:12 ` [PATCH 09/34] e2fsck: read-ahead metadata during passes 1, 2, and 4 Darrick J. Wong
2014-09-13 22:12 ` [PATCH 10/34] dumpe2fs: provide a machine-readable group-only mode Darrick J. Wong
2014-09-19 16:17 ` Theodore Ts'o
2014-09-13 22:12 ` [PATCH 11/34] dumpe2fs: output cleanup Darrick J. Wong
2014-09-19 16:22 ` Theodore Ts'o
2014-09-19 20:00 ` Darrick J. Wong
2014-10-13 18:04 ` Darrick J. Wong
2014-09-13 22:12 ` [PATCH 12/34] misc: move check_plausibility into a separate file Darrick J. Wong
2014-09-19 22:16 ` Theodore Ts'o
2014-09-13 22:12 ` [PATCH 13/34] misc: add plausibility checks to debugfs/tune2fs/dumpe2fs/e2fsck Darrick J. Wong
2014-09-19 23:00 ` Theodore Ts'o
2014-09-13 22:12 ` [PATCH 14/34] misc: use libmagic when libblkid can't identify something Darrick J. Wong
2014-09-21 5:29 ` Theodore Ts'o
2014-09-13 22:12 ` [PATCH 15/34] libext2fs: support BLKZEROOUT/FALLOC_FL_ZERO_RANGE in ext2fs_zero_blocks Darrick J. Wong
2014-09-22 2:51 ` Theodore Ts'o
2014-09-29 18:58 ` Darrick J. Wong
2014-10-14 2:58 ` Darrick J. Wong
2014-10-18 16:32 ` Theodore Ts'o [this message]
2014-10-20 23:37 ` Darrick J. Wong
2014-09-13 22:12 ` [PATCH 16/34] libext2fs/e2fsck: refactor everyone who writes zero blocks to disk Darrick J. Wong
2014-10-13 10:09 ` Theodore Ts'o
2014-10-13 17:09 ` Darrick J. Wong
2014-09-13 22:13 ` [PATCH 17/34] libext2fs: support allocating uninit blocks in bmap2() Darrick J. Wong
2014-10-13 14:35 ` Theodore Ts'o
2014-10-13 16:56 ` Darrick J. Wong
2014-10-13 18:34 ` Darrick J. Wong
2014-09-13 22:13 ` [PATCH 18/34] libext2fs: file IO routines should handle uninit blocks Darrick J. Wong
2014-09-13 22:13 ` [PATCH 19/34] resize2fs: convert fs to and from 64bit mode Darrick J. Wong
2014-09-14 17:34 ` TR Reardon
2014-09-14 17:50 ` Darrick J. Wong
2014-09-13 22:13 ` [PATCH 20/34] resize2fs: adjust reserved_gdt_blocks when changing group descriptor size Darrick J. Wong
2014-09-13 22:13 ` [PATCH 21/34] tests: test resize2fs 32->64 and 64->32bit conversion code Darrick J. Wong
2014-09-13 22:13 ` [PATCH 22/34] libext2fs: find inode goal when allocating blocks Darrick J. Wong
2014-09-13 22:13 ` [PATCH 23/34] libext2fs: find/alloc a range of empty blocks Darrick J. Wong
2014-09-13 22:13 ` [PATCH 24/34] libext2fs: add new hooks to support large allocations Darrick J. Wong
2014-09-13 22:14 ` [PATCH 25/34] libext2fs: implement fallocate Darrick J. Wong
2014-09-13 22:14 ` [PATCH 26/34] libext2fs: use fallocate for creating journals and hugefiles Darrick J. Wong
2014-09-13 22:14 ` [PATCH 27/34] debugfs: implement fallocate Darrick J. Wong
2014-09-13 22:14 ` [PATCH 28/34] tests: test debugfs punch command Darrick J. Wong
2014-09-19 16:26 ` Theodore Ts'o
2014-09-19 20:01 ` Darrick J. Wong
2014-09-13 22:14 ` [PATCH 30/34] fuse2fs: translate ACL structures Darrick J. Wong
2014-09-13 22:14 ` [PATCH 31/34] fuse2fs: handle 64-bit dates correctly Darrick J. Wong
2014-09-13 22:14 ` [PATCH 32/34] fuse2fs: implement fallocate Darrick J. Wong
2014-09-13 22:15 ` [PATCH 34/34] tests: enable using fuse2fs with metadata checksum test Darrick J. Wong
2014-09-14 17:19 ` [PATCH 35/34] e2fsck: free bh when descriptor block checksum fails Darrick J. Wong
2014-09-14 19:11 ` Eric Sandeen
2014-09-19 1:46 ` Theodore Ts'o
2014-09-18 19:09 ` [PATCH 36/34] misc: fix Coverity complaints Darrick J. Wong
2014-09-19 1:47 ` Theodore Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20141018163255.GB30124@thunk.org \
--to=tytso@mit.edu \
--cc=darrick.wong@oracle.com \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.