From: Dave Chinner <david@fromorbit.com>
To: Richard Laager <rlaager@wiktel.com>
Cc: linux-fsdevel@vger.kernel.org
Subject: Re: fallocate(FALLOC_FL_PUNCH_HOLE)
Date: Wed, 14 Mar 2012 14:27:09 +1100 [thread overview]
Message-ID: <20120314032709.GW3592@dastard> (raw)
In-Reply-To: <1331410025.8577.68.camel@watermelon.coderich.net>
On Sat, Mar 10, 2012 at 02:07:05PM -0600, Richard Laager wrote:
> I've been working on a discard patch for QEMU.
>
> I have a couple of questions about the semantics of fallocate()'s
> FALLOC_FL_PUNCH_HOLE that are not addressed in the latest man-pages.git.
>
> 1. Upon successful return, are the results guaranteed to be on
> stable storage?
No.
> 1. If not, is fdatasync() sufficient, or is fsync()
> required?
Will be on stable storage before fdatasync() returns.
> 2. Does O_DSYNC on open() change any of this?
Will be on stable storage before fallocate() returns.
> 3. Does O_DIRECT on open() change any of this?
Has no effect on behaviour.
> 2. If I punch a hole in a previously preallocated range, is this...
> A. required to undo the preallocation?
> B. permitted, but not required, to undo the preallocation?
> C. forbidden from undoing the preallocation?
B. Most implementations will give you A, though.
> If the answer to #2 is not C, it would appear there's no atomic way to
> indicate that I'm done with certain data* but I want the filesystem to
> continue to guarantee space for me. Is this correct?
Not through fallocate() right now. XFS has an ioctl that will turn
written ranges and holes back into preallocated space:
XFS_IOC_ZERO_RANGE. I've got a patch that introduces this zeroing
capability to fallocate (see below) which currently works on XFS.
> * so the filesystem can send a TRIM/UNMAP to an underlying SSD.
It does not, however, issue discards on the range, because it is
still allocated space in the filesystem. It could probably be
made to do so, especially as the folks that requested the
XFS_IOC_ZERO_RANGE functionality asking about extending it to do
this last week.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
fs: Introduce FALLOC_FL_ZERO_RANGE
From: Dave Chinner <dchinner@redhat.com>
FALLOC_FL_ZERO_RANGE is the equivalent of an atomic hole-punch +
preallocation. It enabled ranges of written data to be turned into
zeroes without requiring IO or having to free and reallocate the
extents in the range given as would occur if we had to punch and
then preallocate them separately. This enables applications to zero
parts of files very quickly without changing the layout of the files
in any way.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_file.c | 6 +++++-
include/linux/falloc.h | 1 +
2 files changed, 6 insertions(+), 1 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 825390e..ce2fd17 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -912,7 +912,9 @@ xfs_file_fallocate(
int cmd = XFS_IOC_RESVSP;
int attr_flags = XFS_ATTR_NOLOCK;
- if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
+ if (mode & ~(FALLOC_FL_KEEP_SIZE |
+ FALLOC_FL_PUNCH_HOLE |
+ FALLOC_FL_ZERO_RANGE))
return -EOPNOTSUPP;
bf.l_whence = 0;
@@ -923,6 +925,8 @@ xfs_file_fallocate(
if (mode & FALLOC_FL_PUNCH_HOLE)
cmd = XFS_IOC_UNRESVSP;
+ else if (mode & FALLOC_FL_ZERO_RANGE)
+ cmd = XFS_IOC_ZERO_RANGE;
/* check the new inode size is valid before allocating */
if (!(mode & FALLOC_FL_KEEP_SIZE) &&
diff --git a/include/linux/falloc.h b/include/linux/falloc.h
index 73e0b62..9160c70 100644
--- a/include/linux/falloc.h
+++ b/include/linux/falloc.h
@@ -3,6 +3,7 @@
#define FALLOC_FL_KEEP_SIZE 0x01 /* default is extend size */
#define FALLOC_FL_PUNCH_HOLE 0x02 /* de-allocates range */
+#define FALLOC_FL_ZERO_RANGE 0x04 /* zero/prealloc all blocks in range */
#ifdef __KERNEL__
next prev parent reply other threads:[~2012-03-14 3:27 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-10 20:07 fallocate(FALLOC_FL_PUNCH_HOLE) Richard Laager
2012-03-14 3:27 ` Dave Chinner [this message]
2012-03-14 6:01 ` fallocate(FALLOC_FL_PUNCH_HOLE) Richard Laager
2012-03-14 12:56 ` fallocate(FALLOC_FL_PUNCH_HOLE) Ted Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120314032709.GW3592@dastard \
--to=david@fromorbit.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=rlaager@wiktel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).