* [Cluster-devel] [PATCH 1/6] fs: add hole punching to fallocate
[not found] ` <20101109214147.GK3099@thunk.org>
@ 2010-11-09 21:53 ` Jan Kara
0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2010-11-09 21:53 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Tue 09-11-10 16:41:47, Ted Ts'o wrote:
> On Tue, Nov 09, 2010 at 03:42:42PM +1100, Dave Chinner wrote:
> > Implementation is up to the filesystem. However, XFS does (b)
> > because:
> >
> > 1) it was extremely simple to implement (one of the
> > advantages of having an exceedingly complex allocation
> > interface to begin with :P)
> > 2) conversion is atomic, fast and reliable
> > 3) it is independent of the underlying storage; and
> > 4) reads of unwritten extents operate at memory speed,
> > not disk speed.
>
> Yeah, I was thinking that using a device-style TRIM might be better
> since future attempts to write to it won't require a separate seek to
> modify the extent tree. But yeah, there are a bunch of advantages of
> simply mutating the extent tree.
>
> While we're on the subject of changes to fallocate, what do people
> think of FALLOC_FL_EXPOSE_OLD_DATA, which requires either root
> privileges or (if capabilities are in use) CAP_DAC_OVERRIDE &&
> CAP_MAC_OVERRIDE && CAP_SYS_ADMIN. This would allow a trusted process
> to fallocate blocks with the extent already marked initialized. I've
> had two requests for such functionality for ext4 already.
>
> (Take for example a trusted cluster filesystem backend that checks the
> object checksum before returning any data to the user; and if the
> check fails the cluster file system will try to use some other replica
> stored on some other server.)
Hum, could you elaborate a bit? I fail to see how above fallocate() flag
could be used to help solving this problem... Just curious...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH 1/6] fs: add hole punching to fallocate
[not found] ` <1289840723-3056-2-git-send-email-josef@redhat.com>
@ 2010-11-16 11:16 ` Jan Kara
2010-11-16 11:43 ` Jan Kara
0 siblings, 1 reply; 8+ messages in thread
From: Jan Kara @ 2010-11-16 11:16 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Mon 15-11-10 12:05:18, Josef Bacik wrote:
> diff --git a/fs/open.c b/fs/open.c
> index 4197b9e..ab8dedf 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -223,7 +223,7 @@ int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
> return -EINVAL;
>
> /* Return error if mode is not supported */
> - if (mode && !(mode & FALLOC_FL_KEEP_SIZE))
> + if (mode && (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)))
Why not just:
if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) ?
> diff --git a/include/linux/falloc.h b/include/linux/falloc.h
> index 3c15510..851cba2 100644
> --- a/include/linux/falloc.h
> +++ b/include/linux/falloc.h
> @@ -2,6 +2,7 @@
> #define _FALLOC_H_
>
> #define FALLOC_FL_KEEP_SIZE 0x01 /* default is extend size */
> +#define FALLOC_FL_PUNCH_HOLE 0X02 /* de-allocates range */
^ use lowercase 'x' please...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH 1/6] fs: add hole punching to fallocate
2010-11-16 11:16 ` [Cluster-devel] [PATCH 1/6] fs: add hole punching to fallocate Jan Kara
@ 2010-11-16 11:43 ` Jan Kara
[not found] ` <20101116125249.GB31957@dhcp231-156.rdu.redhat.com>
0 siblings, 1 reply; 8+ messages in thread
From: Jan Kara @ 2010-11-16 11:43 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Tue 16-11-10 12:16:11, Jan Kara wrote:
> On Mon 15-11-10 12:05:18, Josef Bacik wrote:
> > diff --git a/fs/open.c b/fs/open.c
> > index 4197b9e..ab8dedf 100644
> > --- a/fs/open.c
> > +++ b/fs/open.c
> > @@ -223,7 +223,7 @@ int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
> > return -EINVAL;
> >
> > /* Return error if mode is not supported */
> > - if (mode && !(mode & FALLOC_FL_KEEP_SIZE))
> > + if (mode && (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)))
> Why not just:
> if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) ?
And BTW, since FALLOC_FL_PUNCH_HOLE does not change the file size, should
not we enforce that FALLOC_FL_KEEP_SIZE is / is not set? I don't mind too
much which way but keeping it ambiguous (ignored) in the interface usually
proves as a bad idea in future when we want to further extend the interface...
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH 3/6] Ocfs2: handle hole punching via fallocate properly
[not found] ` <1289840723-3056-4-git-send-email-josef@redhat.com>
@ 2010-11-16 11:50 ` Jan Kara
2010-11-17 23:27 ` Joel Becker
1 sibling, 0 replies; 8+ messages in thread
From: Jan Kara @ 2010-11-16 11:50 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Mon 15-11-10 12:05:20, Josef Bacik wrote:
> This patch just makes ocfs2 use its UNRESERVP ioctl when we get the hole punch
> flag in fallocate. I didn't test it, but it seems simple enough. Thanks,
>
> Signed-off-by: Josef Bacik <josef@redhat.com>
You might want to directly CC Joel Becker <Joel.Becker@oracle.com> who
maintains OCFS2. Otherwise the patch looks OK so you can add
Acked-by: Jan Kara <jack@suse.cz>
for what it's worth ;).
Honza
> ---
> fs/ocfs2/file.c | 10 ++++++++--
> 1 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
> index 77b4c04..181ae52 100644
> --- a/fs/ocfs2/file.c
> +++ b/fs/ocfs2/file.c
> @@ -1992,6 +1992,7 @@ static long ocfs2_fallocate(struct inode *inode, int mode, loff_t offset,
> struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
> struct ocfs2_space_resv sr;
> int change_size = 1;
> + int cmd = OCFS2_IOC_RESVSP64;
>
> if (!ocfs2_writes_unwritten_extents(osb))
> return -EOPNOTSUPP;
> @@ -2002,12 +2003,17 @@ static long ocfs2_fallocate(struct inode *inode, int mode, loff_t offset,
> if (mode & FALLOC_FL_KEEP_SIZE)
> change_size = 0;
>
> + if (mode & FALLOC_FL_PUNCH_HOLE) {
> + cmd = OCFS2_IOC_UNRESVSP64;
> + change_size = 0;
> + }
> +
> sr.l_whence = 0;
> sr.l_start = (s64)offset;
> sr.l_len = (s64)len;
>
> - return __ocfs2_change_file_space(NULL, inode, offset,
> - OCFS2_IOC_RESVSP64, &sr, change_size);
> + return __ocfs2_change_file_space(NULL, inode, offset, cmd, &sr,
> + change_size);
> }
>
> int ocfs2_check_range_for_refcount(struct inode *inode, loff_t pos,
> --
> 1.6.6.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH 4/6] Ext4: fail if we try to use hole punch
[not found] ` <1289840723-3056-5-git-send-email-josef@redhat.com>
@ 2010-11-16 11:52 ` Jan Kara
0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2010-11-16 11:52 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Mon 15-11-10 12:05:21, Josef Bacik wrote:
> Ext4 doesn't have the ability to punch holes yet, so make sure we return
> EOPNOTSUPP if we try to use hole punching through fallocate. This support can
> be added later. Thanks,
>
> Signed-off-by: Josef Bacik <josef@redhat.com>
Acked-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/extents.c | 4 ++++
> 1 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> index 0554c48..35bca73 100644
> --- a/fs/ext4/extents.c
> +++ b/fs/ext4/extents.c
> @@ -3622,6 +3622,10 @@ long ext4_fallocate(struct inode *inode, int mode, loff_t offset, loff_t len)
> struct ext4_map_blocks map;
> unsigned int credits, blkbits = inode->i_blkbits;
>
> + /* We only support the FALLOC_FL_KEEP_SIZE mode */
> + if (mode && (mode != FALLOC_FL_KEEP_SIZE))
> + return -EOPNOTSUPP;
> +
> /*
> * currently supporting (pre)allocate mode for extent-based
> * files _only_
> --
> 1.6.6.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH 1/6] fs: add hole punching to fallocate
[not found] ` <20101116125249.GB31957@dhcp231-156.rdu.redhat.com>
@ 2010-11-16 13:14 ` Jan Kara
0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2010-11-16 13:14 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Tue 16-11-10 07:52:50, Josef Bacik wrote:
> On Tue, Nov 16, 2010 at 12:43:46PM +0100, Jan Kara wrote:
> > On Tue 16-11-10 12:16:11, Jan Kara wrote:
> > > On Mon 15-11-10 12:05:18, Josef Bacik wrote:
> > > > diff --git a/fs/open.c b/fs/open.c
> > > > index 4197b9e..ab8dedf 100644
> > > > --- a/fs/open.c
> > > > +++ b/fs/open.c
> > > > @@ -223,7 +223,7 @@ int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
> > > > return -EINVAL;
> > > >
> > > > /* Return error if mode is not supported */
> > > > - if (mode && !(mode & FALLOC_FL_KEEP_SIZE))
> > > > + if (mode && (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)))
> > > Why not just:
> > > if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) ?
> > And BTW, since FALLOC_FL_PUNCH_HOLE does not change the file size, should
> > not we enforce that FALLOC_FL_KEEP_SIZE is / is not set? I don't mind too
> > much which way but keeping it ambiguous (ignored) in the interface usually
> > proves as a bad idea in future when we want to further extend the interface...
> >
>
> Yeah I went back and forth on this. KEEP_SIZE won't change the behavior of
> PUNCH_HOLE since PUNCH_HOLE implicitly means keep the size. I figured since its
> "mode" and not "flags" it would be ok to make either way accepted, but if you
> prefer PUNCH_HOLE means you have to have KEEP_SIZE set then I'm cool with that,
> just let me know one way or the other. Thanks,
I was wondering about 'mode' vs 'flags' as well. The manpage says:
The mode argument determines the operation to be performed on the given
range. Currently only one flag is supported for mode...
So we call it "mode" but speak about "flags"? Seems a bit inconsistent.
I'd maybe lean a bit at the "flags" side and just make sure that
only one of FALLOC_FL_KEEP_SIZE, FALLOC_FL_PUNCH_HOLE is set (interpreting
FALLOC_FL_KEEP_SIZE as allocate blocks beyond i_size). But I'm not sure
what others think.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH 3/6] Ocfs2: handle hole punching via fallocate properly
[not found] ` <1289840723-3056-4-git-send-email-josef@redhat.com>
2010-11-16 11:50 ` [Cluster-devel] [PATCH 3/6] Ocfs2: handle hole punching via fallocate properly Jan Kara
@ 2010-11-17 23:27 ` Joel Becker
1 sibling, 0 replies; 8+ messages in thread
From: Joel Becker @ 2010-11-17 23:27 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Mon, Nov 15, 2010 at 12:05:20PM -0500, Josef Bacik wrote:
> This patch just makes ocfs2 use its UNRESERVP ioctl when we get the hole punch
> flag in fallocate. I didn't test it, but it seems simple enough. Thanks,
>
> Signed-off-by: Josef Bacik <josef@redhat.com>
Seems reasonable to me.
Acked-by: Joel Becker <joel.becker@oracle.com>
Joel
> ---
> fs/ocfs2/file.c | 10 ++++++++--
> 1 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
> index 77b4c04..181ae52 100644
> --- a/fs/ocfs2/file.c
> +++ b/fs/ocfs2/file.c
> @@ -1992,6 +1992,7 @@ static long ocfs2_fallocate(struct inode *inode, int mode, loff_t offset,
> struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
> struct ocfs2_space_resv sr;
> int change_size = 1;
> + int cmd = OCFS2_IOC_RESVSP64;
>
> if (!ocfs2_writes_unwritten_extents(osb))
> return -EOPNOTSUPP;
> @@ -2002,12 +2003,17 @@ static long ocfs2_fallocate(struct inode *inode, int mode, loff_t offset,
> if (mode & FALLOC_FL_KEEP_SIZE)
> change_size = 0;
>
> + if (mode & FALLOC_FL_PUNCH_HOLE) {
> + cmd = OCFS2_IOC_UNRESVSP64;
> + change_size = 0;
> + }
> +
> sr.l_whence = 0;
> sr.l_start = (s64)offset;
> sr.l_len = (s64)len;
>
> - return __ocfs2_change_file_space(NULL, inode, offset,
> - OCFS2_IOC_RESVSP64, &sr, change_size);
> + return __ocfs2_change_file_space(NULL, inode, offset, cmd, &sr,
> + change_size);
> }
>
> int ocfs2_check_range_for_refcount(struct inode *inode, loff_t pos,
> --
> 1.6.6.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
"Born under a bad sign.
I been down since I began to crawl.
If it wasn't for bad luck,
I wouldn't have no luck at all."
Joel Becker
Senior Development Manager
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Cluster-devel] [PATCH 1/6] fs: add hole punching to fallocate
[not found] ` <1290044780-2902-2-git-send-email-josef@redhat.com>
@ 2010-11-18 23:43 ` Jan Kara
0 siblings, 0 replies; 8+ messages in thread
From: Jan Kara @ 2010-11-18 23:43 UTC (permalink / raw)
To: cluster-devel.redhat.com
On Wed 17-11-10 20:46:15, Josef Bacik wrote:
> Hole punching has already been implemented by XFS and OCFS2, and has the
> potential to be implemented on both BTRFS and EXT4 so we need a generic way to
> get to this feature. The simplest way in my mind is to add FALLOC_FL_PUNCH_HOLE
> to fallocate() since it already looks like the normal fallocate() operation.
> I've tested this patch with XFS and BTRFS to make sure XFS did what it's
> supposed to do and that BTRFS failed like it was supposed to. Thank you,
Looks nice now. Acked-by: Jan Kara <jack@suse.cz>
Honza
>
> Signed-off-by: Josef Bacik <josef@redhat.com>
> ---
> fs/open.c | 7 ++++++-
> include/linux/falloc.h | 1 +
> 2 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/fs/open.c b/fs/open.c
> index 4197b9e..5b6ef7e 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -223,7 +223,12 @@ int do_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
> return -EINVAL;
>
> /* Return error if mode is not supported */
> - if (mode && !(mode & FALLOC_FL_KEEP_SIZE))
> + if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
> + return -EOPNOTSUPP;
> +
> + /* Punch hole must have keep size set */
> + if ((mode & FALLOC_FL_PUNCH_HOLE) &&
> + !(mode & FALLOC_FL_KEEP_SIZE))
> return -EOPNOTSUPP;
>
> if (!(file->f_mode & FMODE_WRITE))
> diff --git a/include/linux/falloc.h b/include/linux/falloc.h
> index 3c15510..73e0b62 100644
> --- a/include/linux/falloc.h
> +++ b/include/linux/falloc.h
> @@ -2,6 +2,7 @@
> #define _FALLOC_H_
>
> #define FALLOC_FL_KEEP_SIZE 0x01 /* default is extend size */
> +#define FALLOC_FL_PUNCH_HOLE 0x02 /* de-allocates range */
>
> #ifdef __KERNEL__
>
> --
> 1.6.6.1
>
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-11-18 23:43 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1289840723-3056-1-git-send-email-josef@redhat.com>
[not found] ` <1289840723-3056-2-git-send-email-josef@redhat.com>
2010-11-16 11:16 ` [Cluster-devel] [PATCH 1/6] fs: add hole punching to fallocate Jan Kara
2010-11-16 11:43 ` Jan Kara
[not found] ` <20101116125249.GB31957@dhcp231-156.rdu.redhat.com>
2010-11-16 13:14 ` Jan Kara
[not found] ` <1289840723-3056-4-git-send-email-josef@redhat.com>
2010-11-16 11:50 ` [Cluster-devel] [PATCH 3/6] Ocfs2: handle hole punching via fallocate properly Jan Kara
2010-11-17 23:27 ` Joel Becker
[not found] ` <1289840723-3056-5-git-send-email-josef@redhat.com>
2010-11-16 11:52 ` [Cluster-devel] [PATCH 4/6] Ext4: fail if we try to use hole punch Jan Kara
[not found] <1290044780-2902-1-git-send-email-josef@redhat.com>
[not found] ` <1290044780-2902-2-git-send-email-josef@redhat.com>
2010-11-18 23:43 ` [Cluster-devel] [PATCH 1/6] fs: add hole punching to fallocate Jan Kara
[not found] <1289248327-16308-1-git-send-email-josef@redhat.com>
[not found] ` <20101109011222.GD2715@dastard>
[not found] ` <20101109033038.GF3099@thunk.org>
[not found] ` <20101109044242.GH2715@dastard>
[not found] ` <20101109214147.GK3099@thunk.org>
2010-11-09 21:53 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).