public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC, PATCH 0/2] xfs: Fast zeroing of allocated space
@ 2010-07-27  5:55 Dave Chinner
  2010-07-27  5:55 ` [PATCH 1/2] xfs: use range primitives for xfs page cache operations Dave Chinner
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Dave Chinner @ 2010-07-27  5:55 UTC (permalink / raw)
  To: xfs

These two patches allow conversion of a written range of a file into
back into unwritten state. The first patch converts page cache
helper functions to use the correct range primitives as we need to
be able to toss (invalidate) pages only within the range specified,
not to to the end of the file as it currently uses.

The actual conversion also preallocates any holes in the range, so
it turns the entire range requested into allocated, unwritten
extents.

I also have a patch that adds this functionality to fallocate(). I
haven't tested that at all, but if this is something we want to
support, I'd suggest that we want fallocate to be able to do it...

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] xfs: use range primitives for xfs page cache operations
  2010-07-27  5:55 [RFC, PATCH 0/2] xfs: Fast zeroing of allocated space Dave Chinner
@ 2010-07-27  5:55 ` Dave Chinner
  2010-07-27  6:53   ` Christoph Hellwig
  2010-07-27  5:55 ` [PATCH 2/2] xfs: Introduce XFS_IOC_ZERO_RANGE Dave Chinner
  2010-07-27 20:49 ` [RFC, PATCH 0/2] xfs: Fast zeroing of allocated space Alex Elder
  2 siblings, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2010-07-27  5:55 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

While XFS passes ranges to operate on from the core code, the
functions being called ignore the either the entire range or the end
of the range. This is historical because when the function were
written linux didn't have the necessary range operations. Update the
functions to use the correct operations.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/linux-2.6/xfs_fs_subr.c |   19 ++++++++++---------
 1 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_fs_subr.c b/fs/xfs/linux-2.6/xfs_fs_subr.c
index 1f279b0..e76d31d 100644
--- a/fs/xfs/linux-2.6/xfs_fs_subr.c
+++ b/fs/xfs/linux-2.6/xfs_fs_subr.c
@@ -32,10 +32,7 @@ xfs_tosspages(
 	xfs_off_t	last,
 	int		fiopt)
 {
-	struct address_space *mapping = VFS_I(ip)->i_mapping;
-
-	if (mapping->nrpages)
-		truncate_inode_pages(mapping, first);
+	truncate_inode_pages_range(VFS_I(ip)->i_mapping, first, last);
 }
 
 int
@@ -52,9 +49,10 @@ xfs_flushinval_pages(
 
 	if (mapping->nrpages) {
 		xfs_iflags_clear(ip, XFS_ITRUNCATED);
-		ret = filemap_write_and_wait(mapping);
+		ret = filemap_write_and_wait_range(mapping, first,
+					last == -1 ? LLONG_MAX : last);
 		if (!ret)
-			truncate_inode_pages(mapping, first);
+			truncate_inode_pages_range(mapping, first, last);
 	}
 	return -ret;
 }
@@ -73,7 +71,8 @@ xfs_flush_pages(
 
 	if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
 		xfs_iflags_clear(ip, XFS_ITRUNCATED);
-		ret = -filemap_fdatawrite(mapping);
+		ret = -filemap_fdatawrite_range(mapping, first,
+					last == -1 ? LLONG_MAX : last);
 	}
 	if (flags & XBF_ASYNC)
 		return ret;
@@ -91,7 +90,9 @@ xfs_wait_on_pages(
 {
 	struct address_space *mapping = VFS_I(ip)->i_mapping;
 
-	if (mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK))
-		return -filemap_fdatawait(mapping);
+	if (mapping_tagged(mapping, PAGECACHE_TAG_WRITEBACK)) {
+		return -filemap_fdatawait_range(mapping, first,
+					last == -1 ? ip->i_size - 1 : last);
+	}
 	return 0;
 }
-- 
1.7.1

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] xfs: Introduce XFS_IOC_ZERO_RANGE
  2010-07-27  5:55 [RFC, PATCH 0/2] xfs: Fast zeroing of allocated space Dave Chinner
  2010-07-27  5:55 ` [PATCH 1/2] xfs: use range primitives for xfs page cache operations Dave Chinner
@ 2010-07-27  5:55 ` Dave Chinner
  2010-07-27  7:59   ` Christoph Hellwig
  2010-07-27 20:49 ` [RFC, PATCH 0/2] xfs: Fast zeroing of allocated space Alex Elder
  2 siblings, 1 reply; 7+ messages in thread
From: Dave Chinner @ 2010-07-27  5:55 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

XFS_IOC_ZERO_RANGE is the equivalent of an atomic XFS_IOC_UNRESVSP/
XFS_IOC_RESVSP call pair. It enabled ranges of written data to be
turned into zeroes without requiring IO or having to free and
reallocate the extents in the range given as would occur if we had
to punch and then preallocate them separately.  This enables
applications to zero parts of files very quickly without changing
the layout of the files in any way.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/linux-2.6/xfs_ioctl.c   |    3 ++-
 fs/xfs/linux-2.6/xfs_ioctl32.c |    1 +
 fs/xfs/xfs_bmap.c              |   14 +++++++++++---
 fs/xfs/xfs_bmap.h              |    9 ++++++---
 fs/xfs/xfs_fs.h                |    1 +
 fs/xfs/xfs_vnodeops.c          |   13 +++++++++++--
 6 files changed, 32 insertions(+), 9 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_ioctl.c b/fs/xfs/linux-2.6/xfs_ioctl.c
index 237f5ff..a9c4810 100644
--- a/fs/xfs/linux-2.6/xfs_ioctl.c
+++ b/fs/xfs/linux-2.6/xfs_ioctl.c
@@ -1292,7 +1292,8 @@ xfs_file_ioctl(
 	case XFS_IOC_ALLOCSP64:
 	case XFS_IOC_FREESP64:
 	case XFS_IOC_RESVSP64:
-	case XFS_IOC_UNRESVSP64: {
+	case XFS_IOC_UNRESVSP64:
+	case XFS_IOC_ZERO_RANGE: {
 		xfs_flock64_t		bf;
 
 		if (copy_from_user(&bf, arg, sizeof(bf)))
diff --git a/fs/xfs/linux-2.6/xfs_ioctl32.c b/fs/xfs/linux-2.6/xfs_ioctl32.c
index 6c83f7f..464bcc7 100644
--- a/fs/xfs/linux-2.6/xfs_ioctl32.c
+++ b/fs/xfs/linux-2.6/xfs_ioctl32.c
@@ -574,6 +574,7 @@ xfs_file_compat_ioctl(
 	case XFS_IOC_FSGEOMETRY_V1:
 	case XFS_IOC_FSGROWFSDATA:
 	case XFS_IOC_FSGROWFSRT:
+	case XFS_IOC_ZERO_RANGE:
 		return xfs_file_ioctl(filp, cmd, p);
 #else
 	case XFS_IOC_ALLOCSP_32:
diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 23f14e5..581e646 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -4744,8 +4744,14 @@ xfs_bmapi(
 		 * Check if writing previously allocated but
 		 * unwritten extents.
 		 */
-		if (wr && mval->br_state == XFS_EXT_UNWRITTEN &&
-		    ((flags & (XFS_BMAPI_PREALLOC|XFS_BMAPI_DELAY)) == 0)) {
+		if (wr &&
+		    ((mval->br_state == XFS_EXT_UNWRITTEN &&
+		      ((flags & (XFS_BMAPI_PREALLOC|XFS_BMAPI_DELAY)) == 0)) ||
+		     (mval->br_state == XFS_EXT_NORM &&
+		      ((flags & (XFS_BMAPI_PREALLOC|XFS_BMAPI_CONVERT)) ==
+				(XFS_BMAPI_PREALLOC|XFS_BMAPI_CONVERT))))) {
+
+			printk("extent conversion\n");
 			/*
 			 * Modify (by adding) the state flag, if writing.
 			 */
@@ -4757,7 +4763,9 @@ xfs_bmapi(
 					*firstblock;
 				cur->bc_private.b.flist = flist;
 			}
-			mval->br_state = XFS_EXT_NORM;
+			mval->br_state = (mval->br_state == XFS_EXT_UNWRITTEN)
+						? XFS_EXT_NORM
+						: XFS_EXT_UNWRITTEN;
 			error = xfs_bmap_add_extent(ip, lastx, &cur, mval,
 				firstblock, flist, &tmp_logflags,
 				whichfork, (flags & XFS_BMAPI_RSVBLOCKS));
diff --git a/fs/xfs/xfs_bmap.h b/fs/xfs/xfs_bmap.h
index b13569a..71ec9b6 100644
--- a/fs/xfs/xfs_bmap.h
+++ b/fs/xfs/xfs_bmap.h
@@ -74,9 +74,12 @@ typedef	struct xfs_bmap_free
 #define	XFS_BMAPI_IGSTATE	0x080	/* Ignore state - */
 					/* combine contig. space */
 #define	XFS_BMAPI_CONTIG	0x100	/* must allocate only one extent */
-#define XFS_BMAPI_CONVERT	0x200	/* unwritten extent conversion - */
-					/* need write cache flushing and no */
-					/* additional allocation alignments */
+/*
+ * unwritten extent conversion - this needs write cache flushing and no additional
+ * allocation alignments. When specified with XFS_BMAPI_PREALLOC it converts
+ * from written to unwritten, otherwise convert from unwritten to written.
+ */
+#define XFS_BMAPI_CONVERT	0x200
 
 #define XFS_BMAPI_FLAGS \
 	{ XFS_BMAPI_WRITE,	"WRITE" }, \
diff --git a/fs/xfs/xfs_fs.h b/fs/xfs/xfs_fs.h
index 7cf7220..6af6518 100644
--- a/fs/xfs/xfs_fs.h
+++ b/fs/xfs/xfs_fs.h
@@ -446,6 +446,7 @@ typedef struct xfs_handle {
 /*	XFS_IOC_SETBIOSIZE ---- deprecated 46	   */
 /*	XFS_IOC_GETBIOSIZE ---- deprecated 47	   */
 #define XFS_IOC_GETBMAPX	_IOWR('X', 56, struct getbmap)
+#define XFS_IOC_ZERO_RANGE	_IOW ('X', 57, struct xfs_flock64)
 
 /*
  * ioctl commands that replace IRIX syssgi()'s
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index 3ac137d..9f82422 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -2270,7 +2270,7 @@ xfs_alloc_file_space(
 	count = len;
 	imapp = &imaps[0];
 	nimaps = 1;
-	bmapi_flag = XFS_BMAPI_WRITE | (alloc_type ? XFS_BMAPI_PREALLOC : 0);
+	bmapi_flag = XFS_BMAPI_WRITE | alloc_type;
 	startoffset_fsb	= XFS_B_TO_FSBT(mp, offset);
 	allocatesize_fsb = XFS_B_TO_FSB(mp, count);
 
@@ -2702,7 +2702,9 @@ xfs_change_file_space(
 	xfs_off_t	llen;
 	xfs_trans_t	*tp;
 	struct iattr	iattr;
+	int		prealloc_type;
 
+	printk("type 0x%x\n", bf->l_type);
 	if (!S_ISREG(ip->i_d.di_mode))
 		return XFS_ERROR(EINVAL);
 
@@ -2744,12 +2746,19 @@ xfs_change_file_space(
 	 * size to be changed.
 	 */
 	setprealloc = clrprealloc = 0;
+	prealloc_type = XFS_BMAPI_PREALLOC;
 
 	switch (cmd) {
+	case XFS_IOC_ZERO_RANGE:
+		prealloc_type |= XFS_BMAPI_CONVERT;
+		printk("force prealloc off 0x%llx len 0x%llx\n",
+				startoffset, bf->l_len);
+		xfs_tosspages(ip, startoffset, startoffset + bf->l_len, 0);
+		/* FALLTHRU */
 	case XFS_IOC_RESVSP:
 	case XFS_IOC_RESVSP64:
 		error = xfs_alloc_file_space(ip, startoffset, bf->l_len,
-								1, attr_flags);
+						prealloc_type, attr_flags);
 		if (error)
 			return error;
 		setprealloc = 1;
-- 
1.7.1

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] xfs: use range primitives for xfs page cache operations
  2010-07-27  5:55 ` [PATCH 1/2] xfs: use range primitives for xfs page cache operations Dave Chinner
@ 2010-07-27  6:53   ` Christoph Hellwig
  2010-07-27 12:05     ` Dave Chinner
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2010-07-27  6:53 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Tue, Jul 27, 2010 at 03:55:28PM +1000, Dave Chinner wrote:
> While XFS passes ranges to operate on from the core code, the
> functions being called ignore the either the entire range or the end
> of the range. This is historical because when the function were
> written linux didn't have the necessary range operations. Update the
> functions to use the correct operations.

Assuming you have actually tested this - given that we've ignore
these parameters so long that I'm really fearing some callers have
started to rely on that behaviour.

>  	if (mapping->nrpages) {

I'd drop this check ere as well - no other caller does it.

>  		xfs_iflags_clear(ip, XFS_ITRUNCATED);
> -		ret = filemap_write_and_wait(mapping);
> +		ret = filemap_write_and_wait_range(mapping, first,
> +					last == -1 ? LLONG_MAX : last);
>  		if (!ret)
> -			truncate_inode_pages(mapping, first);
> +			truncate_inode_pages_range(mapping, first, last);
>  	}
>  	return -ret;
>  }
> @@ -73,7 +71,8 @@ xfs_flush_pages(
>  
>  	if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {

Same for this check.

>  		xfs_iflags_clear(ip, XFS_ITRUNCATED);
> -		ret = -filemap_fdatawrite(mapping);
> +		ret = -filemap_fdatawrite_range(mapping, first,
> +					last == -1 ? LLONG_MAX : last);

Also for the non-async case we should just use
filemap_write_and_wait_range, and kill off xfs_wait_on_pages.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] xfs: Introduce XFS_IOC_ZERO_RANGE
  2010-07-27  5:55 ` [PATCH 2/2] xfs: Introduce XFS_IOC_ZERO_RANGE Dave Chinner
@ 2010-07-27  7:59   ` Christoph Hellwig
  0 siblings, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2010-07-27  7:59 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Tue, Jul 27, 2010 at 03:55:29PM +1000, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> XFS_IOC_ZERO_RANGE is the equivalent of an atomic XFS_IOC_UNRESVSP/
> XFS_IOC_RESVSP call pair. It enabled ranges of written data to be
> turned into zeroes without requiring IO or having to free and
> reallocate the extents in the range given as would occur if we had
> to punch and then preallocate them separately.  This enables
> applications to zero parts of files very quickly without changing
> the layout of the files in any way.

This looks good minus the left over printks.  It'll also needs xfstests
coverage and a description in the xfsctl manpage.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] xfs: use range primitives for xfs page cache operations
  2010-07-27  6:53   ` Christoph Hellwig
@ 2010-07-27 12:05     ` Dave Chinner
  0 siblings, 0 replies; 7+ messages in thread
From: Dave Chinner @ 2010-07-27 12:05 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Tue, Jul 27, 2010 at 02:53:26AM -0400, Christoph Hellwig wrote:
> On Tue, Jul 27, 2010 at 03:55:28PM +1000, Dave Chinner wrote:
> > While XFS passes ranges to operate on from the core code, the
> > functions being called ignore the either the entire range or the end
> > of the range. This is historical because when the function were
> > written linux didn't have the necessary range operations. Update the
> > functions to use the correct operations.
> 
> Assuming you have actually tested this - given that we've ignore
> these parameters so long that I'm really fearing some callers have
> started to rely on that behaviour.

Several xfstests run on different configs haven't shown any
regressions.

All current callers of xfs_tosspages() trim from and offset to EOF,
so no change in behaviour there.

All of the xfs_flush_pages() icallers except one flush the entire
file (0 to -1), except for a single call in xfs_setattr which
between on disk size and the new filesystem when extending the file
size by truncate.  It will now flush just the necessary range
instead of the whole file.

All of xfs_flushinval_pages() callers flush to the end of file,
so once again there should be no change in behaviour there.

So I don't think there really is much risk here - the only one that
I'd be concerned about is the setattr() call and we have tests
covering that specific case (138, 139 and 140)....

> >  	if (mapping->nrpages) {
> 
> I'd drop this check ere as well - no other caller does it.
> 
> >  		xfs_iflags_clear(ip, XFS_ITRUNCATED);
> > -		ret = filemap_write_and_wait(mapping);
> > +		ret = filemap_write_and_wait_range(mapping, first,
> > +					last == -1 ? LLONG_MAX : last);
> >  		if (!ret)
> > -			truncate_inode_pages(mapping, first);
> > +			truncate_inode_pages_range(mapping, first, last);
> >  	}
> >  	return -ret;
> >  }
> > @@ -73,7 +71,8 @@ xfs_flush_pages(
> >  
> >  	if (mapping_tagged(mapping, PAGECACHE_TAG_DIRTY)) {
> 
> Same for this check.

Ok, killed those checks.

> >  		xfs_iflags_clear(ip, XFS_ITRUNCATED);
> > -		ret = -filemap_fdatawrite(mapping);
> > +		ret = -filemap_fdatawrite_range(mapping, first,
> > +					last == -1 ? LLONG_MAX : last);
> 
> Also for the non-async case we should just use
> filemap_write_and_wait_range, and kill off xfs_wait_on_pages.

That can be done separately, I think. If we don't see any
problems from this patch, then I think we can kill all
the wrappers, not just one of them.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [RFC, PATCH 0/2] xfs: Fast zeroing of allocated space
  2010-07-27  5:55 [RFC, PATCH 0/2] xfs: Fast zeroing of allocated space Dave Chinner
  2010-07-27  5:55 ` [PATCH 1/2] xfs: use range primitives for xfs page cache operations Dave Chinner
  2010-07-27  5:55 ` [PATCH 2/2] xfs: Introduce XFS_IOC_ZERO_RANGE Dave Chinner
@ 2010-07-27 20:49 ` Alex Elder
  2 siblings, 0 replies; 7+ messages in thread
From: Alex Elder @ 2010-07-27 20:49 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Tue, 2010-07-27 at 15:55 +1000, Dave Chinner wrote:
> These two patches allow conversion of a written range of a file into
> back into unwritten state. The first patch converts page cache
> helper functions to use the correct range primitives as we need to
> be able to toss (invalidate) pages only within the range specified,
> not to to the end of the file as it currently uses.
> 
> The actual conversion also preallocates any holes in the range, so
> it turns the entire range requested into allocated, unwritten
> extents.
> 
> I also have a patch that adds this functionality to fallocate(). I
> haven't tested that at all, but if this is something we want to
> support, I'd suggest that we want fallocate to be able to do it...

Nice feature.  Kind of surprising to me how small the set
of changes it requires is.

I agree with Christoph's comments.  These two patches otherwise
look good to me as well.

Reviewed-by: Alex Elder <aelder@sgi.com>


_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-07-27 20:46 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-27  5:55 [RFC, PATCH 0/2] xfs: Fast zeroing of allocated space Dave Chinner
2010-07-27  5:55 ` [PATCH 1/2] xfs: use range primitives for xfs page cache operations Dave Chinner
2010-07-27  6:53   ` Christoph Hellwig
2010-07-27 12:05     ` Dave Chinner
2010-07-27  5:55 ` [PATCH 2/2] xfs: Introduce XFS_IOC_ZERO_RANGE Dave Chinner
2010-07-27  7:59   ` Christoph Hellwig
2010-07-27 20:49 ` [RFC, PATCH 0/2] xfs: Fast zeroing of allocated space Alex Elder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox