[PATCH 0/7] lseek SEEK_HOLE / SEEK

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/7] lseek SEEK_HOLE / SEEK_DATA via iomap
@ 2017-06-16 14:51 Andreas Gruenbacher
  2017-06-16 14:51 ` [PATCH 1/7] GFS2: Make height info part of metapath Andreas Gruenbacher
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2017-06-16 14:51 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Andreas Gruenbacher, Bob Peterson, linux-xfs, cluster-devel

Here's an updated patch queue that implements iomap in gfs2 and switches gfs2
fiemap over to use iomap.  Following that is lseek SEEK_HOLE and SEEK_DATA
support for gfs2 and xfs.  Finally, xfs's internal use of lseek SEEK_HOLE /
SEEK_DATA is converted to iomap.

With the approach taken by this patch queue, filesystems that support iomap
still need to implement fiemap and lseek SEEK_HOLE / SEEK_DATA separately.
Those implementations will all be identical except for the filesystem specific
locking / unlocking code at the beginning / end of the operations.

Wouldn't it make sense to add iomap_lock and iomap_unlock operations, expose
the iomap ops at the vfs layer, and implement fiemap and lseek SEEK_HOLE /
SEEK_DATA generically in a next step?

Thanks,
Andreas

Andreas Gruenbacher (4):
  vfs: Add iomap_seek_hole_data helper
  gfs2: Implement lseek SEEK_HOLE / SEEK_DATA via iomap
  xfs: Switch to iomap for lseek SEEK_HOLE / SEEK_DATA
  xfs: Switch to iomap for seeking in quota files

Bob Peterson (3):
  GFS2: Make height info part of metapath
  GFS2: Implement iomap for block_map
  GFS2: Switch fiemap implementation to use iomap

 fs/gfs2/Kconfig       |   1 +
 fs/gfs2/bmap.c        | 288 ++++++++++++++++++++++++++++++------------
 fs/gfs2/bmap.h        |   4 +
 fs/gfs2/file.c        |  14 ++-
 fs/gfs2/inode.c       |  54 ++++++--
 fs/gfs2/inode.h       |   1 +
 fs/gfs2/trace_gfs2.h  |  65 ++++++++++
 fs/iomap.c            |  81 ++++++++++++
 fs/xfs/xfs_dquot.c    |   5 +-
 fs/xfs/xfs_file.c     | 337 +-------------------------------------------------
 fs/xfs/xfs_inode.h    |   2 -
 include/linux/iomap.h |   9 +-
 12 files changed, 429 insertions(+), 432 deletions(-)

-- 
2.7.5

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/7] GFS2: Make height info part of metapath
  2017-06-16 14:51 [PATCH 0/7] lseek SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
@ 2017-06-16 14:51 ` Andreas Gruenbacher
  2017-06-16 14:51 ` [PATCH 2/7] GFS2: Implement iomap for block_map Andreas Gruenbacher
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2017-06-16 14:51 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Bob Peterson, linux-xfs, cluster-devel, Andreas Gruenbacher

From: Bob Peterson <rpeterso@redhat.com>

This patch eliminates height parameters from function gfs2_bmap_alloc.
Function find_metapath determines the metapath's "find height", also
known as the desired height. Function lookup_metapath determines the
metapath's "actual height", previously known as starting height or
sheight. Function gfs2_bmap_alloc now gets both height values from
the metapath. This simplification was done as a step toward switching
the block_map functions to using iomap. The bh_map responsibilities
are also removed from function gfs2_bmap_alloc for the same reason.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c | 109 ++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 62 insertions(+), 47 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index 4d810be..a431afd 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -36,6 +36,8 @@
 struct metapath {
 	struct buffer_head *mp_bh[GFS2_MAX_META_HEIGHT];
 	__u16 mp_list[GFS2_MAX_META_HEIGHT];
+	int mp_fheight; /* find_metapath height */
+	int mp_aheight; /* actual height (lookup height) */
 };
 
 /**
@@ -235,9 +237,9 @@ static void find_metapath(const struct gfs2_sbd *sdp, u64 block,
 {
 	unsigned int i;
 
+	mp->mp_fheight = height;
 	for (i = height; i--;)
 		mp->mp_list[i] = do_div(block, sdp->sd_inptrs);
-
 }
 
 static inline unsigned int metapath_branch_start(const struct metapath *mp)
@@ -344,10 +346,13 @@ static int lookup_metapath(struct gfs2_inode *ip, struct metapath *mp)
 	for (x = 0; x < end_of_metadata; x++) {
 		ret = lookup_mp_height(ip, mp, x);
 		if (ret)
-			return ret;
+			goto out;
 	}
 
-	return ip->i_height;
+	ret = ip->i_height;
+out:
+	mp->mp_aheight = ret;
+	return ret;
 }
 
 /**
@@ -479,10 +484,11 @@ static inline unsigned int hptrs(struct gfs2_sbd *sdp, const unsigned int hgt)
  * @inode: The GFS2 inode
  * @lblock: The logical starting block of the extent
  * @bh_map: This is used to return the mapping details
- * @mp: The metapath
- * @sheight: The starting height (i.e. whats already mapped)
- * @height: The height to build to
+ * @zero_new: True if newly allocated blocks should be zeroed
+ * @mp: The metapath, with proper height information calculated
  * @maxlen: The max number of data blocks to alloc
+ * @dblock: Pointer to return the resulting new block
+ * @dblks: Pointer to return the number of blocks allocated
  *
  * In this routine we may have to alloc:
  *   i) Indirect blocks to grow the metadata tree height
@@ -499,62 +505,63 @@ static inline unsigned int hptrs(struct gfs2_sbd *sdp, const unsigned int hgt)
  */
 
 static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
-			   struct buffer_head *bh_map, struct metapath *mp,
-			   const unsigned int sheight,
-			   const unsigned int height,
-			   const size_t maxlen)
+			   bool zero_new, struct metapath *mp,
+			   const size_t maxlen, sector_t *dblock,
+			   unsigned *dblks)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
 	struct super_block *sb = sdp->sd_vfs;
 	struct buffer_head *dibh = mp->mp_bh[0];
-	u64 bn, dblock = 0;
+	u64 bn;
 	unsigned n, i, blks, alloced = 0, iblks = 0, branch_start = 0;
-	unsigned dblks = 0;
 	unsigned ptrs_per_blk;
-	const unsigned end_of_metadata = height - 1;
+	const unsigned end_of_metadata = mp->mp_fheight - 1;
 	int ret;
 	int eob = 0;
 	enum alloc_state state;
 	__be64 *ptr;
 	__be64 zero_bn = 0;
 
-	BUG_ON(sheight < 1);
+	BUG_ON(mp->mp_aheight < 1);
 	BUG_ON(dibh == NULL);
 
+	*dblock = 0;
+	*dblks = 0;
 	gfs2_trans_add_meta(ip->i_gl, dibh);
 
-	if (height == sheight) {
+	if (mp->mp_fheight == mp->mp_aheight) {
 		struct buffer_head *bh;
 		/* Bottom indirect block exists, find unalloced extent size */
 		ptr = metapointer(end_of_metadata, mp);
 		bh = mp->mp_bh[end_of_metadata];
-		dblks = gfs2_extent_length(bh->b_data, bh->b_size, ptr, maxlen,
-					   &eob);
-		BUG_ON(dblks < 1);
+		*dblks = gfs2_extent_length(bh->b_data, bh->b_size, ptr,
+					    maxlen, &eob);
+		BUG_ON(*dblks < 1);
 		state = ALLOC_DATA;
 	} else {
 		/* Need to allocate indirect blocks */
-		ptrs_per_blk = height > 1 ? sdp->sd_inptrs : sdp->sd_diptrs;
-		dblks = min(maxlen, (size_t)(ptrs_per_blk -
-					     mp->mp_list[end_of_metadata]));
-		if (height == ip->i_height) {
+		ptrs_per_blk = mp->mp_fheight > 1 ? sdp->sd_inptrs :
+			sdp->sd_diptrs;
+		*dblks = min(maxlen, (size_t)(ptrs_per_blk -
+					      mp->mp_list[end_of_metadata]));
+		if (mp->mp_fheight == ip->i_height) {
 			/* Writing into existing tree, extend tree down */
-			iblks = height - sheight;
+			iblks = mp->mp_fheight - mp->mp_aheight;
 			state = ALLOC_GROW_DEPTH;
 		} else {
 			/* Building up tree height */
 			state = ALLOC_GROW_HEIGHT;
-			iblks = height - ip->i_height;
+			iblks = mp->mp_fheight - ip->i_height;
 			branch_start = metapath_branch_start(mp);
-			iblks += (height - branch_start);
+			iblks += (mp->mp_fheight - branch_start);
 		}
 	}
 
 	/* start of the second part of the function (state machine) */
 
-	blks = dblks + iblks;
-	i = sheight;
+	blks = *dblks + iblks;
+	i = mp->mp_aheight;
 	do {
 		int error;
 		n = blks - alloced;
@@ -572,9 +579,10 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 						 sizeof(struct gfs2_dinode));
 				zero_bn = *ptr;
 			}
-			for (; i - 1 < height - ip->i_height && n > 0; i++, n--)
+			for (; i - 1 < mp->mp_fheight - ip->i_height && n > 0;
+			     i++, n--)
 				gfs2_indirect_init(mp, ip->i_gl, i, 0, bn++);
-			if (i - 1 == height - ip->i_height) {
+			if (i - 1 == mp->mp_fheight - ip->i_height) {
 				i--;
 				gfs2_buffer_copy_tail(mp->mp_bh[i],
 						sizeof(struct gfs2_meta_header),
@@ -586,7 +594,7 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 					sizeof(struct gfs2_meta_header));
 				*ptr = zero_bn;
 				state = ALLOC_GROW_DEPTH;
-				for(i = branch_start; i < height; i++) {
+				for(i = branch_start; i < mp->mp_fheight; i++) {
 					if (mp->mp_bh[i] == NULL)
 						break;
 					brelse(mp->mp_bh[i]);
@@ -598,44 +606,40 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 				break;
 		/* Branching from existing tree */
 		case ALLOC_GROW_DEPTH:
-			if (i > 1 && i < height)
+			if (i > 1 && i < mp->mp_fheight)
 				gfs2_trans_add_meta(ip->i_gl, mp->mp_bh[i-1]);
-			for (; i < height && n > 0; i++, n--)
+			for (; i < mp->mp_fheight && n > 0; i++, n--)
 				gfs2_indirect_init(mp, ip->i_gl, i,
 						   mp->mp_list[i-1], bn++);
-			if (i == height)
+			if (i == mp->mp_fheight)
 				state = ALLOC_DATA;
 			if (n == 0)
 				break;
 		/* Tree complete, adding data blocks */
 		case ALLOC_DATA:
-			BUG_ON(n > dblks);
+			BUG_ON(n > *dblks);
 			BUG_ON(mp->mp_bh[end_of_metadata] == NULL);
 			gfs2_trans_add_meta(ip->i_gl, mp->mp_bh[end_of_metadata]);
-			dblks = n;
+			*dblks = n;
 			ptr = metapointer(end_of_metadata, mp);
-			dblock = bn;
+			*dblock = bn;
 			while (n-- > 0)
 				*ptr++ = cpu_to_be64(bn++);
-			if (buffer_zeronew(bh_map)) {
-				ret = sb_issue_zeroout(sb, dblock, dblks,
+			if (zero_new) {
+				ret = sb_issue_zeroout(sb, *dblock, *dblks,
 						       GFP_NOFS);
 				if (ret) {
 					fs_err(sdp,
 					       "Failed to zero data buffers\n");
-					clear_buffer_zeronew(bh_map);
 				}
 			}
 			break;
 		}
-	} while ((state != ALLOC_DATA) || !dblock);
+	} while ((state != ALLOC_DATA) || !(*dblock));
 
-	ip->i_height = height;
+	ip->i_height = mp->mp_fheight;
 	gfs2_add_inode_blocks(&ip->i_inode, alloced);
 	gfs2_dinode_out(ip, mp->mp_bh[0]->b_data);
-	map_bh(bh_map, inode->i_sb, dblock);
-	bh_map->b_size = dblks << inode->i_blkbits;
-	set_buffer_new(bh_map);
 	return 0;
 }
 
@@ -669,6 +673,9 @@ int gfs2_block_map(struct inode *inode, sector_t lblock,
 	unsigned int len;
 	struct buffer_head *bh;
 	u8 height;
+	bool zero_new = false;
+	sector_t dblock = 0;
+	unsigned dblks;
 
 	BUG_ON(maxlen == 0);
 
@@ -692,13 +699,13 @@ int gfs2_block_map(struct inode *inode, sector_t lblock,
 	while (size > arr[height])
 		height++;
 	find_metapath(sdp, lblock, &mp, height);
-	ret = 1;
+	mp.mp_aheight = 1;
 	if (height > ip->i_height || gfs2_is_stuffed(ip))
 		goto do_alloc;
 	ret = lookup_metapath(ip, &mp);
 	if (ret < 0)
 		goto out;
-	if (ret != ip->i_height)
+	if (mp.mp_aheight != ip->i_height)
 		goto do_alloc;
 	ptr = metapointer(ip->i_height - 1, &mp);
 	if (*ptr == 0)
@@ -725,7 +732,15 @@ int gfs2_block_map(struct inode *inode, sector_t lblock,
 	}
 
 	/* At this point ret is the tree depth of already allocated blocks */
-	ret = gfs2_bmap_alloc(inode, lblock, bh_map, &mp, ret, height, maxlen);
+	if (buffer_zeronew(bh_map))
+		zero_new = true;
+	ret = gfs2_bmap_alloc(inode, lblock, zero_new, &mp, maxlen, &dblock,
+			      &dblks);
+	if (ret == 0) {
+		map_bh(bh_map, inode->i_sb, dblock);
+		bh_map->b_size = dblks << inode->i_blkbits;
+		set_buffer_new(bh_map);
+	}
 	goto out;
 }
 
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/7] GFS2: Implement iomap for block_map
  2017-06-16 14:51 [PATCH 0/7] lseek SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
  2017-06-16 14:51 ` [PATCH 1/7] GFS2: Make height info part of metapath Andreas Gruenbacher
@ 2017-06-16 14:51 ` Andreas Gruenbacher
  2017-06-16 14:51 ` [PATCH 3/7] GFS2: Switch fiemap implementation to use iomap Andreas Gruenbacher
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2017-06-16 14:51 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Bob Peterson, linux-xfs, cluster-devel, Andreas Gruenbacher

From: Bob Peterson <rpeterso@redhat.com>

This patch implements iomap for block mapping, and switches the
block_map function to use it under the covers.

The additional IOMAP_F_BOUNDARY iomap flag indicates when iomap has
reached a "metadata boundary" and fetching the next mapping is likely to
incur an additional I/O.  This flag is used for setting the bh buffer
boundary flag.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/bmap.c        | 249 ++++++++++++++++++++++++++++++++++++--------------
 fs/gfs2/bmap.h        |   4 +
 fs/gfs2/trace_gfs2.h  |  65 +++++++++++++
 include/linux/iomap.h |   3 +-
 4 files changed, 250 insertions(+), 71 deletions(-)

diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
index a431afd..fa33fdc 100644
--- a/fs/gfs2/bmap.c
+++ b/fs/gfs2/bmap.c
@@ -13,6 +13,7 @@
 #include <linux/blkdev.h>
 #include <linux/gfs2_ondisk.h>
 #include <linux/crc32.h>
+#include <linux/iomap.h>
 
 #include "gfs2.h"
 #include "incore.h"
@@ -416,7 +417,6 @@ static inline unsigned int gfs2_extent_length(void *start, unsigned int len, __b
 	const __be64 *first = ptr;
 	u64 d = be64_to_cpu(*ptr);
 
-	*eob = 0;
 	do {
 		ptr++;
 		if (ptr >= end)
@@ -504,10 +504,8 @@ static inline unsigned int hptrs(struct gfs2_sbd *sdp, const unsigned int hgt)
  * Returns: errno on error
  */
 
-static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
-			   bool zero_new, struct metapath *mp,
-			   const size_t maxlen, sector_t *dblock,
-			   unsigned *dblks)
+static int gfs2_iomap_alloc(struct inode *inode, struct iomap *iomap,
+			    unsigned flags, struct metapath *mp)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
@@ -515,36 +513,37 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 	struct buffer_head *dibh = mp->mp_bh[0];
 	u64 bn;
 	unsigned n, i, blks, alloced = 0, iblks = 0, branch_start = 0;
+	unsigned dblks = 0;
 	unsigned ptrs_per_blk;
 	const unsigned end_of_metadata = mp->mp_fheight - 1;
 	int ret;
-	int eob = 0;
 	enum alloc_state state;
 	__be64 *ptr;
 	__be64 zero_bn = 0;
+	size_t maxlen = iomap->length >> inode->i_blkbits;
 
 	BUG_ON(mp->mp_aheight < 1);
 	BUG_ON(dibh == NULL);
 
-	*dblock = 0;
-	*dblks = 0;
 	gfs2_trans_add_meta(ip->i_gl, dibh);
 
 	if (mp->mp_fheight == mp->mp_aheight) {
 		struct buffer_head *bh;
+		int eob;
+
 		/* Bottom indirect block exists, find unalloced extent size */
 		ptr = metapointer(end_of_metadata, mp);
 		bh = mp->mp_bh[end_of_metadata];
-		*dblks = gfs2_extent_length(bh->b_data, bh->b_size, ptr,
-					    maxlen, &eob);
-		BUG_ON(*dblks < 1);
+		dblks = gfs2_extent_length(bh->b_data, bh->b_size, ptr,
+					   maxlen, &eob);
+		BUG_ON(dblks < 1);
 		state = ALLOC_DATA;
 	} else {
 		/* Need to allocate indirect blocks */
 		ptrs_per_blk = mp->mp_fheight > 1 ? sdp->sd_inptrs :
 			sdp->sd_diptrs;
-		*dblks = min(maxlen, (size_t)(ptrs_per_blk -
-					      mp->mp_list[end_of_metadata]));
+		dblks = min(maxlen, (size_t)(ptrs_per_blk -
+					     mp->mp_list[end_of_metadata]));
 		if (mp->mp_fheight == ip->i_height) {
 			/* Writing into existing tree, extend tree down */
 			iblks = mp->mp_fheight - mp->mp_aheight;
@@ -560,7 +559,7 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 
 	/* start of the second part of the function (state machine) */
 
-	blks = *dblks + iblks;
+	blks = dblks + iblks;
 	i = mp->mp_aheight;
 	do {
 		int error;
@@ -617,26 +616,28 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 				break;
 		/* Tree complete, adding data blocks */
 		case ALLOC_DATA:
-			BUG_ON(n > *dblks);
+			BUG_ON(n > dblks);
 			BUG_ON(mp->mp_bh[end_of_metadata] == NULL);
 			gfs2_trans_add_meta(ip->i_gl, mp->mp_bh[end_of_metadata]);
-			*dblks = n;
+			dblks = n;
 			ptr = metapointer(end_of_metadata, mp);
-			*dblock = bn;
+			iomap->blkno = bn;
 			while (n-- > 0)
 				*ptr++ = cpu_to_be64(bn++);
-			if (zero_new) {
-				ret = sb_issue_zeroout(sb, *dblock, *dblks,
-						       GFP_NOFS);
+			if (flags & IOMAP_ZERO) {
+				ret = sb_issue_zeroout(sb, iomap->blkno,
+						       dblks, GFP_NOFS);
 				if (ret) {
 					fs_err(sdp,
 					       "Failed to zero data buffers\n");
+					flags &= ~IOMAP_ZERO;
 				}
 			}
 			break;
 		}
-	} while ((state != ALLOC_DATA) || !(*dblock));
+	} while (iomap->blkno == IOMAP_NULL_BLOCK);
 
+	iomap->length = (u64)dblks << inode->i_blkbits;
 	ip->i_height = mp->mp_fheight;
 	gfs2_add_inode_blocks(&ip->i_inode, alloced);
 	gfs2_dinode_out(ip, mp->mp_bh[0]->b_data);
@@ -644,47 +645,101 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
 }
 
 /**
- * gfs2_block_map - Map a block from an inode to a disk block
- * @inode: The inode
- * @lblock: The logical block number
- * @bh_map: The bh to be mapped
- * @create: True if its ok to alloc blocks to satify the request
+ * hole_size - figure out the size of a hole
+ * @ip: The inode
+ * @lblock: The logical starting block number
+ * @mp: The metapath
  *
- * Sets buffer_mapped() if successful, sets buffer_boundary() if a
- * read of metadata will be required before the next block can be
- * mapped. Sets buffer_new() if new blocks were allocated.
+ * Returns: The hole size in bytes
  *
- * Returns: errno
  */
+static u64 hole_size(struct inode *inode, sector_t lblock, struct metapath *mp)
+{
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_sbd *sdp = GFS2_SB(inode);
+	struct metapath mp_eof;
+	u64 factor = 1;
+	int hgt;
+	u64 holesz = 0;
+	const __be64 *first, *end, *ptr;
+	const struct buffer_head *bh;
+	u64 lblock_stop = (i_size_read(inode) - 1) >> inode->i_blkbits;
+	int zeroptrs;
+	bool done = false;
+
+	/* Get another metapath, to the very last byte */
+	find_metapath(sdp, lblock_stop, &mp_eof, ip->i_height);
+	for (hgt = ip->i_height - 1; hgt >= 0 && !done; hgt--) {
+		bh = mp->mp_bh[hgt];
+		if (bh) {
+			zeroptrs = 0;
+			first = metapointer(hgt, mp);
+			end = (const __be64 *)(bh->b_data + bh->b_size);
+
+			for (ptr = first; ptr < end; ptr++) {
+				if (*ptr) {
+					done = true;
+					break;
+				} else {
+					zeroptrs++;
+				}
+			}
+		} else {
+			zeroptrs = sdp->sd_inptrs;
+		}
+		if (factor * zeroptrs >= lblock_stop - lblock + 1) {
+			holesz = lblock_stop - lblock + 1;
+			break;
+		}
+		holesz += factor * zeroptrs;
 
-int gfs2_block_map(struct inode *inode, sector_t lblock,
-		   struct buffer_head *bh_map, int create)
+		factor *= sdp->sd_inptrs;
+		if (hgt && (mp->mp_list[hgt - 1] < mp_eof.mp_list[hgt - 1]))
+			(mp->mp_list[hgt - 1])++;
+	}
+	return holesz << inode->i_blkbits;
+}
+
+/**
+ * gfs2_get_iomap - Map blocks from an inode to disk blocks
+ * @inode: The inode
+ * @pos: Starting position in bytes
+ * @length: Length to map, in bytes
+ * @flags: iomap flags
+ * @iomap: The iomap structure
+ *
+ * Returns: errno
+ */
+int gfs2_get_iomap(struct inode *inode, loff_t pos, ssize_t length,
+		   unsigned flags, struct iomap *iomap)
 {
 	struct gfs2_inode *ip = GFS2_I(inode);
 	struct gfs2_sbd *sdp = GFS2_SB(inode);
+	struct metapath mp = { .mp_aheight = 1, };
 	unsigned int bsize = sdp->sd_sb.sb_bsize;
-	const size_t maxlen = bh_map->b_size >> inode->i_blkbits;
 	const u64 *arr = sdp->sd_heightsize;
 	__be64 *ptr;
-	u64 size;
-	struct metapath mp;
+	sector_t lblock = pos >> inode->i_blkbits;
+	sector_t lend = (pos + length + sdp->sd_sb.sb_bsize - 1) >> inode->i_blkbits;
 	int ret;
-	int eob;
+	int eob = 0;
 	unsigned int len;
 	struct buffer_head *bh;
 	u8 height;
-	bool zero_new = false;
-	sector_t dblock = 0;
-	unsigned dblks;
 
-	BUG_ON(maxlen == 0);
+	trace_gfs2_iomap_start(ip, pos, length, flags);
+	if (!length) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	iomap->offset = lblock << inode->i_blkbits;
+	iomap->blkno = IOMAP_NULL_BLOCK;
+	iomap->type = IOMAP_HOLE;
+	iomap->length = (u64)(lend - lblock) << inode->i_blkbits;
+	iomap->flags = 0;
+	bmap_lock(ip, 0);
 
-	memset(&mp, 0, sizeof(mp));
-	bmap_lock(ip, create);
-	clear_buffer_mapped(bh_map);
-	clear_buffer_new(bh_map);
-	clear_buffer_boundary(bh_map);
-	trace_gfs2_bmap(ip, bh_map, lblock, create, 1);
 	if (gfs2_is_dir(ip)) {
 		bsize = sdp->sd_jbsize;
 		arr = sdp->sd_jheightsize;
@@ -692,56 +747,110 @@ int gfs2_block_map(struct inode *inode, sector_t lblock,
 
 	ret = gfs2_meta_inode_buffer(ip, &mp.mp_bh[0]);
 	if (ret)
-		goto out;
+		goto out_release;
 
 	height = ip->i_height;
-	size = (lblock + 1) * bsize;
-	while (size > arr[height])
+	while ((lblock + 1) * bsize > arr[height])
 		height++;
 	find_metapath(sdp, lblock, &mp, height);
-	mp.mp_aheight = 1;
 	if (height > ip->i_height || gfs2_is_stuffed(ip))
 		goto do_alloc;
+
 	ret = lookup_metapath(ip, &mp);
 	if (ret < 0)
-		goto out;
+		goto out_release;
+
 	if (mp.mp_aheight != ip->i_height)
 		goto do_alloc;
+
 	ptr = metapointer(ip->i_height - 1, &mp);
 	if (*ptr == 0)
 		goto do_alloc;
-	map_bh(bh_map, inode->i_sb, be64_to_cpu(*ptr));
+
+	iomap->type = IOMAP_MAPPED;
+	iomap->blkno = be64_to_cpu(*ptr);
+
 	bh = mp.mp_bh[ip->i_height - 1];
-	len = gfs2_extent_length(bh->b_data, bh->b_size, ptr, maxlen, &eob);
-	bh_map->b_size = (len << inode->i_blkbits);
+	len = gfs2_extent_length(bh->b_data, bh->b_size, ptr, lend - lblock, &eob);
 	if (eob)
-		set_buffer_boundary(bh_map);
+		iomap->flags |= IOMAP_F_BOUNDARY;
+	iomap->length = (u64)len << inode->i_blkbits;
+
 	ret = 0;
-out:
+
+out_release:
 	release_metapath(&mp);
-	trace_gfs2_bmap(ip, bh_map, lblock, create, ret);
-	bmap_unlock(ip, create);
+	bmap_unlock(ip, 0);
+out:
+	trace_gfs2_iomap_end(ip, iomap, ret);
 	return ret;
 
 do_alloc:
-	/* All allocations are done here, firstly check create flag */
-	if (!create) {
+	if (!(flags & IOMAP_WRITE)) {
+		if (pos >= i_size_read(inode)) {
+			ret = -ENOENT;
+			goto out_release;
+		}
 		BUG_ON(gfs2_is_stuffed(ip));
 		ret = 0;
-		goto out;
+		iomap->length = hole_size(inode, lblock, &mp);
+		goto out_release;
 	}
 
-	/* At this point ret is the tree depth of already allocated blocks */
+	ret = gfs2_iomap_alloc(inode, iomap, flags, &mp);
+	goto out_release;
+}
+
+/**
+ * gfs2_block_map - Map a block from an inode to a disk block
+ * @inode: The inode
+ * @lblock: The logical block number
+ * @bh_map: The bh to be mapped
+ * @create: True if its ok to alloc blocks to satify the request
+ *
+ * Sets buffer_mapped() if successful, sets buffer_boundary() if a
+ * read of metadata will be required before the next block can be
+ * mapped. Sets buffer_new() if new blocks were allocated.
+ *
+ * Returns: errno
+ */
+
+int gfs2_block_map(struct inode *inode, sector_t lblock,
+		   struct buffer_head *bh_map, int create)
+{
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_sbd *sdp = GFS2_SB(inode);
+	struct iomap iomap;
+	int ret, flags = 0;
+
+	clear_buffer_mapped(bh_map);
+	clear_buffer_new(bh_map);
+	clear_buffer_boundary(bh_map);
+	trace_gfs2_bmap(ip, bh_map, lblock, create, 1);
+
+	if (create)
+		flags |= IOMAP_WRITE;
 	if (buffer_zeronew(bh_map))
-		zero_new = true;
-	ret = gfs2_bmap_alloc(inode, lblock, zero_new, &mp, maxlen, &dblock,
-			      &dblks);
-	if (ret == 0) {
-		map_bh(bh_map, inode->i_sb, dblock);
-		bh_map->b_size = dblks << inode->i_blkbits;
+		flags |= IOMAP_ZERO;
+	ret = gfs2_get_iomap(inode, lblock << sdp->sd_sb.sb_bsize_shift,
+			     bh_map->b_size, flags, &iomap);
+	if (ret)
+		goto out;
+
+	iomap.length = round_up(iomap.length, sdp->sd_sb.sb_bsize);
+	bh_map->b_size = iomap.length;
+	if (iomap.flags & IOMAP_F_BOUNDARY)
+		set_buffer_boundary(bh_map);
+	if (iomap.blkno != IOMAP_NULL_BLOCK)
+		map_bh(bh_map, inode->i_sb, iomap.blkno);
+	bh_map->b_size = iomap.length;
+	clear_buffer_zeronew(bh_map);
+	if (iomap.flags & IOMAP_F_NEW)
 		set_buffer_new(bh_map);
-	}
-	goto out;
+
+out:
+	trace_gfs2_bmap(ip, bh_map, lblock, create, ret);
+	return ret;
 }
 
 /*
diff --git a/fs/gfs2/bmap.h b/fs/gfs2/bmap.h
index 81ded5e..e904aed 100644
--- a/fs/gfs2/bmap.h
+++ b/fs/gfs2/bmap.h
@@ -10,6 +10,8 @@
 #ifndef __BMAP_DOT_H__
 #define __BMAP_DOT_H__
 
+#include <linux/iomap.h>
+
 #include "inode.h"
 
 struct inode;
@@ -47,6 +49,8 @@ static inline void gfs2_write_calc_reserv(const struct gfs2_inode *ip,
 extern int gfs2_unstuff_dinode(struct gfs2_inode *ip, struct page *page);
 extern int gfs2_block_map(struct inode *inode, sector_t lblock,
 			  struct buffer_head *bh, int create);
+extern int gfs2_get_iomap(struct inode *inode, loff_t pos, ssize_t length,
+			  unsigned flags, struct iomap *iomap);
 extern int gfs2_extent_map(struct inode *inode, u64 lblock, int *new,
 			   u64 *dblock, unsigned *extlen);
 extern int gfs2_setattr_size(struct inode *inode, u64 size);
diff --git a/fs/gfs2/trace_gfs2.h b/fs/gfs2/trace_gfs2.h
index 49ac55d..3c91ae3 100644
--- a/fs/gfs2/trace_gfs2.h
+++ b/fs/gfs2/trace_gfs2.h
@@ -12,6 +12,7 @@
 #include <linux/gfs2_ondisk.h>
 #include <linux/writeback.h>
 #include <linux/ktime.h>
+#include <linux/iomap.h>
 #include "incore.h"
 #include "glock.h"
 #include "rgrp.h"
@@ -469,6 +470,70 @@ TRACE_EVENT(gfs2_bmap,
 		  __entry->errno)
 );
 
+TRACE_EVENT(gfs2_iomap_start,
+
+	TP_PROTO(const struct gfs2_inode *ip, loff_t pos, ssize_t length,
+		 u16 flags),
+
+	TP_ARGS(ip, pos, length, flags),
+
+	TP_STRUCT__entry(
+		__field(        dev_t,  dev                     )
+		__field(	u64,	inum			)
+		__field(	loff_t, pos			)
+		__field(	ssize_t, length			)
+		__field(	u16,	flags			)
+	),
+
+	TP_fast_assign(
+		__entry->dev            = ip->i_gl->gl_name.ln_sbd->sd_vfs->s_dev;
+		__entry->inum		= ip->i_no_addr;
+		__entry->pos		= pos;
+		__entry->length		= length;
+		__entry->flags		= flags;
+	),
+
+	TP_printk("%u,%u bmap %llu iomap start %llu/%lu flags:%08x",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  (unsigned long long)__entry->inum,
+		  (unsigned long long)__entry->pos,
+		  (unsigned long)__entry->length, (u16)__entry->flags)
+);
+
+TRACE_EVENT(gfs2_iomap_end,
+
+	TP_PROTO(const struct gfs2_inode *ip, struct iomap *iomap, int ret),
+
+	TP_ARGS(ip, iomap, ret),
+
+	TP_STRUCT__entry(
+		__field(        dev_t,  dev                     )
+		__field(	u64,	inum			)
+		__field(	loff_t, offset			)
+		__field(	ssize_t, length			)
+		__field(	u16,	flags			)
+		__field(	u16,	type			)
+		__field(	int,	ret			)
+	),
+
+	TP_fast_assign(
+		__entry->dev            = ip->i_gl->gl_name.ln_sbd->sd_vfs->s_dev;
+		__entry->inum		= ip->i_no_addr;
+		__entry->offset		= iomap->offset;
+		__entry->length		= iomap->length;
+		__entry->flags		= iomap->flags;
+		__entry->type		= iomap->type;
+		__entry->ret		= ret;
+	),
+
+	TP_printk("%u,%u bmap %llu iomap end %llu/%lu ty:%d flags:%08x rc:%d",
+		  MAJOR(__entry->dev), MINOR(__entry->dev),
+		  (unsigned long long)__entry->inum,
+		  (unsigned long long)__entry->offset,
+		  (unsigned long)__entry->length, (u16)__entry->type,
+		  (u16)__entry->flags, __entry->ret)
+);
+
 /* Keep track of blocks as they are allocated/freed */
 TRACE_EVENT(gfs2_block_alloc,
 
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index f753e78..9d64933 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -21,7 +21,8 @@ struct vm_fault;
 /*
  * Flags for all iomap mappings:
  */
-#define IOMAP_F_NEW	0x01	/* blocks have been newly allocated */
+#define IOMAP_F_NEW		0x01	/* blocks have been newly allocated */
+#define IOMAP_F_BOUNDARY	0x02	/* mapping ends at metadata boundary */
 
 /*
  * Flags that only need to be reported for IOMAP_REPORT requests:
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/7] GFS2: Switch fiemap implementation to use iomap
  2017-06-16 14:51 [PATCH 0/7] lseek SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
  2017-06-16 14:51 ` [PATCH 1/7] GFS2: Make height info part of metapath Andreas Gruenbacher
  2017-06-16 14:51 ` [PATCH 2/7] GFS2: Implement iomap for block_map Andreas Gruenbacher
@ 2017-06-16 14:51 ` Andreas Gruenbacher
  2017-06-16 14:51 ` [PATCH 4/7] vfs: Add iomap_seek_hole_data helper Andreas Gruenbacher
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2017-06-16 14:51 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Bob Peterson, linux-xfs, cluster-devel, Andreas Gruenbacher

From: Bob Peterson <rpeterso@redhat.com>

This patch switches GFS2's implementation of fiemap from the old
block_map code to the new iomap interface.

Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/Kconfig |  1 +
 fs/gfs2/inode.c | 22 +++++++++++++++-------
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/fs/gfs2/Kconfig b/fs/gfs2/Kconfig
index 90c6a8f..43c827a 100644
--- a/fs/gfs2/Kconfig
+++ b/fs/gfs2/Kconfig
@@ -4,6 +4,7 @@ config GFS2_FS
 	select FS_POSIX_ACL
 	select CRC32
 	select QUOTACTL
+	select FS_IOMAP
 	help
 	  A cluster filesystem.
 
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 9f605ea..9ed2e91 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -18,7 +18,7 @@
 #include <linux/posix_acl.h>
 #include <linux/gfs2_ondisk.h>
 #include <linux/crc32.h>
-#include <linux/fiemap.h>
+#include <linux/iomap.h>
 #include <linux/security.h>
 #include <linux/uaccess.h>
 
@@ -1996,6 +1996,17 @@ static int gfs2_getattr(const struct path *path, struct kstat *stat,
 	return 0;
 }
 
+static int gfs2_iomap_begin(struct inode *inode, loff_t offset,
+			    loff_t length, unsigned flags,
+			    struct iomap *iomap)
+{
+	return gfs2_get_iomap(inode, offset, length, flags, iomap);
+}
+
+const struct iomap_ops gfs2_iomap_ops = {
+	.iomap_begin = gfs2_iomap_begin,
+};
+
 static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		       u64 start, u64 len)
 {
@@ -2003,10 +2014,6 @@ static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 	struct gfs2_holder gh;
 	int ret;
 
-	ret = fiemap_check_flags(fieinfo, FIEMAP_FLAG_SYNC);
-	if (ret)
-		return ret;
-
 	inode_lock(inode);
 
 	ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
@@ -2018,6 +2025,7 @@ static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		u64 size = i_size_read(inode);
 		u32 flags = FIEMAP_EXTENT_LAST|FIEMAP_EXTENT_NOT_ALIGNED|
 			    FIEMAP_EXTENT_DATA_INLINE;
+
 		phys += sizeof(struct gfs2_dinode);
 		phys += start;
 		if (start + len > size)
@@ -2028,11 +2036,11 @@ static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		if (ret == 1)
 			ret = 0;
 	} else {
-		ret = __generic_block_fiemap(inode, fieinfo, start, len,
-					     gfs2_block_map);
+		ret = iomap_fiemap(inode, fieinfo, start, len, &gfs2_iomap_ops);
 	}
 
 	gfs2_glock_dq_uninit(&gh);
+
 out:
 	inode_unlock(inode);
 	return ret;
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/7] vfs: Add iomap_seek_hole_data helper
  2017-06-16 14:51 [PATCH 0/7] lseek SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
                   ` (2 preceding siblings ...)
  2017-06-16 14:51 ` [PATCH 3/7] GFS2: Switch fiemap implementation to use iomap Andreas Gruenbacher
@ 2017-06-16 14:51 ` Andreas Gruenbacher
  2017-06-19 17:32   ` Darrick J. Wong
  2017-06-16 14:51 ` [PATCH 5/7] gfs2: Implement lseek SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 13+ messages in thread
From: Andreas Gruenbacher @ 2017-06-16 14:51 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Andreas Gruenbacher, Bob Peterson, linux-xfs, cluster-devel

Filesystems can use this for implementing lseek SEEK_HOLE / SEEK_DATA
support via iomap.

The __iomap_seek_hole_data helper takes an additional size argument and
doesn't reposition the file offset; this is for internal use for xfs
quota files which don't maintain the inode size.

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/iomap.c            | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/iomap.h |  6 ++++
 2 files changed, 87 insertions(+)

diff --git a/fs/iomap.c b/fs/iomap.c
index 4b10892..cf01694 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -584,6 +584,87 @@ int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi,
 }
 EXPORT_SYMBOL_GPL(iomap_fiemap);
 
+static loff_t
+iomap_seek_hole_actor(struct inode *inode, loff_t offset, loff_t length,
+		      void *data, struct iomap *iomap)
+{
+	if (iomap->type == IOMAP_HOLE)
+		return 0;
+	return iomap->offset + iomap->length - offset;
+}
+
+static loff_t
+iomap_seek_data_actor(struct inode *inode, loff_t offset, loff_t length,
+		      void *data, struct iomap *iomap)
+{
+	if (iomap->type != IOMAP_HOLE)
+		return 0;
+	return iomap->offset + iomap->length - offset;
+}
+
+loff_t
+__iomap_seek_hole_data(struct inode *inode, loff_t offset, loff_t size,
+		       int whence, const struct iomap_ops *ops)
+{
+	static loff_t (*actor)(struct inode *, loff_t, loff_t, void *,
+			       struct iomap *);
+	loff_t len = size - offset;
+	loff_t ret;
+
+	/* Nothing to be found beyond the end of the file. */
+	if (len <= 0)
+		return -ENXIO;
+
+	switch(whence) {
+	case SEEK_HOLE:
+		actor = iomap_seek_hole_actor;
+		break;
+
+	case SEEK_DATA:
+		actor = iomap_seek_data_actor;
+		break;
+	}
+
+	while (len > 0) {
+		ret = iomap_apply(inode, offset, len, IOMAP_REPORT, ops,
+				  NULL, actor);
+		if (ret <= 0) {
+			if (ret < 0)
+				return ret;
+			break;
+		}
+		offset += ret;
+		len -= ret;
+	}
+
+	if (len <= 0) {
+		/* There is an implicit hole at the end of the file. */
+		if (whence != SEEK_HOLE)
+			offset = -ENXIO;
+
+		/* The last segment can extend beyond the end of the file. */
+		if (offset > size)
+			offset = size;
+	}
+
+	return offset;
+}
+EXPORT_SYMBOL_GPL(__iomap_seek_hole_data);
+
+loff_t
+iomap_seek_hole_data(struct file *file, loff_t offset, int whence,
+		     const struct iomap_ops *ops)
+{
+	struct inode *inode = file->f_mapping->host;
+
+	offset = __iomap_seek_hole_data(inode, offset, i_size_read(inode),
+					whence, ops);
+	if (offset <= 0)
+		return offset;
+	return vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
+}
+EXPORT_SYMBOL_GPL(iomap_seek_hole_data);
+
 /*
  * Private flags for iomap_dio, must not overlap with the public ones in
  * iomap.h:
diff --git a/include/linux/iomap.h b/include/linux/iomap.h
index 9d64933..49b6ec0 100644
--- a/include/linux/iomap.h
+++ b/include/linux/iomap.h
@@ -85,6 +85,12 @@ int iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops);
 int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		loff_t start, loff_t len, const struct iomap_ops *ops);
 
+struct file;
+loff_t __iomap_seek_hole_data(struct inode *inode, loff_t pos, loff_t size,
+			      int whence, const struct iomap_ops *ops);
+loff_t iomap_seek_hole_data(struct file *file, loff_t pos,
+			    int whence, const struct iomap_ops *ops);
+
 /*
  * Flags for direct I/O ->end_io:
  */
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/7] gfs2: Implement lseek SEEK_HOLE / SEEK_DATA via iomap
  2017-06-16 14:51 [PATCH 0/7] lseek SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
                   ` (3 preceding siblings ...)
  2017-06-16 14:51 ` [PATCH 4/7] vfs: Add iomap_seek_hole_data helper Andreas Gruenbacher
@ 2017-06-16 14:51 ` Andreas Gruenbacher
  2017-06-16 14:51 ` [PATCH 6/7] xfs: Switch to iomap for lseek SEEK_HOLE / SEEK_DATA Andreas Gruenbacher
  2017-06-16 14:51 ` [PATCH 7/7] xfs: Switch to iomap for seeking in quota files Andreas Gruenbacher
  6 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2017-06-16 14:51 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Andreas Gruenbacher, Bob Peterson, linux-xfs, cluster-devel

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/gfs2/file.c  | 14 +++++++++++---
 fs/gfs2/inode.c | 32 ++++++++++++++++++++++++++++++++
 fs/gfs2/inode.h |  1 +
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index c2062a1..62ce320 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -60,9 +60,7 @@ static loff_t gfs2_llseek(struct file *file, loff_t offset, int whence)
 	loff_t error;
 
 	switch (whence) {
-	case SEEK_END: /* These reference inode->i_size */
-	case SEEK_DATA:
-	case SEEK_HOLE:
+	case SEEK_END:
 		error = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, LM_FLAG_ANY,
 					   &i_gh);
 		if (!error) {
@@ -70,8 +68,18 @@ static loff_t gfs2_llseek(struct file *file, loff_t offset, int whence)
 			gfs2_glock_dq_uninit(&i_gh);
 		}
 		break;
+
+	case SEEK_DATA:
+	case SEEK_HOLE:
+		error = gfs2_seek_hole_data(file, offset, whence);
+		break;
+
 	case SEEK_CUR:
 	case SEEK_SET:
+		/*
+		 * These don't reference inode->i_size and don't depend on the
+		 * block mapping, so we don't need the glock.
+		 */
 		error = generic_file_llseek(file, offset, whence);
 		break;
 	default:
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 9ed2e91..c8885e5 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -2046,6 +2046,38 @@ static int gfs2_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 	return ret;
 }
 
+loff_t gfs2_seek_hole_data(struct file *file, loff_t offset, int whence)
+{
+	struct inode *inode = file->f_mapping->host;
+	struct gfs2_inode *ip = GFS2_I(inode);
+	struct gfs2_holder gh;
+	loff_t ret;
+
+	inode_lock(inode);
+	ret = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, 0, &gh);
+	if (ret)
+		goto out;
+	if (gfs2_is_stuffed(ip)) {
+		u64 size = i_size_read(inode);
+
+		if (offset >= size) {
+			ret = -ENXIO;
+			goto out_dequeue;
+		}
+		ret = offset;
+		if (whence == SEEK_HOLE)
+			ret = size;
+	} else {
+		ret = iomap_seek_hole_data(file, offset, whence, &gfs2_iomap_ops);
+	}
+
+out_dequeue:
+	gfs2_glock_dq_uninit(&gh);
+out:
+	inode_unlock(inode);
+	return ret;
+}
+
 const struct inode_operations gfs2_file_iops = {
 	.permission = gfs2_permission,
 	.setattr = gfs2_setattr,
diff --git a/fs/gfs2/inode.h b/fs/gfs2/inode.h
index aace8ce..c2866ec 100644
--- a/fs/gfs2/inode.h
+++ b/fs/gfs2/inode.h
@@ -109,6 +109,7 @@ extern int gfs2_setattr_simple(struct inode *inode, struct iattr *attr);
 extern struct inode *gfs2_lookup_simple(struct inode *dip, const char *name);
 extern void gfs2_dinode_out(const struct gfs2_inode *ip, void *buf);
 extern int gfs2_open_common(struct inode *inode, struct file *file);
+extern loff_t gfs2_seek_hole_data(struct file *file, loff_t offset, int whence);
 
 extern const struct inode_operations gfs2_file_iops;
 extern const struct inode_operations gfs2_dir_iops;
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/7] xfs: Switch to iomap for lseek SEEK_HOLE / SEEK_DATA
  2017-06-16 14:51 [PATCH 0/7] lseek SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
                   ` (4 preceding siblings ...)
  2017-06-16 14:51 ` [PATCH 5/7] gfs2: Implement lseek SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
@ 2017-06-16 14:51 ` Andreas Gruenbacher
  2017-06-17 23:57   ` Dave Chinner
  2017-06-16 14:51 ` [PATCH 7/7] xfs: Switch to iomap for seeking in quota files Andreas Gruenbacher
  6 siblings, 1 reply; 13+ messages in thread
From: Andreas Gruenbacher @ 2017-06-16 14:51 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Andreas Gruenbacher, Bob Peterson, linux-xfs, cluster-devel

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/xfs/xfs_file.c | 17 ++---------------
 1 file changed, 2 insertions(+), 15 deletions(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 5fb5a09..b36dcd7 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1283,28 +1283,15 @@ xfs_seek_hole_data(
 	struct xfs_inode	*ip = XFS_I(inode);
 	struct xfs_mount	*mp = ip->i_mount;
 	uint			lock;
-	loff_t			offset, end;
-	int			error = 0;
+	loff_t			offset;
 
 	if (XFS_FORCED_SHUTDOWN(mp))
 		return -EIO;
 
 	lock = xfs_ilock_data_map_shared(ip);
-
-	end = i_size_read(inode);
-	offset = __xfs_seek_hole_data(inode, start, end, whence);
-	if (offset < 0) {
-		error = offset;
-		goto out_unlock;
-	}
-
-	offset = vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
-
-out_unlock:
+	offset = iomap_seek_hole_data(file, start, whence, &xfs_iomap_ops);
 	xfs_iunlock(ip, lock);
 
-	if (error)
-		return error;
 	return offset;
 }
 
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/7] xfs: Switch to iomap for seeking in quota files
  2017-06-16 14:51 [PATCH 0/7] lseek SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
                   ` (5 preceding siblings ...)
  2017-06-16 14:51 ` [PATCH 6/7] xfs: Switch to iomap for lseek SEEK_HOLE / SEEK_DATA Andreas Gruenbacher
@ 2017-06-16 14:51 ` Andreas Gruenbacher
  2017-06-16 15:30   ` Darrick J. Wong
  6 siblings, 1 reply; 13+ messages in thread
From: Andreas Gruenbacher @ 2017-06-16 14:51 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Andreas Gruenbacher, Bob Peterson, linux-xfs, cluster-devel

(This patch is largely untested.)

Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
---
 fs/xfs/xfs_dquot.c |   5 +-
 fs/xfs/xfs_file.c  | 320 -----------------------------------------------------
 fs/xfs/xfs_inode.h |   2 -
 3 files changed, 3 insertions(+), 324 deletions(-)

diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
index 9d06cc3..a10bd921 100644
--- a/fs/xfs/xfs_dquot.c
+++ b/fs/xfs/xfs_dquot.c
@@ -39,6 +39,7 @@
 #include "xfs_trace.h"
 #include "xfs_log.h"
 #include "xfs_bmap_btree.h"
+#include "xfs_iomap.h"
 
 /*
  * Lock order:
@@ -726,8 +727,8 @@ xfs_dq_get_next_id(
 	quotip = xfs_quota_inode(mp, type);
 	lock = xfs_ilock_data_map_shared(quotip);
 
-	offset = __xfs_seek_hole_data(VFS_I(quotip), XFS_FSB_TO_B(mp, start),
-				      eof, SEEK_DATA);
+	offset = __iomap_seek_hole_data(VFS_I(quotip), XFS_FSB_TO_B(mp, start), eof,
+					SEEK_DATA, &xfs_iomap_ops);
 	if (offset < 0)
 		error = offset;
 
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index b36dcd7..3acbbaf 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -953,326 +953,6 @@ xfs_file_readdir(
 	return xfs_readdir(ip, ctx, bufsize);
 }
 
-/*
- * This type is designed to indicate the type of offset we would like
- * to search from page cache for xfs_seek_hole_data().
- */
-enum {
-	HOLE_OFF = 0,
-	DATA_OFF,
-};
-
-/*
- * Lookup the desired type of offset from the given page.
- *
- * On success, return true and the offset argument will point to the
- * start of the region that was found.  Otherwise this function will
- * return false and keep the offset argument unchanged.
- */
-STATIC bool
-xfs_lookup_buffer_offset(
-	struct page		*page,
-	loff_t			*offset,
-	unsigned int		type)
-{
-	loff_t			lastoff = page_offset(page);
-	bool			found = false;
-	struct buffer_head	*bh, *head;
-
-	bh = head = page_buffers(page);
-	do {
-		/*
-		 * Unwritten extents that have data in the page
-		 * cache covering them can be identified by the
-		 * BH_Unwritten state flag.  Pages with multiple
-		 * buffers might have a mix of holes, data and
-		 * unwritten extents - any buffer with valid
-		 * data in it should have BH_Uptodate flag set
-		 * on it.
-		 */
-		if (buffer_unwritten(bh) ||
-		    buffer_uptodate(bh)) {
-			if (type == DATA_OFF)
-				found = true;
-		} else {
-			if (type == HOLE_OFF)
-				found = true;
-		}
-
-		if (found) {
-			*offset = lastoff;
-			break;
-		}
-		lastoff += bh->b_size;
-	} while ((bh = bh->b_this_page) != head);
-
-	return found;
-}
-
-/*
- * This routine is called to find out and return a data or hole offset
- * from the page cache for unwritten extents according to the desired
- * type for xfs_seek_hole_data().
- *
- * The argument offset is used to tell where we start to search from the
- * page cache.  Map is used to figure out the end points of the range to
- * lookup pages.
- *
- * Return true if the desired type of offset was found, and the argument
- * offset is filled with that address.  Otherwise, return false and keep
- * offset unchanged.
- */
-STATIC bool
-xfs_find_get_desired_pgoff(
-	struct inode		*inode,
-	struct xfs_bmbt_irec	*map,
-	unsigned int		type,
-	loff_t			*offset)
-{
-	struct xfs_inode	*ip = XFS_I(inode);
-	struct xfs_mount	*mp = ip->i_mount;
-	struct pagevec		pvec;
-	pgoff_t			index;
-	pgoff_t			end;
-	loff_t			endoff;
-	loff_t			startoff = *offset;
-	loff_t			lastoff = startoff;
-	bool			found = false;
-
-	pagevec_init(&pvec, 0);
-
-	index = startoff >> PAGE_SHIFT;
-	endoff = XFS_FSB_TO_B(mp, map->br_startoff + map->br_blockcount);
-	end = (endoff - 1) >> PAGE_SHIFT;
-	do {
-		int		want;
-		unsigned	nr_pages;
-		unsigned int	i;
-
-		want = min_t(pgoff_t, end - index, PAGEVEC_SIZE - 1) + 1;
-		nr_pages = pagevec_lookup(&pvec, inode->i_mapping, index,
-					  want);
-		if (nr_pages == 0)
-			break;
-
-		for (i = 0; i < nr_pages; i++) {
-			struct page	*page = pvec.pages[i];
-			loff_t		b_offset;
-
-			/*
-			 * At this point, the page may be truncated or
-			 * invalidated (changing page->mapping to NULL),
-			 * or even swizzled back from swapper_space to tmpfs
-			 * file mapping. However, page->index will not change
-			 * because we have a reference on the page.
-			 *
-			 * If current page offset is beyond where we've ended,
-			 * we've found a hole.
-			 */
-			if (type == HOLE_OFF && lastoff < endoff &&
-			    lastoff < page_offset(pvec.pages[i])) {
-				found = true;
-				*offset = lastoff;
-				goto out;
-			}
-			/* Searching done if the page index is out of range. */
-			if (page->index > end)
-				goto out;
-
-			lock_page(page);
-			/*
-			 * Page truncated or invalidated(page->mapping == NULL).
-			 * We can freely skip it and proceed to check the next
-			 * page.
-			 */
-			if (unlikely(page->mapping != inode->i_mapping)) {
-				unlock_page(page);
-				continue;
-			}
-
-			if (!page_has_buffers(page)) {
-				unlock_page(page);
-				continue;
-			}
-
-			found = xfs_lookup_buffer_offset(page, &b_offset, type);
-			if (found) {
-				/*
-				 * The found offset may be less than the start
-				 * point to search if this is the first time to
-				 * come here.
-				 */
-				*offset = max_t(loff_t, startoff, b_offset);
-				unlock_page(page);
-				goto out;
-			}
-
-			/*
-			 * We either searching data but nothing was found, or
-			 * searching hole but found a data buffer.  In either
-			 * case, probably the next page contains the desired
-			 * things, update the last offset to it so.
-			 */
-			lastoff = page_offset(page) + PAGE_SIZE;
-			unlock_page(page);
-		}
-
-		/*
-		 * The number of returned pages less than our desired, search
-		 * done.
-		 */
-		if (nr_pages < want)
-			break;
-
-		index = pvec.pages[i - 1]->index + 1;
-		pagevec_release(&pvec);
-	} while (index <= end);
-
-	/* No page at lastoff and we are not done - we found a hole. */
-	if (type == HOLE_OFF && lastoff < endoff) {
-		*offset = lastoff;
-		found = true;
-	}
-out:
-	pagevec_release(&pvec);
-	return found;
-}
-
-/*
- * caller must lock inode with xfs_ilock_data_map_shared,
- * can we craft an appropriate ASSERT?
- *
- * end is because the VFS-level lseek interface is defined such that any
- * offset past i_size shall return -ENXIO, but we use this for quota code
- * which does not maintain i_size, and we want to SEEK_DATA past i_size.
- */
-loff_t
-__xfs_seek_hole_data(
-	struct inode		*inode,
-	loff_t			start,
-	loff_t			end,
-	int			whence)
-{
-	struct xfs_inode	*ip = XFS_I(inode);
-	struct xfs_mount	*mp = ip->i_mount;
-	loff_t			uninitialized_var(offset);
-	xfs_fileoff_t		fsbno;
-	xfs_filblks_t		lastbno;
-	int			error;
-
-	if (start >= end) {
-		error = -ENXIO;
-		goto out_error;
-	}
-
-	/*
-	 * Try to read extents from the first block indicated
-	 * by fsbno to the end block of the file.
-	 */
-	fsbno = XFS_B_TO_FSBT(mp, start);
-	lastbno = XFS_B_TO_FSB(mp, end);
-
-	for (;;) {
-		struct xfs_bmbt_irec	map[2];
-		int			nmap = 2;
-		unsigned int		i;
-
-		error = xfs_bmapi_read(ip, fsbno, lastbno - fsbno, map, &nmap,
-				       XFS_BMAPI_ENTIRE);
-		if (error)
-			goto out_error;
-
-		/* No extents at given offset, must be beyond EOF */
-		if (nmap == 0) {
-			error = -ENXIO;
-			goto out_error;
-		}
-
-		for (i = 0; i < nmap; i++) {
-			offset = max_t(loff_t, start,
-				       XFS_FSB_TO_B(mp, map[i].br_startoff));
-
-			/* Landed in the hole we wanted? */
-			if (whence == SEEK_HOLE &&
-			    map[i].br_startblock == HOLESTARTBLOCK)
-				goto out;
-
-			/* Landed in the data extent we wanted? */
-			if (whence == SEEK_DATA &&
-			    (map[i].br_startblock == DELAYSTARTBLOCK ||
-			     (map[i].br_state == XFS_EXT_NORM &&
-			      !isnullstartblock(map[i].br_startblock))))
-				goto out;
-
-			/*
-			 * Landed in an unwritten extent, try to search
-			 * for hole or data from page cache.
-			 */
-			if (map[i].br_state == XFS_EXT_UNWRITTEN) {
-				if (xfs_find_get_desired_pgoff(inode, &map[i],
-				      whence == SEEK_HOLE ? HOLE_OFF : DATA_OFF,
-							&offset))
-					goto out;
-			}
-		}
-
-		/*
-		 * We only received one extent out of the two requested. This
-		 * means we've hit EOF and didn't find what we are looking for.
-		 */
-		if (nmap == 1) {
-			/*
-			 * If we were looking for a hole, set offset to
-			 * the end of the file (i.e., there is an implicit
-			 * hole at the end of any file).
-		 	 */
-			if (whence == SEEK_HOLE) {
-				offset = end;
-				break;
-			}
-			/*
-			 * If we were looking for data, it's nowhere to be found
-			 */
-			ASSERT(whence == SEEK_DATA);
-			error = -ENXIO;
-			goto out_error;
-		}
-
-		ASSERT(i > 1);
-
-		/*
-		 * Nothing was found, proceed to the next round of search
-		 * if the next reading offset is not at or beyond EOF.
-		 */
-		fsbno = map[i - 1].br_startoff + map[i - 1].br_blockcount;
-		start = XFS_FSB_TO_B(mp, fsbno);
-		if (start >= end) {
-			if (whence == SEEK_HOLE) {
-				offset = end;
-				break;
-			}
-			ASSERT(whence == SEEK_DATA);
-			error = -ENXIO;
-			goto out_error;
-		}
-	}
-
-out:
-	/*
-	 * If at this point we have found the hole we wanted, the returned
-	 * offset may be bigger than the file size as it may be aligned to
-	 * page boundary for unwritten extents.  We need to deal with this
-	 * situation in particular.
-	 */
-	if (whence == SEEK_HOLE)
-		offset = min_t(loff_t, offset, end);
-
-	return offset;
-
-out_error:
-	return error;
-}
-
 STATIC loff_t
 xfs_seek_hole_data(
 	struct file		*file,
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 10e89fc..17c441a 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -445,8 +445,6 @@ int	xfs_zero_eof(struct xfs_inode *ip, xfs_off_t offset,
 		     xfs_fsize_t isize, bool *did_zeroing);
 int	xfs_zero_range(struct xfs_inode *ip, xfs_off_t pos, xfs_off_t count,
 		bool *did_zero);
-loff_t	__xfs_seek_hole_data(struct inode *inode, loff_t start,
-			     loff_t eof, int whence);
 
 
 /* from xfs_iops.c */
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 7/7] xfs: Switch to iomap for seeking in quota files
  2017-06-16 14:51 ` [PATCH 7/7] xfs: Switch to iomap for seeking in quota files Andreas Gruenbacher
@ 2017-06-16 15:30   ` Darrick J. Wong
  2017-06-20  6:31     ` [Cluster-devel] " Christoph Hellwig
  0 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2017-06-16 15:30 UTC (permalink / raw)
  To: Andreas Gruenbacher; +Cc: linux-fsdevel, Bob Peterson, linux-xfs, cluster-devel

On Fri, Jun 16, 2017 at 04:51:20PM +0200, Andreas Gruenbacher wrote:
> (This patch is largely untested.)

In principle the XFS bits look ok, but I think you'd be better off with
a review from hch and, uh, while you wait, a run through the xfstests
'quota' group and probably 'auto' too?

I also think the two xfs patches would be better organized either as a
pair of patches where one patch gets rid of the __xfs_seek_hole_data
calls and the second removes the __xfs_seek_hole_data implementation;
or a single patch to refactor it out of xfs in a single blow.

--D

> 
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> ---
>  fs/xfs/xfs_dquot.c |   5 +-
>  fs/xfs/xfs_file.c  | 320 -----------------------------------------------------
>  fs/xfs/xfs_inode.h |   2 -
>  3 files changed, 3 insertions(+), 324 deletions(-)
> 
> diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c
> index 9d06cc3..a10bd921 100644
> --- a/fs/xfs/xfs_dquot.c
> +++ b/fs/xfs/xfs_dquot.c
> @@ -39,6 +39,7 @@
>  #include "xfs_trace.h"
>  #include "xfs_log.h"
>  #include "xfs_bmap_btree.h"
> +#include "xfs_iomap.h"
>  
>  /*
>   * Lock order:
> @@ -726,8 +727,8 @@ xfs_dq_get_next_id(
>  	quotip = xfs_quota_inode(mp, type);
>  	lock = xfs_ilock_data_map_shared(quotip);
>  
> -	offset = __xfs_seek_hole_data(VFS_I(quotip), XFS_FSB_TO_B(mp, start),
> -				      eof, SEEK_DATA);
> +	offset = __iomap_seek_hole_data(VFS_I(quotip), XFS_FSB_TO_B(mp, start), eof,
> +					SEEK_DATA, &xfs_iomap_ops);
>  	if (offset < 0)
>  		error = offset;
>  
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index b36dcd7..3acbbaf 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -953,326 +953,6 @@ xfs_file_readdir(
>  	return xfs_readdir(ip, ctx, bufsize);
>  }
>  
> -/*
> - * This type is designed to indicate the type of offset we would like
> - * to search from page cache for xfs_seek_hole_data().
> - */
> -enum {
> -	HOLE_OFF = 0,
> -	DATA_OFF,
> -};
> -
> -/*
> - * Lookup the desired type of offset from the given page.
> - *
> - * On success, return true and the offset argument will point to the
> - * start of the region that was found.  Otherwise this function will
> - * return false and keep the offset argument unchanged.
> - */
> -STATIC bool
> -xfs_lookup_buffer_offset(
> -	struct page		*page,
> -	loff_t			*offset,
> -	unsigned int		type)
> -{
> -	loff_t			lastoff = page_offset(page);
> -	bool			found = false;
> -	struct buffer_head	*bh, *head;
> -
> -	bh = head = page_buffers(page);
> -	do {
> -		/*
> -		 * Unwritten extents that have data in the page
> -		 * cache covering them can be identified by the
> -		 * BH_Unwritten state flag.  Pages with multiple
> -		 * buffers might have a mix of holes, data and
> -		 * unwritten extents - any buffer with valid
> -		 * data in it should have BH_Uptodate flag set
> -		 * on it.
> -		 */
> -		if (buffer_unwritten(bh) ||
> -		    buffer_uptodate(bh)) {
> -			if (type == DATA_OFF)
> -				found = true;
> -		} else {
> -			if (type == HOLE_OFF)
> -				found = true;
> -		}
> -
> -		if (found) {
> -			*offset = lastoff;
> -			break;
> -		}
> -		lastoff += bh->b_size;
> -	} while ((bh = bh->b_this_page) != head);
> -
> -	return found;
> -}
> -
> -/*
> - * This routine is called to find out and return a data or hole offset
> - * from the page cache for unwritten extents according to the desired
> - * type for xfs_seek_hole_data().
> - *
> - * The argument offset is used to tell where we start to search from the
> - * page cache.  Map is used to figure out the end points of the range to
> - * lookup pages.
> - *
> - * Return true if the desired type of offset was found, and the argument
> - * offset is filled with that address.  Otherwise, return false and keep
> - * offset unchanged.
> - */
> -STATIC bool
> -xfs_find_get_desired_pgoff(
> -	struct inode		*inode,
> -	struct xfs_bmbt_irec	*map,
> -	unsigned int		type,
> -	loff_t			*offset)
> -{
> -	struct xfs_inode	*ip = XFS_I(inode);
> -	struct xfs_mount	*mp = ip->i_mount;
> -	struct pagevec		pvec;
> -	pgoff_t			index;
> -	pgoff_t			end;
> -	loff_t			endoff;
> -	loff_t			startoff = *offset;
> -	loff_t			lastoff = startoff;
> -	bool			found = false;
> -
> -	pagevec_init(&pvec, 0);
> -
> -	index = startoff >> PAGE_SHIFT;
> -	endoff = XFS_FSB_TO_B(mp, map->br_startoff + map->br_blockcount);
> -	end = (endoff - 1) >> PAGE_SHIFT;
> -	do {
> -		int		want;
> -		unsigned	nr_pages;
> -		unsigned int	i;
> -
> -		want = min_t(pgoff_t, end - index, PAGEVEC_SIZE - 1) + 1;
> -		nr_pages = pagevec_lookup(&pvec, inode->i_mapping, index,
> -					  want);
> -		if (nr_pages == 0)
> -			break;
> -
> -		for (i = 0; i < nr_pages; i++) {
> -			struct page	*page = pvec.pages[i];
> -			loff_t		b_offset;
> -
> -			/*
> -			 * At this point, the page may be truncated or
> -			 * invalidated (changing page->mapping to NULL),
> -			 * or even swizzled back from swapper_space to tmpfs
> -			 * file mapping. However, page->index will not change
> -			 * because we have a reference on the page.
> -			 *
> -			 * If current page offset is beyond where we've ended,
> -			 * we've found a hole.
> -			 */
> -			if (type == HOLE_OFF && lastoff < endoff &&
> -			    lastoff < page_offset(pvec.pages[i])) {
> -				found = true;
> -				*offset = lastoff;
> -				goto out;
> -			}
> -			/* Searching done if the page index is out of range. */
> -			if (page->index > end)
> -				goto out;
> -
> -			lock_page(page);
> -			/*
> -			 * Page truncated or invalidated(page->mapping == NULL).
> -			 * We can freely skip it and proceed to check the next
> -			 * page.
> -			 */
> -			if (unlikely(page->mapping != inode->i_mapping)) {
> -				unlock_page(page);
> -				continue;
> -			}
> -
> -			if (!page_has_buffers(page)) {
> -				unlock_page(page);
> -				continue;
> -			}
> -
> -			found = xfs_lookup_buffer_offset(page, &b_offset, type);
> -			if (found) {
> -				/*
> -				 * The found offset may be less than the start
> -				 * point to search if this is the first time to
> -				 * come here.
> -				 */
> -				*offset = max_t(loff_t, startoff, b_offset);
> -				unlock_page(page);
> -				goto out;
> -			}
> -
> -			/*
> -			 * We either searching data but nothing was found, or
> -			 * searching hole but found a data buffer.  In either
> -			 * case, probably the next page contains the desired
> -			 * things, update the last offset to it so.
> -			 */
> -			lastoff = page_offset(page) + PAGE_SIZE;
> -			unlock_page(page);
> -		}
> -
> -		/*
> -		 * The number of returned pages less than our desired, search
> -		 * done.
> -		 */
> -		if (nr_pages < want)
> -			break;
> -
> -		index = pvec.pages[i - 1]->index + 1;
> -		pagevec_release(&pvec);
> -	} while (index <= end);
> -
> -	/* No page at lastoff and we are not done - we found a hole. */
> -	if (type == HOLE_OFF && lastoff < endoff) {
> -		*offset = lastoff;
> -		found = true;
> -	}
> -out:
> -	pagevec_release(&pvec);
> -	return found;
> -}
> -
> -/*
> - * caller must lock inode with xfs_ilock_data_map_shared,
> - * can we craft an appropriate ASSERT?
> - *
> - * end is because the VFS-level lseek interface is defined such that any
> - * offset past i_size shall return -ENXIO, but we use this for quota code
> - * which does not maintain i_size, and we want to SEEK_DATA past i_size.
> - */
> -loff_t
> -__xfs_seek_hole_data(
> -	struct inode		*inode,
> -	loff_t			start,
> -	loff_t			end,
> -	int			whence)
> -{
> -	struct xfs_inode	*ip = XFS_I(inode);
> -	struct xfs_mount	*mp = ip->i_mount;
> -	loff_t			uninitialized_var(offset);
> -	xfs_fileoff_t		fsbno;
> -	xfs_filblks_t		lastbno;
> -	int			error;
> -
> -	if (start >= end) {
> -		error = -ENXIO;
> -		goto out_error;
> -	}
> -
> -	/*
> -	 * Try to read extents from the first block indicated
> -	 * by fsbno to the end block of the file.
> -	 */
> -	fsbno = XFS_B_TO_FSBT(mp, start);
> -	lastbno = XFS_B_TO_FSB(mp, end);
> -
> -	for (;;) {
> -		struct xfs_bmbt_irec	map[2];
> -		int			nmap = 2;
> -		unsigned int		i;
> -
> -		error = xfs_bmapi_read(ip, fsbno, lastbno - fsbno, map, &nmap,
> -				       XFS_BMAPI_ENTIRE);
> -		if (error)
> -			goto out_error;
> -
> -		/* No extents at given offset, must be beyond EOF */
> -		if (nmap == 0) {
> -			error = -ENXIO;
> -			goto out_error;
> -		}
> -
> -		for (i = 0; i < nmap; i++) {
> -			offset = max_t(loff_t, start,
> -				       XFS_FSB_TO_B(mp, map[i].br_startoff));
> -
> -			/* Landed in the hole we wanted? */
> -			if (whence == SEEK_HOLE &&
> -			    map[i].br_startblock == HOLESTARTBLOCK)
> -				goto out;
> -
> -			/* Landed in the data extent we wanted? */
> -			if (whence == SEEK_DATA &&
> -			    (map[i].br_startblock == DELAYSTARTBLOCK ||
> -			     (map[i].br_state == XFS_EXT_NORM &&
> -			      !isnullstartblock(map[i].br_startblock))))
> -				goto out;
> -
> -			/*
> -			 * Landed in an unwritten extent, try to search
> -			 * for hole or data from page cache.
> -			 */
> -			if (map[i].br_state == XFS_EXT_UNWRITTEN) {
> -				if (xfs_find_get_desired_pgoff(inode, &map[i],
> -				      whence == SEEK_HOLE ? HOLE_OFF : DATA_OFF,
> -							&offset))
> -					goto out;
> -			}
> -		}
> -
> -		/*
> -		 * We only received one extent out of the two requested. This
> -		 * means we've hit EOF and didn't find what we are looking for.
> -		 */
> -		if (nmap == 1) {
> -			/*
> -			 * If we were looking for a hole, set offset to
> -			 * the end of the file (i.e., there is an implicit
> -			 * hole at the end of any file).
> -		 	 */
> -			if (whence == SEEK_HOLE) {
> -				offset = end;
> -				break;
> -			}
> -			/*
> -			 * If we were looking for data, it's nowhere to be found
> -			 */
> -			ASSERT(whence == SEEK_DATA);
> -			error = -ENXIO;
> -			goto out_error;
> -		}
> -
> -		ASSERT(i > 1);
> -
> -		/*
> -		 * Nothing was found, proceed to the next round of search
> -		 * if the next reading offset is not at or beyond EOF.
> -		 */
> -		fsbno = map[i - 1].br_startoff + map[i - 1].br_blockcount;
> -		start = XFS_FSB_TO_B(mp, fsbno);
> -		if (start >= end) {
> -			if (whence == SEEK_HOLE) {
> -				offset = end;
> -				break;
> -			}
> -			ASSERT(whence == SEEK_DATA);
> -			error = -ENXIO;
> -			goto out_error;
> -		}
> -	}
> -
> -out:
> -	/*
> -	 * If at this point we have found the hole we wanted, the returned
> -	 * offset may be bigger than the file size as it may be aligned to
> -	 * page boundary for unwritten extents.  We need to deal with this
> -	 * situation in particular.
> -	 */
> -	if (whence == SEEK_HOLE)
> -		offset = min_t(loff_t, offset, end);
> -
> -	return offset;
> -
> -out_error:
> -	return error;
> -}
> -
>  STATIC loff_t
>  xfs_seek_hole_data(
>  	struct file		*file,
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index 10e89fc..17c441a 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -445,8 +445,6 @@ int	xfs_zero_eof(struct xfs_inode *ip, xfs_off_t offset,
>  		     xfs_fsize_t isize, bool *did_zeroing);
>  int	xfs_zero_range(struct xfs_inode *ip, xfs_off_t pos, xfs_off_t count,
>  		bool *did_zero);
> -loff_t	__xfs_seek_hole_data(struct inode *inode, loff_t start,
> -			     loff_t eof, int whence);
>  
>  
>  /* from xfs_iops.c */
> -- 
> 2.7.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/7] xfs: Switch to iomap for lseek SEEK_HOLE / SEEK_DATA
  2017-06-16 14:51 ` [PATCH 6/7] xfs: Switch to iomap for lseek SEEK_HOLE / SEEK_DATA Andreas Gruenbacher
@ 2017-06-17 23:57   ` Dave Chinner
  2017-06-19 22:16     ` Andreas Gruenbacher
  0 siblings, 1 reply; 13+ messages in thread
From: Dave Chinner @ 2017-06-17 23:57 UTC (permalink / raw)
  To: Andreas Gruenbacher; +Cc: linux-fsdevel, Bob Peterson, linux-xfs, cluster-devel

On Fri, Jun 16, 2017 at 04:51:19PM +0200, Andreas Gruenbacher wrote:
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> ---
>  fs/xfs/xfs_file.c | 17 ++---------------
>  1 file changed, 2 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 5fb5a09..b36dcd7 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1283,28 +1283,15 @@ xfs_seek_hole_data(
>  	struct xfs_inode	*ip = XFS_I(inode);
>  	struct xfs_mount	*mp = ip->i_mount;
>  	uint			lock;
> -	loff_t			offset, end;
> -	int			error = 0;
> +	loff_t			offset;
>  
>  	if (XFS_FORCED_SHUTDOWN(mp))
>  		return -EIO;
>  
>  	lock = xfs_ilock_data_map_shared(ip);
> -
> -	end = i_size_read(inode);
> -	offset = __xfs_seek_hole_data(inode, start, end, whence);
> -	if (offset < 0) {
> -		error = offset;
> -		goto out_unlock;
> -	}
> -
> -	offset = vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
> -
> -out_unlock:
> +	offset = iomap_seek_hole_data(file, start, whence, &xfs_iomap_ops);
>  	xfs_iunlock(ip, lock);
>  
> -	if (error)
> -		return error;
>  	return offset;
>  }

Doesn't this change the way data in unwritten extents is reported?
i.e. we current report unwritten extents as holes if there is no
data pending writeback over the range in the page cache, and as
data if there is data in the page cache.

I don't see this page cache lookup for unwritten extents implemented
in iomap_seek_hole_data() - it only reports extents mapped as holes
as holes. Hence this will now always report unwritten extents as
data . This strikes me as a regression as we currently report them
as a hole:

$ xfs_io -f -c "truncate 1m" -c "falloc 0 1m" -c "seek -a -r 0" foo
Whence	Result
HOLE	0
$

I'm pretty sure that ext4 has the same behaviour when it comes to
dirty page cache pages over unwritten extents ..

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/7] vfs: Add iomap_seek_hole_data helper
  2017-06-16 14:51 ` [PATCH 4/7] vfs: Add iomap_seek_hole_data helper Andreas Gruenbacher
@ 2017-06-19 17:32   ` Darrick J. Wong
  0 siblings, 0 replies; 13+ messages in thread
From: Darrick J. Wong @ 2017-06-19 17:32 UTC (permalink / raw)
  To: Andreas Gruenbacher; +Cc: linux-fsdevel, Bob Peterson, linux-xfs, cluster-devel

On Fri, Jun 16, 2017 at 04:51:17PM +0200, Andreas Gruenbacher wrote:
> Filesystems can use this for implementing lseek SEEK_HOLE / SEEK_DATA
> support via iomap.
> 
> The __iomap_seek_hole_data helper takes an additional size argument and
> doesn't reposition the file offset; this is for internal use for xfs
> quota files which don't maintain the inode size.
> 
> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
> ---
>  fs/iomap.c            | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/iomap.h |  6 ++++
>  2 files changed, 87 insertions(+)
> 
> diff --git a/fs/iomap.c b/fs/iomap.c
> index 4b10892..cf01694 100644
> --- a/fs/iomap.c
> +++ b/fs/iomap.c
> @@ -584,6 +584,87 @@ int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi,
>  }
>  EXPORT_SYMBOL_GPL(iomap_fiemap);
>  
> +static loff_t
> +iomap_seek_hole_actor(struct inode *inode, loff_t offset, loff_t length,
> +		      void *data, struct iomap *iomap)
> +{
> +	if (iomap->type == IOMAP_HOLE)

Echoing Dave's comments further on (sorry, I tend to read the patchset but
reply in reverse order, distractions notwithstanding), for SEEK_{HOLE,DATA}
you'd want to test for type == IOMAP_HOLE || type == IOMAP_UNWRITTEN since
unwritten (i.e. fallocated) areas are treated as holes.

--D

> +		return 0;
> +	return iomap->offset + iomap->length - offset;
> +}
> +
> +static loff_t
> +iomap_seek_data_actor(struct inode *inode, loff_t offset, loff_t length,
> +		      void *data, struct iomap *iomap)
> +{
> +	if (iomap->type != IOMAP_HOLE)
> +		return 0;
> +	return iomap->offset + iomap->length - offset;
> +}
> +
> +loff_t
> +__iomap_seek_hole_data(struct inode *inode, loff_t offset, loff_t size,
> +		       int whence, const struct iomap_ops *ops)
> +{
> +	static loff_t (*actor)(struct inode *, loff_t, loff_t, void *,
> +			       struct iomap *);
> +	loff_t len = size - offset;
> +	loff_t ret;
> +
> +	/* Nothing to be found beyond the end of the file. */
> +	if (len <= 0)
> +		return -ENXIO;
> +
> +	switch(whence) {
> +	case SEEK_HOLE:
> +		actor = iomap_seek_hole_actor;
> +		break;
> +
> +	case SEEK_DATA:
> +		actor = iomap_seek_data_actor;
> +		break;
> +	}
> +
> +	while (len > 0) {
> +		ret = iomap_apply(inode, offset, len, IOMAP_REPORT, ops,
> +				  NULL, actor);
> +		if (ret <= 0) {
> +			if (ret < 0)
> +				return ret;
> +			break;
> +		}
> +		offset += ret;
> +		len -= ret;
> +	}
> +
> +	if (len <= 0) {
> +		/* There is an implicit hole at the end of the file. */
> +		if (whence != SEEK_HOLE)
> +			offset = -ENXIO;
> +
> +		/* The last segment can extend beyond the end of the file. */
> +		if (offset > size)
> +			offset = size;
> +	}
> +
> +	return offset;
> +}
> +EXPORT_SYMBOL_GPL(__iomap_seek_hole_data);
> +
> +loff_t
> +iomap_seek_hole_data(struct file *file, loff_t offset, int whence,
> +		     const struct iomap_ops *ops)
> +{
> +	struct inode *inode = file->f_mapping->host;
> +
> +	offset = __iomap_seek_hole_data(inode, offset, i_size_read(inode),
> +					whence, ops);
> +	if (offset <= 0)
> +		return offset;
> +	return vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
> +}
> +EXPORT_SYMBOL_GPL(iomap_seek_hole_data);
> +
>  /*
>   * Private flags for iomap_dio, must not overlap with the public ones in
>   * iomap.h:
> diff --git a/include/linux/iomap.h b/include/linux/iomap.h
> index 9d64933..49b6ec0 100644
> --- a/include/linux/iomap.h
> +++ b/include/linux/iomap.h
> @@ -85,6 +85,12 @@ int iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops);
>  int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
>  		loff_t start, loff_t len, const struct iomap_ops *ops);
>  
> +struct file;
> +loff_t __iomap_seek_hole_data(struct inode *inode, loff_t pos, loff_t size,
> +			      int whence, const struct iomap_ops *ops);
> +loff_t iomap_seek_hole_data(struct file *file, loff_t pos,
> +			    int whence, const struct iomap_ops *ops);
> +
>  /*
>   * Flags for direct I/O ->end_io:
>   */
> -- 
> 2.7.5
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/7] xfs: Switch to iomap for lseek SEEK_HOLE / SEEK_DATA
  2017-06-17 23:57   ` Dave Chinner
@ 2017-06-19 22:16     ` Andreas Gruenbacher
  0 siblings, 0 replies; 13+ messages in thread
From: Andreas Gruenbacher @ 2017-06-19 22:16 UTC (permalink / raw)
  To: Dave Chinner
  Cc: linux-fsdevel, Bob Peterson, linux-xfs, cluster-devel,
	Darrick J. Wong

On Sun, Jun 18, 2017 at 1:57 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Fri, Jun 16, 2017 at 04:51:19PM +0200, Andreas Gruenbacher wrote:
>> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
>> ---
>>  fs/xfs/xfs_file.c | 17 ++---------------
>>  1 file changed, 2 insertions(+), 15 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
>> index 5fb5a09..b36dcd7 100644
>> --- a/fs/xfs/xfs_file.c
>> +++ b/fs/xfs/xfs_file.c
>> @@ -1283,28 +1283,15 @@ xfs_seek_hole_data(
>>       struct xfs_inode        *ip = XFS_I(inode);
>>       struct xfs_mount        *mp = ip->i_mount;
>>       uint                    lock;
>> -     loff_t                  offset, end;
>> -     int                     error = 0;
>> +     loff_t                  offset;
>>
>>       if (XFS_FORCED_SHUTDOWN(mp))
>>               return -EIO;
>>
>>       lock = xfs_ilock_data_map_shared(ip);
>> -
>> -     end = i_size_read(inode);
>> -     offset = __xfs_seek_hole_data(inode, start, end, whence);
>> -     if (offset < 0) {
>> -             error = offset;
>> -             goto out_unlock;
>> -     }
>> -
>> -     offset = vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
>> -
>> -out_unlock:
>> +     offset = iomap_seek_hole_data(file, start, whence, &xfs_iomap_ops);
>>       xfs_iunlock(ip, lock);
>>
>> -     if (error)
>> -             return error;
>>       return offset;
>>  }
>
> Doesn't this change the way data in unwritten extents is reported?
> i.e. we current report unwritten extents as holes if there is no
> data pending writeback over the range in the page cache, and as
> data if there is data in the page cache.
>
> I don't see this page cache lookup for unwritten extents implemented
> in iomap_seek_hole_data() - it only reports extents mapped as holes
> as holes. Hence this will now always report unwritten extents as
> data . This strikes me as a regression as we currently report them
> as a hole:
>
> $ xfs_io -f -c "truncate 1m" -c "falloc 0 1m" -c "seek -a -r 0" foo
> Whence  Result
> HOLE    0
> $

Indeed, thanks for pointing this out.

> I'm pretty sure that ext4 has the same behaviour when it comes to
> dirty page cache pages over unwritten extents.

That's correct as well.

The code in xfs and ext4 looks similar enough that we can implement
this generically in the iomap_seek_hole_data helper, I believe.

Thanks,
Andreas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Cluster-devel] [PATCH 7/7] xfs: Switch to iomap for seeking in quota files
  2017-06-16 15:30   ` Darrick J. Wong
@ 2017-06-20  6:31     ` Christoph Hellwig
  0 siblings, 0 replies; 13+ messages in thread
From: Christoph Hellwig @ 2017-06-20  6:31 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Andreas Gruenbacher, linux-fsdevel, cluster-devel, linux-xfs

On Fri, Jun 16, 2017 at 08:30:29AM -0700, Darrick J. Wong wrote:
> On Fri, Jun 16, 2017 at 04:51:20PM +0200, Andreas Gruenbacher wrote:
> > (This patch is largely untested.)
> 
> In principle the XFS bits look ok, but I think you'd be better off with
> a review from hch and, uh, while you wait, a run through the xfstests
> 'quota' group and probably 'auto' too?

I think for the quota code we should simply not even use seek
hole / data.  Quota files don't have delayed allocations, unwritten
extents or COW forks, so it's just a trivial lookup in the extent
list.  I'll fix that part up to make Andreas' life easier.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-06-20  6:31 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-06-16 14:51 [PATCH 0/7] lseek SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
2017-06-16 14:51 ` [PATCH 1/7] GFS2: Make height info part of metapath Andreas Gruenbacher
2017-06-16 14:51 ` [PATCH 2/7] GFS2: Implement iomap for block_map Andreas Gruenbacher
2017-06-16 14:51 ` [PATCH 3/7] GFS2: Switch fiemap implementation to use iomap Andreas Gruenbacher
2017-06-16 14:51 ` [PATCH 4/7] vfs: Add iomap_seek_hole_data helper Andreas Gruenbacher
2017-06-19 17:32   ` Darrick J. Wong
2017-06-16 14:51 ` [PATCH 5/7] gfs2: Implement lseek SEEK_HOLE / SEEK_DATA via iomap Andreas Gruenbacher
2017-06-16 14:51 ` [PATCH 6/7] xfs: Switch to iomap for lseek SEEK_HOLE / SEEK_DATA Andreas Gruenbacher
2017-06-17 23:57   ` Dave Chinner
2017-06-19 22:16     ` Andreas Gruenbacher
2017-06-16 14:51 ` [PATCH 7/7] xfs: Switch to iomap for seeking in quota files Andreas Gruenbacher
2017-06-16 15:30   ` Darrick J. Wong
2017-06-20  6:31     ` [Cluster-devel] " Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).