From mboxrd@z Thu Jan  1 00:00:00 1970
From: Steven Whitehouse <swhiteho@redhat.com>
Date: Mon, 15 Aug 2016 09:41:54 +0100
Subject: [Cluster-devel] [GFS2 PATCH][v2] GFS2: Add function
 gfs2_get_iomap
In-Reply-To: <225698410.3303717.1471031879076.JavaMail.zimbra@redhat.com>
References: <1935983818.2525484.1470927576083.JavaMail.zimbra@redhat.com>
	<57AC990C.3040909@redhat.com>
	<225698410.3303717.1471031879076.JavaMail.zimbra@redhat.com>
Message-ID: <57B18052.4060101@redhat.com>
List-Id: <cluster-devel.redhat.com>
To: cluster-devel.redhat.com
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

Hi,


On 12/08/16 20:57, Bob Peterson wrote:
> ----- Original Message -----
> | > +int gfs2_get_iomap(struct inode *inode, loff_t pos, ssize_t length,
> | > +		   struct iomap *iomap)
> | This function should be merged with gfs2_block_map() I think, so that
> | gfs2_block_map just becomes a wrapper (which will eventually go away)
> | for this function. Otherwise we will have two parallel but slightly
> | different implementations of block mapping to maintain.
>
> Yes, I guess that's worth doing. I did this and included the revised
> patch below.
That wasn't quite what I was thinking of... comments below

> | Is there a better way to pass the gh from the begin function to the end
> | function? I'm sure that will work, but it is the kind of thing that
> | might trip up the unwary in the future,
>
> Yeah, I would have passed the gh another way, but Christoph didn't
> leave a good way in the iomap interface or other bits to do it.
> At least not that I can see. It might be worth changing the interface to
> add that flexibility for other file systems that might want to do this
> as well. It would be easy enough to add a (void *)private pointer or
> something that can be passed around. Christoph and Steve: Would you
> prefer I plumb something before I do the GFS2 bits?
>
> For now, here's a replacement patch which still does it the only way possible.
I think there is a possible race in case a single process opens the same 
file twice, unless I'm missing something there? It is an unlikely thing 
for someone to do, but we should allow for it anyway.

> Bob Peterson
> Red Hat File Systems
> ---
> This patch replaces the GFS2 fiemap implementation that used vfs
> function __generic_block_fiemap with a new implementation that uses
> the new iomap-based fiemap interface. This allows GFS2's fiemap to
> skip holes. The time to do filefrag on a file with a 1 petabyte hole
> is reduced from several days or weeks, to milliseconds. Note that
> there are Kconfig changes that affect everyone.
>
> Signed-off-by: Bob Peterson <rpeterso@redhat.com>
> ---
> diff --git a/fs/gfs2/Kconfig b/fs/gfs2/Kconfig
> index 90c6a8f..f8fa955 100644
> --- a/fs/gfs2/Kconfig
> +++ b/fs/gfs2/Kconfig
> @@ -25,6 +25,7 @@ config GFS2_FS
>   config GFS2_FS_LOCKING_DLM
>   	bool "GFS2 DLM locking"
>   	depends on (GFS2_FS!=n) && NET && INET && (IPV6 || IPV6=n) && \
> +		FS_IOMAP && \
>   		CONFIGFS_FS && SYSFS && (DLM=y || DLM=GFS2_FS)
>   	help
>   	  Multiple node locking module for GFS2
> diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c
> index 6e2bec1..839f2a6 100644
> --- a/fs/gfs2/bmap.c
> +++ b/fs/gfs2/bmap.c
> @@ -588,6 +588,66 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
>   }
>   
>   /**
> + * hole_size - figure out the size of a hole
> + * @ip: The inode
> + * @lblock: The logical starting block number
> + * @mp: The metapath
> + *
> + * Returns: The hole size in bytes
> + *
> + */
> +static u64 hole_size(struct inode *inode, sector_t lblock, struct metapath *mp)
> +{
> +	struct gfs2_inode *ip = GFS2_I(inode);
> +	struct gfs2_sbd *sdp = GFS2_SB(inode);
> +	struct metapath mp_eof;
> +	unsigned int end_of_metadata = ip->i_height - 1;
> +	u64 factor = 1;
> +	int hgt = end_of_metadata;
> +	u64 holesz = 0, holestep;
> +	const __be64 *first, *end, *ptr;
> +	const struct buffer_head *bh;
> +	u64 isize = i_size_read(inode);
> +	int zeroptrs;
> +	bool done = false;
> +
> +	/* Get another metapath, to the very last byte */
> +	find_metapath(sdp, (isize - 1) >> inode->i_blkbits, &mp_eof,
> +		      ip->i_height);
> +	for (hgt = end_of_metadata; hgt >= 0 && !done; hgt--) {
> +		bh = mp->mp_bh[hgt];
> +		if (bh) {
> +			zeroptrs = 0;
> +			first = metapointer(hgt, mp);
> +			end = (const __be64 *)(bh->b_data + bh->b_size);
> +
> +			for (ptr = first; ptr < end; ptr++) {
> +				if (*ptr) {
> +					done = true;
> +					break;
> +				} else {
> +					zeroptrs++;
> +				}
> +			}
> +		} else {
> +			zeroptrs = sdp->sd_inptrs;
> +		}
> +		holestep = min(factor * zeroptrs,
> +			       isize - (lblock + (zeroptrs * holesz)));
> +		holesz += holestep;
> +		if (lblock + holesz >= isize)
> +			return holesz << inode->i_blkbits;
> +
> +		factor *= sdp->sd_inptrs;
> +		if (hgt && (mp->mp_list[hgt - 1] < mp_eof.mp_list[hgt - 1]))
> +			(mp->mp_list[hgt - 1])++;
> +	}
> +	return holesz << inode->i_blkbits;
> +}
> +
> +enum gfs2_map_type { maptype_iomap = 1, maptype_bhmap = 2, };
> +
> +/**
>    * gfs2_block_map - Map a block from an inode to a disk block
>    * @inode: The inode
>    * @lblock: The logical block number
> @@ -601,13 +661,14 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock,
>    * Returns: errno
>    */
>   
> -int gfs2_block_map(struct inode *inode, sector_t lblock,
> -		   struct buffer_head *bh_map, int create)
> +static int _gfs2_block_map(struct inode *inode, loff_t pos, ssize_t length,
> +			   int create, enum gfs2_map_type maptype,
> +			   void *iomap_or_bh)
>   {
Lets just call this gfs2_iomap() and make it take a struct iomap only. 
Then we'd have a new function with
the original name:

int gfs2_block_map(struct inode *inode, sector_t lblock,
		   struct buffer_head *bh_map, int create)

{
     struct iomap *iom;

     /* Set up iomap based on bh */

     ret = gfs2_iomap(inode, pos, length, create, &iom);

     /* Copy iomap data back into bh */

    return ret;
}

That way we use iomap as the main structure through our block mapping 
functions, removing the bh from those code paths entirely and 
gfs2_block_map() becomes a function for backwards compatibility that 
we'd eventually be able to remove. It also means that you don't need the 
gfs2_map_type enum either,

Steve.