From mboxrd@z Thu Jan 1 00:00:00 1970 From: Steven Whitehouse Date: Mon, 15 Aug 2016 09:41:54 +0100 Subject: [Cluster-devel] [GFS2 PATCH][v2] GFS2: Add function gfs2_get_iomap In-Reply-To: <225698410.3303717.1471031879076.JavaMail.zimbra@redhat.com> References: <1935983818.2525484.1470927576083.JavaMail.zimbra@redhat.com> <57AC990C.3040909@redhat.com> <225698410.3303717.1471031879076.JavaMail.zimbra@redhat.com> Message-ID: <57B18052.4060101@redhat.com> List-Id: To: cluster-devel.redhat.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, On 12/08/16 20:57, Bob Peterson wrote: > ----- Original Message ----- > | > +int gfs2_get_iomap(struct inode *inode, loff_t pos, ssize_t length, > | > + struct iomap *iomap) > | This function should be merged with gfs2_block_map() I think, so that > | gfs2_block_map just becomes a wrapper (which will eventually go away) > | for this function. Otherwise we will have two parallel but slightly > | different implementations of block mapping to maintain. > > Yes, I guess that's worth doing. I did this and included the revised > patch below. That wasn't quite what I was thinking of... comments below > | Is there a better way to pass the gh from the begin function to the end > | function? I'm sure that will work, but it is the kind of thing that > | might trip up the unwary in the future, > > Yeah, I would have passed the gh another way, but Christoph didn't > leave a good way in the iomap interface or other bits to do it. > At least not that I can see. It might be worth changing the interface to > add that flexibility for other file systems that might want to do this > as well. It would be easy enough to add a (void *)private pointer or > something that can be passed around. Christoph and Steve: Would you > prefer I plumb something before I do the GFS2 bits? > > For now, here's a replacement patch which still does it the only way possible. I think there is a possible race in case a single process opens the same file twice, unless I'm missing something there? It is an unlikely thing for someone to do, but we should allow for it anyway. > Bob Peterson > Red Hat File Systems > --- > This patch replaces the GFS2 fiemap implementation that used vfs > function __generic_block_fiemap with a new implementation that uses > the new iomap-based fiemap interface. This allows GFS2's fiemap to > skip holes. The time to do filefrag on a file with a 1 petabyte hole > is reduced from several days or weeks, to milliseconds. Note that > there are Kconfig changes that affect everyone. > > Signed-off-by: Bob Peterson > --- > diff --git a/fs/gfs2/Kconfig b/fs/gfs2/Kconfig > index 90c6a8f..f8fa955 100644 > --- a/fs/gfs2/Kconfig > +++ b/fs/gfs2/Kconfig > @@ -25,6 +25,7 @@ config GFS2_FS > config GFS2_FS_LOCKING_DLM > bool "GFS2 DLM locking" > depends on (GFS2_FS!=n) && NET && INET && (IPV6 || IPV6=n) && \ > + FS_IOMAP && \ > CONFIGFS_FS && SYSFS && (DLM=y || DLM=GFS2_FS) > help > Multiple node locking module for GFS2 > diff --git a/fs/gfs2/bmap.c b/fs/gfs2/bmap.c > index 6e2bec1..839f2a6 100644 > --- a/fs/gfs2/bmap.c > +++ b/fs/gfs2/bmap.c > @@ -588,6 +588,66 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock, > } > > /** > + * hole_size - figure out the size of a hole > + * @ip: The inode > + * @lblock: The logical starting block number > + * @mp: The metapath > + * > + * Returns: The hole size in bytes > + * > + */ > +static u64 hole_size(struct inode *inode, sector_t lblock, struct metapath *mp) > +{ > + struct gfs2_inode *ip = GFS2_I(inode); > + struct gfs2_sbd *sdp = GFS2_SB(inode); > + struct metapath mp_eof; > + unsigned int end_of_metadata = ip->i_height - 1; > + u64 factor = 1; > + int hgt = end_of_metadata; > + u64 holesz = 0, holestep; > + const __be64 *first, *end, *ptr; > + const struct buffer_head *bh; > + u64 isize = i_size_read(inode); > + int zeroptrs; > + bool done = false; > + > + /* Get another metapath, to the very last byte */ > + find_metapath(sdp, (isize - 1) >> inode->i_blkbits, &mp_eof, > + ip->i_height); > + for (hgt = end_of_metadata; hgt >= 0 && !done; hgt--) { > + bh = mp->mp_bh[hgt]; > + if (bh) { > + zeroptrs = 0; > + first = metapointer(hgt, mp); > + end = (const __be64 *)(bh->b_data + bh->b_size); > + > + for (ptr = first; ptr < end; ptr++) { > + if (*ptr) { > + done = true; > + break; > + } else { > + zeroptrs++; > + } > + } > + } else { > + zeroptrs = sdp->sd_inptrs; > + } > + holestep = min(factor * zeroptrs, > + isize - (lblock + (zeroptrs * holesz))); > + holesz += holestep; > + if (lblock + holesz >= isize) > + return holesz << inode->i_blkbits; > + > + factor *= sdp->sd_inptrs; > + if (hgt && (mp->mp_list[hgt - 1] < mp_eof.mp_list[hgt - 1])) > + (mp->mp_list[hgt - 1])++; > + } > + return holesz << inode->i_blkbits; > +} > + > +enum gfs2_map_type { maptype_iomap = 1, maptype_bhmap = 2, }; > + > +/** > * gfs2_block_map - Map a block from an inode to a disk block > * @inode: The inode > * @lblock: The logical block number > @@ -601,13 +661,14 @@ static int gfs2_bmap_alloc(struct inode *inode, const sector_t lblock, > * Returns: errno > */ > > -int gfs2_block_map(struct inode *inode, sector_t lblock, > - struct buffer_head *bh_map, int create) > +static int _gfs2_block_map(struct inode *inode, loff_t pos, ssize_t length, > + int create, enum gfs2_map_type maptype, > + void *iomap_or_bh) > { Lets just call this gfs2_iomap() and make it take a struct iomap only. Then we'd have a new function with the original name: int gfs2_block_map(struct inode *inode, sector_t lblock, struct buffer_head *bh_map, int create) { struct iomap *iom; /* Set up iomap based on bh */ ret = gfs2_iomap(inode, pos, length, create, &iom); /* Copy iomap data back into bh */ return ret; } That way we use iomap as the main structure through our block mapping functions, removing the bh from those code paths entirely and gfs2_block_map() becomes a function for backwards compatibility that we'd eventually be able to remove. It also means that you don't need the gfs2_map_type enum either, Steve.