* [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP
2012-10-18 5:11 [RFC, PATCH 0/2] fiemap: filesystem free space mapping Dave Chinner
@ 2012-10-18 5:11 ` Dave Chinner
2012-11-08 16:50 ` Mark Tinguely
2012-10-18 5:11 ` [PATCH 2/2] xfs: implement FIEMAP_FLAG_FREESPACE_* Dave Chinner
` (3 subsequent siblings)
4 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2012-10-18 5:11 UTC (permalink / raw)
To: linux-fsdevel; +Cc: xfs
From: Dave Chinner <dchinner@redhat.com>
fiemap is used to map extents of used space on files. it's just an
array of extents, though, so there's no reason it can only index
*used* space.
Ther eis need for getting freespace layout information into
userspace. For example, defragmentation programs would find it
useful to be able to map the free space in the filesystem to
work out where it is best to move data to defragment it.
Alternatively, knowing where free space is enables us to identify
extents that need to be moved to defragment free space.
Hence, extend fiemap with the FIEMAP_FLAG_FREESPACE to indicate that
the caller wants to map free space in the range fm_start bytes from
the start of the filesystem for fm_length bytes.
Because XFS can report extents in size order without needing to
sort, and this information is useful to xfs_fsr, also add
FIEMAP_FLAG_FREESPACE_SIZE to tell the filesystem to return a
freespace map ordered by extent size rather than offset. If there
are multiple extents of the same size, then they are ordered by
offset.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
Documentation/filesystems/fiemap.txt | 37 +++++++++++++++++++++++++++++++---
include/linux/fiemap.h | 6 +++++-
2 files changed, 39 insertions(+), 4 deletions(-)
diff --git a/Documentation/filesystems/fiemap.txt b/Documentation/filesystems/fiemap.txt
index 1b805a0..45531ba 100644
--- a/Documentation/filesystems/fiemap.txt
+++ b/Documentation/filesystems/fiemap.txt
@@ -2,9 +2,9 @@
Fiemap Ioctl
============
-The fiemap ioctl is an efficient method for userspace to get file
-extent mappings. Instead of block-by-block mapping (such as bmap), fiemap
-returns a list of extents.
+The fiemap ioctl is an efficient method for userspace to get file or
+filesystem extent mappings. Instead of block-by-block mapping (such as
+bmap), fiemap returns a list of extents.
Request Basics
@@ -58,6 +58,37 @@ If this flag is set, the kernel will sync the file before mapping extents.
If this flag is set, the extents returned will describe the inodes
extended attribute lookup tree, instead of its data tree.
+* FIEMAP_FLAG_FREESPACE
+If this flag is set, the extents returned will describe the
+*filesystem's* free space map, with fm_start specifying the start offset
+into the filesystems address range (in bytes) of the region to be
+mapped. fm_length is the the byte range that will be mapped. Free space
+extents will be mapped in ascending offset order.
+
+Filesystems with multiple freespace indexes may return
+FIEMAP_EXTENT_LAST at the end of a specific freespace index map. Hence
+FIEMAP_EXTENT_LAST does not mean there is no more free space to be
+mapped, just that the requested range spanned multiple free space
+indexes.
+
+Hence the caller needs to be aware of the underlying filesystem
+implementation and geometry to make correct use of this call. As such,
+this functionality is only intended for use by filesystem management
+utilities (e.g. defragmentation tools) and not general purpose
+applications.
+
+* FIEMAP_FLAG_FREESPACE_SIZE
+If this flag is set, the filesystem freespace tree will be mapped
+similar to FIEMAP_FLAG_FREESPACE, but extents will be ordered from
+smallest free space extent to largest Where extents have the same size,
+they will be ordered by ascending offset order similar to
+FIEMAP_FLAG_FREESPACE. It is up to the application to track the highest
+offset extent seen by this walk so that if it doesn't see a
+FIEMAP_EXTENT_LAST flag, the application knows what offset to start the
+next mapping from.
+
+The same caveats exist for this call for FIEMAP_EXTENT_LAST as for
+FIEMAP_FLAG_FREESPACE.
Extent Mapping
--------------
diff --git a/include/linux/fiemap.h b/include/linux/fiemap.h
index d830747..f4fbb9f 100644
--- a/include/linux/fiemap.h
+++ b/include/linux/fiemap.h
@@ -40,8 +40,12 @@ struct fiemap {
#define FIEMAP_FLAG_SYNC 0x00000001 /* sync file data before map */
#define FIEMAP_FLAG_XATTR 0x00000002 /* map extended attribute tree */
+#define FIEMAP_FLAG_FREESPACE 0x00000004 /* map fs freespace tree */
+#define FIEMAP_FLAG_FREESPACE_SIZE 0x00000008 /* map freespace in size order */
-#define FIEMAP_FLAGS_COMPAT (FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR)
+#define FIEMAP_FLAGS_COMPAT (FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR | \
+ FIEMAP_FLAG_FREESPACE | \
+ FIEMAP_FLAG_FREESPACE_SIZE)
#define FIEMAP_EXTENT_LAST 0x00000001 /* Last extent in file. */
#define FIEMAP_EXTENT_UNKNOWN 0x00000002 /* Data location unknown. */
--
1.7.10
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP
2012-10-18 5:11 ` [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP Dave Chinner
@ 2012-11-08 16:50 ` Mark Tinguely
2012-11-08 20:56 ` Dave Chinner
0 siblings, 1 reply; 15+ messages in thread
From: Mark Tinguely @ 2012-11-08 16:50 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, xfs
On 10/18/12 00:11, Dave Chinner wrote:
> From: Dave Chinner<dchinner@redhat.com>
>
> fiemap is used to map extents of used space on files. it's just an
> array of extents, though, so there's no reason it can only index
> *used* space.
>
> Ther eis need for getting freespace layout information into
> userspace. For example, defragmentation programs would find it
> useful to be able to map the free space in the filesystem to
> work out where it is best to move data to defragment it.
> Alternatively, knowing where free space is enables us to identify
> extents that need to be moved to defragment free space.
>
> Hence, extend fiemap with the FIEMAP_FLAG_FREESPACE to indicate that
> the caller wants to map free space in the range fm_start bytes from
> the start of the filesystem for fm_length bytes.
>
> Because XFS can report extents in size order without needing to
> sort, and this information is useful to xfs_fsr, also add
> FIEMAP_FLAG_FREESPACE_SIZE to tell the filesystem to return a
> freespace map ordered by extent size rather than offset. If there
> are multiple extents of the same size, then they are ordered by
> offset.
>
> Signed-off-by: Dave Chinner<dchinner@redhat.com>
> ---
...
> --------------
> diff --git a/include/linux/fiemap.h b/include/linux/fiemap.h
> index d830747..f4fbb9f 100644
> --- a/include/linux/fiemap.h
> +++ b/include/linux/fiemap.h
include/uabi/linux/fiemap.h
^^^^
other than that, it looks good.
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP
2012-11-08 16:50 ` Mark Tinguely
@ 2012-11-08 20:56 ` Dave Chinner
2012-11-08 21:01 ` Mark Tinguely
0 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2012-11-08 20:56 UTC (permalink / raw)
To: Mark Tinguely; +Cc: linux-fsdevel, xfs
On Thu, Nov 08, 2012 at 10:50:49AM -0600, Mark Tinguely wrote:
> On 10/18/12 00:11, Dave Chinner wrote:
> >From: Dave Chinner<dchinner@redhat.com>
> >
> >fiemap is used to map extents of used space on files. it's just an
> >array of extents, though, so there's no reason it can only index
> >*used* space.
> >
> >Ther eis need for getting freespace layout information into
> >userspace. For example, defragmentation programs would find it
> >useful to be able to map the free space in the filesystem to
> >work out where it is best to move data to defragment it.
> >Alternatively, knowing where free space is enables us to identify
> >extents that need to be moved to defragment free space.
> >
> >Hence, extend fiemap with the FIEMAP_FLAG_FREESPACE to indicate that
> >the caller wants to map free space in the range fm_start bytes from
> >the start of the filesystem for fm_length bytes.
> >
> >Because XFS can report extents in size order without needing to
> >sort, and this information is useful to xfs_fsr, also add
> >FIEMAP_FLAG_FREESPACE_SIZE to tell the filesystem to return a
> >freespace map ordered by extent size rather than offset. If there
> >are multiple extents of the same size, then they are ordered by
> >offset.
> >
> >Signed-off-by: Dave Chinner<dchinner@redhat.com>
> >---
>
> ...
>
> > --------------
> >diff --git a/include/linux/fiemap.h b/include/linux/fiemap.h
> >index d830747..f4fbb9f 100644
> >--- a/include/linux/fiemap.h
> >+++ b/include/linux/fiemap.h
>
> include/uabi/linux/fiemap.h
> ^^^^
> other than that, it looks good.
include/uapi/ actaully, but that change was made after I posted the
patches so there's no surprise that It didn't apply.
As it is, this needs to be redone into an FS_IOC_FIEMAPFS ioctl in
response to other reviews. I've already done that work (a week ago),
I just haven't fully tested it yet so I haven't reported it...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP
2012-11-08 20:56 ` Dave Chinner
@ 2012-11-08 21:01 ` Mark Tinguely
0 siblings, 0 replies; 15+ messages in thread
From: Mark Tinguely @ 2012-11-08 21:01 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, xfs
On 11/08/12 14:56, Dave Chinner wrote:
> On Thu, Nov 08, 2012 at 10:50:49AM -0600, Mark Tinguely wrote:
>> On 10/18/12 00:11, Dave Chinner wrote:
>>> From: Dave Chinner<dchinner@redhat.com>
>>>
>>> fiemap is used to map extents of used space on files. it's just an
>>> array of extents, though, so there's no reason it can only index
>>> *used* space.
>>>
>>> Ther eis need for getting freespace layout information into
>>> userspace. For example, defragmentation programs would find it
>>> useful to be able to map the free space in the filesystem to
>>> work out where it is best to move data to defragment it.
>>> Alternatively, knowing where free space is enables us to identify
>>> extents that need to be moved to defragment free space.
>>>
>>> Hence, extend fiemap with the FIEMAP_FLAG_FREESPACE to indicate that
>>> the caller wants to map free space in the range fm_start bytes from
>>> the start of the filesystem for fm_length bytes.
>>>
>>> Because XFS can report extents in size order without needing to
>>> sort, and this information is useful to xfs_fsr, also add
>>> FIEMAP_FLAG_FREESPACE_SIZE to tell the filesystem to return a
>>> freespace map ordered by extent size rather than offset. If there
>>> are multiple extents of the same size, then they are ordered by
>>> offset.
>>>
>>> Signed-off-by: Dave Chinner<dchinner@redhat.com>
>>> ---
>>
>> ...
>>
>>> --------------
>>> diff --git a/include/linux/fiemap.h b/include/linux/fiemap.h
>>> index d830747..f4fbb9f 100644
>>> --- a/include/linux/fiemap.h
>>> +++ b/include/linux/fiemap.h
>>
>> include/uabi/linux/fiemap.h
>> ^^^^
>> other than that, it looks good.
>
> include/uapi/ actaully, but that change was made after I posted the
> patches so there's no surprise that It didn't apply.
>
> As it is, this needs to be redone into an FS_IOC_FIEMAPFS ioctl in
> response to other reviews. I've already done that work (a week ago),
> I just haven't fully tested it yet so I haven't reported it...
>
> Cheers,
>
> Dave.
Okay. Thank-you for the update.
--Mark.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/2] xfs: implement FIEMAP_FLAG_FREESPACE_*
2012-10-18 5:11 [RFC, PATCH 0/2] fiemap: filesystem free space mapping Dave Chinner
2012-10-18 5:11 ` [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP Dave Chinner
@ 2012-10-18 5:11 ` Dave Chinner
2012-10-18 5:27 ` [RFC, PATCH 3/2] xfsprogs: space management tool Dave Chinner
` (2 subsequent siblings)
4 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2012-10-18 5:11 UTC (permalink / raw)
To: linux-fsdevel; +Cc: xfs
From: Dave Chinner <dchinner@redhat.com>
As you wish.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
fs/xfs/xfs_alloc.c | 219 ++++++++++++++++++++++++++++++++++++++++++++++++++++
fs/xfs/xfs_alloc.h | 7 ++
fs/xfs/xfs_iops.c | 12 ++-
3 files changed, 237 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_alloc.c b/fs/xfs/xfs_alloc.c
index 335206a..ee680c9 100644
--- a/fs/xfs/xfs_alloc.c
+++ b/fs/xfs/xfs_alloc.c
@@ -2470,3 +2470,222 @@ error0:
xfs_perag_put(args.pag);
return error;
}
+
+/*
+ * Walk the extents in the tree given by the cursor, and dump them all into the
+ * fieinfo. At the last extent in the tree, set the FIEMAP_EXTENT_LAST flag so
+ * that we return only free space from this tree in a given request.
+ */
+static int
+xfs_alloc_ag_freespace_map(
+ struct xfs_btree_cur *cur,
+ struct fiemap_extent_info *fieinfo,
+ xfs_agblock_t sagbno,
+ xfs_agblock_t eagbno)
+{
+ int error = 0;
+ int i = 1;
+
+ /*
+ * Loop until we have either filled the fiemap or reached the end of
+ * the AG walk.
+ */
+ while (i) {
+ xfs_agblock_t fbno;
+ xfs_extlen_t flen;
+ xfs_daddr_t dbno;
+ xfs_fileoff_t dlen;
+ int flags = 0;
+
+ error = xfs_alloc_get_rec(cur, &fbno, &flen, &i);
+ if (error)
+ break;
+ XFS_WANT_CORRUPTED_RETURN(i == 1);
+
+ /*
+ * move the cursor now to make it easy to continue the loop and
+ * detect the last extent in the lookup.
+ */
+ error = xfs_btree_increment(cur, 0, &i);
+ if (error)
+ break;
+
+ /* range check - must be wholly withing requested range */
+ if (fbno < sagbno ||
+ (eagbno != NULLAGBLOCK && fbno + flen > eagbno)) {
+ xfs_warn(cur->bc_mp, "10: %d/%d, %d/%d", sagbno, eagbno, fbno, flen);
+ continue;
+ }
+
+ /*
+ * use daddr format for all range/len calculations as that is
+ * the format the range/len variables are supplied in by
+ * userspace.
+ */
+ dbno = XFS_AGB_TO_DADDR(cur->bc_mp, cur->bc_private.a.agno, fbno);
+ dlen = XFS_FSB_TO_BB(cur->bc_mp, flen);
+
+ if (i == 0)
+ flags |= FIEMAP_EXTENT_LAST;
+ error = -fiemap_fill_next_extent(fieinfo, BBTOB(dbno),
+ BBTOB(dbno), BBTOB(dlen), flags);
+ if (error)
+ break;
+ }
+ return error;
+
+}
+
+/*
+ * Map the freespace from the requested range in the requested order.
+ *
+ * To make things simple, this function will only return the freespace from a
+ * single AG regardless of the size of the map passed in. That AG will be the AG
+ * that the first freespace is found in. In other words, FIEMAP_EXTENT_LAST does
+ * not mean the last freespace extent has been mapped, just that the last extent
+ * in a given freespace index has been mapped. The caller is responsible for
+ * moving the range to the next freespace region if it needs to query for more
+ * information.
+ *
+ * IOWs, the caller is responsible for knowing about the XFS filesystem
+ * structure and how it indexes freespace to use this call effectively.
+ */
+#define XFS_FREESP_FLAGS (FIEMAP_FLAG_FREESPACE | FIEMAP_FLAG_FREESPACE_SIZE)
+int
+xfs_alloc_freespace_map(
+ struct xfs_mount *mp,
+ struct fiemap_extent_info *fieinfo,
+ u64 start,
+ u64 length)
+{
+ struct xfs_btree_cur *cur;
+ struct xfs_buf *agbp;
+ struct xfs_perag *pag;
+ xfs_agnumber_t agno;
+ xfs_agnumber_t sagno;
+ xfs_agblock_t sagbno;
+ xfs_agnumber_t eagno;
+ xfs_agblock_t eagbno;
+ bool bycnt;
+ int error = 0;
+
+ /* can only have one type of mapping */
+ if ((fieinfo->fi_flags & XFS_FREESP_FLAGS) == XFS_FREESP_FLAGS) {
+ xfs_warn(mp, "1: 0x%x\n", fieinfo->fi_flags);
+ return EINVAL;
+ }
+ bycnt = (fieinfo->fi_flags & FIEMAP_FLAG_FREESPACE_SIZE);
+
+ if (XFS_B_TO_FSB(mp, start) >= mp->m_sb.sb_dblocks) {
+ xfs_warn(mp, "2: %lld, %lld/%lld\n", start,
+ XFS_B_TO_FSB(mp, start), mp->m_sb.sb_dblocks);
+ return EINVAL;
+ }
+ if (length < mp->m_sb.sb_blocksize) {
+ xfs_warn(mp, "3: %lld, %d\n", length, mp->m_sb.sb_blocksize);
+ return EINVAL;
+ }
+ if (start + length < start) {
+ xfs_warn(mp, "4: %lld/%lld, %lld", start, length, start + length);
+ return EINVAL;
+ }
+
+ sagno = xfs_daddr_to_agno(mp, BTOBB(start));
+ sagbno = xfs_daddr_to_agbno(mp, BTOBB(start));
+
+ eagno = xfs_daddr_to_agno(mp, BTOBB(start + length));
+ eagbno = xfs_daddr_to_agbno(mp, BTOBB(start + length));
+
+ if (sagno == eagno && sagbno == eagbno) {
+ xfs_warn(mp, "5: %d/%d, %d/%d", sagno, eagno, sagbno, eagbno);
+ return EINVAL;
+ }
+
+ /*
+ * Force out the log. This means any transactions that might have freed
+ * space before we took the AGF buffer lock are now on disk, and the
+ * volatile disk cache is flushed.
+ */
+ xfs_log_force(mp, XFS_LOG_SYNC);
+
+ /*
+ * Do initial lookup in by-bno tree. Keep skipping AGs until with
+ * either find a free space extent or reach the end of the search.
+ */
+ for (agno = sagno; agno < eagno; agno++) {
+ int i;
+ error = 0;
+
+ error = xfs_alloc_read_agf(mp, NULL, agno, 0, &agbp);
+ if (error || !agbp) {
+ xfs_warn(mp, "7: %p, %d", agbp, error);
+ goto next;
+ }
+
+ pag = xfs_perag_get(mp, agno);
+ if (pag->pagf_freeblks <= pag->pagf_flcount) {
+ /* no free space worth reporting */
+ xfs_warn(mp, "6: %d %d", pag->pagf_freeblks,
+ pag->pagf_flcount);
+ goto put_agbp;
+ }
+
+ cur = xfs_allocbt_init_cursor(mp, NULL, agbp, agno,
+ XFS_BTNUM_BNO);
+ error = xfs_alloc_lookup_ge(cur, 0, sagbno, &i);
+ if (error) {
+ xfs_warn(mp, "8: %d/%d, %d/%d", sagno, eagno, sagbno, eagbno);
+ goto del_cursor;
+ }
+ XFS_WANT_CORRUPTED_GOTO(i == 1, del_cursor);
+
+ if (!bycnt) {
+ /*
+ * if we are doing a bno ordered lookup, we can just
+ * loop across the free space extents formatting them
+ * until we get to the end of the AG, eagbno or fill the
+ * fieinfo map.
+ */
+ error = xfs_alloc_ag_freespace_map(cur, fieinfo, sagbno,
+ agno == eagno ? eagbno : NULLAGBLOCK);
+ } else {
+ /*
+ * We are doing a size ordered lookup. We know there is
+ * a free space extent somewhere past out start bno, so
+ * just kill the current cursor and start a size
+ * ordered scan to find all the freespace in the given
+ * range.
+ */
+ xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
+ cur = xfs_allocbt_init_cursor(mp, NULL, agbp, agno,
+ XFS_BTNUM_CNT);
+ error = xfs_alloc_lookup_ge(cur, 0, 1, &i);
+ if (error)
+ goto del_cursor;
+ XFS_WANT_CORRUPTED_GOTO(i == 1, del_cursor);
+
+ error = xfs_alloc_ag_freespace_map(cur, fieinfo, sagbno,
+ agno == eagno ? eagbno : NULLAGBLOCK);
+ }
+
+del_cursor:
+ xfs_btree_del_cursor(cur, error < 0 ? XFS_BTREE_ERROR
+ : XFS_BTREE_NOERROR);
+put_agbp:
+ xfs_perag_put(pag);
+ xfs_buf_relse(agbp);
+next:
+ if (error)
+ break;
+ sagbno = 0;
+ }
+
+ /*
+ * negative errno indicates that we hit a FIEMAP_EXTENT_LAST flag. Clear
+ * the error in that case.
+ */
+ if (error < 0)
+ error = 0;
+
+ return error;;
+}
diff --git a/fs/xfs/xfs_alloc.h b/fs/xfs/xfs_alloc.h
index feacb06..371b02c 100644
--- a/fs/xfs/xfs_alloc.h
+++ b/fs/xfs/xfs_alloc.h
@@ -231,4 +231,11 @@ xfs_alloc_get_rec(
xfs_extlen_t *len, /* output: length of extent */
int *stat); /* output: success/failure */
+int
+xfs_alloc_freespace_map(
+ struct xfs_mount *mp,
+ struct fiemap_extent_info *fieinfo,
+ u64 start,
+ u64 length);
+
#endif /* __XFS_ALLOC_H__ */
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 4e00cf0..4555525 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -938,7 +938,9 @@ xfs_vn_update_time(
return -xfs_trans_commit(tp, 0);
}
-#define XFS_FIEMAP_FLAGS (FIEMAP_FLAG_SYNC|FIEMAP_FLAG_XATTR)
+#define XFS_FIEMAP_FLAGS (FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR | \
+ FIEMAP_FLAG_FREESPACE | \
+ FIEMAP_FLAG_FREESPACE_SIZE)
/*
* Call fiemap helper to fill in user data.
@@ -997,6 +999,13 @@ xfs_vn_fiemap(
if (error)
return error;
+ if ((fieinfo->fi_flags &
+ (FIEMAP_FLAG_FREESPACE | FIEMAP_FLAG_FREESPACE_SIZE))) {
+ error = xfs_alloc_freespace_map(ip->i_mount, fieinfo,
+ start, length);
+ goto out;
+ }
+
/* Set up bmap header for xfs internal routine */
bm.bmv_offset = BTOBB(start);
/* Special case for whole file */
@@ -1017,6 +1026,7 @@ xfs_vn_fiemap(
bm.bmv_iflags |= BMV_IF_DELALLOC;
error = xfs_getbmap(ip, &bm, xfs_fiemap_format, fieinfo);
+out:
if (error)
return -error;
--
1.7.10
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [RFC, PATCH 3/2] xfsprogs: space management tool
2012-10-18 5:11 [RFC, PATCH 0/2] fiemap: filesystem free space mapping Dave Chinner
2012-10-18 5:11 ` [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP Dave Chinner
2012-10-18 5:11 ` [PATCH 2/2] xfs: implement FIEMAP_FLAG_FREESPACE_* Dave Chinner
@ 2012-10-18 5:27 ` Dave Chinner
2012-10-18 8:10 ` [RFC, PATCH 0/2] fiemap: filesystem free space mapping Andreas Dilger
2012-10-23 12:30 ` Christoph Hellwig
4 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2012-10-18 5:27 UTC (permalink / raw)
To: linux-fsdevel; +Cc: xfs
From: Dave Chinner <dchinner@redhat.com>
xfs_spaceman is intended as a diagnostic and control tool for space
management operations within XFS. Operations like examining free
space, managing allocation policies, issuing block discards on free
space, etc.
The tool is modelled on the xfs_io interface, allowing both
interactive and command line control of the tool, enabling it to be
used in scripts and automated management tools.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
Makefile | 3 +-
spaceman/Makefile | 34 +++++
spaceman/file.c | 149 +++++++++++++++++++++
spaceman/freesp.c | 377 +++++++++++++++++++++++++++++++++++++++++++++++++++++
spaceman/init.c | 117 +++++++++++++++++
spaceman/init.h | 24 ++++
spaceman/space.h | 37 ++++++
7 files changed, 740 insertions(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index c40fb2c..a81b8b2 100644
--- a/Makefile
+++ b/Makefile
@@ -41,7 +41,7 @@ endif
LIB_SUBDIRS = libxfs libxlog libxcmd libhandle libdisk
TOOL_SUBDIRS = copy db estimate fsck fsr growfs io logprint mkfs quota \
- mdrestore repair rtcp m4 man doc po debian
+ mdrestore repair rtcp m4 man doc po debian spaceman
SUBDIRS = include $(LIB_SUBDIRS) $(TOOL_SUBDIRS)
@@ -62,6 +62,7 @@ io: libxcmd libhandle
mkfs: libxfs
quota: libxcmd
repair: libxfs libxlog
+space: libxcmd
ifneq ($(ENABLE_BLKID), yes)
mkfs: libdisk
diff --git a/spaceman/Makefile b/spaceman/Makefile
new file mode 100644
index 0000000..612d36b
--- /dev/null
+++ b/spaceman/Makefile
@@ -0,0 +1,34 @@
+#
+# Copyright (c) 2012 Red Hat, Inc. All Rights Reserved.
+#
+
+TOPDIR = ..
+include $(TOPDIR)/include/builddefs
+
+LTCOMMAND = xfs_spaceman
+HFILES = init.h space.h
+CFILES = init.c \
+ file.c freesp.c
+
+LLDLIBS = $(LIBXCMD)
+LTDEPENDENCIES = $(LIBXCMD)
+LLDFLAGS = -static
+
+ifeq ($(ENABLE_READLINE),yes)
+LLDLIBS += $(LIBREADLINE) $(LIBTERMCAP)
+endif
+
+ifeq ($(ENABLE_EDITLINE),yes)
+LLDLIBS += $(LIBEDITLINE) $(LIBTERMCAP)
+endif
+
+default: depend $(LTCOMMAND)
+
+include $(BUILDRULES)
+
+install: default
+ $(INSTALL) -m 755 -d $(PKG_SBIN_DIR)
+ $(LTINSTALL) -m 755 $(LTCOMMAND) $(PKG_SBIN_DIR)
+install-dev:
+
+-include .dep
diff --git a/spaceman/file.c b/spaceman/file.c
new file mode 100644
index 0000000..ea4ab0c
--- /dev/null
+++ b/spaceman/file.c
@@ -0,0 +1,149 @@
+/*
+ * Copyright (c) 2004-2005 Silicon Graphics, Inc.
+ * Copyright (c) 2012 Red Hat, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <xfs/xfs.h>
+#include <xfs/command.h>
+#include <xfs/input.h>
+#include <sys/mman.h>
+#include "init.h"
+#include "space.h"
+
+static cmdinfo_t print_cmd;
+
+fileio_t *filetable;
+int filecount;
+fileio_t *file;
+
+static void
+print_fileio(
+ fileio_t *file,
+ int index,
+ int braces)
+{
+ printf(_("%c%03d%c %-14s (%s,%s,%s%s%s)\n"),
+ braces? '[' : ' ', index, braces? ']' : ' ', file->name,
+ file->flags & O_SYNC ? _("sync") : _("non-sync"),
+ file->flags & O_DIRECT ? _("direct") : _("non-direct"),
+ file->flags & O_RDONLY ? _("read-only") : _("read-write"),
+ file->flags & O_APPEND ? _(",append-only") : "",
+ file->flags & O_NONBLOCK ? _(",non-block") : "");
+}
+
+int
+filelist_f(void)
+{
+ int i;
+
+ for (i = 0; i < filecount; i++)
+ print_fileio(&filetable[i], i, &filetable[i] == file);
+ return 0;
+}
+
+static int
+print_f(
+ int argc,
+ char **argv)
+{
+ filelist_f();
+ return 0;
+}
+
+int
+openfile(
+ char *path,
+ xfs_fsop_geom_t *geom,
+ int flags,
+ mode_t mode)
+{
+ int fd;
+
+ fd = open(path, flags, mode);
+ if (fd < 0) {
+ if ((errno == EISDIR) && (flags & O_RDWR)) {
+ /* make it as if we asked for O_RDONLY & try again */
+ flags &= ~O_RDWR;
+ flags |= O_RDONLY;
+ fd = open(path, flags, mode);
+ if (fd < 0) {
+ perror(path);
+ return -1;
+ }
+ } else {
+ perror(path);
+ return -1;
+ }
+ }
+
+ if (xfsctl(path, fd, XFS_IOC_FSGEOMETRY, geom) < 0) {
+ perror("XFS_IOC_FSGEOMETRY");
+ close(fd);
+ return -1;
+ }
+ return fd;
+}
+
+int
+addfile(
+ char *name,
+ int fd,
+ xfs_fsop_geom_t *geometry,
+ int flags)
+{
+ char *filename;
+
+ filename = strdup(name);
+ if (!filename) {
+ perror("strdup");
+ close(fd);
+ return -1;
+ }
+
+ /* Extend the table of currently open files */
+ filetable = (fileio_t *)realloc(filetable, /* growing */
+ ++filecount * sizeof(fileio_t));
+ if (!filetable) {
+ perror("realloc");
+ filecount = 0;
+ free(filename);
+ close(fd);
+ return -1;
+ }
+
+ /* Finally, make this the new active open file */
+ file = &filetable[filecount - 1];
+ file->fd = fd;
+ file->flags = flags;
+ file->name = filename;
+ file->geom = *geometry;
+ return 0;
+}
+
+void
+file_init(void)
+{
+ print_cmd.name = "print";
+ print_cmd.altname = "p";
+ print_cmd.cfunc = print_f;
+ print_cmd.argmin = 0;
+ print_cmd.argmax = 0;
+ print_cmd.flags = CMD_FLAG_GLOBAL;
+ print_cmd.oneline = _("list current open files");
+
+ add_command(&print_cmd);
+}
diff --git a/spaceman/freesp.c b/spaceman/freesp.c
new file mode 100644
index 0000000..bfc93c9
--- /dev/null
+++ b/spaceman/freesp.c
@@ -0,0 +1,377 @@
+/*
+ * Copyright (c) 2000-2001,2005 Silicon Graphics, Inc.
+ * Copyright (c) 2012 Red Hat, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <xfs/xfs.h>
+#include <xfs/xfs_types.h>
+#include <xfs/command.h>
+#include <linux/fs.h>
+#include <linux/fiemap.h>
+#include "init.h"
+#include "space.h"
+
+#ifndef FIEMAP_FLAG_FREESPACE
+#define FIEMAP_FLAG_FREESPACE 0x4
+#define FIEMAP_FLAG_FREESPACE_SIZE 0x8
+#endif
+
+typedef struct histent
+{
+ int low;
+ int high;
+ long long count;
+ long long blocks;
+} histent_t;
+
+static int agcount;
+static xfs_agnumber_t *aglist;
+static int countflag;
+static int dumpflag;
+static int equalsize;
+static histent_t *hist;
+static int histcount;
+static int multsize;
+static int seen1;
+static int summaryflag;
+static long long totblocks;
+static long long totexts;
+
+static cmdinfo_t freesp_cmd;
+
+static void
+addhistent(
+ int h)
+{
+ hist = realloc(hist, (histcount + 1) * sizeof(*hist));
+ if (h == 0)
+ h = 1;
+ hist[histcount].low = h;
+ hist[histcount].count = hist[histcount].blocks = 0;
+ histcount++;
+ if (h == 1)
+ seen1 = 1;
+}
+
+static void
+addtohist(
+ xfs_agnumber_t agno,
+ xfs_agblock_t agbno,
+ off64_t len)
+{
+ int i;
+
+ if (dumpflag)
+ printf("%8d %8d %8Zu\n", agno, agbno, len);
+ totexts++;
+ totblocks += len;
+ for (i = 0; i < histcount; i++) {
+ if (hist[i].high >= len) {
+ hist[i].count++;
+ hist[i].blocks += len;
+ break;
+ }
+ }
+}
+
+static int
+hcmp(
+ const void *a,
+ const void *b)
+{
+ return ((histent_t *)a)->low - ((histent_t *)b)->low;
+}
+
+static void
+histinit(
+ int maxlen)
+{
+ int i;
+
+ if (equalsize) {
+ for (i = 1; i < maxlen; i += equalsize)
+ addhistent(i);
+ } else if (multsize) {
+ for (i = 1; i < maxlen; i *= multsize)
+ addhistent(i);
+ } else {
+ if (!seen1)
+ addhistent(1);
+ qsort(hist, histcount, sizeof(*hist), hcmp);
+ }
+ for (i = 0; i < histcount; i++) {
+ if (i < histcount - 1)
+ hist[i].high = hist[i + 1].low - 1;
+ else
+ hist[i].high = maxlen;
+ }
+}
+
+static void
+printhist(void)
+{
+ int i;
+
+ printf("%7s %7s %7s %7s %6s\n",
+ _("from"), _("to"), _("extents"), _("blocks"), _("pct"));
+ for (i = 0; i < histcount; i++) {
+ if (hist[i].count)
+ printf("%7d %7d %7lld %7lld %6.2f\n", hist[i].low,
+ hist[i].high, hist[i].count, hist[i].blocks,
+ hist[i].blocks * 100.0 / totblocks);
+ }
+}
+
+static int
+inaglist(
+ xfs_agnumber_t agno)
+{
+ int i;
+
+ if (agcount == 0)
+ return 1;
+ for (i = 0; i < agcount; i++)
+ if (aglist[i] == agno)
+ return 1;
+ return 0;
+}
+
+#define NR_EXTENTS 128
+
+static void
+scan_ag(
+ xfs_agnumber_t agno)
+{
+ struct fiemap *fiemap;
+ off64_t blocksize = file->geom.blocksize;
+ uint64_t last_logical = agno * file->geom.agblocks * blocksize;
+ uint64_t length = file->geom.agblocks * blocksize;
+ off64_t fsbperag;
+ int fiemap_flags;
+ int last = 0;
+ int map_size;
+
+
+ last_logical = (off64_t)file->geom.agblocks * blocksize * agno;
+ length = (off64_t)file->geom.agblocks * blocksize;
+ fsbperag = (off64_t)file->geom.agblocks * blocksize;
+
+ map_size = sizeof(struct fiemap) +
+ sizeof(struct fiemap_extent) * NR_EXTENTS;
+ fiemap = malloc(map_size);
+ if (!fiemap) {
+ fprintf(stderr, _("%s: fiemap malloc failed.\n"), progname);
+ exitcode = 1;
+ return;
+ }
+ if (countflag)
+ fiemap_flags = FIEMAP_FLAG_FREESPACE_SIZE;
+ else
+ fiemap_flags = FIEMAP_FLAG_FREESPACE;
+
+ while (!last) {
+ xfs_agblock_t agbno;
+ int ret;
+ int i;
+
+ memset(fiemap, 0, map_size);
+ fiemap->fm_flags = fiemap_flags;
+ fiemap->fm_start = last_logical;
+ fiemap->fm_length = length;
+ fiemap->fm_extent_count = NR_EXTENTS;
+
+ ret = ioctl(file->fd, FS_IOC_FIEMAP, (unsigned long)fiemap);
+ if (ret < 0) {
+ fprintf(stderr, "%s: ioctl(FS_IOC_FIEMAP) [\"%s\"]: "
+ "%s\n", progname, file->name, strerror(errno));
+ free(fiemap);
+ exitcode = 1;
+ return;
+ }
+
+ /* No more extents to map, exit */
+ if (!fiemap->fm_mapped_extents)
+ break;
+
+ for (i = 0; i < fiemap->fm_mapped_extents; i++) {
+ struct fiemap_extent *extent;
+ off64_t aglen;
+
+ extent = &fiemap->fm_extents[i];
+
+
+ agbno = (extent->fe_physical - (fsbperag * agno)) /
+ blocksize;
+ aglen = extent->fe_length / blocksize;
+
+ addtohist(agno, agbno, aglen);
+
+ /*
+ * we have to keep track of the highest offset extent we
+ * see when getting size ordered free space, so just do
+ * for all extents we get.
+ */
+ last_logical = max(last_logical,
+ extent->fe_logical + extent->fe_length);
+
+ if (extent->fe_flags & FIEMAP_EXTENT_LAST) {
+ last = 1;
+ break;
+ }
+ }
+ }
+}
+static void
+aglistadd(
+ char *a)
+{
+ aglist = realloc(aglist, (agcount + 1) * sizeof(*aglist));
+ aglist[agcount] = (xfs_agnumber_t)atoi(a);
+ agcount++;
+}
+
+static int
+init(
+ int argc,
+ char **argv)
+{
+ int c;
+ int speced = 0;
+
+ agcount = countflag = dumpflag = equalsize = multsize = optind = 0;
+ histcount = seen1 = summaryflag = 0;
+ totblocks = totexts = 0;
+ aglist = NULL;
+ hist = NULL;
+ while ((c = getopt(argc, argv, "a:bcde:h:m:s")) != EOF) {
+ switch (c) {
+ case 'a':
+ aglistadd(optarg);
+ break;
+ case 'b':
+ if (speced)
+ return 0;
+ multsize = 2;
+ speced = 1;
+ break;
+ case 'c':
+ countflag = 1;
+ break;
+ case 'd':
+ dumpflag = 1;
+ break;
+ case 'e':
+ if (speced)
+ return 0;
+ equalsize = atoi(optarg);
+ speced = 1;
+ break;
+ case 'h':
+ if (speced && !histcount)
+ return 0;
+ addhistent(atoi(optarg));
+ speced = 1;
+ break;
+ case 'm':
+ if (speced)
+ return 0;
+ multsize = atoi(optarg);
+ speced = 1;
+ break;
+ case 's':
+ summaryflag = 1;
+ break;
+ case '?':
+ return 0;
+ }
+ }
+ if (optind != argc)
+ return 0;
+ if (!speced)
+ multsize = 2;
+ histinit(file->geom.agblocks);
+ return 1;
+}
+
+/*
+ * Report on freespace usage in xfs filesystem.
+ */
+static int
+freesp_f(
+ int argc,
+ char **argv)
+{
+ xfs_agnumber_t agno;
+
+ if (!init(argc, argv))
+ return 0;
+ for (agno = 0; agno < file->geom.agcount; agno++) {
+ if (inaglist(agno))
+ scan_ag(agno);
+ }
+ if (histcount)
+ printhist();
+ if (summaryflag) {
+ printf(_("total free extents %lld\n"), totexts);
+ printf(_("total free blocks %lld\n"), totblocks);
+ printf(_("average free extent size %g\n"),
+ (double)totblocks / (double)totexts);
+ }
+ if (aglist)
+ free(aglist);
+ if (hist)
+ free(hist);
+ return 0;
+}
+
+static void
+freesp_help(void)
+{
+ printf(_(
+"\n"
+"Examine filesystem free space\n"
+"\n"
+"Options: [-bcds] [-a agno] [-e bsize] [-h h1]... [-m bmult]\n"
+"\n"
+" -b -- binary histogram bin size\n"
+" -c -- scan the by-count (size) ordered freespace tree\n"
+" -d -- debug output\n"
+" -s -- emit freespace summary information\n"
+" -a agno -- scan only the given AG agno\n"
+" -e bsize -- use fixed histogram bin size of bsize\n"
+" -h h1 -- use custom histogram bin size of h1. Multiple specifications allowed.\n"
+" -m bmult -- use histogram bin size multiplier of bmult\n"
+"\n"));
+
+}
+
+void
+freesp_init(void)
+{
+ freesp_cmd.name = "freesp";
+ freesp_cmd.altname = "fsp";
+ freesp_cmd.cfunc = freesp_f;
+ freesp_cmd.argmin = 0;
+ freesp_cmd.argmax = -1;
+ freesp_cmd.args = "[-bcds] [-a agno] [-e bsize] [-h h1]... [-m bmult]\n";
+ freesp_cmd.flags = CMD_FLAG_GLOBAL;
+ freesp_cmd.oneline = _("Examine filesystem free space");
+ freesp_cmd.help = freesp_help;
+
+ add_command(&freesp_cmd);
+}
+
diff --git a/spaceman/init.c b/spaceman/init.c
new file mode 100644
index 0000000..108dcd7
--- /dev/null
+++ b/spaceman/init.c
@@ -0,0 +1,117 @@
+/*
+ * Copyright (c) 2012 Red Hat, Inc
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include <xfs/xfs.h>
+#include <xfs/command.h>
+#include <xfs/input.h>
+#include "init.h"
+#include "space.h"
+
+char *progname;
+int exitcode;
+
+void
+usage(void)
+{
+ fprintf(stderr,
+ _("Usage: %s [-c cmd] file\n"),
+ progname);
+ exit(1);
+}
+
+static void
+init_commands(void)
+{
+ file_init();
+ freesp_init();
+ help_init();
+ quit_init();
+}
+
+static int
+init_args_command(
+ int index)
+{
+ if (index >= filecount)
+ return 0;
+ file = &filetable[index++];
+ return index;
+}
+
+static int
+init_check_command(
+ const cmdinfo_t *ct)
+{
+ if (!(ct->flags & CMD_FLAG_GLOBAL))
+ return 0;
+ return 1;
+}
+
+void
+init(
+ int argc,
+ char **argv)
+{
+ int c, flags = 0;
+ mode_t mode = 0600;
+ xfs_fsop_geom_t geometry = { 0 };
+
+ progname = basename(argv[0]);
+ setlocale(LC_ALL, "");
+ bindtextdomain(PACKAGE, LOCALEDIR);
+ textdomain(PACKAGE);
+
+ while ((c = getopt(argc, argv, "c:V")) != EOF) {
+ switch (c) {
+ case 'c':
+ add_user_command(optarg);
+ break;
+ case 'V':
+ printf(_("%s version %s\n"), progname, VERSION);
+ exit(0);
+ default:
+ usage();
+ }
+ }
+
+ while (optind < argc) {
+ if ((c = openfile(argv[optind], &geometry, flags, mode)) < 0)
+ exit(1);
+ if (!platform_test_xfs_fd(c)) {
+ printf(_("Not an XFS filesystem!\n"));
+ exit(1);
+ }
+ if (addfile(argv[optind], c, &geometry, flags) < 0)
+ exit(1);
+ optind++;
+ }
+
+ init_commands();
+ add_args_command(init_args_command);
+ add_check_command(init_check_command);
+}
+
+int
+main(
+ int argc,
+ char **argv)
+{
+ init(argc, argv);
+ command_loop();
+ return exitcode;
+}
diff --git a/spaceman/init.h b/spaceman/init.h
new file mode 100644
index 0000000..ecd0b5d
--- /dev/null
+++ b/spaceman/init.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2012 Red Hat, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+extern char *progname;
+extern int exitcode;
+
+#define min(a,b) (((a)<(b))?(a):(b))
+#define max(a,b) (((a)>(b))?(a):(b))
+
diff --git a/spaceman/space.h b/spaceman/space.h
new file mode 100644
index 0000000..c6a63fe
--- /dev/null
+++ b/spaceman/space.h
@@ -0,0 +1,37 @@
+/*
+ * Copyright (c) 2012 Red Hat, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+typedef struct fileio {
+ int fd; /* open file descriptor */
+ int flags; /* flags describing file state */
+ char *name; /* file name at time of open */
+ xfs_fsop_geom_t geom; /* XFS filesystem geometry */
+} fileio_t;
+
+extern fileio_t *filetable; /* open file table */
+extern int filecount; /* number of open files */
+extern fileio_t *file; /* active file in file table */
+extern int filelist_f(void);
+
+extern int openfile(char *, xfs_fsop_geom_t *, int, mode_t);
+extern int addfile(char *, int , xfs_fsop_geom_t *, int);
+
+extern void file_init(void);
+extern void help_init(void);
+extern void quit_init(void);
+extern void freesp_init(void);
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
2012-10-18 5:11 [RFC, PATCH 0/2] fiemap: filesystem free space mapping Dave Chinner
` (2 preceding siblings ...)
2012-10-18 5:27 ` [RFC, PATCH 3/2] xfsprogs: space management tool Dave Chinner
@ 2012-10-18 8:10 ` Andreas Dilger
2012-10-18 21:07 ` Dave Chinner
2012-10-23 12:30 ` Christoph Hellwig
4 siblings, 1 reply; 15+ messages in thread
From: Andreas Dilger @ 2012-10-18 8:10 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, xfs
On 2012-10-17, at 11:11 PM, Dave Chinner wrote:
> So, I was bored a few days ago, and I was sick of having to run
> xfs_db incorrectly report free space extents when the filesytem is
> mounted, so I decided to extend fiemap to export freespace mappings
> to userspace so I could get the information coherently through the
> mounted filesystem.
>
> Yes, this could probably be considered interface abuse but, well, it
> was simple to do because extent mapping is exactly what fiemap is
> designed to do. Hence I didn't have to write new walkers/formatters
> and I was using code I knew worked correctly.
One question about the usage of this interface - is the ioctl()
called on an open fd for the root inode, or is it called on any
open fd in the filesystem? In some sense, getting the free space
on the root (or preferably block dev inode if that would work)
would make the most sense, since FIEMAP is intended to be related
to a specific file.
That said, it is a lot easier to use if it can be on any open file
handle in the filesystem, and one could consider the free space as
being related to every file in the filesystem (e.g. for the next
block allocation or defrag migration).
> There are two methods of mapping - one is reporting free space in
> ascending extent start offset order, then other in ascending extent
> length order. Both a useful to have (e.g. defragmenter might want to
> know about the nearest free block to given offset or the largest
> free extent in a given region). Either way, XFS keeps indexes
> ordered in both ways, so they can be exported directly with minimal
> overhead.
>
> The only "interesting" abuse of the interface is really the use of
> FIEMAP_EXTENT_LAST. This means that the last extent in a freespace
> index is being returned, rather than the last freespace extent. This
> is done because filesystems often have multiple free space indexes,
> and it may be difficult to sort/scan over multiple indexes in a
> single map.
I'm not sure I understand the distinction you are trying to convey here.
Could you elaborate?
> This means an application needs to keep track of what freespace has
> been returned to it and adjust it's fiemap ranges apprpritately, or
> be aware of the underlying filesystem structure to for requests that
> don't span free space indexes. I don't see this a bug problem,
> because any application that is digging in freespace maps needs to
> know how the filesystem is structured to make any sense of the
> infomration returned. As such, I see this interface purely for
> filesystem diagnostics or utilities tightly bound to the filesystem
> (e.g. xfs_fsr).
>
> I'll attach a patch for a small utility that uses this interace to
> replicate the xfs_db freespace command in a short while so people
> can see how it is used. that shoul dmake it easier to comment on. :)
>
> Cheers,
>
> Dave.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Cheers, Andreas
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
2012-10-18 8:10 ` [RFC, PATCH 0/2] fiemap: filesystem free space mapping Andreas Dilger
@ 2012-10-18 21:07 ` Dave Chinner
0 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2012-10-18 21:07 UTC (permalink / raw)
To: Andreas Dilger; +Cc: linux-fsdevel, xfs
On Thu, Oct 18, 2012 at 02:10:59AM -0600, Andreas Dilger wrote:
> On 2012-10-17, at 11:11 PM, Dave Chinner wrote:
> > So, I was bored a few days ago, and I was sick of having to run
> > xfs_db incorrectly report free space extents when the filesytem is
> > mounted, so I decided to extend fiemap to export freespace mappings
> > to userspace so I could get the information coherently through the
> > mounted filesystem.
> >
> > Yes, this could probably be considered interface abuse but, well, it
> > was simple to do because extent mapping is exactly what fiemap is
> > designed to do. Hence I didn't have to write new walkers/formatters
> > and I was using code I knew worked correctly.
>
> One question about the usage of this interface - is the ioctl()
> called on an open fd for the root inode, or is it called on any
> open fd in the filesystem? In some sense, getting the free space
> on the root (or preferably block dev inode if that would work)
> would make the most sense, since FIEMAP is intended to be related
> to a specific file.
fiemap in XFs is currently only hooked up to files, not directories.
I didn't change that, so it needs an open regular file in the
filesystem to work. I need to change that for it to work on
directories - I think that having it work on the root dir of a
filesystem is the right thing to do, but really having it behave
like fstatfs(2) is where it shoul dend up, I think.
> That said, it is a lot easier to use if it can be on any open file
> handle in the filesystem, and one could consider the free space as
> being related to every file in the filesystem (e.g. for the next
> block allocation or defrag migration).
*nod*
> > There are two methods of mapping - one is reporting free space in
> > ascending extent start offset order, then other in ascending extent
> > length order. Both a useful to have (e.g. defragmenter might want to
> > know about the nearest free block to given offset or the largest
> > free extent in a given region). Either way, XFS keeps indexes
> > ordered in both ways, so they can be exported directly with minimal
> > overhead.
> >
> > The only "interesting" abuse of the interface is really the use of
> > FIEMAP_EXTENT_LAST. This means that the last extent in a freespace
> > index is being returned, rather than the last freespace extent. This
> > is done because filesystems often have multiple free space indexes,
> > and it may be difficult to sort/scan over multiple indexes in a
> > single map.
>
> I'm not sure I understand the distinction you are trying to convey here.
> Could you elaborate?
XFs has multiple Allocation Groups with separate indexes in each AG.
It only make sense for filesystem tools to be finding free space in
a specific region (i.e. the AG they want to allocate in). xfs-fsr
already controls the AG that the new extents are allocated in, but
it has no idea of whether that is the best AG to relocate the data
to - it just follows the kernel allocation rules based on the
location of the inode. If we want to select a new AG based on, say,
largest free extent size, then we need to know what the largest
sizes in each AG are. Hence we want to know when we reach the end of
an AG index when pulling the freespace data out of the kernel so we
categorise it by AG.
I suspect a similar thing might be useful for btrfs, with per-device
freespace mappings...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
2012-10-18 5:11 [RFC, PATCH 0/2] fiemap: filesystem free space mapping Dave Chinner
` (3 preceding siblings ...)
2012-10-18 8:10 ` [RFC, PATCH 0/2] fiemap: filesystem free space mapping Andreas Dilger
@ 2012-10-23 12:30 ` Christoph Hellwig
2012-10-23 21:53 ` Dave Chinner
4 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2012-10-23 12:30 UTC (permalink / raw)
To: Dave Chinner; +Cc: linux-fsdevel, xfs
On Thu, Oct 18, 2012 at 04:11:17PM +1100, Dave Chinner wrote:
> So, I was bored a few days ago, and I was sick of having to run
> xfs_db incorrectly report free space extents when the filesytem is
> mounted, so I decided to extend fiemap to export freespace mappings
> to userspace so I could get the information coherently through the
> mounted filesystem.
>
> Yes, this could probably be considered interface abuse but, well, it
> was simple to do because extent mapping is exactly what fiemap is
> designed to do. Hence I didn't have to write new walkers/formatters
> and I was using code I knew worked correctly.
I think the right way to handle this is to introduce a new ioctl which
uses the same structures. That way we have a reasonable interface,
without issue like which file does it need to be called on because the
VFS glue can turn it into a superblock op.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
2012-10-23 12:30 ` Christoph Hellwig
@ 2012-10-23 21:53 ` Dave Chinner
2012-10-24 11:47 ` Chris Mason
0 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2012-10-23 21:53 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-fsdevel, xfs
On Tue, Oct 23, 2012 at 08:30:44AM -0400, Christoph Hellwig wrote:
> On Thu, Oct 18, 2012 at 04:11:17PM +1100, Dave Chinner wrote:
> > So, I was bored a few days ago, and I was sick of having to run
> > xfs_db incorrectly report free space extents when the filesytem is
> > mounted, so I decided to extend fiemap to export freespace mappings
> > to userspace so I could get the information coherently through the
> > mounted filesystem.
> >
> > Yes, this could probably be considered interface abuse but, well, it
> > was simple to do because extent mapping is exactly what fiemap is
> > designed to do. Hence I didn't have to write new walkers/formatters
> > and I was using code I knew worked correctly.
>
> I think the right way to handle this is to introduce a new ioctl which
> uses the same structures. That way we have a reasonable interface,
> without issue like which file does it need to be called on because the
> VFS glue can turn it into a superblock op.
A VFS level ioctl or an XFS ioctl?
I thought about a new ioctl, but then what's the point of having an
extensible fiemap interface if we create new ioctls with an
identical interface for doing something that the existing ioctl is
perfectly capable of doing? I'd still need special flags to control
the ioctl behaviour even though it uses struct fiemap and plumbing,
so it seemed pointless to introduce a new ioctl....
As it is, the only reason fiemap doesn't work on directory ioctls
for XFS is that it hasn't been hooked up to directories. I can't see
anything in the fiemap VFS layers that prevents us from mapping
directories and we know the mapping code in XFS works on
directories. So that would the "what file" problem go away - any
file would do as long as the user has the permissions to run the
free space mapping command....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
2012-10-23 21:53 ` Dave Chinner
@ 2012-10-24 11:47 ` Chris Mason
2012-10-24 12:32 ` Jie Liu
2012-10-24 15:09 ` Christoph Hellwig
0 siblings, 2 replies; 15+ messages in thread
From: Chris Mason @ 2012-10-24 11:47 UTC (permalink / raw)
To: Dave Chinner
Cc: Christoph Hellwig, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com
On Tue, Oct 23, 2012 at 03:53:13PM -0600, Dave Chinner wrote:
> On Tue, Oct 23, 2012 at 08:30:44AM -0400, Christoph Hellwig wrote:
> > On Thu, Oct 18, 2012 at 04:11:17PM +1100, Dave Chinner wrote:
> > > So, I was bored a few days ago, and I was sick of having to run
> > > xfs_db incorrectly report free space extents when the filesytem is
> > > mounted, so I decided to extend fiemap to export freespace mappings
> > > to userspace so I could get the information coherently through the
> > > mounted filesystem.
> > >
> > > Yes, this could probably be considered interface abuse but, well, it
> > > was simple to do because extent mapping is exactly what fiemap is
> > > designed to do. Hence I didn't have to write new walkers/formatters
> > > and I was using code I knew worked correctly.
> >
> > I think the right way to handle this is to introduce a new ioctl which
> > uses the same structures. That way we have a reasonable interface,
> > without issue like which file does it need to be called on because the
> > VFS glue can turn it into a superblock op.
>
> A VFS level ioctl or an XFS ioctl?
>
> I thought about a new ioctl, but then what's the point of having an
> extensible fiemap interface if we create new ioctls with an
> identical interface for doing something that the existing ioctl is
> perfectly capable of doing? I'd still need special flags to control
> the ioctl behaviour even though it uses struct fiemap and plumbing,
> so it seemed pointless to introduce a new ioctl....
This brings us one step close to the norton disk doctor defrag display.
I'm all for it in the main fiemap call, it makes much more sense for the
users I think.
-chris
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
2012-10-24 11:47 ` Chris Mason
@ 2012-10-24 12:32 ` Jie Liu
2012-10-24 15:09 ` Christoph Hellwig
1 sibling, 0 replies; 15+ messages in thread
From: Jie Liu @ 2012-10-24 12:32 UTC (permalink / raw)
To: Dave Chinner, Chris Mason, Christoph Hellwig,
linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com
On 10/24/12 19:47, Chris Mason wrote:
> On Tue, Oct 23, 2012 at 03:53:13PM -0600, Dave Chinner wrote:
>> On Tue, Oct 23, 2012 at 08:30:44AM -0400, Christoph Hellwig wrote:
>>> On Thu, Oct 18, 2012 at 04:11:17PM +1100, Dave Chinner wrote:
>>>> So, I was bored a few days ago, and I was sick of having to run
>>>> xfs_db incorrectly report free space extents when the filesytem is
>>>> mounted, so I decided to extend fiemap to export freespace mappings
>>>> to userspace so I could get the information coherently through the
>>>> mounted filesystem.
>>>>
>>>> Yes, this could probably be considered interface abuse but, well, it
>>>> was simple to do because extent mapping is exactly what fiemap is
>>>> designed to do. Hence I didn't have to write new walkers/formatters
>>>> and I was using code I knew worked correctly.
>>> I think the right way to handle this is to introduce a new ioctl which
>>> uses the same structures. That way we have a reasonable interface,
>>> without issue like which file does it need to be called on because the
>>> VFS glue can turn it into a superblock op.
>> A VFS level ioctl or an XFS ioctl?
>>
>> I thought about a new ioctl, but then what's the point of having an
>> extensible fiemap interface if we create new ioctls with an
>> identical interface for doing something that the existing ioctl is
>> perfectly capable of doing? I'd still need special flags to control
>> the ioctl behaviour even though it uses struct fiemap and plumbing,
>> so it seemed pointless to introduce a new ioctl....
Hi Dave,
I am writing XFS shrinkfs feature, and I really need an approach to get
the free space of an XFS file system since
xfs_db can not fetch agf->agf_freeblks and agf_btreeblks against a
mounted partition to calculate it out for
online shrink operation that just as what's you have mentioned above.
So currently I add a new ioctl for this purpose, that's would be fine if
we can have a fiemap interface to do it so that
I can kill this new ioctl to avoid duplicate efforts.
Thanks,
-Jeff
> This brings us one step close to the norton disk doctor defrag display.
> I'm all for it in the main fiemap call, it makes much more sense for the
> users I think.
>
> -chris
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
2012-10-24 11:47 ` Chris Mason
2012-10-24 12:32 ` Jie Liu
@ 2012-10-24 15:09 ` Christoph Hellwig
2012-10-24 19:15 ` Dave Chinner
1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2012-10-24 15:09 UTC (permalink / raw)
To: Chris Mason, Dave Chinner, Christoph Hellwig,
linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com
On Wed, Oct 24, 2012 at 07:47:17AM -0400, Chris Mason wrote:
> I'm all for it in the main fiemap call, it makes much more sense for the
> users I think.
How so? Current fiemap is a per-inode information, Daves new call is
per-fs. Making one a flag of another is a gross user interface. In
addition we're bound to get issue where filesystems fail to wire up
fiemap to the tons of different iops just for this operation, or
accidentally wire up "real" fiemap to things like special files or
pipes.
Btw, I'd like t orestate that I really love to see this functionality in
the VFS, just not multiplexed over FIEMAP.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
2012-10-24 15:09 ` Christoph Hellwig
@ 2012-10-24 19:15 ` Dave Chinner
0 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2012-10-24 19:15 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Chris Mason, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com
On Wed, Oct 24, 2012 at 11:09:51AM -0400, Christoph Hellwig wrote:
> On Wed, Oct 24, 2012 at 07:47:17AM -0400, Chris Mason wrote:
> > I'm all for it in the main fiemap call, it makes much more sense for the
> > users I think.
>
> How so? Current fiemap is a per-inode information, Daves new call is
> per-fs. Making one a flag of another is a gross user interface. In
> addition we're bound to get issue where filesystems fail to wire up
> fiemap to the tons of different iops just for this operation, or
> accidentally wire up "real" fiemap to things like special files or
> pipes.
>
> Btw, I'd like t orestate that I really love to see this functionality in
> the VFS, just not multiplexed over FIEMAP.
That's fine. I just wanted to clarify what you were asking.
FIEMAPFS it is, then...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 15+ messages in thread