linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC, PATCH 0/2] fiemap: filesystem free space mapping
@ 2012-10-18  5:11 Dave Chinner
  2012-10-18  5:11 ` [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP Dave Chinner
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Dave Chinner @ 2012-10-18  5:11 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: xfs

So, I was bored a few days ago, and I was sick of having to run
xfs_db incorrectly report free space extents when the filesytem is
mounted, so I decided to extend fiemap to export freespace mappings
to userspace so I could get the information coherently through the
mounted filesystem.

Yes, this could probably be considered interface abuse but, well, it
was simple to do because extent mapping is exactly what fiemap is
designed to do. Hence I didn't have to write new walkers/formatters
and I was using code I knew worked correctly.

There are two methods of mapping - one is reporting free space in
ascending extent start offset order, then other in ascending extent
length order. Both a useful to have (e.g. defragmenter might want to
know about the nearest free block to given offset or the largest
free extent in a given region). Either way, XFS keeps indexes
ordered in both ways, so they can be exported directly with minimal
overhead.

The only "interesting" abuse of the interface is really the use of
FIEMAP_EXTENT_LAST. This means that the last extent in a freespace
index is being returned, rather than the last freespace extent. This
is done because filesystems often have multiple free space indexes,
and it may be difficult to sort/scan over multiple indexes in a
single map.

This means an application needs to keep track of what freespace has
been returned to it and adjust it's fiemap ranges apprpritately, or
be aware of the underlying filesystem structure to for requests that
don't span free space indexes. I don't see this a bug problem,
because any application that is digging in freespace maps needs to
know how the filesystem is structured to make any sense of the
infomration returned. As such, I see this interface purely for
filesystem diagnostics or utilities tightly bound to the filesystem
(e.g. xfs_fsr).

I'll attach a patch for a small utility that uses this interace to
replicate the xfs_db freespace command in a short while so people
can see how it is used. that shoul dmake it easier to comment on. :)

Cheers,

Dave.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP
  2012-10-18  5:11 [RFC, PATCH 0/2] fiemap: filesystem free space mapping Dave Chinner
@ 2012-10-18  5:11 ` Dave Chinner
  2012-11-08 16:50   ` Mark Tinguely
  2012-10-18  5:11 ` [PATCH 2/2] xfs: implement FIEMAP_FLAG_FREESPACE_* Dave Chinner
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2012-10-18  5:11 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: xfs

From: Dave Chinner <dchinner@redhat.com>

fiemap is used to map extents of used space on files. it's just an
array of extents, though, so there's no reason it can only index
*used* space.

Ther eis need for getting freespace layout information into
userspace. For example, defragmentation programs would find it
useful to be able to map the free space in the filesystem to
work out where it is best to move data to defragment it.
Alternatively, knowing where free space is enables us to identify
extents that need to be moved to defragment free space.

Hence, extend fiemap with the FIEMAP_FLAG_FREESPACE to indicate that
the caller wants to map free space in the range fm_start bytes from
the start of the filesystem for fm_length bytes.

Because XFS can report extents in size order without needing to
sort, and this information is useful to xfs_fsr, also add
FIEMAP_FLAG_FREESPACE_SIZE to tell the filesystem to return a
freespace map ordered by extent size rather than offset. If there
are multiple extents of the same size, then they are ordered by
offset.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 Documentation/filesystems/fiemap.txt |   37 +++++++++++++++++++++++++++++++---
 include/linux/fiemap.h               |    6 +++++-
 2 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/Documentation/filesystems/fiemap.txt b/Documentation/filesystems/fiemap.txt
index 1b805a0..45531ba 100644
--- a/Documentation/filesystems/fiemap.txt
+++ b/Documentation/filesystems/fiemap.txt
@@ -2,9 +2,9 @@
 Fiemap Ioctl
 ============
 
-The fiemap ioctl is an efficient method for userspace to get file
-extent mappings. Instead of block-by-block mapping (such as bmap), fiemap
-returns a list of extents.
+The fiemap ioctl is an efficient method for userspace to get file or
+filesystem extent mappings. Instead of block-by-block mapping (such as
+bmap), fiemap returns a list of extents.
 
 
 Request Basics
@@ -58,6 +58,37 @@ If this flag is set, the kernel will sync the file before mapping extents.
 If this flag is set, the extents returned will describe the inodes
 extended attribute lookup tree, instead of its data tree.
 
+* FIEMAP_FLAG_FREESPACE
+If this flag is set, the extents returned will describe the
+*filesystem's* free space map, with fm_start specifying the start offset
+into the filesystems address range (in bytes) of the region to be
+mapped. fm_length is the the byte range that will be mapped. Free space
+extents will be mapped in ascending offset order.
+
+Filesystems with multiple freespace indexes may return
+FIEMAP_EXTENT_LAST at the end of a specific freespace index map. Hence
+FIEMAP_EXTENT_LAST does not mean there is no more free space to be
+mapped, just that the requested range spanned multiple free space
+indexes.
+
+Hence the caller needs to be aware of the underlying filesystem
+implementation and geometry to make correct use of this call. As such,
+this functionality is only intended for use by filesystem management
+utilities (e.g. defragmentation tools) and not general purpose
+applications.
+
+* FIEMAP_FLAG_FREESPACE_SIZE
+If this flag is set, the filesystem freespace tree will be mapped
+similar to FIEMAP_FLAG_FREESPACE, but extents will be ordered from
+smallest free space extent to largest Where extents have the same size,
+they will be ordered by ascending offset order similar to
+FIEMAP_FLAG_FREESPACE. It is up to the application to track the highest
+offset extent seen by this walk so that if it doesn't see a
+FIEMAP_EXTENT_LAST flag, the application knows what offset to start the
+next mapping from.
+
+The same caveats exist for this call for FIEMAP_EXTENT_LAST as for
+FIEMAP_FLAG_FREESPACE.
 
 Extent Mapping
 --------------
diff --git a/include/linux/fiemap.h b/include/linux/fiemap.h
index d830747..f4fbb9f 100644
--- a/include/linux/fiemap.h
+++ b/include/linux/fiemap.h
@@ -40,8 +40,12 @@ struct fiemap {
 
 #define FIEMAP_FLAG_SYNC	0x00000001 /* sync file data before map */
 #define FIEMAP_FLAG_XATTR	0x00000002 /* map extended attribute tree */
+#define FIEMAP_FLAG_FREESPACE	0x00000004 /* map fs freespace tree */
+#define FIEMAP_FLAG_FREESPACE_SIZE 0x00000008 /* map freespace in size order */
 
-#define FIEMAP_FLAGS_COMPAT	(FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR)
+#define FIEMAP_FLAGS_COMPAT	(FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR | \
+				 FIEMAP_FLAG_FREESPACE | \
+				 FIEMAP_FLAG_FREESPACE_SIZE)
 
 #define FIEMAP_EXTENT_LAST		0x00000001 /* Last extent in file. */
 #define FIEMAP_EXTENT_UNKNOWN		0x00000002 /* Data location unknown. */
-- 
1.7.10


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/2] xfs: implement FIEMAP_FLAG_FREESPACE_*
  2012-10-18  5:11 [RFC, PATCH 0/2] fiemap: filesystem free space mapping Dave Chinner
  2012-10-18  5:11 ` [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP Dave Chinner
@ 2012-10-18  5:11 ` Dave Chinner
  2012-10-18  5:27 ` [RFC, PATCH 3/2] xfsprogs: space management tool Dave Chinner
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2012-10-18  5:11 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: xfs

From: Dave Chinner <dchinner@redhat.com>

As you wish.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_alloc.c |  219 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_alloc.h |    7 ++
 fs/xfs/xfs_iops.c  |   12 ++-
 3 files changed, 237 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_alloc.c b/fs/xfs/xfs_alloc.c
index 335206a..ee680c9 100644
--- a/fs/xfs/xfs_alloc.c
+++ b/fs/xfs/xfs_alloc.c
@@ -2470,3 +2470,222 @@ error0:
 	xfs_perag_put(args.pag);
 	return error;
 }
+
+/*
+ * Walk the extents in the tree given by the cursor, and dump them all into the
+ * fieinfo. At the last extent in the tree, set the FIEMAP_EXTENT_LAST flag so
+ * that we return only free space from this tree in a given request.
+ */
+static int
+xfs_alloc_ag_freespace_map(
+	struct xfs_btree_cur    *cur,
+	struct fiemap_extent_info *fieinfo,
+	xfs_agblock_t		sagbno,
+	xfs_agblock_t		eagbno)
+{
+	int	error = 0;
+	int	i = 1;
+
+	/*
+	 * Loop until we have either filled the fiemap or reached the end of
+	 * the AG walk.
+	 */
+	while (i) {
+		xfs_agblock_t	fbno;
+		xfs_extlen_t	flen;
+		xfs_daddr_t	dbno;
+		xfs_fileoff_t	dlen;
+		int		flags = 0;
+
+		error = xfs_alloc_get_rec(cur, &fbno, &flen, &i);
+		if (error)
+			break;
+		XFS_WANT_CORRUPTED_RETURN(i == 1);
+
+		/*
+		 * move the cursor now to make it easy to continue the loop and
+		 * detect the last extent in the lookup.
+		 */
+		error = xfs_btree_increment(cur, 0, &i);
+		if (error)
+			break;
+
+		/* range check - must be wholly withing requested range */
+		if (fbno < sagbno ||
+		    (eagbno != NULLAGBLOCK && fbno + flen > eagbno)) {
+		xfs_warn(cur->bc_mp, "10: %d/%d, %d/%d", sagbno, eagbno, fbno, flen);
+			continue;
+		}
+
+		/*
+		 * use daddr format for all range/len calculations as that is
+		 * the format the range/len variables are supplied in by
+		 * userspace.
+		 */
+		dbno = XFS_AGB_TO_DADDR(cur->bc_mp, cur->bc_private.a.agno, fbno);
+		dlen = XFS_FSB_TO_BB(cur->bc_mp, flen);
+
+		if (i == 0)
+			flags |= FIEMAP_EXTENT_LAST;
+		error = -fiemap_fill_next_extent(fieinfo, BBTOB(dbno),
+						BBTOB(dbno), BBTOB(dlen), flags);
+		if (error)
+			break;
+	}
+	return error;
+
+}
+
+/*
+ * Map the freespace from the requested range in the requested order.
+ *
+ * To make things simple, this function will only return the freespace from a
+ * single AG regardless of the size of the map passed in. That AG will be the AG
+ * that the first freespace is found in. In other words, FIEMAP_EXTENT_LAST does
+ * not mean the last freespace extent has been mapped, just that the last extent
+ * in a given freespace index has been mapped. The caller is responsible for
+ * moving the range to the next freespace region if it needs to query for more
+ * information.
+ *
+ * IOWs, the caller is responsible for knowing about the XFS filesystem
+ * structure and how it indexes freespace to use this call effectively.
+ */
+#define XFS_FREESP_FLAGS (FIEMAP_FLAG_FREESPACE | FIEMAP_FLAG_FREESPACE_SIZE)
+int
+xfs_alloc_freespace_map(
+	struct xfs_mount	*mp,
+	struct fiemap_extent_info *fieinfo,
+	u64			start,
+	u64			length)
+{
+	struct xfs_btree_cur	*cur;
+	struct xfs_buf		*agbp;
+	struct xfs_perag	*pag;
+	xfs_agnumber_t		agno;
+	xfs_agnumber_t		sagno;
+	xfs_agblock_t		sagbno;
+	xfs_agnumber_t		eagno;
+	xfs_agblock_t		eagbno;
+	bool			bycnt;
+	int			error = 0;
+
+	/* can only have one type of mapping */
+	if ((fieinfo->fi_flags & XFS_FREESP_FLAGS) == XFS_FREESP_FLAGS) {
+		xfs_warn(mp, "1: 0x%x\n", fieinfo->fi_flags);
+		return EINVAL;
+	}
+	bycnt = (fieinfo->fi_flags & FIEMAP_FLAG_FREESPACE_SIZE);
+
+	if (XFS_B_TO_FSB(mp, start) >= mp->m_sb.sb_dblocks) {
+		xfs_warn(mp, "2: %lld, %lld/%lld\n", start,
+				XFS_B_TO_FSB(mp, start), mp->m_sb.sb_dblocks);
+		return EINVAL;
+	}
+	if (length < mp->m_sb.sb_blocksize) {
+		xfs_warn(mp, "3: %lld, %d\n", length, mp->m_sb.sb_blocksize);
+		return EINVAL;
+	}
+	if (start + length < start) {
+		xfs_warn(mp, "4: %lld/%lld, %lld", start, length, start + length);
+		return EINVAL;
+	}
+
+	sagno = xfs_daddr_to_agno(mp, BTOBB(start));
+	sagbno = xfs_daddr_to_agbno(mp, BTOBB(start));
+
+	eagno = xfs_daddr_to_agno(mp, BTOBB(start + length));
+	eagbno = xfs_daddr_to_agbno(mp, BTOBB(start + length));
+
+	if (sagno == eagno && sagbno == eagbno) {
+		xfs_warn(mp, "5: %d/%d, %d/%d", sagno, eagno, sagbno, eagbno);
+		return EINVAL;
+	}
+
+	/*
+	 * Force out the log.  This means any transactions that might have freed
+	 * space before we took the AGF buffer lock are now on disk, and the
+	 * volatile disk cache is flushed.
+	 */
+	xfs_log_force(mp, XFS_LOG_SYNC);
+
+	/*
+	 * Do initial lookup in by-bno tree. Keep skipping AGs until with
+	 * either find a free space extent or reach the end of the search.
+	 */
+	for (agno = sagno; agno < eagno; agno++) {
+		int	i;
+		error = 0;
+
+		error = xfs_alloc_read_agf(mp, NULL, agno, 0, &agbp);
+		if (error || !agbp) {
+			xfs_warn(mp, "7: %p, %d", agbp, error);
+			goto next;
+		}
+
+		pag = xfs_perag_get(mp, agno);
+		if (pag->pagf_freeblks <= pag->pagf_flcount) {
+			/* no free space worth reporting */
+			xfs_warn(mp, "6: %d %d", pag->pagf_freeblks,
+						pag->pagf_flcount);
+			goto put_agbp;
+		}
+
+		cur = xfs_allocbt_init_cursor(mp, NULL, agbp, agno,
+					      XFS_BTNUM_BNO);
+		error = xfs_alloc_lookup_ge(cur, 0, sagbno, &i);
+		if (error) {
+			xfs_warn(mp, "8: %d/%d, %d/%d", sagno, eagno, sagbno, eagbno);
+			goto del_cursor;
+		}
+		XFS_WANT_CORRUPTED_GOTO(i == 1, del_cursor);
+
+		if (!bycnt) {
+			/*
+			 * if we are doing a bno ordered lookup, we can just
+			 * loop across the free space extents formatting them
+			 * until we get to the end of the AG, eagbno or fill the
+			 * fieinfo map.
+			 */
+			error = xfs_alloc_ag_freespace_map(cur, fieinfo, sagbno,
+					agno == eagno ? eagbno : NULLAGBLOCK);
+		} else {
+			/*
+			 * We are doing a size ordered lookup. We know there is
+			 * a free space extent somewhere past out start bno, so
+			 * just kill the current cursor and start a size
+			 * ordered scan to find all the freespace in the given
+			 * range.
+			 */
+			xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
+			cur = xfs_allocbt_init_cursor(mp, NULL, agbp, agno,
+						      XFS_BTNUM_CNT);
+			error = xfs_alloc_lookup_ge(cur, 0, 1, &i);
+			if (error)
+				goto del_cursor;
+			XFS_WANT_CORRUPTED_GOTO(i == 1, del_cursor);
+
+			error = xfs_alloc_ag_freespace_map(cur, fieinfo, sagbno,
+					agno == eagno ? eagbno : NULLAGBLOCK);
+		}
+
+del_cursor:
+		xfs_btree_del_cursor(cur, error < 0 ? XFS_BTREE_ERROR
+						    : XFS_BTREE_NOERROR);
+put_agbp:
+		xfs_perag_put(pag);
+		xfs_buf_relse(agbp);
+next:
+		if (error)
+			break;
+		sagbno = 0;
+	}
+
+	/*
+	 * negative errno indicates that we hit a FIEMAP_EXTENT_LAST flag. Clear
+	 * the error in that case.
+	 */
+	if (error < 0)
+		error = 0;
+
+	return error;;
+}
diff --git a/fs/xfs/xfs_alloc.h b/fs/xfs/xfs_alloc.h
index feacb06..371b02c 100644
--- a/fs/xfs/xfs_alloc.h
+++ b/fs/xfs/xfs_alloc.h
@@ -231,4 +231,11 @@ xfs_alloc_get_rec(
 	xfs_extlen_t		*len,	/* output: length of extent */
 	int			*stat);	/* output: success/failure */
 
+int
+xfs_alloc_freespace_map(
+	struct xfs_mount	*mp,
+	struct fiemap_extent_info *fieinfo,
+	u64			start,
+	u64			length);
+
 #endif	/* __XFS_ALLOC_H__ */
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 4e00cf0..4555525 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -938,7 +938,9 @@ xfs_vn_update_time(
 	return -xfs_trans_commit(tp, 0);
 }
 
-#define XFS_FIEMAP_FLAGS	(FIEMAP_FLAG_SYNC|FIEMAP_FLAG_XATTR)
+#define XFS_FIEMAP_FLAGS	(FIEMAP_FLAG_SYNC | FIEMAP_FLAG_XATTR | \
+				 FIEMAP_FLAG_FREESPACE |		\
+				 FIEMAP_FLAG_FREESPACE_SIZE)
 
 /*
  * Call fiemap helper to fill in user data.
@@ -997,6 +999,13 @@ xfs_vn_fiemap(
 	if (error)
 		return error;
 
+	if ((fieinfo->fi_flags &
+			(FIEMAP_FLAG_FREESPACE | FIEMAP_FLAG_FREESPACE_SIZE))) {
+		error = xfs_alloc_freespace_map(ip->i_mount, fieinfo,
+						start, length);
+		goto out;
+	}
+
 	/* Set up bmap header for xfs internal routine */
 	bm.bmv_offset = BTOBB(start);
 	/* Special case for whole file */
@@ -1017,6 +1026,7 @@ xfs_vn_fiemap(
 		bm.bmv_iflags |= BMV_IF_DELALLOC;
 
 	error = xfs_getbmap(ip, &bm, xfs_fiemap_format, fieinfo);
+out:
 	if (error)
 		return -error;
 
-- 
1.7.10


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC, PATCH 3/2] xfsprogs: space management tool
  2012-10-18  5:11 [RFC, PATCH 0/2] fiemap: filesystem free space mapping Dave Chinner
  2012-10-18  5:11 ` [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP Dave Chinner
  2012-10-18  5:11 ` [PATCH 2/2] xfs: implement FIEMAP_FLAG_FREESPACE_* Dave Chinner
@ 2012-10-18  5:27 ` Dave Chinner
  2012-10-18  8:10 ` [RFC, PATCH 0/2] fiemap: filesystem free space mapping Andreas Dilger
  2012-10-23 12:30 ` Christoph Hellwig
  4 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2012-10-18  5:27 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: xfs


From: Dave Chinner <dchinner@redhat.com>

xfs_spaceman is intended as a diagnostic and control tool for space
management operations within XFS. Operations like examining free
space, managing allocation policies, issuing block discards on free
space, etc.

The tool is modelled on the xfs_io interface, allowing both
interactive and command line control of the tool, enabling it to be
used in scripts and automated management tools.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 Makefile          |    3 +-
 spaceman/Makefile |   34 +++++
 spaceman/file.c   |  149 +++++++++++++++++++++
 spaceman/freesp.c |  377 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 spaceman/init.c   |  117 +++++++++++++++++
 spaceman/init.h   |   24 ++++
 spaceman/space.h  |   37 ++++++
 7 files changed, 740 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index c40fb2c..a81b8b2 100644
--- a/Makefile
+++ b/Makefile
@@ -41,7 +41,7 @@ endif
 
 LIB_SUBDIRS = libxfs libxlog libxcmd libhandle libdisk
 TOOL_SUBDIRS = copy db estimate fsck fsr growfs io logprint mkfs quota \
-		mdrestore repair rtcp m4 man doc po debian
+		mdrestore repair rtcp m4 man doc po debian spaceman
 
 SUBDIRS = include $(LIB_SUBDIRS) $(TOOL_SUBDIRS)
 
@@ -62,6 +62,7 @@ io: libxcmd libhandle
 mkfs: libxfs
 quota: libxcmd
 repair: libxfs libxlog
+space: libxcmd
 
 ifneq ($(ENABLE_BLKID), yes)
 mkfs: libdisk
diff --git a/spaceman/Makefile b/spaceman/Makefile
new file mode 100644
index 0000000..612d36b
--- /dev/null
+++ b/spaceman/Makefile
@@ -0,0 +1,34 @@
+#
+# Copyright (c) 2012 Red Hat, Inc.  All Rights Reserved.
+#
+
+TOPDIR = ..
+include $(TOPDIR)/include/builddefs
+
+LTCOMMAND = xfs_spaceman
+HFILES = init.h space.h
+CFILES = init.c \
+	file.c freesp.c
+
+LLDLIBS = $(LIBXCMD)
+LTDEPENDENCIES = $(LIBXCMD)
+LLDFLAGS = -static
+
+ifeq ($(ENABLE_READLINE),yes)
+LLDLIBS += $(LIBREADLINE) $(LIBTERMCAP)
+endif
+
+ifeq ($(ENABLE_EDITLINE),yes)
+LLDLIBS += $(LIBEDITLINE) $(LIBTERMCAP)
+endif
+
+default: depend $(LTCOMMAND)
+
+include $(BUILDRULES)
+
+install: default
+	$(INSTALL) -m 755 -d $(PKG_SBIN_DIR)
+	$(LTINSTALL) -m 755 $(LTCOMMAND) $(PKG_SBIN_DIR)
+install-dev:
+
+-include .dep
diff --git a/spaceman/file.c b/spaceman/file.c
new file mode 100644
index 0000000..ea4ab0c
--- /dev/null
+++ b/spaceman/file.c
@@ -0,0 +1,149 @@
+/*
+ * Copyright (c) 2004-2005 Silicon Graphics, Inc.
+ * Copyright (c) 2012 Red Hat, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+
+#include <xfs/xfs.h>
+#include <xfs/command.h>
+#include <xfs/input.h>
+#include <sys/mman.h>
+#include "init.h"
+#include "space.h"
+
+static cmdinfo_t print_cmd;
+
+fileio_t	*filetable;
+int		filecount;
+fileio_t	*file;
+
+static void
+print_fileio(
+	fileio_t	*file,
+	int		index,
+	int		braces)
+{
+	printf(_("%c%03d%c %-14s (%s,%s,%s%s%s)\n"),
+		braces? '[' : ' ', index, braces? ']' : ' ', file->name,
+		file->flags & O_SYNC ? _("sync") : _("non-sync"),
+		file->flags & O_DIRECT ? _("direct") : _("non-direct"),
+		file->flags & O_RDONLY ? _("read-only") : _("read-write"),
+		file->flags & O_APPEND ? _(",append-only") : "",
+		file->flags & O_NONBLOCK ? _(",non-block") : "");
+}
+
+int
+filelist_f(void)
+{
+	int		i;
+
+	for (i = 0; i < filecount; i++)
+		print_fileio(&filetable[i], i, &filetable[i] == file);
+	return 0;
+}
+
+static int
+print_f(
+	int		argc,
+	char		**argv)
+{
+	filelist_f();
+	return 0;
+}
+
+int
+openfile(
+	char		*path,
+	xfs_fsop_geom_t	*geom,
+	int		flags,
+	mode_t		mode)
+{
+	int		fd;
+
+	fd = open(path, flags, mode);
+	if (fd < 0) {
+		if ((errno == EISDIR) && (flags & O_RDWR)) {
+			/* make it as if we asked for O_RDONLY & try again */
+			flags &= ~O_RDWR;
+			flags |= O_RDONLY;
+			fd = open(path, flags, mode);
+			if (fd < 0) {
+				perror(path);
+				return -1;
+			}
+		} else {
+			perror(path);
+			return -1;
+		}
+	}
+
+	if (xfsctl(path, fd, XFS_IOC_FSGEOMETRY, geom) < 0) {
+		perror("XFS_IOC_FSGEOMETRY");
+		close(fd);
+		return -1;
+	}
+	return fd;
+}
+
+int
+addfile(
+	char		*name,
+	int		fd,
+	xfs_fsop_geom_t	*geometry,
+	int		flags)
+{
+	char		*filename;
+
+	filename = strdup(name);
+	if (!filename) {
+		perror("strdup");
+		close(fd);
+		return -1;
+	}
+
+	/* Extend the table of currently open files */
+	filetable = (fileio_t *)realloc(filetable,	/* growing */
+					++filecount * sizeof(fileio_t));
+	if (!filetable) {
+		perror("realloc");
+		filecount = 0;
+		free(filename);
+		close(fd);
+		return -1;
+	}
+
+	/* Finally, make this the new active open file */
+	file = &filetable[filecount - 1];
+	file->fd = fd;
+	file->flags = flags;
+	file->name = filename;
+	file->geom = *geometry;
+	return 0;
+}
+
+void
+file_init(void)
+{
+	print_cmd.name = "print";
+	print_cmd.altname = "p";
+	print_cmd.cfunc = print_f;
+	print_cmd.argmin = 0;
+	print_cmd.argmax = 0;
+	print_cmd.flags = CMD_FLAG_GLOBAL;
+	print_cmd.oneline = _("list current open files");
+
+	add_command(&print_cmd);
+}
diff --git a/spaceman/freesp.c b/spaceman/freesp.c
new file mode 100644
index 0000000..bfc93c9
--- /dev/null
+++ b/spaceman/freesp.c
@@ -0,0 +1,377 @@
+/*
+ * Copyright (c) 2000-2001,2005 Silicon Graphics, Inc.
+ * Copyright (c) 2012 Red Hat, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+
+#include <xfs/xfs.h>
+#include <xfs/xfs_types.h>
+#include <xfs/command.h>
+#include <linux/fs.h>
+#include <linux/fiemap.h>
+#include "init.h"
+#include "space.h"
+
+#ifndef FIEMAP_FLAG_FREESPACE
+#define FIEMAP_FLAG_FREESPACE		0x4
+#define FIEMAP_FLAG_FREESPACE_SIZE	0x8
+#endif
+
+typedef struct histent
+{
+	int		low;
+	int		high;
+	long long	count;
+	long long	blocks;
+} histent_t;
+
+static int		agcount;
+static xfs_agnumber_t	*aglist;
+static int		countflag;
+static int		dumpflag;
+static int		equalsize;
+static histent_t	*hist;
+static int		histcount;
+static int		multsize;
+static int		seen1;
+static int		summaryflag;
+static long long	totblocks;
+static long long	totexts;
+
+static cmdinfo_t freesp_cmd;
+
+static void
+addhistent(
+	int	h)
+{
+	hist = realloc(hist, (histcount + 1) * sizeof(*hist));
+	if (h == 0)
+		h = 1;
+	hist[histcount].low = h;
+	hist[histcount].count = hist[histcount].blocks = 0;
+	histcount++;
+	if (h == 1)
+		seen1 = 1;
+}
+
+static void
+addtohist(
+	xfs_agnumber_t	agno,
+	xfs_agblock_t	agbno,
+	off64_t		len)
+{
+	int		i;
+
+	if (dumpflag)
+		printf("%8d %8d %8Zu\n", agno, agbno, len);
+	totexts++;
+	totblocks += len;
+	for (i = 0; i < histcount; i++) {
+		if (hist[i].high >= len) {
+			hist[i].count++;
+			hist[i].blocks += len;
+			break;
+		}
+	}
+}
+
+static int
+hcmp(
+	const void	*a,
+	const void	*b)
+{
+	return ((histent_t *)a)->low - ((histent_t *)b)->low;
+}
+
+static void
+histinit(
+	int	maxlen)
+{
+	int	i;
+
+	if (equalsize) {
+		for (i = 1; i < maxlen; i += equalsize)
+			addhistent(i);
+	} else if (multsize) {
+		for (i = 1; i < maxlen; i *= multsize)
+			addhistent(i);
+	} else {
+		if (!seen1)
+			addhistent(1);
+		qsort(hist, histcount, sizeof(*hist), hcmp);
+	}
+	for (i = 0; i < histcount; i++) {
+		if (i < histcount - 1)
+			hist[i].high = hist[i + 1].low - 1;
+		else
+			hist[i].high = maxlen;
+	}
+}
+
+static void
+printhist(void)
+{
+	int	i;
+
+	printf("%7s %7s %7s %7s %6s\n",
+		_("from"), _("to"), _("extents"), _("blocks"), _("pct"));
+	for (i = 0; i < histcount; i++) {
+		if (hist[i].count)
+			printf("%7d %7d %7lld %7lld %6.2f\n", hist[i].low,
+				hist[i].high, hist[i].count, hist[i].blocks,
+				hist[i].blocks * 100.0 / totblocks);
+	}
+}
+
+static int
+inaglist(
+	xfs_agnumber_t	agno)
+{
+	int		i;
+
+	if (agcount == 0)
+		return 1;
+	for (i = 0; i < agcount; i++)
+		if (aglist[i] == agno)
+			return 1;
+	return 0;
+}
+
+#define NR_EXTENTS 128
+
+static void
+scan_ag(
+	xfs_agnumber_t	agno)
+{
+	struct fiemap	*fiemap;
+	off64_t		blocksize = file->geom.blocksize;
+	uint64_t	last_logical = agno * file->geom.agblocks * blocksize;
+	uint64_t	length = file->geom.agblocks * blocksize;
+	off64_t		fsbperag;
+	int		fiemap_flags;
+	int		last = 0;
+	int		map_size;
+
+
+	last_logical = (off64_t)file->geom.agblocks * blocksize * agno;
+	length = (off64_t)file->geom.agblocks * blocksize;
+	fsbperag = (off64_t)file->geom.agblocks * blocksize;
+
+	map_size = sizeof(struct fiemap) +
+		   sizeof(struct fiemap_extent) * NR_EXTENTS;
+	fiemap = malloc(map_size);
+	if (!fiemap) {
+		fprintf(stderr, _("%s: fiemap malloc failed.\n"), progname);
+		exitcode = 1;
+		return;
+	}
+	if (countflag)
+		fiemap_flags = FIEMAP_FLAG_FREESPACE_SIZE;
+	else
+		fiemap_flags = FIEMAP_FLAG_FREESPACE;
+
+	while (!last) {
+		xfs_agblock_t	agbno;
+		int		ret;
+		int		i;
+
+		memset(fiemap, 0, map_size);
+		fiemap->fm_flags = fiemap_flags;
+		fiemap->fm_start = last_logical;
+		fiemap->fm_length = length;
+		fiemap->fm_extent_count = NR_EXTENTS;
+
+		ret = ioctl(file->fd, FS_IOC_FIEMAP, (unsigned long)fiemap);
+		if (ret < 0) {
+			fprintf(stderr, "%s: ioctl(FS_IOC_FIEMAP) [\"%s\"]: "
+				"%s\n", progname, file->name, strerror(errno));
+			free(fiemap);
+			exitcode = 1;
+			return;
+		}
+
+		/* No more extents to map, exit */
+		if (!fiemap->fm_mapped_extents)
+			break;
+
+		for (i = 0; i < fiemap->fm_mapped_extents; i++) {
+			struct fiemap_extent	*extent;
+			off64_t			aglen;
+
+			extent = &fiemap->fm_extents[i];
+
+
+			agbno = (extent->fe_physical - (fsbperag * agno)) /
+								blocksize;
+			aglen = extent->fe_length / blocksize;
+
+			addtohist(agno, agbno, aglen);
+
+			/*
+			 * we have to keep track of the highest offset extent we
+			 * see when getting size ordered free space, so just do
+			 * for all extents we get.
+			 */
+			last_logical = max(last_logical,
+					extent->fe_logical + extent->fe_length);
+
+			if (extent->fe_flags & FIEMAP_EXTENT_LAST) {
+				last = 1;
+				break;
+			}
+		}
+	}
+}
+static void
+aglistadd(
+	char	*a)
+{
+	aglist = realloc(aglist, (agcount + 1) * sizeof(*aglist));
+	aglist[agcount] = (xfs_agnumber_t)atoi(a);
+	agcount++;
+}
+
+static int
+init(
+	int		argc,
+	char		**argv)
+{
+	int		c;
+	int		speced = 0;
+
+	agcount = countflag = dumpflag = equalsize = multsize = optind = 0;
+	histcount = seen1 = summaryflag = 0;
+	totblocks = totexts = 0;
+	aglist = NULL;
+	hist = NULL;
+	while ((c = getopt(argc, argv, "a:bcde:h:m:s")) != EOF) {
+		switch (c) {
+		case 'a':
+			aglistadd(optarg);
+			break;
+		case 'b':
+			if (speced)
+				return 0;
+			multsize = 2;
+			speced = 1;
+			break;
+		case 'c':
+			countflag = 1;
+			break;
+		case 'd':
+			dumpflag = 1;
+			break;
+		case 'e':
+			if (speced)
+				return 0;
+			equalsize = atoi(optarg);
+			speced = 1;
+			break;
+		case 'h':
+			if (speced && !histcount)
+				return 0;
+			addhistent(atoi(optarg));
+			speced = 1;
+			break;
+		case 'm':
+			if (speced)
+				return 0;
+			multsize = atoi(optarg);
+			speced = 1;
+			break;
+		case 's':
+			summaryflag = 1;
+			break;
+		case '?':
+			return 0;
+		}
+	}
+	if (optind != argc)
+		return 0;
+	if (!speced)
+		multsize = 2;
+	histinit(file->geom.agblocks);
+	return 1;
+}
+
+/*
+ * Report on freespace usage in xfs filesystem.
+ */
+static int
+freesp_f(
+	int		argc,
+	char		**argv)
+{
+	xfs_agnumber_t	agno;
+
+	if (!init(argc, argv))
+		return 0;
+	for (agno = 0; agno < file->geom.agcount; agno++)  {
+		if (inaglist(agno))
+			scan_ag(agno);
+	}
+	if (histcount)
+		printhist();
+	if (summaryflag) {
+		printf(_("total free extents %lld\n"), totexts);
+		printf(_("total free blocks %lld\n"), totblocks);
+		printf(_("average free extent size %g\n"),
+			(double)totblocks / (double)totexts);
+	}
+	if (aglist)
+		free(aglist);
+	if (hist)
+		free(hist);
+	return 0;
+}
+
+static void
+freesp_help(void)
+{
+	printf(_(
+"\n"
+"Examine filesystem free space\n"
+"\n"
+"Options: [-bcds] [-a agno] [-e bsize] [-h h1]... [-m bmult]\n"
+"\n"
+" -b -- binary histogram bin size\n"
+" -c -- scan the by-count (size) ordered freespace tree\n"
+" -d -- debug output\n"
+" -s -- emit freespace summary information\n"
+" -a agno -- scan only the given AG agno\n"
+" -e bsize -- use fixed histogram bin size of bsize\n"
+" -h h1 -- use custom histogram bin size of h1. Multiple specifications allowed.\n"
+" -m bmult -- use histogram bin size multiplier of bmult\n"
+"\n"));
+
+}
+
+void
+freesp_init(void)
+{
+	freesp_cmd.name = "freesp";
+	freesp_cmd.altname = "fsp";
+	freesp_cmd.cfunc = freesp_f;
+	freesp_cmd.argmin = 0;
+	freesp_cmd.argmax = -1;
+	freesp_cmd.args = "[-bcds] [-a agno] [-e bsize] [-h h1]... [-m bmult]\n";
+	freesp_cmd.flags = CMD_FLAG_GLOBAL;
+	freesp_cmd.oneline = _("Examine filesystem free space");
+	freesp_cmd.help = freesp_help;
+
+	add_command(&freesp_cmd);
+}
+
diff --git a/spaceman/init.c b/spaceman/init.c
new file mode 100644
index 0000000..108dcd7
--- /dev/null
+++ b/spaceman/init.c
@@ -0,0 +1,117 @@
+/*
+ * Copyright (c) 2012 Red Hat, Inc
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+
+#include <xfs/xfs.h>
+#include <xfs/command.h>
+#include <xfs/input.h>
+#include "init.h"
+#include "space.h"
+
+char	*progname;
+int	exitcode;
+
+void
+usage(void)
+{
+	fprintf(stderr,
+		_("Usage: %s [-c cmd] file\n"),
+		progname);
+	exit(1);
+}
+
+static void
+init_commands(void)
+{
+	file_init();
+	freesp_init();
+	help_init();
+	quit_init();
+}
+
+static int
+init_args_command(
+	int	index)
+{
+	if (index >= filecount)
+		return 0;
+	file = &filetable[index++];
+	return index;
+}
+
+static int
+init_check_command(
+	const cmdinfo_t	*ct)
+{
+	if (!(ct->flags & CMD_FLAG_GLOBAL))
+		return 0;
+	return 1;
+}
+
+void
+init(
+	int		argc,
+	char		**argv)
+{
+	int		c, flags = 0;
+	mode_t		mode = 0600;
+	xfs_fsop_geom_t	geometry = { 0 };
+
+	progname = basename(argv[0]);
+	setlocale(LC_ALL, "");
+	bindtextdomain(PACKAGE, LOCALEDIR);
+	textdomain(PACKAGE);
+
+	while ((c = getopt(argc, argv, "c:V")) != EOF) {
+		switch (c) {
+		case 'c':
+			add_user_command(optarg);
+			break;
+		case 'V':
+			printf(_("%s version %s\n"), progname, VERSION);
+			exit(0);
+		default:
+			usage();
+		}
+	}
+
+	while (optind < argc) {
+		if ((c = openfile(argv[optind], &geometry, flags, mode)) < 0)
+			exit(1);
+		if (!platform_test_xfs_fd(c)) {
+			printf(_("Not an XFS filesystem!\n"));
+			exit(1);
+		}
+		if (addfile(argv[optind], c, &geometry, flags) < 0)
+			exit(1);
+		optind++;
+	}
+
+	init_commands();
+	add_args_command(init_args_command);
+	add_check_command(init_check_command);
+}
+
+int
+main(
+	int	argc,
+	char	**argv)
+{
+	init(argc, argv);
+	command_loop();
+	return exitcode;
+}
diff --git a/spaceman/init.h b/spaceman/init.h
new file mode 100644
index 0000000..ecd0b5d
--- /dev/null
+++ b/spaceman/init.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (c) 2012 Red Hat, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+
+extern char	*progname;
+extern int	exitcode;
+
+#define min(a,b)	(((a)<(b))?(a):(b))
+#define max(a,b)	(((a)>(b))?(a):(b))
+
diff --git a/spaceman/space.h b/spaceman/space.h
new file mode 100644
index 0000000..c6a63fe
--- /dev/null
+++ b/spaceman/space.h
@@ -0,0 +1,37 @@
+/*
+ * Copyright (c) 2012 Red Hat, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+
+typedef struct fileio {
+	int		fd;		/* open file descriptor */
+	int		flags;		/* flags describing file state */
+	char		*name;		/* file name at time of open */
+	xfs_fsop_geom_t	geom;		/* XFS filesystem geometry */
+} fileio_t;
+
+extern fileio_t		*filetable;	/* open file table */
+extern int		filecount;	/* number of open files */
+extern fileio_t		*file;		/* active file in file table */
+extern int filelist_f(void);
+
+extern int	openfile(char *, xfs_fsop_geom_t *, int, mode_t);
+extern int	addfile(char *, int , xfs_fsop_geom_t *, int);
+
+extern void	file_init(void);
+extern void	help_init(void);
+extern void	quit_init(void);
+extern void	freesp_init(void);

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
  2012-10-18  5:11 [RFC, PATCH 0/2] fiemap: filesystem free space mapping Dave Chinner
                   ` (2 preceding siblings ...)
  2012-10-18  5:27 ` [RFC, PATCH 3/2] xfsprogs: space management tool Dave Chinner
@ 2012-10-18  8:10 ` Andreas Dilger
  2012-10-18 21:07   ` Dave Chinner
  2012-10-23 12:30 ` Christoph Hellwig
  4 siblings, 1 reply; 15+ messages in thread
From: Andreas Dilger @ 2012-10-18  8:10 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-fsdevel, xfs

On 2012-10-17, at 11:11 PM, Dave Chinner wrote:
> So, I was bored a few days ago, and I was sick of having to run
> xfs_db incorrectly report free space extents when the filesytem is
> mounted, so I decided to extend fiemap to export freespace mappings
> to userspace so I could get the information coherently through the
> mounted filesystem.
> 
> Yes, this could probably be considered interface abuse but, well, it
> was simple to do because extent mapping is exactly what fiemap is
> designed to do. Hence I didn't have to write new walkers/formatters
> and I was using code I knew worked correctly.

One question about the usage of this interface - is the ioctl()
called on an open fd for the root inode, or is it called on any
open fd in the filesystem?  In some sense, getting the free space
on the root (or preferably block dev inode if that would work)
would make the most sense, since FIEMAP is intended to be related
to a specific file.

That said, it is a lot easier to use if it can be on any open file
handle in the filesystem, and one could consider the free space as
being related to every file in the filesystem (e.g. for the next
block allocation or defrag migration).

> There are two methods of mapping - one is reporting free space in
> ascending extent start offset order, then other in ascending extent
> length order. Both a useful to have (e.g. defragmenter might want to
> know about the nearest free block to given offset or the largest
> free extent in a given region). Either way, XFS keeps indexes
> ordered in both ways, so they can be exported directly with minimal
> overhead.
> 
> The only "interesting" abuse of the interface is really the use of
> FIEMAP_EXTENT_LAST. This means that the last extent in a freespace
> index is being returned, rather than the last freespace extent. This
> is done because filesystems often have multiple free space indexes,
> and it may be difficult to sort/scan over multiple indexes in a
> single map.

I'm not sure I understand the distinction you are trying to convey here.
Could you elaborate?

> This means an application needs to keep track of what freespace has
> been returned to it and adjust it's fiemap ranges apprpritately, or
> be aware of the underlying filesystem structure to for requests that
> don't span free space indexes. I don't see this a bug problem,
> because any application that is digging in freespace maps needs to
> know how the filesystem is structured to make any sense of the
> infomration returned. As such, I see this interface purely for
> filesystem diagnostics or utilities tightly bound to the filesystem
> (e.g. xfs_fsr).
> 
> I'll attach a patch for a small utility that uses this interace to
> replicate the xfs_db freespace command in a short while so people
> can see how it is used. that shoul dmake it easier to comment on. :)
> 
> Cheers,
> 
> Dave.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas





_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
  2012-10-18  8:10 ` [RFC, PATCH 0/2] fiemap: filesystem free space mapping Andreas Dilger
@ 2012-10-18 21:07   ` Dave Chinner
  0 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2012-10-18 21:07 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: linux-fsdevel, xfs

On Thu, Oct 18, 2012 at 02:10:59AM -0600, Andreas Dilger wrote:
> On 2012-10-17, at 11:11 PM, Dave Chinner wrote:
> > So, I was bored a few days ago, and I was sick of having to run
> > xfs_db incorrectly report free space extents when the filesytem is
> > mounted, so I decided to extend fiemap to export freespace mappings
> > to userspace so I could get the information coherently through the
> > mounted filesystem.
> > 
> > Yes, this could probably be considered interface abuse but, well, it
> > was simple to do because extent mapping is exactly what fiemap is
> > designed to do. Hence I didn't have to write new walkers/formatters
> > and I was using code I knew worked correctly.
> 
> One question about the usage of this interface - is the ioctl()
> called on an open fd for the root inode, or is it called on any
> open fd in the filesystem?  In some sense, getting the free space
> on the root (or preferably block dev inode if that would work)
> would make the most sense, since FIEMAP is intended to be related
> to a specific file.

fiemap in XFs is currently only hooked up to files, not directories.
I didn't change that, so it needs an open regular file in the
filesystem to work. I need to change that for it to work on
directories - I think that having it work on the root dir of a
filesystem is the right thing to do, but really having it behave
like fstatfs(2) is where it shoul dend up, I think.

> That said, it is a lot easier to use if it can be on any open file
> handle in the filesystem, and one could consider the free space as
> being related to every file in the filesystem (e.g. for the next
> block allocation or defrag migration).

*nod*

> > There are two methods of mapping - one is reporting free space in
> > ascending extent start offset order, then other in ascending extent
> > length order. Both a useful to have (e.g. defragmenter might want to
> > know about the nearest free block to given offset or the largest
> > free extent in a given region). Either way, XFS keeps indexes
> > ordered in both ways, so they can be exported directly with minimal
> > overhead.
> > 
> > The only "interesting" abuse of the interface is really the use of
> > FIEMAP_EXTENT_LAST. This means that the last extent in a freespace
> > index is being returned, rather than the last freespace extent. This
> > is done because filesystems often have multiple free space indexes,
> > and it may be difficult to sort/scan over multiple indexes in a
> > single map.
> 
> I'm not sure I understand the distinction you are trying to convey here.
> Could you elaborate?

XFs has multiple Allocation Groups with separate indexes in each AG.
It only make sense for filesystem tools to be finding free space in
a specific region (i.e. the AG they want to allocate in). xfs-fsr
already controls the AG that the new extents are allocated in, but
it has no idea of whether that is the best AG to relocate the data
to - it just follows the kernel allocation rules based on the
location of the inode. If we want to select a new AG based on, say,
largest free extent size, then we need to know what the largest
sizes in each AG are. Hence we want to know when we reach the end of
an AG index when pulling the freespace data out of the kernel so we
categorise it by AG.

I suspect a similar thing might be useful for btrfs, with per-device
freespace mappings...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
  2012-10-18  5:11 [RFC, PATCH 0/2] fiemap: filesystem free space mapping Dave Chinner
                   ` (3 preceding siblings ...)
  2012-10-18  8:10 ` [RFC, PATCH 0/2] fiemap: filesystem free space mapping Andreas Dilger
@ 2012-10-23 12:30 ` Christoph Hellwig
  2012-10-23 21:53   ` Dave Chinner
  4 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2012-10-23 12:30 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-fsdevel, xfs

On Thu, Oct 18, 2012 at 04:11:17PM +1100, Dave Chinner wrote:
> So, I was bored a few days ago, and I was sick of having to run
> xfs_db incorrectly report free space extents when the filesytem is
> mounted, so I decided to extend fiemap to export freespace mappings
> to userspace so I could get the information coherently through the
> mounted filesystem.
> 
> Yes, this could probably be considered interface abuse but, well, it
> was simple to do because extent mapping is exactly what fiemap is
> designed to do. Hence I didn't have to write new walkers/formatters
> and I was using code I knew worked correctly.

I think the right way to handle this is to introduce a new ioctl which
uses the same structures.  That way we have a reasonable interface,
without issue like which file does it need to be called on because the
VFS glue can turn it into a superblock op.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
  2012-10-23 12:30 ` Christoph Hellwig
@ 2012-10-23 21:53   ` Dave Chinner
  2012-10-24 11:47     ` Chris Mason
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2012-10-23 21:53 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-fsdevel, xfs

On Tue, Oct 23, 2012 at 08:30:44AM -0400, Christoph Hellwig wrote:
> On Thu, Oct 18, 2012 at 04:11:17PM +1100, Dave Chinner wrote:
> > So, I was bored a few days ago, and I was sick of having to run
> > xfs_db incorrectly report free space extents when the filesytem is
> > mounted, so I decided to extend fiemap to export freespace mappings
> > to userspace so I could get the information coherently through the
> > mounted filesystem.
> > 
> > Yes, this could probably be considered interface abuse but, well, it
> > was simple to do because extent mapping is exactly what fiemap is
> > designed to do. Hence I didn't have to write new walkers/formatters
> > and I was using code I knew worked correctly.
> 
> I think the right way to handle this is to introduce a new ioctl which
> uses the same structures.  That way we have a reasonable interface,
> without issue like which file does it need to be called on because the
> VFS glue can turn it into a superblock op.

A VFS level ioctl or an XFS ioctl?

I thought about a new ioctl, but then what's the point of having an
extensible fiemap interface if we create new ioctls with an
identical interface for doing something that the existing ioctl is
perfectly capable of doing?  I'd still need special flags to control
the ioctl behaviour even though it uses struct fiemap and plumbing,
so it seemed pointless to introduce a new ioctl....

As it is, the only reason fiemap doesn't work on directory ioctls
for XFS is that it hasn't been hooked up to directories. I can't see
anything in the fiemap VFS layers that prevents us from mapping
directories and we know the mapping code in XFS works on
directories. So that would the "what file" problem go away - any
file would do as long as the user has the permissions to run the
free space mapping command....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
  2012-10-23 21:53   ` Dave Chinner
@ 2012-10-24 11:47     ` Chris Mason
  2012-10-24 12:32       ` Jie Liu
  2012-10-24 15:09       ` Christoph Hellwig
  0 siblings, 2 replies; 15+ messages in thread
From: Chris Mason @ 2012-10-24 11:47 UTC (permalink / raw)
  To: Dave Chinner
  Cc: Christoph Hellwig, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com

On Tue, Oct 23, 2012 at 03:53:13PM -0600, Dave Chinner wrote:
> On Tue, Oct 23, 2012 at 08:30:44AM -0400, Christoph Hellwig wrote:
> > On Thu, Oct 18, 2012 at 04:11:17PM +1100, Dave Chinner wrote:
> > > So, I was bored a few days ago, and I was sick of having to run
> > > xfs_db incorrectly report free space extents when the filesytem is
> > > mounted, so I decided to extend fiemap to export freespace mappings
> > > to userspace so I could get the information coherently through the
> > > mounted filesystem.
> > > 
> > > Yes, this could probably be considered interface abuse but, well, it
> > > was simple to do because extent mapping is exactly what fiemap is
> > > designed to do. Hence I didn't have to write new walkers/formatters
> > > and I was using code I knew worked correctly.
> > 
> > I think the right way to handle this is to introduce a new ioctl which
> > uses the same structures.  That way we have a reasonable interface,
> > without issue like which file does it need to be called on because the
> > VFS glue can turn it into a superblock op.
> 
> A VFS level ioctl or an XFS ioctl?
> 
> I thought about a new ioctl, but then what's the point of having an
> extensible fiemap interface if we create new ioctls with an
> identical interface for doing something that the existing ioctl is
> perfectly capable of doing?  I'd still need special flags to control
> the ioctl behaviour even though it uses struct fiemap and plumbing,
> so it seemed pointless to introduce a new ioctl....

This brings us one step close to the norton disk doctor defrag display.
I'm all for it in the main fiemap call, it makes much more sense for the
users I think.

-chris

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
  2012-10-24 11:47     ` Chris Mason
@ 2012-10-24 12:32       ` Jie Liu
  2012-10-24 15:09       ` Christoph Hellwig
  1 sibling, 0 replies; 15+ messages in thread
From: Jie Liu @ 2012-10-24 12:32 UTC (permalink / raw)
  To: Dave Chinner, Chris Mason, Christoph Hellwig,
	linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com

On 10/24/12 19:47, Chris Mason wrote:
> On Tue, Oct 23, 2012 at 03:53:13PM -0600, Dave Chinner wrote:
>> On Tue, Oct 23, 2012 at 08:30:44AM -0400, Christoph Hellwig wrote:
>>> On Thu, Oct 18, 2012 at 04:11:17PM +1100, Dave Chinner wrote:
>>>> So, I was bored a few days ago, and I was sick of having to run
>>>> xfs_db incorrectly report free space extents when the filesytem is
>>>> mounted, so I decided to extend fiemap to export freespace mappings
>>>> to userspace so I could get the information coherently through the
>>>> mounted filesystem.
>>>>
>>>> Yes, this could probably be considered interface abuse but, well, it
>>>> was simple to do because extent mapping is exactly what fiemap is
>>>> designed to do. Hence I didn't have to write new walkers/formatters
>>>> and I was using code I knew worked correctly.
>>> I think the right way to handle this is to introduce a new ioctl which
>>> uses the same structures.  That way we have a reasonable interface,
>>> without issue like which file does it need to be called on because the
>>> VFS glue can turn it into a superblock op.
>> A VFS level ioctl or an XFS ioctl?
>>
>> I thought about a new ioctl, but then what's the point of having an
>> extensible fiemap interface if we create new ioctls with an
>> identical interface for doing something that the existing ioctl is
>> perfectly capable of doing?  I'd still need special flags to control
>> the ioctl behaviour even though it uses struct fiemap and plumbing,
>> so it seemed pointless to introduce a new ioctl....
Hi Dave,

I am writing XFS shrinkfs feature, and I really need an approach to get
the free space of an XFS file system since
xfs_db can not fetch agf->agf_freeblks and agf_btreeblks against a
mounted partition to calculate it out for
online shrink operation that just as what's you have mentioned above.

So currently I add a new ioctl for this purpose, that's would be fine if
we can have a fiemap interface to do it so that
I can kill this new ioctl to avoid duplicate efforts.

Thanks,
-Jeff
> This brings us one step close to the norton disk doctor defrag display.
> I'm all for it in the main fiemap call, it makes much more sense for the
> users I think.
>
> -chris
>
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
  2012-10-24 11:47     ` Chris Mason
  2012-10-24 12:32       ` Jie Liu
@ 2012-10-24 15:09       ` Christoph Hellwig
  2012-10-24 19:15         ` Dave Chinner
  1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2012-10-24 15:09 UTC (permalink / raw)
  To: Chris Mason, Dave Chinner, Christoph Hellwig,
	linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com

On Wed, Oct 24, 2012 at 07:47:17AM -0400, Chris Mason wrote:
> I'm all for it in the main fiemap call, it makes much more sense for the
> users I think.

How so?  Current fiemap is a per-inode information, Daves new call is
per-fs.  Making one a flag of another is a gross user interface.  In
addition we're bound to get issue where filesystems fail to wire up
fiemap to the tons of different iops just for this operation, or
accidentally wire up "real" fiemap to things like special files or
pipes.

Btw, I'd like t orestate that I really love to see this functionality in
the VFS, just not multiplexed over FIEMAP.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC, PATCH 0/2] fiemap: filesystem free space mapping
  2012-10-24 15:09       ` Christoph Hellwig
@ 2012-10-24 19:15         ` Dave Chinner
  0 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2012-10-24 19:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Chris Mason, linux-fsdevel@vger.kernel.org, xfs@oss.sgi.com

On Wed, Oct 24, 2012 at 11:09:51AM -0400, Christoph Hellwig wrote:
> On Wed, Oct 24, 2012 at 07:47:17AM -0400, Chris Mason wrote:
> > I'm all for it in the main fiemap call, it makes much more sense for the
> > users I think.
> 
> How so?  Current fiemap is a per-inode information, Daves new call is
> per-fs.  Making one a flag of another is a gross user interface.  In
> addition we're bound to get issue where filesystems fail to wire up
> fiemap to the tons of different iops just for this operation, or
> accidentally wire up "real" fiemap to things like special files or
> pipes.
> 
> Btw, I'd like t orestate that I really love to see this functionality in
> the VFS, just not multiplexed over FIEMAP.

That's fine. I just wanted to clarify what you were asking.
FIEMAPFS it is, then...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP
  2012-10-18  5:11 ` [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP Dave Chinner
@ 2012-11-08 16:50   ` Mark Tinguely
  2012-11-08 20:56     ` Dave Chinner
  0 siblings, 1 reply; 15+ messages in thread
From: Mark Tinguely @ 2012-11-08 16:50 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-fsdevel, xfs

On 10/18/12 00:11, Dave Chinner wrote:
> From: Dave Chinner<dchinner@redhat.com>
>
> fiemap is used to map extents of used space on files. it's just an
> array of extents, though, so there's no reason it can only index
> *used*  space.
>
> Ther eis need for getting freespace layout information into
> userspace. For example, defragmentation programs would find it
> useful to be able to map the free space in the filesystem to
> work out where it is best to move data to defragment it.
> Alternatively, knowing where free space is enables us to identify
> extents that need to be moved to defragment free space.
>
> Hence, extend fiemap with the FIEMAP_FLAG_FREESPACE to indicate that
> the caller wants to map free space in the range fm_start bytes from
> the start of the filesystem for fm_length bytes.
>
> Because XFS can report extents in size order without needing to
> sort, and this information is useful to xfs_fsr, also add
> FIEMAP_FLAG_FREESPACE_SIZE to tell the filesystem to return a
> freespace map ordered by extent size rather than offset. If there
> are multiple extents of the same size, then they are ordered by
> offset.
>
> Signed-off-by: Dave Chinner<dchinner@redhat.com>
> ---

...

>   --------------
> diff --git a/include/linux/fiemap.h b/include/linux/fiemap.h
> index d830747..f4fbb9f 100644
> --- a/include/linux/fiemap.h
> +++ b/include/linux/fiemap.h

   include/uabi/linux/fiemap.h
           ^^^^
other than that, it looks good.

Reviewed-by: Mark Tinguely <tinguely@sgi.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP
  2012-11-08 16:50   ` Mark Tinguely
@ 2012-11-08 20:56     ` Dave Chinner
  2012-11-08 21:01       ` Mark Tinguely
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2012-11-08 20:56 UTC (permalink / raw)
  To: Mark Tinguely; +Cc: linux-fsdevel, xfs

On Thu, Nov 08, 2012 at 10:50:49AM -0600, Mark Tinguely wrote:
> On 10/18/12 00:11, Dave Chinner wrote:
> >From: Dave Chinner<dchinner@redhat.com>
> >
> >fiemap is used to map extents of used space on files. it's just an
> >array of extents, though, so there's no reason it can only index
> >*used*  space.
> >
> >Ther eis need for getting freespace layout information into
> >userspace. For example, defragmentation programs would find it
> >useful to be able to map the free space in the filesystem to
> >work out where it is best to move data to defragment it.
> >Alternatively, knowing where free space is enables us to identify
> >extents that need to be moved to defragment free space.
> >
> >Hence, extend fiemap with the FIEMAP_FLAG_FREESPACE to indicate that
> >the caller wants to map free space in the range fm_start bytes from
> >the start of the filesystem for fm_length bytes.
> >
> >Because XFS can report extents in size order without needing to
> >sort, and this information is useful to xfs_fsr, also add
> >FIEMAP_FLAG_FREESPACE_SIZE to tell the filesystem to return a
> >freespace map ordered by extent size rather than offset. If there
> >are multiple extents of the same size, then they are ordered by
> >offset.
> >
> >Signed-off-by: Dave Chinner<dchinner@redhat.com>
> >---
> 
> ...
> 
> >  --------------
> >diff --git a/include/linux/fiemap.h b/include/linux/fiemap.h
> >index d830747..f4fbb9f 100644
> >--- a/include/linux/fiemap.h
> >+++ b/include/linux/fiemap.h
> 
>   include/uabi/linux/fiemap.h
>           ^^^^
> other than that, it looks good.

include/uapi/ actaully, but that change was made after I posted the
patches so there's no surprise that It didn't apply.

As it is, this needs to be redone into an FS_IOC_FIEMAPFS ioctl in
response to other reviews. I've already done that work (a week ago),
I just haven't fully tested it yet so I haven't reported it...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP
  2012-11-08 20:56     ` Dave Chinner
@ 2012-11-08 21:01       ` Mark Tinguely
  0 siblings, 0 replies; 15+ messages in thread
From: Mark Tinguely @ 2012-11-08 21:01 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-fsdevel, xfs

On 11/08/12 14:56, Dave Chinner wrote:
> On Thu, Nov 08, 2012 at 10:50:49AM -0600, Mark Tinguely wrote:
>> On 10/18/12 00:11, Dave Chinner wrote:
>>> From: Dave Chinner<dchinner@redhat.com>
>>>
>>> fiemap is used to map extents of used space on files. it's just an
>>> array of extents, though, so there's no reason it can only index
>>> *used*  space.
>>>
>>> Ther eis need for getting freespace layout information into
>>> userspace. For example, defragmentation programs would find it
>>> useful to be able to map the free space in the filesystem to
>>> work out where it is best to move data to defragment it.
>>> Alternatively, knowing where free space is enables us to identify
>>> extents that need to be moved to defragment free space.
>>>
>>> Hence, extend fiemap with the FIEMAP_FLAG_FREESPACE to indicate that
>>> the caller wants to map free space in the range fm_start bytes from
>>> the start of the filesystem for fm_length bytes.
>>>
>>> Because XFS can report extents in size order without needing to
>>> sort, and this information is useful to xfs_fsr, also add
>>> FIEMAP_FLAG_FREESPACE_SIZE to tell the filesystem to return a
>>> freespace map ordered by extent size rather than offset. If there
>>> are multiple extents of the same size, then they are ordered by
>>> offset.
>>>
>>> Signed-off-by: Dave Chinner<dchinner@redhat.com>
>>> ---
>>
>> ...
>>
>>>   --------------
>>> diff --git a/include/linux/fiemap.h b/include/linux/fiemap.h
>>> index d830747..f4fbb9f 100644
>>> --- a/include/linux/fiemap.h
>>> +++ b/include/linux/fiemap.h
>>
>>    include/uabi/linux/fiemap.h
>>            ^^^^
>> other than that, it looks good.
>
> include/uapi/ actaully, but that change was made after I posted the
> patches so there's no surprise that It didn't apply.
>
> As it is, this needs to be redone into an FS_IOC_FIEMAPFS ioctl in
> response to other reviews. I've already done that work (a week ago),
> I just haven't fully tested it yet so I haven't reported it...
>
> Cheers,
>
> Dave.

Okay. Thank-you for the update.

--Mark.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2012-11-08 21:01 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-18  5:11 [RFC, PATCH 0/2] fiemap: filesystem free space mapping Dave Chinner
2012-10-18  5:11 ` [PATCH 1/2] fiemap: add freespace mapping to FS_IOC_FIEMAP Dave Chinner
2012-11-08 16:50   ` Mark Tinguely
2012-11-08 20:56     ` Dave Chinner
2012-11-08 21:01       ` Mark Tinguely
2012-10-18  5:11 ` [PATCH 2/2] xfs: implement FIEMAP_FLAG_FREESPACE_* Dave Chinner
2012-10-18  5:27 ` [RFC, PATCH 3/2] xfsprogs: space management tool Dave Chinner
2012-10-18  8:10 ` [RFC, PATCH 0/2] fiemap: filesystem free space mapping Andreas Dilger
2012-10-18 21:07   ` Dave Chinner
2012-10-23 12:30 ` Christoph Hellwig
2012-10-23 21:53   ` Dave Chinner
2012-10-24 11:47     ` Chris Mason
2012-10-24 12:32       ` Jie Liu
2012-10-24 15:09       ` Christoph Hellwig
2012-10-24 19:15         ` Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).