REVIEW: xfs_reno #2

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

* REVIEW: xfs_reno #2
@ 2007-10-04  4:25 Barry Naujok
  2007-10-17 15:48 ` Ruben Porras
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Barry Naujok @ 2007-10-04  4:25 UTC (permalink / raw)
  To: xfs@oss.sgi.com, xfs-dev

[-- Attachment #1: Type: text/plain, Size: 489 bytes --]

A couple changes from the first xfs_reno:

  - Major one is that symlinks are now supported, but only
    owner, group and extended attributes are copied for them
    (not times or inode attributes).

  - Man page!


To make this better, ideally we need some form of
"swap inodes" function in the kernel, where the entire
contents of the inode themselves are swapped. This form
can handle any inode and without any of the dir/file/attr/etc
copy/swap mechanisms we have in xfs_reno.

Barry.

[-- Attachment #2: xfs_reno.patch --]
[-- Type: application/octet-stream, Size: 50138 bytes --]


===========================================================================
xfsdump/Makefile
===========================================================================

--- a/xfsdump/Makefile	2007-10-04 14:16:39.000000000 +1000
+++ b/xfsdump/Makefile	2007-09-14 17:31:31.916437140 +1000
@@ -16,7 +16,7 @@
 	Logs/* built .census install.* install-dev.* *.gz
 
 SUBDIRS = include librmt \
-	common estimate fsr inventory invutil dump restore \
+	common estimate fsr inventory invutil dump reno restore \
 	m4 man doc po debian build
 
 default: $(CONFIGURE)

===========================================================================
xfsdump/man/man8/xfs_reno.8
===========================================================================

--- a/xfsdump/man/man8/xfs_reno.8	2006-06-17 00:58:24.000000000 +1000
+++ b/xfsdump/man/man8/xfs_reno.8	2007-10-04 14:10:30.316027694 +1000
@@ -0,0 +1,117 @@
+.TH xfs_reno 8
+.SH NAME
+xfs_reno \- renumber XFS inodes
+.SH SYNOPSIS
+.B xfs_reno
+[
+.B \-fnpqv
+] [
+.B \-P
+.I interval
+]
+.I path
+.br
+.B xfs_reno \-r
+.I recover_file
+.SH DESCRIPTION
+.B xfs_reno
+is applicable only to XFS filesystems.
+.PP
+.B xfs_reno
+renumbers inodes. XFS supports 64-bit inode numbers, although by
+default it will avoid creating inodes with numbers greater than
+what can be contained within a 32-bit number. If a filesystem does
+contain inode numbers greater than 32-bits, then this can conflict with
+applications that do not support them.
+To recover from this situation previously, affected files would need
+to be copied (and so get a new inode number) and the old version
+removed. This can be time consuming and impractical for very large
+files and filesystems.
+.B xfs_reno
+can be used to renumber such inodes quickly.
+.B xfs_reno
+will copy the inodes of affected files and move the data from the old
+inode to the new without having to copy the data.
+.B xfs_reno
+relies on XFS in the kernel to allocate a new inode number, so if the
+filesystem has been mounted with the
+.I inode64
+mount option, the new inodes will quite possibly have inode numbers
+greater than 32-bits.
+.PP
+.B xfs_reno
+should only be used on a filesystem where it is necessary to
+renumber inodes. Use of
+.B xfs_reno
+on a regular basis is
+.IR "not recommended" .
+Apart from application compatibility, there is no particular advantage
+to be had from renumbering inodes.
+.PP
+.B xfs_reno
+works by traversing a directory tree, scanning all the directories
+and noting which files require renumbering. Once the scanning phase
+is done, it will process the appropriate files and directories. The
+directory's absolute pathname must be given to
+.BR xfs_reno .
+The following options are accepted by
+.BR xfs_reno .
+.TP
+.B \-f
+Force conversion on all inodes, rather than just those with a 64-bit
+inode number. This is not particularly useful except for debugging
+purposes.
+.TP
+.B \-n
+Do nothing, perform a trial run.
+.TP
+.B \-v
+Increases the verbosity of progress and error messages.  Additional
+.BR \-v 's
+can be used to further increase verbosity.
+.TP
+.B \-q
+Do not report progress, only errors.
+.TP
+.B \-p
+Show progress status.
+.TP
+.BI \-P " seconds"
+Set the interval for the progress status in seconds.  The default is 1
+second.
+.TP
+.B \-r
+Recover from an interrupted run.  If
+.B xfs_reno
+is interrupted, it will leave a file called
+.I xfs_reno.recover
+in the directory specified on the command line.  This file will
+contain enough information so that
+.B xfs_reno
+can either finish processing the file it was working on when
+interrupted or back out the last change it made, depending on how far
+through the process it had progressed.
+.B xfs_reno
+will only recover the single file it was working on so it will need
+to be run again on the directory to be sure that all the appropriate
+inodes have been converted.
+.SH EXAMPLES
+To renumber inodes with 64-bit inode numbers:
+.IP
+.B # xfs_reno -p /path/to/directory
+.PP
+To recover from an interrupted run:
+.IP
+.B # xfs_reno -r /path/to/directory/xfs_reno.recover
+.PP
+.SH FILES
+.PD
+.TP
+.I /path/xfs_reno.recover
+records the state where renumbering was interrupted.
+.PD
+.SH SEE ALSO
+.BR xfs_fsr (8),
+.BR xfs_ncheck (8),
+.BR fstab (5),
+.BR xfs (5).

===========================================================================
xfsdump/reno/Makefile
===========================================================================

--- a/xfsdump/reno/Makefile	2006-06-17 00:58:24.000000000 +1000
+++ b/xfsdump/reno/Makefile	2007-10-02 17:06:18.658320738 +1000
@@ -0,0 +1,19 @@
+#
+# Copyright (c) 2007 Silicon Graphics, Inc.  All Rights Reserved.
+#
+
+TOPDIR = ..
+include $(TOPDIR)/include/builddefs
+
+LTCOMMAND = xfs_reno
+CFILES = xfs_reno.c
+LLDLIBS = $(LIBATTR)
+
+default: $(LTCOMMAND)
+
+include $(BUILDRULES)
+
+install: default
+	$(INSTALL) -m 755 -d $(PKG_BIN_DIR)
+	$(LTINSTALL) -m 755 $(LTCOMMAND) $(PKG_BIN_DIR)
+install-dev:

===========================================================================
xfsdump/reno/xfs_reno.c
===========================================================================

--- a/xfsdump/reno/xfs_reno.c	2006-06-17 00:58:24.000000000 +1000
+++ b/xfsdump/reno/xfs_reno.c	2007-10-04 14:11:43.102521161 +1000
@@ -0,0 +1,2040 @@
+/*
+ * Copyright (c) 2007 Silicon Graphics, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+
+/*
+ * xfs_reno - renumber 64-bit inodes
+ *
+ * xfs_reno [-f] [-n] [-p] [-q] [-v] [-P seconds] path ...
+ * xfs_reno [-r] path ...
+ *
+ * Renumbers all inodes > 32 bits into 32 bit space. Requires the filesytem
+ * to be mounted with inode32.
+ *
+ *	-f		force conversion on all inodes rather than just
+ *			those with a 64bit inode number.
+ *	-n		nothing, do not renumber inodes
+ *	-p		show progress status.
+ *	-q		quiet, do not report progress, only errors.
+ *	-v		verbose, more -v's more verbose.
+ *	-P seconds	set the interval for the progress status in seconds.
+ *	-r		recover from an interrupted run.
+ */
+
+#include <xfs/xfs.h>
+
+#include <dirent.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <ftw.h>
+#include <libgen.h>
+#include <malloc.h>
+#include <signal.h>
+#include <stdint.h>
+#include <sys/ioctl.h>
+#include <attr/attributes.h>
+#include <xfs/xfs_dfrag.h>
+#include <xfs/xfs_inum.h>
+
+#define ATTRBUFSIZE	1024
+
+#define SCAN_PHASE	0x00
+#define DIR_PHASE	0x10	/* nothing done or all done */
+#define DIR_PHASE_1	0x11	/* target dir created */
+#define DIR_PHASE_2	0x12	/* temp dir created */
+#define DIR_PHASE_3	0x13	/* attributes backed up to temp */
+#define DIR_PHASE_4	0x14	/* dirents moved to target dir */
+#define DIR_PHASE_5	0x15	/* attributes applied to target dir */
+#define DIR_PHASE_6	0x16	/* src dir removed */
+#define DIR_PHASE_7	0x17	/* temp dir removed */
+#define DIR_PHASE_MAX	0x17
+#define FILE_PHASE	0x20	/* nothing done or all done */
+#define FILE_PHASE_1	0x21	/* temp file created */
+#define FILE_PHASE_2	0x22	/* swapped extents */
+#define FILE_PHASE_3	0x23	/* unlinked source */
+#define FILE_PHASE_4	0x24	/* renamed temp to source name */
+#define FILE_PHASE_MAX	0x24
+#define SLINK_PHASE	0x30	/* nothing done or all done */
+#define SLINK_PHASE_1	0x31	/* temp symlink created */
+#define SLINK_PHASE_2	0x32	/* symlink attrs copied */
+#define SLINK_PHASE_3	0x33	/* unlinked source */
+#define SLINK_PHASE_4	0x34	/* renamed temp to source name */
+#define SLINK_PHASE_MAX	0x34
+
+static void update_recoverfile(void);
+#define SET_PHASE(x)	(cur_phase = x, update_recoverfile())
+
+#define LOG_ERR		0
+#define LOG_NORMAL	1
+#define LOG_INFO	2
+#define LOG_DEBUG	3
+#define LOG_NITTY	4
+
+#define NH_BUCKETS	65536
+#define NH_HASH(ino)	(nodehash + ((ino) % NH_BUCKETS))
+
+typedef struct {
+	xfs_ino_t	ino;
+	int		ftw_flags;
+	nlink_t		numpaths;
+	char		**paths;
+} bignode_t;
+
+typedef struct {
+	bignode_t	*nodes;
+	uint64_t	listlen;
+	uint64_t	lastnode;
+} nodelist_t;
+
+static const char	*cmd_prefix = "xfs_reno_";
+
+static char		*progname;
+static int		log_level = LOG_NORMAL;
+static int		force_all;
+static nodelist_t	*nodehash;
+static int		realuid;
+static uint64_t		numdirnodes;
+static uint64_t		numfilenodes;
+static uint64_t		numslinknodes;
+static uint64_t		numdirsdone;
+static uint64_t		numfilesdone;
+static uint64_t		numslinksdone;
+static int		poll_interval;
+static time_t		starttime;
+static bignode_t	*cur_node;
+static char		*cur_target;
+static char		*cur_temp;
+static int		cur_phase;
+static int		highest_numpaths;
+static char		*recover_file;
+static int		recover_fd;
+static volatile int	poll_output;
+static int		global_rval;
+
+/*
+ * message handling
+ */
+static void
+log_message(
+	int		level,
+	char		*fmt, ...)
+{
+	char		buf[1024];
+	va_list		ap;
+
+	if (log_level < level)
+		return;
+
+	va_start(ap, fmt);
+	vsnprintf(buf, 1024, fmt, ap);
+	va_end(ap);
+
+	printf("%c%s: %s\n", poll_output ? '\n' : '\r', progname, buf);
+	poll_output = 0;
+}
+
+static void
+err_message(
+	char		*fmt, ...)
+{
+	char		buf[1024];
+	va_list		ap;
+
+	va_start(ap, fmt);
+	vsnprintf(buf, 1024, fmt, ap);
+	va_end(ap);
+
+	fprintf(stderr, "%c%s: %s\n", poll_output ? '\n' : '\r', progname, buf);
+	poll_output = 0;
+}
+
+static void
+err_nomem(void)
+{
+	err_message(_("Out of memory"));
+}
+
+static void
+err_open(
+	const char	*s)
+{
+	err_message(_("Cannot open %s: %s"), s, strerror(errno));
+}
+
+static void
+err_not_xfs(
+	const char 	*s)
+{
+	err_message(_("%s is not on an XFS filesystem"), s);
+}
+
+static void
+err_stat(
+	const char	*s)
+{
+	err_message(_("Cannot stat %s: %s\n"), s, strerror(errno));
+}
+
+/*
+ * usage message
+ */
+static void
+usage(void)
+{
+	fprintf(stderr, _("%s [-fnpqv] [-P <interval>] [-r] <path>\n"),
+			progname);
+	exit(1);
+}
+
+
+/*
+ * XFS interface functions
+ */
+
+static int
+xfs_bulkstat_single(int fd, xfs_ino_t *lastip, xfs_bstat_t *ubuffer)
+{
+	xfs_fsop_bulkreq_t  bulkreq;
+
+	bulkreq.lastip = (__u64 *)lastip;
+	bulkreq.icount = 1;
+	bulkreq.ubuffer = ubuffer;
+	bulkreq.ocount = NULL;
+	return ioctl(fd, XFS_IOC_FSBULKSTAT_SINGLE, &bulkreq);
+}
+
+static int
+xfs_swapext(int fd, xfs_swapext_t *sx)
+{
+	return ioctl(fd, XFS_IOC_SWAPEXT, sx);
+}
+
+static int
+xfs_getxattr(int fd, struct fsxattr *attr)
+{
+	return ioctl(fd, XFS_IOC_FSGETXATTR, attr);
+}
+
+static int
+xfs_setxattr(int fd, struct fsxattr *attr)
+{
+	return ioctl(fd, XFS_IOC_FSSETXATTR, attr);
+}
+
+/*
+ * A hash table of inode numbers and associated paths.
+ */
+static nodelist_t *
+init_nodehash(void)
+{
+	int		i;
+
+	nodehash = calloc(NH_BUCKETS, sizeof(nodelist_t));
+	if (nodehash == NULL) {
+		err_nomem();
+		return NULL;
+	}
+
+	for (i = 0; i < NH_BUCKETS; i++) {
+		nodehash[i].nodes = NULL;
+		nodehash[i].lastnode = 0;
+		nodehash[i].listlen = 0;
+	}
+
+	return nodehash;
+}
+
+static void
+free_nodehash(void)
+{
+	int		i, j, k;
+
+	for (i = 0; i < NH_BUCKETS; i++) {
+		bignode_t *nodes = nodehash[i].nodes;
+
+		for (j = 0; j < nodehash[i].lastnode; j++) {
+			for (k = 0; k < nodes[j].numpaths; k++) {
+				free(nodes[j].paths[k]);
+			}
+			free(nodes[j].paths);
+		}
+
+		free(nodes);
+	}
+	free(nodehash);
+}
+
+static nlink_t
+add_path(
+	bignode_t	*node,
+	const char	*path)
+{
+	node->paths = realloc(node->paths,
+			      sizeof(char *) * (node->numpaths + 1));
+	if (node->paths == NULL) {
+		err_nomem();
+		exit(1);
+	}
+
+	node->paths[node->numpaths] = strdup(path);
+	if (node->paths[node->numpaths] == NULL) {
+		err_nomem();
+		exit(1);
+	}
+
+	node->numpaths++;
+	if (node->numpaths > highest_numpaths)
+		highest_numpaths = node->numpaths;
+
+	return node->numpaths;
+}
+
+static bignode_t *
+add_node(
+	nodelist_t	*list,
+	xfs_ino_t	ino,
+	int		ftw_flags,
+	const char	*path)
+{
+	bignode_t	*node;
+
+	if (list->lastnode >= list->listlen) {
+		list->listlen += 500;
+		list->nodes = realloc(list->nodes,
+					sizeof(bignode_t) * list->listlen);
+		if (list->nodes == NULL) {
+			err_nomem();
+			return NULL;
+		}
+	}
+
+	node = list->nodes + list->lastnode;
+
+	node->ino = ino;
+	node->ftw_flags = ftw_flags;
+	node->paths = NULL;
+	node->numpaths = 0;
+	add_path(node, path);
+
+	list->lastnode++;
+
+	return node;
+}
+
+static bignode_t *
+find_node(
+	xfs_ino_t	ino)
+{
+	int		i;
+	nodelist_t	*nodelist;
+	bignode_t	*nodes;
+
+	nodelist = NH_HASH(ino);
+	nodes = nodelist->nodes;
+
+	for(i = 0; i < nodelist->lastnode; i++) {
+		if (nodes[i].ino == ino) {
+			return &nodes[i];
+		}
+	}
+
+	return NULL;
+}
+
+static bignode_t *
+add_node_path(
+	xfs_ino_t	ino,
+	int		ftw_flags,
+	const char	*path)
+{
+	nodelist_t	*nodelist;
+	bignode_t	*node;
+
+	log_message(LOG_NITTY, "add_node_path: ino %llu, path %s", ino, path);
+
+	node = find_node(ino);
+	if (node == NULL) {
+		nodelist = NH_HASH(ino);
+		return add_node(nodelist, ino, ftw_flags, path);
+	}
+
+	add_path(node, path);
+	return node;
+}
+
+static void
+dump_node(
+	char		*msg,
+	bignode_t	*node)
+{
+	int		k;
+
+	if (log_level < LOG_DEBUG)
+		return;
+
+	log_message(LOG_DEBUG, "%s: %llu %llu %s", msg, node->ino,
+			node->numpaths, node->paths[0]);
+
+	for (k = 1; k < node->numpaths; k++)
+		log_message(LOG_DEBUG, "\t%s", node->paths[k]);
+}
+
+static void
+dump_nodehash(void)
+{
+	int		i, j;
+
+	if (log_level < LOG_NITTY)
+		return;
+
+	for (i = 0; i < NH_BUCKETS; i++) {
+		bignode_t	*nodes = nodehash[i].nodes;
+		for (j = 0; j < nodehash[i].lastnode; j++, nodes++)
+			dump_node("nodehash", nodes);
+	}
+}
+
+static int
+for_all_nodes(
+	int		(*fn)(bignode_t *node),
+	int		ftw_flags,
+	int		quit_on_error)
+{
+	int		i;
+	int		j;
+	int		rval = 0;
+
+	for (i = 0; i < NH_BUCKETS; i++) {
+		bignode_t	*nodes = nodehash[i].nodes;
+
+		for (j = 0; j < nodehash[i].lastnode; j++, nodes++) {
+			if (nodes->ftw_flags == ftw_flags) {
+				rval = fn(nodes);
+				if (rval && quit_on_error)
+					goto quit;
+			}
+		}
+	}
+
+quit:
+	return rval;
+}
+
+/*
+ * Adds appropriate files to the inode hash table
+ */
+static int
+nftw_addnodes(
+	const char	*path,
+	const struct stat64 *st,
+	int		flags,
+	struct FTW	*sntfw)
+{
+	if (st->st_ino <= XFS_MAXINUMBER_32 && !force_all)
+		return 0;
+
+	if (flags == FTW_F)
+		numfilenodes++;
+	else if (flags == FTW_D)
+		numdirnodes++;
+	else if (flags == FTW_SL)
+		numslinknodes++;
+	else
+		return 0;
+
+	add_node_path(st->st_ino, flags, path);
+
+	return 0;
+}
+
+/*
+ * Attribute cloning code - most of this is here because attr_copy does not
+ * let us pick and choose which attributes we want to copy.
+ */
+
+attr_multiop_t	attr_ops[ATTR_MAX_MULTIOPS];
+
+/*
+ * Grab attributes specified in attr_ops from source file and write them
+ * out on the destination file.
+ */
+
+static int
+attr_replicate(
+	char		*source,
+	char		*target,
+	int		count)
+{
+	int		j, k;
+
+	if (attr_multi(source, attr_ops, count, ATTR_DONTFOLLOW) < 0)
+		return -1;
+
+	for (k = 0; k < count; k++) {
+		if (attr_ops[k].am_error) {
+			err_message(_("Error %d getting attribute"),
+					attr_ops[k].am_error);
+			break;
+		}
+		attr_ops[k].am_opcode = ATTR_OP_SET;
+	}
+	if (attr_multi(target, attr_ops, k, ATTR_DONTFOLLOW) < 0)
+		err_message("on attr_multif set");
+	for (j = 0; j < k; j++) {
+		if (attr_ops[j].am_error) {
+			err_message(_("Error %d setting attribute"),
+					attr_ops[j].am_error);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * Copy all the attributes specified from src to dst.
+ */
+
+static int
+attr_clone_copy(
+	char		*source,
+	char		*target,
+	char		*list_buf,
+	char		*attr_buf,
+	int		buf_len,
+	int		flags)
+{
+        attrlist_t 	*alist;
+        attrlist_ent_t	*attr;
+        attrlist_cursor_t cursor;
+        int		space, i, j;
+	char		*ptr;
+
+        bzero((char *)&cursor, sizeof(cursor));
+        do {
+                if (attr_list(source, list_buf, ATTRBUFSIZE,
+                		flags | ATTR_DONTFOLLOW, &cursor) < 0) {
+			err_message("on attr_listf");
+                        return -1;
+		}
+
+                alist = (attrlist_t *)list_buf;
+
+		space = buf_len;
+		ptr = attr_buf;
+                for (j = 0, i = 0; i < alist->al_count; i++) {
+                        attr = ATTR_ENTRY(list_buf, i);
+			if (space < attr->a_valuelen) {
+				if (attr_replicate(source, target, j) < 0)
+					return -1;
+				j = 0;
+				space = buf_len;
+				ptr = attr_buf;
+			}
+			attr_ops[j].am_opcode = ATTR_OP_GET;
+			attr_ops[j].am_attrname = attr->a_name;
+			attr_ops[j].am_attrvalue = ptr;
+			attr_ops[j].am_length = (int) attr->a_valuelen;
+			attr_ops[j].am_flags = flags;
+			attr_ops[j].am_error = 0;
+			j++;
+			ptr += attr->a_valuelen;
+			space -= attr->a_valuelen;
+                }
+
+		log_message(LOG_NITTY, "copying attribute %d", i);
+
+		if (j) {
+			if (attr_replicate(source, target, j) < 0)
+				return -1;
+		}
+
+        } while (alist->al_more);
+
+        return 0;
+}
+
+static int
+clone_attribs(
+	char		*source,
+	char		*target)
+{
+	char		list_buf[ATTRBUFSIZE];
+	char		*attr_buf;
+	int		rval;
+
+	attr_buf = malloc(ATTR_MAX_VALUELEN * 2);
+	if (attr_buf == NULL) {
+		err_nomem();
+		return -1;
+	}
+	rval = attr_clone_copy(source, target, list_buf, attr_buf,
+			ATTR_MAX_VALUELEN * 2, 0);
+	if (rval == 0)
+		rval = attr_clone_copy(source, target, list_buf, attr_buf,
+				ATTR_MAX_VALUELEN * 2, ATTR_ROOT);
+	if (rval == 0)
+		rval = attr_clone_copy(source, target, list_buf, attr_buf,
+				ATTR_MAX_VALUELEN * 2, ATTR_SECURE);
+	free(attr_buf);
+	return rval;
+}
+
+static int
+dup_attributes(
+	char		*source,
+	int		sfd,
+	char		*target,
+	int		tfd)
+{
+	struct stat64	st;
+	struct timeval	tv[2];
+	struct fsxattr	fsx;
+
+	if (fstat64(sfd, &st) < 0) {
+		err_stat(source);
+		return -1;
+	}
+
+	if (xfs_getxattr(sfd, &fsx) < 0) {
+		err_stat(source);
+		return -1;
+	}
+
+	tv[0].tv_sec = st.st_atim.tv_sec;
+	tv[0].tv_usec = st.st_atim.tv_nsec / 1000;
+	tv[1].tv_sec = st.st_mtim.tv_sec;
+	tv[1].tv_usec = st.st_mtim.tv_nsec / 1000;
+
+	if (futimes(tfd, tv) < 0)
+		err_message(_("%s: Cannot update target times"), target);
+
+	if (fchown(tfd, st.st_uid, st.st_gid) < 0) {
+		err_message(_("%s: Cannot change target ownership to "
+				"uid(%d) gid(%d)"), target,
+				st.st_uid, st.st_gid);
+
+		if (fchmod(tfd, st.st_mode & ~(S_ISUID | S_ISGID)) < 0)
+			err_message(_("%s: Cannot change target mode "
+					"to (%o)"), target, st.st_mode);
+	} else if (fchmod(tfd, st.st_mode) < 0)
+		err_message(_("%s: Cannot change target mode to (%o)"),
+				target, st.st_mode);
+
+	if (xfs_setxattr(tfd, &fsx) < 0)
+		err_message(_("%s: Cannet set target extended "
+				"attributes"), target);
+
+	return clone_attribs(source, target);
+}
+
+static int
+move_dirents(
+	char		*srcpath,
+	char		*targetpath,
+	int		*move_count)
+{
+	int		rval = 0;
+	DIR		*srcd;
+	struct dirent64	*dp;
+	char		srcname[PATH_MAX];
+	char		targetname[PATH_MAX];
+
+	*move_count = 0;
+
+	srcd = opendir(srcpath);
+	if (srcd == NULL) {
+		err_open(srcpath);
+		return 1;
+	}
+
+	while ((dp = readdir64(srcd)) != NULL) {
+		if (dp->d_ino == 0 || !strcmp(dp->d_name, ".") ||
+				!strcmp(dp->d_name, ".."))
+			continue;
+
+		if (strlen(srcpath) + 1 + strlen(dp->d_name) >=
+				sizeof(srcname) - 1) {
+
+			err_message(_("%s/%s: Name too long"), srcpath,
+					dp->d_name);
+			rval = 1;
+			goto quit;
+		}
+
+		sprintf(srcname, "%s/%s", srcpath, dp->d_name);
+		sprintf(targetname, "%s/%s", targetpath, dp->d_name);
+
+		rval = rename(srcname, targetname);
+		if (rval != 0) {
+			err_message(_("failed to rename: \'%s\' to \'%s\'"),
+					srcname, targetname);
+			goto quit;
+		}
+
+		log_message(LOG_DEBUG, "rename %s -> %s", srcname, targetname);
+
+		(*move_count)++;
+	}
+
+quit:
+	closedir(srcd);
+	return rval;
+}
+
+static int
+process_dir(
+	bignode_t	*node)
+{
+	int		sfd = -1;
+	int		tfd = -1;
+	int		targetfd = -1;
+	int		rval = 0;
+	int		move_count = 0;
+	char		*srcname = NULL;
+	char		*pname = NULL;
+	struct stat64	s1;
+	struct fsxattr  fsx;
+	char		target[PATH_MAX] = "";
+
+	SET_PHASE(DIR_PHASE);
+
+	dump_node("directory", node);
+
+	cur_node = node;
+	srcname = node->paths[0];
+
+	if (stat64(srcname, &s1) < 0) {
+		if (errno != ENOENT) {
+			err_stat(srcname);
+			global_rval |= 2;
+		}
+		goto quit;
+	}
+	if (s1.st_ino <= XFS_MAXINUMBER_32 && !force_all) {
+		/*
+		 * This directory has already changed ino's, probably due
+		 * to being moved during processing of a parent directory.
+		 */
+		log_message(LOG_DEBUG, "process_dir: skipping %s", srcname);
+		goto quit;
+	}
+
+	rval = 1;
+
+	sfd = open(srcname, O_RDONLY);
+	if (sfd < 0) {
+		err_open(srcname);
+		goto quit;
+	}
+
+	if (!platform_test_xfs_fd(sfd)) {
+		err_not_xfs(srcname);
+		goto quit;
+	}
+
+	if (xfs_getxattr(sfd, &fsx) < 0) {
+		err_message(_("failed to get inode attrs: %s"), srcname);
+		goto quit;
+	}
+	if (fsx.fsx_xflags & (XFS_XFLAG_IMMUTABLE | XFS_XFLAG_APPEND)) {
+		err_message(_("%s: immutable/append, ignoring"), srcname);
+		global_rval |= 2;
+		rval = 0;
+		goto quit;
+	}
+
+	/* mkdir parent/target */
+	pname = strdup(srcname);
+	if (pname == NULL) {
+		err_nomem();
+		goto quit;
+	}
+	dirname(pname);
+	sprintf(target, "%s/%sXXXXXX", pname, cmd_prefix);
+	if (mkdtemp(target) == NULL) {
+		err_message(_("Unable to create directory copy: %s"), srcname);
+		goto quit;
+	}
+	SET_PHASE(DIR_PHASE_1);
+
+	cur_target = strdup(target);
+	if (!cur_target) {
+		err_nomem();
+		goto quit;
+	}
+
+	sprintf(target, "%s/%sXXXXXX", pname, cmd_prefix);
+	if (mkdtemp(target) == NULL) {
+		err_message(_("unable to create tmp directory copy"));
+		goto quit;
+	}
+	SET_PHASE(DIR_PHASE_2);
+
+	cur_temp = strdup(target);
+	if (!cur_temp) {
+		err_nomem();
+		goto quit;
+	}
+
+	tfd = open(cur_temp, O_RDONLY);
+	if (tfd < 0) {
+		err_open(cur_temp);
+		goto quit;
+	}
+
+	targetfd = open(cur_target, O_RDONLY);
+	if (tfd < 0) {
+		err_open(cur_target);
+		goto quit;
+	}
+
+
+	/* copy timestamps, attribs and EAs, to cur_temp */
+	rval = dup_attributes(srcname, sfd, cur_temp, tfd);
+	if (rval != 0) {
+		err_message(_("unable to duplicate directory attributes: %s"),
+			    srcname);
+		goto quit_unlink;
+	}
+
+	SET_PHASE(DIR_PHASE_3);
+
+	/* move src dirents to cur_target (this changes timestamps on src) */
+	rval = move_dirents(srcname, cur_target, &move_count);
+	if (rval != 0) {
+		err_message(_("unable to move directory contents: %s to %s"),
+				srcname, cur_target);
+		/* uh oh, move everything back... */
+		if (move_count > 0)
+			goto quit_undo;
+	}
+
+	SET_PHASE(DIR_PHASE_4);
+
+	/* copy timestamps, attribs and EAs from cur_temp to cur_target */
+	rval = dup_attributes(cur_temp, tfd, cur_target, targetfd);
+	if (rval != 0) {
+		err_message(_("unable to duplicate directory attributes: %s"),
+				cur_temp);
+		goto quit_unlink;
+	}
+
+	SET_PHASE(DIR_PHASE_5);
+
+	/* rmdir src */
+	rval = rmdir(srcname);
+	if (rval != 0) {
+		err_message(_("unable to remove directory: %s"), srcname);
+		goto quit_undo;
+	}
+
+	SET_PHASE(DIR_PHASE_6);
+
+	rval = rmdir(cur_temp);
+	if (rval != 0)
+		err_message(_("unable to remove tmp directory: %s"), cur_temp);
+
+	SET_PHASE(DIR_PHASE_7);
+
+	/* rename cur_target src */
+	rval = rename(cur_target, srcname);
+	if (rval != 0) {
+		/*
+		 * we can't abort since the src dir is now gone.
+		 * let the admin clean this one up
+		 */
+		err_message(_("unable to rename directory: %s to %s"),
+				cur_target, srcname);
+	}
+	goto quit;
+
+ quit_undo:
+	if (move_dirents(cur_target, srcname, &move_count) != 0) {
+		/* oh, dear lord... let the admin clean this one up */
+		err_message(_("unable to move directory contents back: %s to %s"),
+				cur_target, srcname);
+		goto quit;
+	}
+	SET_PHASE(DIR_PHASE_3);
+
+ quit_unlink:
+	rmdir(cur_target);
+	rmdir(cur_temp);
+
+ quit:
+
+	SET_PHASE(DIR_PHASE);
+
+	if (sfd >= 0)
+		close(sfd);
+	if (tfd >= 0)
+		close(tfd);
+	if (targetfd >= 0)
+		close(targetfd);
+
+	free(pname);
+	free(cur_target);
+	free(cur_temp);
+
+	cur_target = NULL;
+	cur_temp = NULL;
+	cur_node = NULL;
+	numdirsdone++;
+	return rval;
+}
+
+static int
+process_file(
+	bignode_t	*node)
+{
+	int		sfd = -1;
+	int		tfd = -1;
+	int		i = 0;
+	int		rval = 0;
+	struct stat64	s1;
+	char		*srcname = NULL;
+	char		*pname = NULL;
+	xfs_swapext_t	sx;
+	xfs_bstat_t	bstatbuf;
+	struct fsxattr  fsx;
+	char		target[PATH_MAX] = "";
+
+	SET_PHASE(FILE_PHASE);
+
+	dump_node("file", node);
+
+	cur_node = node;
+	srcname = node->paths[0];
+
+	bzero(&s1, sizeof(s1));
+	bzero(&bstatbuf, sizeof(bstatbuf));
+	bzero(&sx, sizeof(sx));
+
+	if (stat64(srcname, &s1) < 0) {
+		if (errno != ENOENT) {
+			err_stat(srcname);
+			global_rval |= 2;
+		}
+		goto quit;
+	}
+	if (s1.st_ino <= XFS_MAXINUMBER_32 && !force_all)
+		/* this file has changed, and no longer needs processing */
+		goto quit;
+
+	/* open and sync source */
+	sfd = open(srcname, O_RDWR | O_DIRECT);
+	if (sfd < 0) {
+		err_open(srcname);
+		rval = 1;
+		goto quit;
+	}
+	if (!platform_test_xfs_fd(sfd)) {
+		err_not_xfs(srcname);
+		rval = 1;
+		goto quit;
+	}
+	if (fsync(sfd) < 0) {
+		err_message(_("sync failed: %s: %s"),
+				srcname, strerror(errno));
+		rval = 1;
+		goto quit;
+	}
+
+
+	/*
+	 * Check if a mandatory lock is set on the file to try and
+	 * avoid blocking indefinitely on the reads later. Note that
+	 * someone could still set a mandatory lock after this check
+	 * but before all reads have completed to block xfs_reno reads.
+	 * This change just closes the window a bit.
+	 */
+	if ((s1.st_mode & S_ISGID) && !(s1.st_mode & S_IXGRP)) {
+		struct flock fl;
+
+		fl.l_type = F_RDLCK;
+		fl.l_whence = SEEK_SET;
+		fl.l_start = (off_t)0;
+		fl.l_len = 0;
+		if (fcntl(sfd, F_GETLK, &fl) < 0 ) {
+			if (log_level >= LOG_DEBUG)
+				err_message("locking check failed: %s",
+						srcname);
+			global_rval |= 2;
+			goto quit;
+		}
+		if (fl.l_type != F_UNLCK) {
+			if (log_level >= LOG_DEBUG)
+				err_message("mandatory lock: %s: ignoring",
+						srcname);
+			global_rval |= 2;
+			goto quit;
+		}
+	}
+
+	if (xfs_getxattr(sfd, &fsx) < 0) {
+		err_message(_("failed to get inode attrs: %s"), srcname);
+		rval = 1;
+		goto quit;
+	}
+	if (fsx.fsx_xflags & (XFS_XFLAG_IMMUTABLE | XFS_XFLAG_APPEND)) {
+		err_message(_("%s: immutable/append, ignoring"), srcname);
+		global_rval |= 2;
+		goto quit;
+	}
+
+	rval = 1;
+
+	if (realuid != 0 && realuid != s1.st_uid) {
+		errno = EACCES;
+		err_open(srcname);
+		goto quit;
+	}
+
+	/* creat target */
+	pname = strdup(srcname);
+	if (pname == NULL) {
+		err_nomem();
+		goto quit;
+	}
+	dirname(pname);
+	sprintf(target, "%s/%sXXXXXX", pname, cmd_prefix);
+	tfd = mkstemp(target);
+	if (tfd < 0) {
+		err_message("unable to create file copy");
+		goto quit;
+	}
+	cur_target = strdup(target);
+	if (cur_target == NULL) {
+		err_nomem();
+		goto quit;
+	}
+
+	SET_PHASE(FILE_PHASE_1);
+
+	/* Setup direct I/O */
+	if (fcntl(tfd, F_SETFL, O_DIRECT) < 0 ) {
+		err_message(_("could not set O_DIRECT for %s on tmp: %s"),
+				srcname, target);
+		unlink(target);
+		goto quit;
+	}
+
+	/* copy attribs & EAs to target */
+	if (dup_attributes(srcname, sfd, target, tfd) != 0) {
+		err_message(_("unable to duplicate file attributes: %s"),
+				srcname);
+		unlink(target);
+		goto quit;
+	}
+
+	if (xfs_bulkstat_single(sfd, &s1.st_ino, &bstatbuf) < 0) {
+		err_message(_("unable to bulkstat source file: %s"),
+				srcname);
+		unlink(target);
+		goto quit;
+	}
+
+	if (bstatbuf.bs_ino != s1.st_ino) {
+		err_message(_("bulkstat of source file returned wrong inode: %s"),
+				srcname);
+		unlink(target);
+		goto quit;
+	}
+
+	ftruncate64(tfd, bstatbuf.bs_size);
+
+	/* swapextents src target */
+	sx.sx_stat     = bstatbuf; /* struct copy */
+	sx.sx_version  = XFS_SX_VERSION;
+	sx.sx_fdtarget = sfd;
+	sx.sx_fdtmp    = tfd;
+	sx.sx_offset   = 0;
+	sx.sx_length   = bstatbuf.bs_size;
+
+	/* Swap the extents */
+	rval = xfs_swapext(sfd, &sx);
+	if (rval < 0) {
+		if (log_level >= LOG_DEBUG) {
+			switch (errno) {
+			case ENOTSUP:
+				err_message("%s: file type not supported",
+					srcname);
+				break;
+			case EFAULT:
+				/* The file has changed since we started the copy */
+				err_message("%s: file modified, "
+					 "inode renumber aborted: %ld",
+					 srcname, bstatbuf.bs_size);
+				break;
+			case EBUSY:
+				/* Timestamp has changed or mmap'ed file */
+				err_message("%s: file busy", srcname);
+				break;
+			default:
+				err_message(_("Swap extents failed: %s: %s"),
+					srcname, strerror(errno));
+				break;
+			}
+		} else
+			err_message(_("Swap extents failed: %s: %s"),
+					srcname, strerror(errno));
+		goto quit;
+	}
+
+	if (bstatbuf.bs_dmevmask | bstatbuf.bs_dmstate) {
+		struct fsdmidata fssetdm;
+
+		/* Set the DMAPI Fields. */
+		fssetdm.fsd_dmevmask = bstatbuf.bs_dmevmask;
+		fssetdm.fsd_padding = 0;
+		fssetdm.fsd_dmstate = bstatbuf.bs_dmstate;
+
+		if (ioctl(tfd, XFS_IOC_FSSETDM, (void *)&fssetdm ) < 0)
+			err_message(_("attempt to set DMI attributes "
+					"of %s failed"), target);
+	}
+
+	SET_PHASE(FILE_PHASE_2);
+
+	/* unlink src */
+	rval = unlink(srcname);
+	if (rval != 0) {
+		err_message(_("unable to remove file: %s"), srcname);
+		goto quit;
+	}
+
+	SET_PHASE(FILE_PHASE_3);
+
+	/* rename target src */
+	rval = rename(target, srcname);
+	if (rval != 0) {
+		/*
+		 * we can't abort since the src file is now gone.
+		 * let the admin clean this one up
+		 */
+		err_message(_("unable to rename file: %s to %s"),
+				target, srcname);
+		goto quit;
+	}
+
+	SET_PHASE(FILE_PHASE_4);
+
+	/* for each hardlink, unlink and creat pointing to target */
+	for (i = 1; i < node->numpaths; i++) {
+		/* unlink src */
+		rval = unlink(node->paths[i]);
+		if (rval != 0) {
+			err_message(_("unable to remove file: %s"),
+				       node->paths[i]);
+			goto quit;
+		}
+
+		rval = link(srcname, node->paths[i]);
+		if (rval != 0) {
+			err_message("unable to link to file: %s", srcname);
+			goto quit;
+		}
+		numfilesdone++;
+	}
+
+ quit:
+	cur_node = NULL;
+
+	SET_PHASE(FILE_PHASE);
+
+	if (sfd >= 0)
+		close(sfd);
+	if (tfd >= 0)
+		close(tfd);
+
+	free(pname);
+	free(cur_target);
+
+	cur_target = NULL;
+
+	numfilesdone++;
+	return rval;
+}
+
+
+static int
+process_slink(
+	bignode_t	*node)
+{
+	int		i = 0;
+	int		rval = 0;
+	struct stat64	st;
+	char		*srcname = NULL;
+	char		*pname = NULL;
+	char		target[PATH_MAX] = "";
+	char		linkbuf[PATH_MAX];
+
+	SET_PHASE(SLINK_PHASE);
+
+	dump_node("symlink", node);
+
+	cur_node = node;
+	srcname = node->paths[0];
+
+	if (lstat64(srcname, &st) < 0) {
+		if (errno != ENOENT) {
+			err_stat(srcname);
+			global_rval |= 2;
+		}
+		goto quit;
+	}
+	if (st.st_ino <= XFS_MAXINUMBER_32 && !force_all)
+		/* this file has changed, and no longer needs processing */
+		goto quit;
+
+	rval = 1;
+
+	i = readlink(srcname, linkbuf, sizeof(linkbuf) - 1);
+	if (i < 0) {
+		err_message(_("unable to read symlink: %s"), srcname);
+		goto quit;
+	}
+	linkbuf[i] = '\0';
+
+	if (realuid != 0 && realuid != st.st_uid) {
+		errno = EACCES;
+		err_open(srcname);
+		goto quit;
+	}
+
+	/* create target */
+	pname = strdup(srcname);
+	if (pname == NULL) {
+		err_nomem();
+		goto quit;
+	}
+	dirname(pname);
+
+	sprintf(target, "%s/%sXXXXXX", pname, cmd_prefix);
+	if (mktemp(target) == NULL) {
+		err_message(_("unable to create temp symlink name"));
+		goto quit;
+	}
+	cur_target = strdup(target);
+	if (cur_target == NULL) {
+		err_nomem();
+		goto quit;
+	}
+
+	if (symlink(linkbuf, target) != 0) {
+		err_message(_("unable to create symlink: %s"), target);
+		goto quit;
+	}
+
+	SET_PHASE(SLINK_PHASE_1);
+
+	/* copy ownership & EAs to target */
+	if (lchown(target, st.st_uid, st.st_gid) < 0) {
+		err_message(_("%s: Cannot change target ownership to "
+				"uid(%d) gid(%d)"), target,
+				st.st_uid, st.st_gid);
+		unlink(target);
+		goto quit;
+	}
+
+	if (clone_attribs(srcname, target) != 0) {
+		err_message(_("unable to duplicate symlink attributes: %s"),
+				srcname);
+		unlink(target);
+		goto quit;
+	}
+
+	SET_PHASE(SLINK_PHASE_2);
+
+	/* unlink src */
+	rval = unlink(srcname);
+	if (rval != 0) {
+		err_message(_("unable to remove symlink: %s"), srcname);
+		goto quit;
+	}
+
+	SET_PHASE(SLINK_PHASE_3);
+
+	/* rename target src */
+	rval = rename(target, srcname);
+	if (rval != 0) {
+		/*
+		 * we can't abort since the src file is now gone.
+		 * let the admin clean this one up
+		 */
+		err_message(_("unable to rename symlink: %s to %s"),
+				target, srcname);
+		goto quit;
+	}
+
+	SET_PHASE(SLINK_PHASE_4);
+
+	/* for each hardlink, unlink and creat pointing to target */
+	for (i = 1; i < node->numpaths; i++) {
+		/* unlink src */
+		rval = unlink(node->paths[i]);
+		if (rval != 0) {
+			err_message(_("unable to remove symlink: %s"),
+				       node->paths[i]);
+			goto quit;
+		}
+
+		rval = link(srcname, node->paths[i]);
+		if (rval != 0) {
+			err_message("unable to link to symlink: %s", srcname);
+			goto quit;
+		}
+		numslinksdone++;
+	}
+
+ quit:
+	cur_node = NULL;
+
+	SET_PHASE(SLINK_PHASE);
+
+	free(pname);
+	free(cur_target);
+
+	cur_target = NULL;
+
+	numslinksdone++;
+	return rval;
+}
+
+static int
+open_recoverfile(void)
+{
+	recover_fd = open(recover_file, O_RDWR | O_SYNC | O_CREAT | O_EXCL,
+			0600);
+	if (recover_fd < 0) {
+		if (errno == EEXIST)
+			err_message(_("Recovery file already exists, either "
+				"run '%s -r %s' or remove the file."),
+				progname, recover_file);
+		else
+			err_open(recover_file);
+		return 1;
+	}
+
+	if (!platform_test_xfs_fd(recover_fd)) {
+		err_not_xfs(recover_file);
+		close(recover_fd);
+		return 1;
+	}
+
+	return 0;
+}
+
+static void
+update_recoverfile(void)
+{
+	static const char null_file[] = "0\n0\n0\n\ntarget: \ntemp: \nend\n";
+	static size_t	buf_size = 0;
+	static char	*buf = NULL;
+	int 		i, len;
+
+	if (recover_fd <= 0)
+		return;
+
+	if (cur_node == NULL || cur_phase == 0) {
+		/* inbetween processing or still scanning */
+		lseek(recover_fd, 0, SEEK_SET);
+		write(recover_fd, null_file, sizeof(null_file));
+		return;
+	}
+
+	ASSERT(highest_numpaths > 0);
+	if (buf == NULL) {
+		buf_size = (highest_numpaths + 3) * PATH_MAX;
+		buf = malloc(buf_size);
+		if (buf == NULL) {
+			err_nomem();
+			exit(1);
+		}
+	}
+
+	len = sprintf(buf, "%d\n%llu\n%d\n", cur_phase,
+			(long long)cur_node->ino, cur_node->ftw_flags);
+
+	for (i = 0; i < cur_node->numpaths; i++)
+		len += sprintf(buf + len, "%s\n", cur_node->paths[i]);
+
+	len += sprintf(buf + len, "target: %s\ntemp: %s\nend\n",
+			cur_target, cur_temp);
+
+	ASSERT(len < buf_size);
+
+	lseek(recover_fd, 0, SEEK_SET);
+	ftruncate(recover_fd, 0);
+	write(recover_fd, buf, len);
+}
+
+static void
+cleanup(void)
+{
+	log_message(LOG_NORMAL, _("Interrupted -- cleaning up..."));
+
+	free_nodehash();
+
+	log_message(LOG_NORMAL, _("Done."));
+}
+
+static void
+sighandler(int sig)
+{
+	static char	cycle[4] = "-\\|/";
+	static uint64_t	cur_cycle = 0;
+	double		percent;
+	char		*typename;
+	uint64_t	nodes, done;
+
+	alarm(0);
+
+	if (sig != SIGALRM) {
+		cleanup();
+		exit(1);
+	}
+
+	if (cur_phase == SCAN_PHASE) {
+		if (log_level >= LOG_INFO)
+			fprintf(stderr, _("\r%llu files, %llu dirs and %llu "
+				"symlinks to renumber found... %c"),
+				(long long)numfilenodes,
+				(long long)numdirnodes,
+				(long long)numslinknodes,
+				cycle[cur_cycle % 4]);
+		else
+			fprintf(stderr, "\r%c",
+				cycle[cur_cycle % 4]);
+		cur_cycle++;
+	} else {
+		if (cur_phase >= DIR_PHASE && cur_phase <= DIR_PHASE_MAX) {
+			nodes = numdirnodes;
+			done = numdirsdone;
+			typename = _("dirs");
+	 	} else
+	 	if (cur_phase >= FILE_PHASE && cur_phase <= FILE_PHASE_MAX) {
+			nodes = numfilenodes;
+			done = numfilesdone;
+			typename = _("files");
+	  	} else {
+			nodes = numslinknodes;
+			done = numslinksdone;
+			typename = _("symlinks");
+		}
+		percent = 100.0 * (double)done / (double)nodes;
+		if (percent > 100.0)
+			percent = 100.0;
+		if (log_level >= LOG_INFO)
+			fprintf(stderr, _("\r%.1f%%, %llu of %llu %s, "
+					"%u seconds elapsed"), percent,
+					(long long)done, (long long)nodes,
+					typename, (int)(time(0) - starttime));
+		else
+			fprintf(stderr, "\r%.1f%%", percent);
+	}
+	poll_output = 1;
+	signal(SIGALRM, sighandler);
+
+	if (poll_interval)
+		alarm(poll_interval);
+}
+
+static int
+read_recover_file(
+	char		*recover_file,
+	bignode_t	**node,
+	char		**target,
+	char		**temp,
+	int		*phase)
+{
+	FILE		*file;
+	int		rval = 1;
+	ino_t		ino;
+	int		ftw_flags;
+	char		buf[PATH_MAX + 10]; /* path + "target: " */
+	struct stat64	s;
+	int		first_path;
+
+	/*
+
+	A recovery file should look like:
+
+	<phase>
+	<ino number>
+	<ftw flags>
+	<first path to inode>
+	<hardlinks to inode>
+	target: <path to target dir or file>
+	temp: <path to temp dir if dir phase>
+	end
+	*/
+
+	file = fopen(recover_file, "r");
+	if (file == NULL) {
+		err_open(recover_file);
+		return 1;
+	}
+
+	/* read phase */
+	*phase = 0;
+	if (fgets(buf, PATH_MAX + 10, file) == NULL) {
+		err_message("Recovery failed: unable to read phase");
+		goto quit;
+	}
+	buf[strlen(buf) - 1] = '\0';
+	*phase = atoi(buf);
+	if (*phase == SCAN_PHASE) {
+		fclose(file);
+		return 0;
+	}
+	if ((*phase < DIR_PHASE || *phase > DIR_PHASE_MAX) &&
+			(*phase < FILE_PHASE || *phase > FILE_PHASE_MAX)) {
+		err_message("Recovery failed: failed to read valid recovery phase");
+		goto quit;
+	}
+
+	/* read inode number */
+	if (fgets(buf, PATH_MAX + 10, file) == NULL) {
+		err_message("Recovery failed: unable to read inode number");
+		goto quit;
+	}
+	buf[strlen(buf) - 1] = '\0';
+	ino = strtoull(buf, NULL, 10);
+	if (ino == 0) {
+		err_message("Recovery failed: unable to read inode number");
+		goto quit;
+	}
+
+	/* read ftw_flags */
+	if (fgets(buf, PATH_MAX + 10, file) == NULL) {
+		err_message("Recovery failed: unable to read flags");
+		goto quit;
+	}
+	buf[strlen(buf) - 1] = '\0';
+	if (buf[1] != '\0' || (buf[0] != '0' && buf[0] != '1')) {
+		err_message("Recovery failed: unable to read flags: '%s'", buf);
+		goto quit;
+	}
+	ftw_flags = atoi(buf);
+
+	/* read paths and target path */
+	*node = NULL;
+	*target = NULL;
+	first_path = 1;
+	while (fgets(buf, PATH_MAX + 10, file) != NULL) {
+		buf[strlen(buf) - 1] = '\0';
+
+		log_message(LOG_DEBUG, "path: '%s'", buf);
+
+		if (buf[0] == '/') {
+			if (stat64(buf, &s) < 0) {
+				err_message(_("Recovery failed: cannot "
+						"stat '%s'"), buf);
+				goto quit;
+			}
+			if (s.st_ino != ino) {
+				err_message(_("Recovery failed: inode "
+						"number for '%s' does not "
+						"match recorded number"), buf);
+				goto quit;
+			}
+
+			if (first_path) {
+				first_path = 0;
+				*node = add_node_path(ino, ftw_flags, buf);
+			}
+			else {
+				add_path(*node, buf);
+			}
+		}
+		else if (strncmp(buf, "target: ", 8) == 0) {
+			*target = strdup(buf + 8);
+			if (*target == NULL) {
+				err_nomem();
+				goto quit;
+			}
+			if (stat64(*target, &s) < 0) {
+				err_message(_("Recovery failed: cannot "
+						"stat '%s'"), *target);
+				goto quit;
+			}
+		}
+		else if (strncmp(buf, "temp: ", 6) == 0) {
+			*temp = strdup(buf + 6);
+			if (*temp == NULL) {
+				err_nomem();
+				goto quit;
+			}
+		}
+		else if (strcmp(buf, "end") == 0) {
+			rval = 0;
+			goto quit;
+	 	}
+	 	else {
+			err_message(_("Recovery failed: unrecognised "
+					"string: '%s'"), buf);
+			goto quit;
+		}
+	}
+
+	err_message(_("Recovery failed: end of recovery file not found"));
+
+ quit:
+	if (*node == NULL) {
+		err_message(_("Recovery failed: no valid inode or paths "
+				"specified"));
+		rval = 1;
+	}
+
+	if (*target == NULL) {
+		err_message(_("Recovery failed: no inode target specified"));
+		rval = 1;
+	}
+
+	fclose(file);
+
+	return rval;
+}
+
+int
+recover(
+	bignode_t	*node,
+	char		*target,
+	char		*tname,
+	int		phase)
+{
+	int		tfd = -1;
+	int		targetfd = -1;
+	char		*srcname = NULL;
+	int		rval = 0;
+	int		i;
+	int		move_count = 0;
+
+	dump_node("recover", node);
+	log_message(LOG_DEBUG, "target: %s, phase: %x", target, phase);
+
+	if (node)
+		srcname = node->paths[0];
+
+	switch (phase) {
+
+	case DIR_PHASE_2:
+rmtemps:
+		log_message(LOG_NORMAL, _("Removing temporary directory: '%s'"),
+				tname);
+		if (rmdir(tname) < 0 && errno != ENOENT) {
+			err_message(_("unable to remove directory: %s"), tname);
+			rval = 1;
+		}
+		/* FALL THRU */
+	case DIR_PHASE_1:
+		log_message(LOG_NORMAL, _("Removing target directory: '%s'"),
+				target);
+		if (rmdir(target) < 0 && errno != ENOENT) {
+			err_message(_("unable to remove directory: %s"),
+					target);
+			rval = 1;
+		}
+		break;
+
+	case DIR_PHASE_3:
+		log_message(LOG_NORMAL, _("Completing moving directory "
+				"contents: '%s' to '%s'"), srcname, target);
+		if (move_dirents(srcname, target, &move_count) != 0) {
+			err_message(_("unable to move directory contents: "
+					"%s to %s"), srcname, target);
+			/* uh oh, move everything back... */
+			if (move_count > 0) {
+				if (move_dirents(target, srcname,
+						&move_count) != 0) {
+					/* oh, dear lord... let the admin
+					 * clean this one up */
+					err_message(_("unable to move directory "
+						"contents back: %s to %s"),
+						target, srcname);
+					exit(1);
+				}
+			}
+			goto rmtemps;
+		}
+		/* FALL THRU */
+	case DIR_PHASE_4:
+		log_message(LOG_NORMAL, _("Setting attributes for target "
+				"directory: \'%s\'"), target);
+		tfd = open(tname, O_RDONLY);
+		if (tfd < 0) {
+			err_open(tname);
+			rval = 1;
+			break;
+		}
+		targetfd = open(target, O_RDONLY);
+		if (targetfd < 0) {
+			err_open(target);
+			rval = 1;
+			break;
+		}
+		rval = dup_attributes(tname, tfd, target, targetfd);
+		if (rval != 0) {
+			err_message(_("unable to duplicate directory "
+					"attributes: %s"), tname);
+			break;
+		}
+		close(tfd);
+		close(targetfd);
+		/* FALL THRU */
+	case DIR_PHASE_6:
+		log_message(LOG_NORMAL, _("Removing temporary directory: \'%s\'"),
+				tname);
+		if (rmdir(tname) < 0 && errno != ENOENT) {
+			err_message(_("unable to remove directory: %s"),
+					tname);
+			rval = 1;
+			break;
+		}
+		/* FALL THRU */
+	case DIR_PHASE_5:
+		log_message(LOG_NORMAL, _("Removing old directory: \'%s\'"),
+				srcname);
+		if (rmdir(srcname) < 0 && errno != ENOENT) {
+			err_message(_("unable to remove directory: %s"),
+					srcname);
+			rval = 1;
+			break;
+		}
+		/* FALL THRU */
+	case DIR_PHASE_7:
+		log_message(LOG_NORMAL, _("Renaming new directory to old "
+			"directory: \'%s\' -> \'%s\'"), target, srcname);
+		rval = rename(target, srcname);
+		if (rval != 0) {
+			/* we can't abort since the src dir is now gone.
+			 * let the admin clean this one up
+			 */
+			err_message(_("unable to rename directory: %s to %s"),
+					target, srcname);
+			break;
+		}
+		break;
+
+
+	case FILE_PHASE_1:
+	case SLINK_PHASE_1:
+		log_message(LOG_NORMAL, _("Unlinking temporary file: \'%s\'"),
+				target);
+		unlink(target);
+		break;
+
+	case FILE_PHASE_2:
+	case SLINK_PHASE_2:
+		log_message(LOG_NORMAL, _("Unlinking old file: \'%s\'"),
+				srcname);
+		rval = unlink(srcname);
+		if (rval != 0) {
+			err_message(_("unable to remove file: %s"), srcname);
+			break;
+		}
+		/* FALL THRU */
+	case FILE_PHASE_3:
+	case SLINK_PHASE_3:
+		log_message(LOG_NORMAL, _("Renaming new file to old file: "
+				"\'%s\' -> \'%s\'"), target, srcname);
+		rval = rename(target, srcname);
+		if (rval != 0) {
+			/* we can't abort since the src file is now gone.
+			 * let the admin clean this one up
+			 */
+			err_message(_("unable to rename file: %s to %s"),
+					target, srcname);
+			break;
+		}
+		/* FALL THRU */
+	case FILE_PHASE_4:
+	case SLINK_PHASE_4:
+		/* for each hardlink, unlink and creat pointing to target */
+		for (i = 1; i < node->numpaths; i++) {
+			if (i == 1)
+				log_message(LOG_NORMAL, _("Resetting hardlinks "
+						"to new file"));
+
+			rval = unlink(node->paths[i]);
+			if (rval != 0) {
+				err_message(_("unable to remove file: %s"),
+						node->paths[i]);
+				break;
+			}
+			rval = link(srcname, node->paths[i]);
+			if (rval != 0) {
+				err_message(_("unable to link to file: %s"),
+						srcname);
+				break;
+			}
+		}
+		break;
+	}
+
+	if (rval == 0) {
+		log_message(LOG_NORMAL, _("Removing recover file: \'%s\'"),
+				recover_file);
+		unlink(recover_file);
+		log_message(LOG_NORMAL, _("Recovery done."));
+	}
+	else {
+		log_message(LOG_NORMAL, _("Leaving recover file: \'%s\'"),
+				recover_file);
+		log_message(LOG_NORMAL, _("Recovery failed."));
+	}
+
+	return rval;
+}
+
+int
+main(
+	int		argc,
+	char		*argv[])
+{
+	int		c = 0;
+	int		rval = 0;
+	int		q_opt = 0;
+	int		v_opt = 0;
+	int		p_opt = 0;
+	int		n_opt = 0;
+	char		pathname[PATH_MAX];
+	struct stat64	st;
+
+	progname = basename(argv[0]);
+
+	setlocale(LC_ALL, "");
+	bindtextdomain(PACKAGE, LOCALEDIR);
+	textdomain(PACKAGE);
+
+	while ((c = getopt(argc, argv, "fnpqvP:r:")) != -1) {
+		switch (c) {
+		case 'f':
+			force_all = 1;
+			break;
+		case 'n':
+			n_opt++;
+			break;
+		case 'p':
+			p_opt++;
+			break;
+		case 'q':
+			if (v_opt)
+				err_message(_("'q' option incompatible "
+						"with 'v' option"));
+			q_opt++;
+			log_level=0;
+			break;
+		case 'v':
+			if (q_opt)
+				err_message(_("'v' option incompatible "
+						"with 'q' option"));
+			v_opt++;
+			log_level++;
+			break;
+		case 'P':
+			poll_interval = atoi(optarg);
+			break;
+		case 'r':
+			recover_file = optarg;
+			break;
+		default:
+			err_message(_("%s: illegal option -- %c\n"), c);
+			usage();
+			/* NOTREACHED */
+			break;
+		}
+	}
+
+	if (optind != argc - 1 && recover_file == NULL) {
+		usage();
+		exit(1);
+	}
+
+	realuid = getuid();
+	starttime = time(0);
+
+	init_nodehash();
+
+	signal(SIGALRM, sighandler);
+	signal(SIGABRT, sighandler);
+	signal(SIGHUP, sighandler);
+	signal(SIGINT, sighandler);
+	signal(SIGQUIT, sighandler);
+	signal(SIGTERM, sighandler);
+
+	if (p_opt && poll_interval == 0) {
+		poll_interval = 1;
+	}
+	if (poll_interval)
+		alarm(poll_interval);
+
+	if (recover_file) {
+		bignode_t	*node = NULL;
+		char		*target = NULL;
+		char		*tname = NULL;
+		int		phase = 0;
+
+		if (n_opt)
+			goto quit;
+
+		/* read node info from recovery file */
+		if (read_recover_file(recover_file, &node, &target,
+				&tname, &phase) != 0)
+			exit(1);
+
+		rval = recover(node, target, tname, phase);
+
+		free(target);
+		free(tname);
+
+		return rval;
+	}
+
+	recover_file = malloc(PATH_MAX);
+	if (recover_file == NULL) {
+		err_nomem();
+		exit(1);
+	}
+	recover_file[0] = '\0';
+
+	strcpy(pathname, argv[optind]);
+	if (pathname[0] != '/') {
+		err_message(_("pathname must begin with a slash ('/')"));
+		exit(1);
+	}
+
+	if (stat64(pathname, &st) < 0) {
+		err_stat(pathname);
+		exit(1);
+	}
+	if (S_ISREG(st.st_mode)) {
+		/* single file specified */
+		if (st.st_nlink > 1) {
+			err_message(_("cannot process single file with a "
+					"link count greater than 1"));
+			exit(1);
+		}
+
+		strcpy(recover_file, pathname);
+		dirname(recover_file);
+
+		strcpy(recover_file + strlen(recover_file), "/xfs_reno.recover");
+		if (!n_opt) {
+			if (open_recoverfile() != 0)
+				exit(1);
+		}
+		add_node_path(st.st_ino, FTW_F, pathname);
+	}
+	else if (S_ISDIR(st.st_mode)) {
+		/* directory tree specified */
+		strcpy(recover_file, pathname);
+
+		strcpy(recover_file + strlen(recover_file), "/xfs_reno.recover");
+		if (!n_opt) {
+			if (open_recoverfile() != 0)
+				exit(1);
+		}
+
+		/* directory scan */
+		log_message(LOG_INFO, _("\rScanning directory tree..."));
+		SET_PHASE(SCAN_PHASE);
+		nftw64(pathname, nftw_addnodes, 100, FTW_PHYS | FTW_MOUNT);
+	}
+	else {
+		err_message(_("pathname must be either a regular file "
+				"or directory"));
+		exit(1);
+	}
+
+	dump_nodehash();
+
+	if (n_opt) {
+		/* n flag set, don't do anything */
+		if (numdirnodes)
+			log_message(LOG_NORMAL, "\rWould process %d %s",
+					numdirnodes, numdirnodes == 1 ?
+						"directory" : "directories");
+		else
+			log_message(LOG_NORMAL, "\rNo directories to process");
+
+		if (numfilenodes)
+			/* process files */
+			log_message(LOG_NORMAL, "\rWould process %d %s",
+					numfilenodes, numfilenodes == 1 ?
+						"file" : "files");
+		else
+			log_message(LOG_NORMAL, "\rNo files to process");
+		if (numslinknodes)
+			/* process files */
+			log_message(LOG_NORMAL, "\rWould process %d %s",
+					numslinknodes, numslinknodes == 1 ?
+						"symlinx" : "symlinks");
+		else
+			log_message(LOG_NORMAL, "\rNo symlinks to process");
+	} else {
+		/* process directories */
+		if (numdirnodes) {
+			log_message(LOG_INFO, _("\rProcessing %d %s..."),
+					numdirnodes, numdirnodes == 1 ?
+					    _("directory") : _("directories"));
+			cur_phase = DIR_PHASE;
+			rval = for_all_nodes(process_dir, FTW_D, 1);
+			if (rval != 0)
+				goto quit;
+		}
+		else
+			log_message(LOG_INFO, _("\rNo directories to process..."));
+
+		if (numfilenodes) {
+			/* process files */
+			log_message(LOG_INFO, _("\rProcessing %d %s..."),
+					numfilenodes, numfilenodes == 1 ?
+						_("file") : _("files"));
+			cur_phase = FILE_PHASE;
+			for_all_nodes(process_file, FTW_F, 0);
+		}
+		else
+			log_message(LOG_INFO, _("\rNo files to process..."));
+
+		if (numslinknodes) {
+			/* process symlinks */
+			log_message(LOG_INFO, _("\rProcessing %d %s..."),
+					numslinknodes, numslinknodes == 1 ?
+						_("symlink") : _("symlinks"));
+			cur_phase = SLINK_PHASE;
+			for_all_nodes(process_slink, FTW_SL, 0);
+		}
+		else
+			log_message(LOG_INFO, _("\rNo symlinks to process..."));
+
+	}
+quit:
+	free_nodehash();
+
+	close(recover_fd);
+
+	if (rval == 0)
+		unlink(recover_file);
+
+	log_message(LOG_DEBUG, "\r%u seconds elapsed", time(0) - starttime);
+	log_message(LOG_INFO, _("\rDone.     "));
+
+	return rval | global_rval;
+}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-10-04  4:25 REVIEW: xfs_reno #2 Barry Naujok
@ 2007-10-17 15:48 ` Ruben Porras
  2007-11-16  6:04 ` Vlad Apostolov
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 15+ messages in thread
From: Ruben Porras @ 2007-10-17 15:48 UTC (permalink / raw)
  To: xfs@oss.sgi.com

Barry Naujok wrote:
> A couple changes from the first xfs_reno:
>
>  - Major one is that symlinks are now supported, but only
>    owner, group and extended attributes are copied for them
>    (not times or inode attributes).
>
>  - Man page!
>
>
> To make this better, ideally we need some form of
> "swap inodes" function in the kernel, where the entire
> contents of the inode themselves are swapped. This form
> can handle any inode and without any of the dir/file/attr/etc
> copy/swap mechanisms we have in xfs_reno.
>

If everything goes as planned I will have time next month to look at 
xfs_reno.

-- 
Rubén Porras
LinWorks GmbH

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-10-04  4:25 REVIEW: xfs_reno #2 Barry Naujok
  2007-10-17 15:48 ` Ruben Porras
@ 2007-11-16  6:04 ` Vlad Apostolov
  2007-11-16  6:20   ` Timothy Shimmin
  2007-11-19  3:48 ` Lachlan McIlroy
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Vlad Apostolov @ 2007-11-16  6:04 UTC (permalink / raw)
  To: Barry Naujok; +Cc: xfs@oss.sgi.com, xfs-dev

Barry Naujok wrote:
> A couple changes from the first xfs_reno:
>
>  - Major one is that symlinks are now supported, but only
>    owner, group and extended attributes are copied for them
>    (not times or inode attributes).
>
>  - Man page!
>
>
> To make this better, ideally we need some form of
> "swap inodes" function in the kernel, where the entire
> contents of the inode themselves are swapped. This form
> can handle any inode and without any of the dir/file/attr/etc
> copy/swap mechanisms we have in xfs_reno.
>
> Barry.
Hi Barry,

The code is looking good. Some questions and minor remarks bellow.

- init_nodehash() return value is not used

- Why poll_output is volatile?

- I think you meant "exit()" instead of "goto quit" below as
"recover_fd" is not opened yet:

		if (n_opt)
			goto quit;
...			
quit:
	free_nodehash();

	close(recover_fd);


- Is dirname(xxx) used as intended? I think it should be xxx = dirname(xxx).

- Some log_message() strings don't have _("text") convention.

I see that you take care to copy the DMAPI fields as well.
Unfortunately changing the inode number in a DMAPI filesystem
would make the DMAPI handle different, which means any application
using DMAPI would not be able to access the new file anymore.

When the XFS parent pointers feature is released we would need to find
out to update the EA to point to the new inode parent directory. This may
not be that easy though.

Regards,
Vlad

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-11-16  6:04 ` Vlad Apostolov
@ 2007-11-16  6:20   ` Timothy Shimmin
  2007-11-18 23:13     ` Vlad Apostolov
  2007-11-19 12:39     ` Christoph Hellwig
  0 siblings, 2 replies; 15+ messages in thread
From: Timothy Shimmin @ 2007-11-16  6:20 UTC (permalink / raw)
  To: Vlad Apostolov; +Cc: Barry Naujok, xfs@oss.sgi.com, xfs-dev

Vlad Apostolov wrote:
> 
> When the XFS parent pointers feature is released we would need to find
> out to update the EA to point to the new inode parent directory. This may
> not be that easy though.
> 
Really?
Apart from the swapping of extents, reno uses standard calls doesn't it,
in which case any movement of inodes will have the parent pointers
updated by the normal vnode ops (e.g. mkdir, rename) in the kernel.

--Tim

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-11-16  6:20   ` Timothy Shimmin
@ 2007-11-18 23:13     ` Vlad Apostolov
  2007-11-18 23:19       ` Vlad Apostolov
  2007-11-19 12:39     ` Christoph Hellwig
  1 sibling, 1 reply; 15+ messages in thread
From: Vlad Apostolov @ 2007-11-18 23:13 UTC (permalink / raw)
  To: Timothy Shimmin; +Cc: Barry Naujok, xfs@oss.sgi.com, xfs-dev

Timothy Shimmin wrote:
> Vlad Apostolov wrote:
>>
>> When the XFS parent pointers feature is released we would need to find
>> out to update the EA to point to the new inode parent directory. This 
>> may
>> not be that easy though.
>>
> Really?
> Apart from the swapping of extents, reno uses standard calls doesn't it,
> in which case any movement of inodes will have the parent pointers
> updated by the normal vnode ops (e.g. mkdir, rename) in the kernel.
>
> --Tim
When a 64 bits parent inode directory is changed to 32 bits inode,
I couldn't see code that would change parent pointer EA of the
children to point to the new 32 bits parent. Please correct me if
I missed something.

Regards,
Vlad

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-11-18 23:13     ` Vlad Apostolov
@ 2007-11-18 23:19       ` Vlad Apostolov
  0 siblings, 0 replies; 15+ messages in thread
From: Vlad Apostolov @ 2007-11-18 23:19 UTC (permalink / raw)
  To: Timothy Shimmin; +Cc: Barry Naujok, xfs@oss.sgi.com, xfs-dev

Vlad Apostolov wrote:
> Timothy Shimmin wrote:
>> Vlad Apostolov wrote:
>>>
>>> When the XFS parent pointers feature is released we would need to find
>>> out to update the EA to point to the new inode parent directory. 
>>> This may
>>> not be that easy though.
>>>
>> Really?
>> Apart from the swapping of extents, reno uses standard calls doesn't it,
>> in which case any movement of inodes will have the parent pointers
>> updated by the normal vnode ops (e.g. mkdir, rename) in the kernel.
>>
>> --Tim
> When a 64 bits parent inode directory is changed to 32 bits inode,
> I couldn't see code that would change parent pointer EA of the
> children to point to the new 32 bits parent. Please correct me if
> I missed something.
>
> Regards,
> Vlad
>
Or maybe renaming a file under the new parent will update the EA
parent pointer to the new parent.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-10-04  4:25 REVIEW: xfs_reno #2 Barry Naujok
  2007-10-17 15:48 ` Ruben Porras
  2007-11-16  6:04 ` Vlad Apostolov
@ 2007-11-19  3:48 ` Lachlan McIlroy
  2007-11-20  1:36 ` David Chinner
  2008-03-06 16:10 ` Ruben Porras
  4 siblings, 0 replies; 15+ messages in thread
From: Lachlan McIlroy @ 2007-11-19  3:48 UTC (permalink / raw)
  To: Barry Naujok; +Cc: xfs@oss.sgi.com, xfs-dev


+#define NH_BUCKETS	65536
+#define NH_HASH(ino)	(nodehash + ((ino) % NH_BUCKETS))
Hashing addresses by a power of two often results in an uneven
distribution over the hashtable.  Have you verified your hashing
algorithm?

+static nodelist_t *
+init_nodehash(void)
+{
+	int		i;
+
+	nodehash = calloc(NH_BUCKETS, sizeof(nodelist_t));
+	if (nodehash == NULL) {
+		err_nomem();
+		return NULL;
+	}
+
+	for (i = 0; i < NH_BUCKETS; i++) {
+		nodehash[i].nodes = NULL;
+		nodehash[i].lastnode = 0;
+		nodehash[i].listlen = 0;
+	}
No need to do this, calloc() zeroed the memory.

+
+	return nodehash;
+}


+static nlink_t
+add_path(
+	bignode_t	*node,
+	const char	*path)
+{
+	node->paths = realloc(node->paths,
+			      sizeof(char *) * (node->numpaths + 1));
Lots of little allocations here, realloc()'ing for space for one more
pointer is inefficient.  Can we alloc a chunk of pointers?

Just how many path pointers do we typically need?  Can we add an array
of initial pointers into bignode_t and when we exceed that start
allocating more chunks here?

+	if (node->paths == NULL) {
+		err_nomem();
+		exit(1);
+	}
+
+	node->paths[node->numpaths] = strdup(path);
More little allocations.  Can we preallocate a chunk of memory and
strcpy() the paths into it?  The array of path pointers would then
be indexes into the memory.

+	if (node->paths[node->numpaths] == NULL) {
+		err_nomem();
+		exit(1);
+	}
+
+	node->numpaths++;
+	if (node->numpaths > highest_numpaths)
+		highest_numpaths = node->numpaths;
+
+	return node->numpaths;
+}


+static bignode_t *
+add_node(
+	nodelist_t	*list,
+	xfs_ino_t	ino,
+	int		ftw_flags,
+	const char	*path)
+{
+	bignode_t	*node;
+
+	if (list->lastnode >= list->listlen) {
+		list->listlen += 500;
+		list->nodes = realloc(list->nodes,
+					sizeof(bignode_t) * list->listlen);
Can we avoid the realloc()?  (realloc() may need to copy the data
to a new location if it cannot extend the current allocation.)
For example each chunk of 500 nodes could end in a pointer to
the next chunk.

+		if (list->nodes == NULL) {
+			err_nomem();
+			return NULL;
+		}
+	}


+static bignode_t *
+find_node(
+	xfs_ino_t	ino)
+{
+	int		i;
+	nodelist_t	*nodelist;
+	bignode_t	*nodes;
+
+	nodelist = NH_HASH(ino);
+	nodes = nodelist->nodes;
+
+	for(i = 0; i < nodelist->lastnode; i++) {
+		if (nodes[i].ino == ino) {
By any chance do we read inodes in ascending order?  Or can they
be in random order?

If they are in ascending order then we could binary search here.
If not, and we call find_node() a lot, then it might be worth
sorting each list of nodes.

+			return &nodes[i];
+		}
+	}
+
+	return NULL;
+}


+		bignode_t	*nodes = nodehash[i].nodes;
+		for (j = 0; j < nodehash[i].lastnode; j++, nodes++)
+			dump_node("nodehash", nodes);
You have this code in various places.  You may be able to save a
few cycles by dropping the loop counter.  Note I have invented
listcount to be the actual number of nodes in the list.

	bignode_t	*nodes = nodehash[i].nodes;
	bignode_t	*lastnode = nodes + nodehash[i].listcount;
	for (; nodes < lastnode; nodes++)
		dump_node("nodehash", nodes);


+static int
+clone_attribs(
+	char		*source,
+	char		*target)
+{
+	char		list_buf[ATTRBUFSIZE];
May not be an issue putting 1k on stack here but could be a global
allocated on startup.

+	char		*attr_buf;
+	int		rval;
+
+	attr_buf = malloc(ATTR_MAX_VALUELEN * 2);
Could do this allocation on startup too - one less failure case to
worry about.

+	if (attr_buf == NULL) {
+		err_nomem();
+		return -1;
+	}
+	rval = attr_clone_copy(source, target, list_buf, attr_buf,
+			ATTR_MAX_VALUELEN * 2, 0);
+	if (rval == 0)
+		rval = attr_clone_copy(source, target, list_buf, attr_buf,
+				ATTR_MAX_VALUELEN * 2, ATTR_ROOT);
+	if (rval == 0)
+		rval = attr_clone_copy(source, target, list_buf, attr_buf,
+				ATTR_MAX_VALUELEN * 2, ATTR_SECURE);
+	free(attr_buf);
+	return rval;
+}


+	SET_PHASE(DIR_PHASE_7);
+
+	/* rename cur_target src */
+	rval = rename(cur_target, srcname);
+	if (rval != 0) {
+		/*
+		 * we can't abort since the src dir is now gone.
+		 * let the admin clean this one up
+		 */
+		err_message(_("unable to rename directory: %s to %s"),
+				cur_target, srcname);
+	}
+	goto quit;
+
+ quit_undo:
+	if (move_dirents(cur_target, srcname, &move_count) != 0) {
+		/* oh, dear lord... let the admin clean this one up */
+		err_message(_("unable to move directory contents back: %s to %s"),
+				cur_target, srcname);
+		goto quit;
+	}
Can we avoid these 'leave it to the admin to clean up' problems?
Could we rename the source directory to a temporary name, rename the
target to the source name and if all that works, remove the temporary
source otherwise remove the target and rename the source back again?
Not sure if that actually buys us anything since we're back to square
one if we can't rename the temporary source back again.


+static void
+update_recoverfile(void)
+{
+	static const char null_file[] = "0\n0\n0\n\ntarget: \ntemp: \nend\n";
+	static size_t	buf_size = 0;
+	static char	*buf = NULL;
+	int 		i, len;
+
+	if (recover_fd <= 0)
+		return;
+
+	if (cur_node == NULL || cur_phase == 0) {
+		/* inbetween processing or still scanning */
+		lseek(recover_fd, 0, SEEK_SET);
+		write(recover_fd, null_file, sizeof(null_file));
+		return;
+	}
+
+	ASSERT(highest_numpaths > 0);
+	if (buf == NULL) {
+		buf_size = (highest_numpaths + 3) * PATH_MAX;
+		buf = malloc(buf_size);
+		if (buf == NULL) {
+			err_nomem();
+			exit(1);
+		}
+	}
Should you check if highest_numpaths has increased and realloc the
buffer?  Or will we have finished the scan by the time we get here?

+
+	len = sprintf(buf, "%d\n%llu\n%d\n", cur_phase,
+			(long long)cur_node->ino, cur_node->ftw_flags);
+
+	for (i = 0; i < cur_node->numpaths; i++)
+		len += sprintf(buf + len, "%s\n", cur_node->paths[i]);
+
+	len += sprintf(buf + len, "target: %s\ntemp: %s\nend\n",
+			cur_target, cur_temp);
+
+	ASSERT(len < buf_size);
Can we use snprintf() instead?

+
+	lseek(recover_fd, 0, SEEK_SET);
+	ftruncate(recover_fd, 0);
+	write(recover_fd, buf, len);
+}

What's the test plan for xfs_reno?

Barry Naujok wrote:
> A couple changes from the first xfs_reno:
> 
>  - Major one is that symlinks are now supported, but only
>    owner, group and extended attributes are copied for them
>    (not times or inode attributes).
> 
>  - Man page!
> 
> 
> To make this better, ideally we need some form of
> "swap inodes" function in the kernel, where the entire
> contents of the inode themselves are swapped. This form
> can handle any inode and without any of the dir/file/attr/etc
> copy/swap mechanisms we have in xfs_reno.
> 
> Barry.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-11-16  6:20   ` Timothy Shimmin
  2007-11-18 23:13     ` Vlad Apostolov
@ 2007-11-19 12:39     ` Christoph Hellwig
  2007-11-19 15:52       ` Eric Sandeen
  1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2007-11-19 12:39 UTC (permalink / raw)
  To: Timothy Shimmin; +Cc: Vlad Apostolov, Barry Naujok, xfs@oss.sgi.com, xfs-dev

On Fri, Nov 16, 2007 at 05:20:28PM +1100, Timothy Shimmin wrote:
> Vlad Apostolov wrote:
> >
> >When the XFS parent pointers feature is released we would need to find
> >out to update the EA to point to the new inode parent directory. This may
> >not be that easy though.
> >
> Really?
> Apart from the swapping of extents, reno uses standard calls doesn't it,
> in which case any movement of inodes will have the parent pointers
> updated by the normal vnode ops (e.g. mkdir, rename) in the kernel.

What parent pointers?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-11-19 12:39     ` Christoph Hellwig
@ 2007-11-19 15:52       ` Eric Sandeen
  2007-11-19 22:08         ` Vlad Apostolov
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Sandeen @ 2007-11-19 15:52 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Timothy Shimmin, Vlad Apostolov, Barry Naujok, xfs@oss.sgi.com,
	xfs-dev

Christoph Hellwig wrote:
> On Fri, Nov 16, 2007 at 05:20:28PM +1100, Timothy Shimmin wrote:
>> Vlad Apostolov wrote:
>>> When the XFS parent pointers feature is released we would need to find
>>> out to update the EA to point to the new inode parent directory. This may
>>> not be that easy though.
>>>
>> Really?
>> Apart from the swapping of extents, reno uses standard calls doesn't it,
>> in which case any movement of inodes will have the parent pointers
>> updated by the normal vnode ops (e.g. mkdir, rename) in the kernel.
> 
> What parent pointers?
> 
> 

The ones not yet released I guess :)

>> Vlad Apostolov wrote:
>>> When the XFS parent pointers feature is released...

-Eric

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-11-19 15:52       ` Eric Sandeen
@ 2007-11-19 22:08         ` Vlad Apostolov
  0 siblings, 0 replies; 15+ messages in thread
From: Vlad Apostolov @ 2007-11-19 22:08 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Christoph Hellwig, Timothy Shimmin, Barry Naujok, xfs@oss.sgi.com,
	xfs-dev

Eric Sandeen wrote:
> Christoph Hellwig wrote:
>   
>> On Fri, Nov 16, 2007 at 05:20:28PM +1100, Timothy Shimmin wrote:
>>     
>>> Vlad Apostolov wrote:
>>>       
>>>> When the XFS parent pointers feature is released we would need to find
>>>> out to update the EA to point to the new inode parent directory. This may
>>>> not be that easy though.
>>>>
>>>>         
>>> Really?
>>> Apart from the swapping of extents, reno uses standard calls doesn't it,
>>> in which case any movement of inodes will have the parent pointers
>>> updated by the normal vnode ops (e.g. mkdir, rename) in the kernel.
>>>       
>> What parent pointers?
>>
>>
>>     
>
> The ones not yet released I guess :)
>   
It is a released and existing feature on XFS for Irix, that we are back 
porting to
Linux. You will see some patches soon.

Regards,
Vlad
>   
>>> Vlad Apostolov wrote:
>>>       
>>>> When the XFS parent pointers feature is released...
>>>>         
>
> -Eric
>   

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-10-04  4:25 REVIEW: xfs_reno #2 Barry Naujok
                   ` (2 preceding siblings ...)
  2007-11-19  3:48 ` Lachlan McIlroy
@ 2007-11-20  1:36 ` David Chinner
  2007-11-23 14:30   ` Ruben Porras
  2008-03-06 16:11   ` Ruben Porras
  2008-03-06 16:10 ` Ruben Porras
  4 siblings, 2 replies; 15+ messages in thread
From: David Chinner @ 2007-11-20  1:36 UTC (permalink / raw)
  To: Barry Naujok; +Cc: xfs@oss.sgi.com, xfs-dev

On Thu, Oct 04, 2007 at 02:25:16PM +1000, Barry Naujok wrote:
> A couple changes from the first xfs_reno:
> 
>  - Major one is that symlinks are now supported, but only
>    owner, group and extended attributes are copied for them
>    (not times or inode attributes).
> 
>  - Man page!
> 
> 
> To make this better, ideally we need some form of
> "swap inodes" function in the kernel, where the entire
> contents of the inode themselves are swapped. This form
> can handle any inode and without any of the dir/file/attr/etc
> copy/swap mechanisms we have in xfs_reno.

Something like the attached patch?

This is proof-of-concept. I've compiled it but I haven't tested
it. Your mission, Barry, should you choose to accept it, it to
make it work ;)

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

---
 fs/xfs/linux-2.6/xfs_ioctl.c |    4 
 fs/xfs/xfs_dfrag.c           |  313 ++++++++++++++++++++++++++++++++++++-------
 fs/xfs/xfs_dfrag.h           |   24 ++-
 fs/xfs/xfs_fs.h              |    1 
 fs/xfs/xfs_trans.h           |    3 
 fs/xfs/xfsidbg.c             |    9 -
 6 files changed, 297 insertions(+), 57 deletions(-)

Index: 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_ioctl.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/linux-2.6/xfs_ioctl.c	2007-11-16 10:27:41.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/linux-2.6/xfs_ioctl.c	2007-11-20 11:18:45.829822690 +1100
@@ -817,6 +817,10 @@ xfs_ioctl(
 		error = xfs_swapext((struct xfs_swapext __user *)arg);
 		return -error;
 	}
+	case XFS_IOC_SWAPINO: {
+		error = xfs_swapino((struct xfs_swapino __user *)arg);
+		return -error;
+	}
 
 	case XFS_IOC_FSCOUNTS: {
 		xfs_fsop_counts_t out;
Index: 2.6.x-xfs-new/fs/xfs/xfs_dfrag.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/xfs_dfrag.c	2007-11-16 10:27:41.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/xfs_dfrag.c	2007-11-20 11:41:28.196327293 +1100
@@ -44,6 +44,20 @@
 #include "xfs_rw.h"
 #include "xfs_vnodeops.h"
 
+
+STATIC int
+xfs_swap_fd_to_inode(int fd, struct file **filp, xfs_inode_t **ip)
+{
+	*filp = fget(fd);
+	if (!*filp)
+		return EINVAL;
+
+	*ip = XFS_I((*filp)->f_path.dentry->d_inode);
+	if (!*ip)
+		return EBADF;
+	return 0;
+}
+
 /*
  * Syssgi interface for swapext
  */
@@ -53,75 +67,85 @@ xfs_swapext(
 {
 	xfs_swapext_t	*sxp;
 	xfs_inode_t     *ip=NULL, *tip=NULL;
-	xfs_mount_t     *mp;
 	struct file	*fp = NULL, *tfp = NULL;
-	bhv_vnode_t	*vp, *tvp;
-	int		error = 0;
+	int		error;
 
+	error = ENOMEM;
 	sxp = kmem_alloc(sizeof(xfs_swapext_t), KM_MAYFAIL);
-	if (!sxp) {
-		error = XFS_ERROR(ENOMEM);
+	if (!sxp)
 		goto error0;
-	}
-
-	if (copy_from_user(sxp, sxu, sizeof(xfs_swapext_t))) {
-		error = XFS_ERROR(EFAULT);
+	error = EFAULT;
+	if (copy_from_user(sxp, sxu, sizeof(xfs_swapext_t)))
 		goto error0;
-	}
 
-	/* Pull information for the target fd */
-	if (((fp = fget((int)sxp->sx_fdtarget)) == NULL) ||
-	    ((vp = vn_from_inode(fp->f_path.dentry->d_inode)) == NULL))  {
-		error = XFS_ERROR(EINVAL);
+	error = xfs_swap_fd_to_inode((int)sxp->sx_fdtarget, &fp, &ip);
+	if (error)
 		goto error0;
-	}
-
-	ip = xfs_vtoi(vp);
-	if (ip == NULL) {
-		error = XFS_ERROR(EBADF);
+	error = xfs_swap_fd_to_inode((int)sxp->sx_fdtmp, &tfp, &tip);
+	if (error)
 		goto error0;
-	}
 
-	if (((tfp = fget((int)sxp->sx_fdtmp)) == NULL) ||
-	    ((tvp = vn_from_inode(tfp->f_path.dentry->d_inode)) == NULL)) {
-		error = XFS_ERROR(EINVAL);
+	error = EINVAL;
+	if (ip->i_mount != tip->i_mount)
 		goto error0;
-	}
-
-	tip = xfs_vtoi(tvp);
-	if (tip == NULL) {
-		error = XFS_ERROR(EBADF);
+	if (ip->i_ino == tip->i_ino)
 		goto error0;
-	}
-
-	if (ip->i_mount != tip->i_mount) {
-		error =  XFS_ERROR(EINVAL);
+	error = EIO;
+	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
 		goto error0;
-	}
 
-	if (ip->i_ino == tip->i_ino) {
-		error =  XFS_ERROR(EINVAL);
-		goto error0;
-	}
+	error = xfs_swap_extents(ip, tip, sxp);
+error0:
+	if (fp != NULL)
+		fput(fp);
+	if (tfp != NULL)
+		fput(tfp);
+	if (sxp != NULL)
+		kmem_free(sxp, sizeof(xfs_swapext_t));
+	return error;
+}
 
-	mp = ip->i_mount;
 
-	if (XFS_FORCED_SHUTDOWN(mp)) {
-		error =  XFS_ERROR(EIO);
+int
+xfs_swapino(
+	xfs_swapino_t	__user *siu)
+{
+	xfs_swapino_t	*sino;
+	xfs_inode_t     *ip=NULL, *tip=NULL;
+	struct file	*fp = NULL, *tfp = NULL;
+	int		error;
+
+	error = ENOMEM;
+	sino = kmem_alloc(sizeof(xfs_swapino_t), KM_MAYFAIL);
+	if (!sino)
+		goto error0;
+	error = EFAULT;
+	if (copy_from_user(sino, siu, sizeof(xfs_swapino_t)))
 		goto error0;
-	}
 
-	error = xfs_swap_extents(ip, tip, sxp);
+	error = xfs_swap_fd_to_inode((int)sino->sx_fdtarget, &fp, &ip);
+	if (error)
+		goto error0;
+	error = xfs_swap_fd_to_inode((int)sino->sx_fdtmp, &tfp, &tip);
+	if (error)
+		goto error0;
+	error = EINVAL;
+	if (ip->i_mount != tip->i_mount)
+		goto error0;
+	if (ip->i_ino == tip->i_ino)
+		goto error0;
+	error = EIO;
+	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
+		goto error0;
 
- error0:
+	error = xfs_swap_inodes(ip, tip, sino);
+error0:
 	if (fp != NULL)
 		fput(fp);
 	if (tfp != NULL)
 		fput(tfp);
-
-	if (sxp != NULL)
-		kmem_free(sxp, sizeof(xfs_swapext_t));
-
+	if (sino != NULL)
+		kmem_free(sino, sizeof(xfs_swapino_t));
 	return error;
 }
 
@@ -397,3 +421,198 @@ xfs_swap_extents(
 		kmem_free(tempifp, sizeof(xfs_ifork_t));
 	return error;
 }
+
+STATIC void
+xfs_swapino_log_fields(
+	xfs_trans_t	*tp,
+	xfs_inode_t	*ip)
+{
+	int		ilf_fields = XFS_ILOG_CORE;
+
+	switch(ip->i_d.di_format) {
+	case XFS_DINODE_FMT_EXTENTS:
+		/* If the extents fit in the inode, fix the
+		 * pointer.  Otherwise it's already NULL or
+		 * pointing to the extent.
+		 */
+		if (ip->i_d.di_nextents <= XFS_INLINE_EXTS) {
+			xfs_ifork_t	*ifp = &ip->i_df;
+			ifp->if_u1.if_extents = ifp->if_u2.if_inline_ext;
+		}
+		ilf_fields |= XFS_ILOG_DEXT;
+		break;
+	case XFS_DINODE_FMT_BTREE:
+		ilf_fields |= XFS_ILOG_DBROOT;
+		break;
+	}
+
+	switch(ip->i_d.di_aformat) {
+	case XFS_DINODE_FMT_LOCAL:
+		ilf_fields |= XFS_ILOG_ADATA;
+		break;
+	case XFS_DINODE_FMT_EXTENTS:
+		/* If the extents fit in the inode, fix the
+		 * pointer.  Otherwise it's already NULL or
+		 * pointing to the extent.
+		 */
+		if (ip->i_d.di_nextents <= XFS_INLINE_EXTS) {
+			xfs_ifork_t	*ifp = ip->i_afp;
+			ifp->if_u1.if_extents = ifp->if_u2.if_inline_ext;
+		}
+		ilf_fields |= XFS_ILOG_AEXT;
+		break;
+	case XFS_DINODE_FMT_BTREE:
+		ilf_fields |= XFS_ILOG_ABROOT;
+		break;
+	}
+	xfs_trans_log_inode(tp, ip, ilf_fields);
+}
+
+int
+xfs_swap_inodes(
+	xfs_inode_t	*ip,
+	xfs_inode_t	*tip,
+	xfs_swapino_t	*sino)
+{
+	xfs_mount_t	*mp;
+	xfs_inode_t	*ips[2];
+	xfs_trans_t	*tp;
+	xfs_icdinode_t	*dic = NULL;
+	xfs_ifork_t	*tempifp, *ifp, *tifp, *i_afp;
+	static uint	lock_flags = XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL;
+	int		error;
+	char		locked = 0;
+
+	mp = ip->i_mount;
+	error = ENOMEM;
+	tempifp = kmem_alloc(sizeof(xfs_ifork_t), KM_MAYFAIL);
+	if (!tempifp)
+		goto error0;
+	dic = kmem_alloc(sizeof(xfs_dinode_core_t), KM_MAYFAIL);
+	if (!dic)
+		goto error0;
+
+	/* Lock in i_ino order */
+	if (ip->i_ino < tip->i_ino) {
+		ips[0] = ip;
+		ips[1] = tip;
+	} else {
+		ips[0] = tip;
+		ips[1] = ip;
+	}
+
+	xfs_lock_inodes(ips, 2, 0, lock_flags);
+	locked = 1;
+
+	/* Check permissions */
+	error = xfs_iaccess(ip, S_IWUSR, NULL);
+	if (error)
+		goto error0;
+	error = xfs_iaccess(tip, S_IWUSR, NULL);
+	if (error)
+		goto error0;
+
+	/* Verify that both files have the same format */
+	error = EINVAL;
+	if ((ip->i_d.di_mode & S_IFMT) != (tip->i_d.di_mode & S_IFMT))
+		goto error0;
+
+	/* Verify both files are either real-time or non-realtime */
+	if (XFS_IS_REALTIME_INODE(ip) != XFS_IS_REALTIME_INODE(tip))
+		goto error0;
+
+	if (VN_CACHED(tip->i_vnode) != 0) {
+		xfs_inval_cached_trace(tip, 0, -1, 0, -1);
+		error = xfs_flushinval_pages(tip, 0, -1,
+				FI_REMAPF_LOCKED);
+		if (error)
+			goto error0;
+	}
+
+	xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	xfs_iunlock(tip, XFS_ILOCK_EXCL);
+
+	/*
+	 * There is a race condition here since we gave up the
+	 * ilock.  However, the data fork will not change since
+	 * we have the iolock (locked for truncation too) so we
+	 * are safe.  We don't really care if non-io related
+	 * fields change.
+	 */
+
+	xfs_tosspages(ip, 0, -1, FI_REMAPF);
+
+	tp = xfs_trans_alloc(mp, XFS_TRANS_SWAPINO);
+	error = xfs_trans_reserve(tp, 0, 2 * XFS_ICHANGE_LOG_RES(mp), 0, 0, 0);
+	if (error) {
+		xfs_iunlock(ip,  XFS_IOLOCK_EXCL);
+		xfs_iunlock(tip, XFS_IOLOCK_EXCL);
+		xfs_trans_cancel(tp, 0);
+		locked = 0;
+		goto error0;
+	}
+	xfs_lock_inodes(ips, 2, 0, XFS_ILOCK_EXCL);
+
+	/*
+	 * Swap the inode cores - structure copies.
+	 */
+	*dic = ip->i_d;
+	ip->i_d = tip->i_d;
+	tip->i_d = *dic;
+
+	/*
+	 * Swap the data forks of the inodes - structure copies
+	 */
+	ifp = &ip->i_df;
+	tifp = &tip->i_df;
+	*tempifp = *ifp;
+	*ifp = *tifp;
+	*tifp = *tempifp;
+
+	/*
+	 * Swap the attribute forks
+	 */
+	i_afp = ip->i_afp;
+	ip->i_afp = tip->i_afp;
+	tip->i_afp = i_afp;
+
+	/*
+	 * Increment vnode ref counts since xfs_trans_commit &
+	 * xfs_trans_cancel will both unlock the inodes and
+	 * decrement the associated ref counts.
+	 */
+	VN_HOLD(ip->i_vnode);
+	VN_HOLD(tip->i_vnode);
+	xfs_trans_ijoin(tp, ip, lock_flags);
+	xfs_trans_ijoin(tp, tip, lock_flags);
+
+
+	/*
+	 * log both entire inodes
+	 */
+	xfs_swapino_log_fields(tp, ip);
+	xfs_swapino_log_fields(tp, tip);
+
+	/*
+	 * If this is a synchronous mount, make sure that the
+	 * transaction goes to disk before returning to the user.
+	 */
+	if (mp->m_flags & XFS_MOUNT_WSYNC)
+		xfs_trans_set_sync(tp);
+
+	error = xfs_trans_commit(tp, XFS_TRANS_SWAPINO);
+	locked = 0;
+
+ error0:
+	if (locked) {
+		xfs_iunlock(ip,  lock_flags);
+		xfs_iunlock(tip, lock_flags);
+	}
+	vn_revalidate(ip->i_vnode);
+	vn_revalidate(tip->i_vnode);
+	if (dic)
+		kmem_free(dic, sizeof(xfs_icdinode_t));
+	if (tempifp)
+		kmem_free(tempifp, sizeof(xfs_ifork_t));
+	return error;
+}
Index: 2.6.x-xfs-new/fs/xfs/xfs_dfrag.h
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/xfs_dfrag.h	2007-01-16 10:54:17.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/xfs_dfrag.h	2007-11-20 11:20:01.364010037 +1100
@@ -21,7 +21,6 @@
 /*
  * Structure passed to xfs_swapext
  */
-
 typedef struct xfs_swapext
 {
 	__int64_t	sx_version;	/* version */
@@ -38,19 +37,34 @@ typedef struct xfs_swapext
  */
 #define XFS_SX_VERSION		0
 
-#ifdef __KERNEL__
 /*
- * Prototypes for visible xfs_dfrag.c routines.
+ * Structure passed to xfs_swapext
  */
+typedef struct xfs_swapino
+{
+	__int64_t	sx_version;	/* version */
+	__int64_t	sx_fdtarget;	/* fd of target file */
+	__int64_t	sx_fdtmp;	/* fd of tmp file */
+	char		sx_pad[16];	/* pad space, unused */
+} xfs_swapino_t;
 
 /*
- * Syscall interface for xfs_swapext
+ * Version flag
+ */
+#define XFS_SI_VERSION		0
+
+#ifdef __KERNEL__
+/*
+ * Prototypes for visible xfs_dfrag.c routines.
  */
-int	xfs_swapext(struct xfs_swapext __user *sx);
 
+int	xfs_swapext(struct xfs_swapext __user *sx);
 int	xfs_swap_extents(struct xfs_inode *ip, struct xfs_inode *tip,
 		struct xfs_swapext *sxp);
 
+int	xfs_swapino(struct xfs_swapino __user *si);
+int	xfs_swap_inodes(struct xfs_inode *ip, struct xfs_inode *tip,
+		struct xfs_swapino *sino);
 #endif	/* __KERNEL__ */
 
 #endif	/* __XFS_DFRAG_H__ */
Index: 2.6.x-xfs-new/fs/xfs/xfs_fs.h
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/xfs_fs.h	2007-10-15 09:58:18.000000000 +1000
+++ 2.6.x-xfs-new/fs/xfs/xfs_fs.h	2007-11-20 11:19:54.640883392 +1100
@@ -480,6 +480,7 @@ typedef struct xfs_handle {
 #define XFS_IOC_ATTRMULTI_BY_HANDLE  _IOW ('X', 123, struct xfs_fsop_attrmulti_handlereq)
 #define XFS_IOC_FSGEOMETRY	     _IOR ('X', 124, struct xfs_fsop_geom)
 #define XFS_IOC_GOINGDOWN	     _IOR ('X', 125, __uint32_t)
+#define XFS_IOC_SWAPINO		     _IOWR('X', 126, struct xfs_swapino)
 /*	XFS_IOC_GETFSUUID ---------- deprecated 140	 */
 
 
Index: 2.6.x-xfs-new/fs/xfs/xfs_trans.h
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/xfs_trans.h	2007-11-16 11:32:26.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/xfs_trans.h	2007-11-20 11:28:17.027542129 +1100
@@ -95,7 +95,8 @@ typedef struct xfs_trans_header {
 #define	XFS_TRANS_GROWFSRT_FREE		39
 #define	XFS_TRANS_SWAPEXT		40
 #define	XFS_TRANS_SB_COUNT		41
-#define	XFS_TRANS_TYPE_MAX		41
+#define	XFS_TRANS_SWAPINO		42
+#define	XFS_TRANS_TYPE_MAX		42
 /* new transaction types need to be reflected in xfs_logprint(8) */
 
 
Index: 2.6.x-xfs-new/fs/xfs/xfsidbg.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/xfsidbg.c	2007-11-16 11:32:26.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/xfsidbg.c	2007-11-20 11:29:03.701447244 +1100
@@ -5911,11 +5911,12 @@ xfsidbg_print_trans_type(unsigned int t_
 	case XFS_TRANS_GROWFSRT_ALLOC:	kdb_printf("GROWFSRT_ALLOC");	break;
 	case XFS_TRANS_GROWFSRT_ZERO:	kdb_printf("GROWFSRT_ZERO");	break;
 	case XFS_TRANS_GROWFSRT_FREE:	kdb_printf("GROWFSRT_FREE");	break;
-  	case XFS_TRANS_SWAPEXT:		kdb_printf("SWAPEXT");		break;
+	case XFS_TRANS_SWAPEXT:		kdb_printf("SWAPEXT");		break;
 	case XFS_TRANS_SB_COUNT:	kdb_printf("SB_COUNT");		break;
- 	case XFS_TRANS_DUMMY1:		kdb_printf("DUMMY1");		break;
- 	case XFS_TRANS_DUMMY2:		kdb_printf("DUMMY2");		break;
- 	case XLOG_UNMOUNT_REC_TYPE:	kdb_printf("UNMOUNT");		break;
+	case XFS_TRANS_SWAPINO:		kdb_printf("SWAPINO");		break;
+	case XFS_TRANS_DUMMY1:		kdb_printf("DUMMY1");		break;
+	case XFS_TRANS_DUMMY2:		kdb_printf("DUMMY2");		break;
+	case XLOG_UNMOUNT_REC_TYPE:	kdb_printf("UNMOUNT");		break;
 	default:			kdb_printf("unknown(0x%x)", t_type); break;
 	}
 }

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-11-20  1:36 ` David Chinner
@ 2007-11-23 14:30   ` Ruben Porras
  2008-03-06 16:11   ` Ruben Porras
  1 sibling, 0 replies; 15+ messages in thread
From: Ruben Porras @ 2007-11-23 14:30 UTC (permalink / raw)
  To: David Chinner; +Cc: Barry Naujok, xfs@oss.sgi.com, xfs-dev

David Chinner schrieb:
> On Thu, Oct 04, 2007 at 02:25:16PM +1000, Barry Naujok wrote:
>   
>> A couple changes from the first xfs_reno:
>>
>>  - Major one is that symlinks are now supported, but only
>>    owner, group and extended attributes are copied for them
>>    (not times or inode attributes).
>>
>>  - Man page!
>>
>>
>> To make this better, ideally we need some form of
>> "swap inodes" function in the kernel, where the entire
>> contents of the inode themselves are swapped. This form
>> can handle any inode and without any of the dir/file/attr/etc
>> copy/swap mechanisms we have in xfs_reno.
>>     
>
> Something like the attached patch?
>
> This is proof-of-concept. I've compiled it but I haven't tested
> it. Your mission, Barry, should you choose to accept it, it to
> make it work ;)
>
> Cheers,
>
> Dave.
>   
Great!

Inline are changes to xfs_reno to make it use the new ioctl. It is also 
a proof-of-concept. I've compiled it but I haven't tested it ;)

Now process_(dir|file|slink) do more or less the same, should be a good 
idea to mix them in one function as I did with them inside the "recover" 
function.

I also did s/sx/si/ on the xfs_swapino_t to make the notation consistent 
with xfs_swapext_t.

Some questions,

why do we use ftruncate on files?

before moving file inodes, XFS_XFLAG_IMMUTABLE | XFS_XFLAG_APPEND are 
checked. I extended the check also for directories but not for symlinks. 
Under which conditions are these flags setted. Can't we skip the test no 
that we do struct copy of the inode?

Have a nice weekend.

-- 
Rubén Porras
LinWorks GmbH

--
xfs_reno.c |  737 +++++++++++++++++--------------------------------------------
 1 file changed, 212 insertions(+), 525 deletions(-)

--- xfs_reno_old.c	2007-11-22 18:55:36.276029053 +0100
+++ xfs_reno.c	2007-11-23 15:00:53.283564811 +0100
@@ -50,30 +50,24 @@
 #include <xfs/xfs_dfrag.h>
 #include <xfs/xfs_inum.h>
 
-#define ATTRBUFSIZE	1024
-
 #define SCAN_PHASE	0x00
 #define DIR_PHASE	0x10	/* nothing done or all done */
-#define DIR_PHASE_1	0x11	/* target dir created */
-#define DIR_PHASE_2	0x12	/* temp dir created */
-#define DIR_PHASE_3	0x13	/* attributes backed up to temp */
-#define DIR_PHASE_4	0x14	/* dirents moved to target dir */
-#define DIR_PHASE_5	0x15	/* attributes applied to target dir */
-#define DIR_PHASE_6	0x16	/* src dir removed */
-#define DIR_PHASE_7	0x17	/* temp dir removed */
-#define DIR_PHASE_MAX	0x17
+#define DIR_PHASE_1	0x11	/* temp dir created */
+#define DIR_PHASE_2	0x12	/* swapped extents and inodes */
+#define DIR_PHASE_3	0x13	/* src dir removed */
+#define DIR_PHASE_MAX	0x13	/* renamed temp to source name */
 #define FILE_PHASE	0x20	/* nothing done or all done */
 #define FILE_PHASE_1	0x21	/* temp file created */
-#define FILE_PHASE_2	0x22	/* swapped extents */
+#define FILE_PHASE_2	0x22	/* swapped extents and inodes */
 #define FILE_PHASE_3	0x23	/* unlinked source */
-#define FILE_PHASE_4	0x24	/* renamed temp to source name */
-#define FILE_PHASE_MAX	0x24
+#define FILE_PHASE_4	0x24	/* hard links copied */
+#define FILE_PHASE_MAX	0x24	/* renamed temp to source name */
 #define SLINK_PHASE	0x30	/* nothing done or all done */
 #define SLINK_PHASE_1	0x31	/* temp symlink created */
 #define SLINK_PHASE_2	0x32	/* symlink attrs copied */
 #define SLINK_PHASE_3	0x33	/* unlinked source */
-#define SLINK_PHASE_4	0x34	/* renamed temp to source name */
-#define SLINK_PHASE_MAX	0x34
+#define SLINK_PHASE_4	0x34	/* hard links copied */
+#define SLINK_PHASE_MAX	0x34	/* renamed temp to source name */
 
 static void update_recoverfile(void);
 #define SET_PHASE(x)	(cur_phase = x, update_recoverfile())
@@ -117,7 +111,6 @@
 static time_t		starttime;
 static bignode_t	*cur_node;
 static char		*cur_target;
-static char		*cur_temp;
 static int		cur_phase;
 static int		highest_numpaths;
 static char		*recover_file;
@@ -189,6 +182,60 @@
 	err_message(_("Cannot stat %s: %s\n"), s, strerror(errno));
 }
 
+static void
+err_swapino(
+	    int err,
+	    const char *srcname)
+{
+	if (log_level >= LOG_DEBUG) {
+		switch (err) {
+		case EIO:
+			err_message(_("Filesystem is going down: %s: %s"),
+				srcname, strerror(err));
+			break;
+
+		default:
+			err_message(_("Swap inode failed: %s: %s"),
+				srcname, strerror(err));
+			break;
+		}
+	} else
+		err_message(_("Swap inode failed: %s: %s"),
+				srcname, strerror(err));
+}
+
+static void
+err_swapext(
+	    int err,
+	    const char *srcname,
+	    xfs_off_t bs_size)
+{
+	if (log_level >= LOG_DEBUG) {
+		switch (err) {
+		case ENOTSUP:
+			err_message("%s: file type not supported",
+				srcname);
+			break;
+		case EFAULT:
+			/* The file has changed since we started the copy */
+			err_message("%s: file modified, "
+				 "inode renumber aborted: %ld",
+				 srcname, bs_size);
+			break;
+		case EBUSY:
+			/* Timestamp has changed or mmap'ed file */
+			err_message("%s: file busy", srcname);
+			break;
+		default:
+			err_message(_("Swap extents failed: %s: %s"),
+				srcname, strerror(errno));
+			break;
+		}
+	} else
+		err_message(_("Swap extents failed: %s: %s"),
+				srcname, strerror(errno));
+}
+
 /*
  * usage message
  */
@@ -224,15 +271,15 @@
 }
 
 static int
-xfs_getxattr(int fd, struct fsxattr *attr)
+xfs_swapino(int fd, xfs_swapino_t *iu)
 {
-	return ioctl(fd, XFS_IOC_FSGETXATTR, attr);
+	return ioctl(fd, XFS_IOC_SWAPINO, iu);
 }
 
 static int
-xfs_setxattr(int fd, struct fsxattr *attr)
+xfs_getxattr(int fd, struct fsxattr *attr)
 {
-	return ioctl(fd, XFS_IOC_FSSETXATTR, attr);
+	return ioctl(fd, XFS_IOC_FSGETXATTR, attr);
 }
 
 /*
@@ -461,253 +508,19 @@
 	return 0;
 }
 
-/*
- * Attribute cloning code - most of this is here because attr_copy does not
- * let us pick and choose which attributes we want to copy.
- */
-
-attr_multiop_t	attr_ops[ATTR_MAX_MULTIOPS];
-
-/*
- * Grab attributes specified in attr_ops from source file and write them
- * out on the destination file.
- */
-
-static int
-attr_replicate(
-	char		*source,
-	char		*target,
-	int		count)
-{
-	int		j, k;
-
-	if (attr_multi(source, attr_ops, count, ATTR_DONTFOLLOW) < 0)
-		return -1;
-
-	for (k = 0; k < count; k++) {
-		if (attr_ops[k].am_error) {
-			err_message(_("Error %d getting attribute"),
-					attr_ops[k].am_error);
-			break;
-		}
-		attr_ops[k].am_opcode = ATTR_OP_SET;
-	}
-	if (attr_multi(target, attr_ops, k, ATTR_DONTFOLLOW) < 0)
-		err_message("on attr_multif set");
-	for (j = 0; j < k; j++) {
-		if (attr_ops[j].am_error) {
-			err_message(_("Error %d setting attribute"),
-					attr_ops[j].am_error);
-			return -1;
-		}
-	}
-
-	return 0;
-}
-
-/*
- * Copy all the attributes specified from src to dst.
- */
-
-static int
-attr_clone_copy(
-	char		*source,
-	char		*target,
-	char		*list_buf,
-	char		*attr_buf,
-	int		buf_len,
-	int		flags)
-{
-        attrlist_t 	*alist;
-        attrlist_ent_t	*attr;
-        attrlist_cursor_t cursor;
-        int		space, i, j;
-	char		*ptr;
-
-        bzero((char *)&cursor, sizeof(cursor));
-        do {
-                if (attr_list(source, list_buf, ATTRBUFSIZE,
-                		flags | ATTR_DONTFOLLOW, &cursor) < 0) {
-			err_message("on attr_listf");
-                        return -1;
-		}
-
-                alist = (attrlist_t *)list_buf;
-
-		space = buf_len;
-		ptr = attr_buf;
-                for (j = 0, i = 0; i < alist->al_count; i++) {
-                        attr = ATTR_ENTRY(list_buf, i);
-			if (space < attr->a_valuelen) {
-				if (attr_replicate(source, target, j) < 0)
-					return -1;
-				j = 0;
-				space = buf_len;
-				ptr = attr_buf;
-			}
-			attr_ops[j].am_opcode = ATTR_OP_GET;
-			attr_ops[j].am_attrname = attr->a_name;
-			attr_ops[j].am_attrvalue = ptr;
-			attr_ops[j].am_length = (int) attr->a_valuelen;
-			attr_ops[j].am_flags = flags;
-			attr_ops[j].am_error = 0;
-			j++;
-			ptr += attr->a_valuelen;
-			space -= attr->a_valuelen;
-                }
-
-		log_message(LOG_NITTY, "copying attribute %d", i);
-
-		if (j) {
-			if (attr_replicate(source, target, j) < 0)
-				return -1;
-		}
-
-        } while (alist->al_more);
-
-        return 0;
-}
-
-static int
-clone_attribs(
-	char		*source,
-	char		*target)
-{
-	char		list_buf[ATTRBUFSIZE];
-	char		*attr_buf;
-	int		rval;
-
-	attr_buf = malloc(ATTR_MAX_VALUELEN * 2);
-	if (attr_buf == NULL) {
-		err_nomem();
-		return -1;
-	}
-	rval = attr_clone_copy(source, target, list_buf, attr_buf,
-			ATTR_MAX_VALUELEN * 2, 0);
-	if (rval == 0)
-		rval = attr_clone_copy(source, target, list_buf, attr_buf,
-				ATTR_MAX_VALUELEN * 2, ATTR_ROOT);
-	if (rval == 0)
-		rval = attr_clone_copy(source, target, list_buf, attr_buf,
-				ATTR_MAX_VALUELEN * 2, ATTR_SECURE);
-	free(attr_buf);
-	return rval;
-}
-
-static int
-dup_attributes(
-	char		*source,
-	int		sfd,
-	char		*target,
-	int		tfd)
-{
-	struct stat64	st;
-	struct timeval	tv[2];
-	struct fsxattr	fsx;
-
-	if (fstat64(sfd, &st) < 0) {
-		err_stat(source);
-		return -1;
-	}
-
-	if (xfs_getxattr(sfd, &fsx) < 0) {
-		err_stat(source);
-		return -1;
-	}
-
-	tv[0].tv_sec = st.st_atim.tv_sec;
-	tv[0].tv_usec = st.st_atim.tv_nsec / 1000;
-	tv[1].tv_sec = st.st_mtim.tv_sec;
-	tv[1].tv_usec = st.st_mtim.tv_nsec / 1000;
-
-	if (futimes(tfd, tv) < 0)
-		err_message(_("%s: Cannot update target times"), target);
-
-	if (fchown(tfd, st.st_uid, st.st_gid) < 0) {
-		err_message(_("%s: Cannot change target ownership to "
-				"uid(%d) gid(%d)"), target,
-				st.st_uid, st.st_gid);
-
-		if (fchmod(tfd, st.st_mode & ~(S_ISUID | S_ISGID)) < 0)
-			err_message(_("%s: Cannot change target mode "
-					"to (%o)"), target, st.st_mode);
-	} else if (fchmod(tfd, st.st_mode) < 0)
-		err_message(_("%s: Cannot change target mode to (%o)"),
-				target, st.st_mode);
-
-	if (xfs_setxattr(tfd, &fsx) < 0)
-		err_message(_("%s: Cannet set target extended "
-				"attributes"), target);
-
-	return clone_attribs(source, target);
-}
-
-static int
-move_dirents(
-	char		*srcpath,
-	char		*targetpath,
-	int		*move_count)
-{
-	int		rval = 0;
-	DIR		*srcd;
-	struct dirent64	*dp;
-	char		srcname[PATH_MAX];
-	char		targetname[PATH_MAX];
-
-	*move_count = 0;
-
-	srcd = opendir(srcpath);
-	if (srcd == NULL) {
-		err_open(srcpath);
-		return 1;
-	}
-
-	while ((dp = readdir64(srcd)) != NULL) {
-		if (dp->d_ino == 0 || !strcmp(dp->d_name, ".") ||
-				!strcmp(dp->d_name, ".."))
-			continue;
-
-		if (strlen(srcpath) + 1 + strlen(dp->d_name) >=
-				sizeof(srcname) - 1) {
-
-			err_message(_("%s/%s: Name too long"), srcpath,
-					dp->d_name);
-			rval = 1;
-			goto quit;
-		}
-
-		sprintf(srcname, "%s/%s", srcpath, dp->d_name);
-		sprintf(targetname, "%s/%s", targetpath, dp->d_name);
-
-		rval = rename(srcname, targetname);
-		if (rval != 0) {
-			err_message(_("failed to rename: \'%s\' to \'%s\'"),
-					srcname, targetname);
-			goto quit;
-		}
-
-		log_message(LOG_DEBUG, "rename %s -> %s", srcname, targetname);
-
-		(*move_count)++;
-	}
-
-quit:
-	closedir(srcd);
-	return rval;
-}
-
 static int
 process_dir(
 	bignode_t	*node)
 {
 	int		sfd = -1;
 	int		tfd = -1;
-	int		targetfd = -1;
 	int		rval = 0;
-	int		move_count = 0;
+	struct stat64	st;
 	char		*srcname = NULL;
 	char		*pname = NULL;
-	struct stat64	s1;
+	xfs_swapino_t	si;
+	xfs_swapext_t	sx;
+	xfs_bstat_t	bstatbuf;
 	struct fsxattr  fsx;
 	char		target[PATH_MAX] = "";
 
@@ -718,14 +531,19 @@
 	cur_node = node;
 	srcname = node->paths[0];
 
-	if (stat64(srcname, &s1) < 0) {
+	bzero(&st, sizeof(st));
+	bzero(&bstatbuf, sizeof(bstatbuf));
+	bzero(&si, sizeof(si));
+	bzero(&sx, sizeof(sx));
+
+	if (stat64(srcname, &st) < 0) {
 		if (errno != ENOENT) {
 			err_stat(srcname);
 			global_rval |= 2;
 		}
 		goto quit;
 	}
-	if (s1.st_ino <= XFS_MAXINUMBER_32 && !force_all) {
+	if (st.st_ino <= XFS_MAXINUMBER_32 && !force_all) {
 		/*
 		 * This directory has already changed ino's, probably due
 		 * to being moved during processing of a parent directory.
@@ -737,7 +555,7 @@
 	rval = 1;
 
 	sfd = open(srcname, O_RDONLY);
-	if (sfd < 0) {
+	if (sfd == -1) {
 		err_open(srcname);
 		goto quit;
 	}
@@ -754,7 +572,12 @@
 	if (fsx.fsx_xflags & (XFS_XFLAG_IMMUTABLE | XFS_XFLAG_APPEND)) {
 		err_message(_("%s: immutable/append, ignoring"), srcname);
 		global_rval |= 2;
-		rval = 0;
+		goto quit;
+	}
+
+	if (realuid != 0 && realuid != st.st_uid) {
+		errno = EACCES;
+		err_open(srcname);
 		goto quit;
 	}
 
@@ -770,7 +593,11 @@
 		err_message(_("Unable to create directory copy: %s"), srcname);
 		goto quit;
 	}
-	SET_PHASE(DIR_PHASE_1);
+	tfd = open(target, O_RDONLY);
+	if (tfd == -1) {
+		err_open(target);
+		goto quit;
+	}
 
 	cur_target = strdup(target);
 	if (!cur_target) {
@@ -778,81 +605,64 @@
 		goto quit;
 	}
 
-	sprintf(target, "%s/%sXXXXXX", pname, cmd_prefix);
-	if (mkdtemp(target) == NULL) {
-		err_message(_("unable to create tmp directory copy"));
-		goto quit;
-	}
-	SET_PHASE(DIR_PHASE_2);
+	SET_PHASE(DIR_PHASE_1);
 
-	cur_temp = strdup(target);
-	if (!cur_temp) {
-		err_nomem();
-		goto quit;
-	}
+	/* swapino src target */
+	si.si_version  = XFS_SI_VERSION;
+	si.si_fdtarget = tfd;
+	si.si_fdtmp    = sfd;
 
-	tfd = open(cur_temp, O_RDONLY);
-	if (tfd < 0) {
-		err_open(cur_temp);
-		goto quit;
+	/* swap the inodes */
+	rval = xfs_swapino(tfd, &si);
+	if (rval < 0) {
+		err_swapino(rval, srcname);
+		goto quit_unlink;
 	}
 
-	targetfd = open(cur_target, O_RDONLY);
-	if (tfd < 0) {
-		err_open(cur_target);
+	if (xfs_bulkstat_single(sfd, &st.st_ino, &bstatbuf) < 0) {
+		err_message(_("unable to bulkstat source file: %s"),
+				srcname);
+		unlink(target);
 		goto quit;
 	}
 
-
-	/* copy timestamps, attribs and EAs, to cur_temp */
-	rval = dup_attributes(srcname, sfd, cur_temp, tfd);
-	if (rval != 0) {
-		err_message(_("unable to duplicate directory attributes: %s"),
+	if (bstatbuf.bs_ino != st.st_ino) {
+		err_message(_("bulkstat of source file returned wrong inode: %s"),
 			    srcname);
-		goto quit_unlink;
+		unlink(target);
+		goto quit;
 	}
 
-	SET_PHASE(DIR_PHASE_3);
-
-	/* move src dirents to cur_target (this changes timestamps on src) */
-	rval = move_dirents(srcname, cur_target, &move_count);
-	if (rval != 0) {
-		err_message(_("unable to move directory contents: %s to %s"),
-				srcname, cur_target);
-		/* uh oh, move everything back... */
-		if (move_count > 0)
-			goto quit_undo;
-	}
+	ftruncate64(tfd, bstatbuf.bs_size);
 
-	SET_PHASE(DIR_PHASE_4);
+	/* swapextents src target */
+	sx.sx_stat     = bstatbuf; /* struct copy */
+	sx.sx_version  = XFS_SX_VERSION;
+	sx.sx_fdtarget = sfd;
+	sx.sx_fdtmp    = tfd;
+	sx.sx_offset   = 0;
+	sx.sx_length   = bstatbuf.bs_size;
 
-	/* copy timestamps, attribs and EAs from cur_temp to cur_target */
-	rval = dup_attributes(cur_temp, tfd, cur_target, targetfd);
-	if (rval != 0) {
-		err_message(_("unable to duplicate directory attributes: %s"),
-				cur_temp);
+	/* Swap the extents */
+	rval = xfs_swapext(sfd, &sx);
+	if (rval < 0) {
+		err_swapext(rval, srcname, bstatbuf.bs_size);
 		goto quit_unlink;
 	}
 
-	SET_PHASE(DIR_PHASE_5);
+	SET_PHASE(DIR_PHASE_2);
 
 	/* rmdir src */
 	rval = rmdir(srcname);
 	if (rval != 0) {
 		err_message(_("unable to remove directory: %s"), srcname);
-		goto quit_undo;
+		goto quit;
 	}
 
-	SET_PHASE(DIR_PHASE_6);
-
-	rval = rmdir(cur_temp);
-	if (rval != 0)
-		err_message(_("unable to remove tmp directory: %s"), cur_temp);
-
-	SET_PHASE(DIR_PHASE_7);
+	SET_PHASE(DIR_PHASE_3);
 
 	/* rename cur_target src */
-	rval = rename(cur_target, srcname);
+	rval = rename(target, srcname);
 	if (rval != 0) {
 		/*
 		 * we can't abort since the src dir is now gone.
@@ -863,18 +673,10 @@
 	}
 	goto quit;
 
- quit_undo:
-	if (move_dirents(cur_target, srcname, &move_count) != 0) {
-		/* oh, dear lord... let the admin clean this one up */
-		err_message(_("unable to move directory contents back: %s to %s"),
-				cur_target, srcname);
-		goto quit;
-	}
-	SET_PHASE(DIR_PHASE_3);
-
  quit_unlink:
-	rmdir(cur_target);
-	rmdir(cur_temp);
+	rval = rmdir(target);
+	if (rval != 0)
+		err_message(_("unable to remove directory: %s"), target);
 
  quit:
 
@@ -884,16 +686,13 @@
 		close(sfd);
 	if (tfd >= 0)
 		close(tfd);
-	if (targetfd >= 0)
-		close(targetfd);
 
 	free(pname);
 	free(cur_target);
-	free(cur_temp);
 
 	cur_target = NULL;
-	cur_temp = NULL;
 	cur_node = NULL;
+
 	numdirsdone++;
 	return rval;
 }
@@ -906,9 +705,10 @@
 	int		tfd = -1;
 	int		i = 0;
 	int		rval = 0;
-	struct stat64	s1;
+	struct stat64	st;
 	char		*srcname = NULL;
 	char		*pname = NULL;
+	xfs_swapino_t	si;
 	xfs_swapext_t	sx;
 	xfs_bstat_t	bstatbuf;
 	struct fsxattr  fsx;
@@ -921,37 +721,36 @@
 	cur_node = node;
 	srcname = node->paths[0];
 
-	bzero(&s1, sizeof(s1));
+	bzero(&st, sizeof(st));
 	bzero(&bstatbuf, sizeof(bstatbuf));
+	bzero(&si, sizeof(si));
 	bzero(&sx, sizeof(sx));
 
-	if (stat64(srcname, &s1) < 0) {
+	if (stat64(srcname, &st) < 0) {
 		if (errno != ENOENT) {
 			err_stat(srcname);
 			global_rval |= 2;
 		}
 		goto quit;
 	}
-	if (s1.st_ino <= XFS_MAXINUMBER_32 && !force_all)
+	if (st.st_ino <= XFS_MAXINUMBER_32 && !force_all)
 		/* this file has changed, and no longer needs processing */
 		goto quit;
 
+	rval = 1;
 	/* open and sync source */
 	sfd = open(srcname, O_RDWR | O_DIRECT);
 	if (sfd < 0) {
 		err_open(srcname);
-		rval = 1;
 		goto quit;
 	}
 	if (!platform_test_xfs_fd(sfd)) {
 		err_not_xfs(srcname);
-		rval = 1;
 		goto quit;
 	}
 	if (fsync(sfd) < 0) {
 		err_message(_("sync failed: %s: %s"),
 				srcname, strerror(errno));
-		rval = 1;
 		goto quit;
 	}
 
@@ -963,7 +762,7 @@
 	 * but before all reads have completed to block xfs_reno reads.
 	 * This change just closes the window a bit.
 	 */
-	if ((s1.st_mode & S_ISGID) && !(s1.st_mode & S_IXGRP)) {
+	if ((st.st_mode & S_ISGID) && !(st.st_mode & S_IXGRP)) {
 		struct flock fl;
 
 		fl.l_type = F_RDLCK;
@@ -988,7 +787,6 @@
 
 	if (xfs_getxattr(sfd, &fsx) < 0) {
 		err_message(_("failed to get inode attrs: %s"), srcname);
-		rval = 1;
 		goto quit;
 	}
 	if (fsx.fsx_xflags & (XFS_XFLAG_IMMUTABLE | XFS_XFLAG_APPEND)) {
@@ -997,9 +795,7 @@
 		goto quit;
 	}
 
-	rval = 1;
-
-	if (realuid != 0 && realuid != s1.st_uid) {
+	if (realuid != 0 && realuid != st.st_uid) {
 		errno = EACCES;
 		err_open(srcname);
 		goto quit;
@@ -1012,9 +808,10 @@
 		goto quit;
 	}
 	dirname(pname);
+
 	sprintf(target, "%s/%sXXXXXX", pname, cmd_prefix);
 	tfd = mkstemp(target);
-	if (tfd < 0) {
+	if (tfd == -1) {
 		err_message("unable to create file copy");
 		goto quit;
 	}
@@ -1026,30 +823,26 @@
 
 	SET_PHASE(FILE_PHASE_1);
 
-	/* Setup direct I/O */
-	if (fcntl(tfd, F_SETFL, O_DIRECT) < 0 ) {
-		err_message(_("could not set O_DIRECT for %s on tmp: %s"),
-				srcname, target);
-		unlink(target);
-		goto quit;
-	}
+	/* swapino src target */
+	si.si_version  = XFS_SI_VERSION;
+	si.si_fdtarget = sfd;
+	si.si_fdtmp    = tfd;
 
-	/* copy attribs & EAs to target */
-	if (dup_attributes(srcname, sfd, target, tfd) != 0) {
-		err_message(_("unable to duplicate file attributes: %s"),
-				srcname);
-		unlink(target);
-		goto quit;
+	/* swap the inodes */
+	rval = xfs_swapino(sfd, &si);
+	if (rval < 0) {
+		err_swapino(rval, srcname);
+		goto quit_unlink;
 	}
 
-	if (xfs_bulkstat_single(sfd, &s1.st_ino, &bstatbuf) < 0) {
+	if (xfs_bulkstat_single(sfd, &st.st_ino, &bstatbuf) < 0) {
 		err_message(_("unable to bulkstat source file: %s"),
 				srcname);
 		unlink(target);
 		goto quit;
 	}
 
-	if (bstatbuf.bs_ino != s1.st_ino) {
+	if (bstatbuf.bs_ino != st.st_ino) {
 		err_message(_("bulkstat of source file returned wrong inode: %s"),
 				srcname);
 		unlink(target);
@@ -1069,44 +862,8 @@
 	/* Swap the extents */
 	rval = xfs_swapext(sfd, &sx);
 	if (rval < 0) {
-		if (log_level >= LOG_DEBUG) {
-			switch (errno) {
-			case ENOTSUP:
-				err_message("%s: file type not supported",
-					srcname);
-				break;
-			case EFAULT:
-				/* The file has changed since we started the copy */
-				err_message("%s: file modified, "
-					 "inode renumber aborted: %ld",
-					 srcname, bstatbuf.bs_size);
-				break;
-			case EBUSY:
-				/* Timestamp has changed or mmap'ed file */
-				err_message("%s: file busy", srcname);
-				break;
-			default:
-				err_message(_("Swap extents failed: %s: %s"),
-					srcname, strerror(errno));
-				break;
-			}
-		} else
-			err_message(_("Swap extents failed: %s: %s"),
-					srcname, strerror(errno));
-		goto quit;
-	}
-
-	if (bstatbuf.bs_dmevmask | bstatbuf.bs_dmstate) {
-		struct fsdmidata fssetdm;
-
-		/* Set the DMAPI Fields. */
-		fssetdm.fsd_dmevmask = bstatbuf.bs_dmevmask;
-		fssetdm.fsd_padding = 0;
-		fssetdm.fsd_dmstate = bstatbuf.bs_dmstate;
-
-		if (ioctl(tfd, XFS_IOC_FSSETDM, (void *)&fssetdm ) < 0)
-			err_message(_("attempt to set DMI attributes "
-					"of %s failed"), target);
+		err_swapext(rval, srcname, bstatbuf.bs_size);
+		goto quit_unlink;
 	}
 
 	SET_PHASE(FILE_PHASE_2);
@@ -1152,8 +909,12 @@
 		numfilesdone++;
 	}
 
+ quit_unlink:
+	rval = unlink(target);
+	if (rval != 0)
+		err_message(_("unable to remove file: %s"), target);
+
  quit:
-	cur_node = NULL;
 
 	SET_PHASE(FILE_PHASE);
 
@@ -1166,6 +927,7 @@
 	free(cur_target);
 
 	cur_target = NULL;
+	cur_node = NULL;
 
 	numfilesdone++;
 	return rval;
@@ -1177,12 +939,15 @@
 	bignode_t	*node)
 {
 	int		i = 0;
+	int		sfd = -1;
+	int		tfd = -1;
 	int		rval = 0;
 	struct stat64	st;
 	char		*srcname = NULL;
 	char		*pname = NULL;
 	char		target[PATH_MAX] = "";
 	char		linkbuf[PATH_MAX];
+	xfs_swapino_t	si;
 
 	SET_PHASE(SLINK_PHASE);
 
@@ -1191,6 +956,9 @@
 	cur_node = node;
 	srcname = node->paths[0];
 
+	bzero(&st, sizeof(st));
+	bzero(&si, sizeof(si));
+
 	if (lstat64(srcname, &st) < 0) {
 		if (errno != ENOENT) {
 			err_stat(srcname);
@@ -1204,6 +972,13 @@
 
 	rval = 1;
 
+	/* open source */
+	sfd = open(srcname, O_RDWR | O_DIRECT);
+	if (sfd < 0) {
+		err_open(srcname);
+		goto quit;
+	}
+
 	i = readlink(srcname, linkbuf, sizeof(linkbuf) - 1);
 	if (i < 0) {
 		err_message(_("unable to read symlink: %s"), srcname);
@@ -1226,7 +1001,8 @@
 	dirname(pname);
 
 	sprintf(target, "%s/%sXXXXXX", pname, cmd_prefix);
-	if (mktemp(target) == NULL) {
+	tfd = mkstemp(target);
+	if (tfd == -1) {
 		err_message(_("unable to create temp symlink name"));
 		goto quit;
 	}
@@ -1243,19 +1019,15 @@
 
 	SET_PHASE(SLINK_PHASE_1);
 
-	/* copy ownership & EAs to target */
-	if (lchown(target, st.st_uid, st.st_gid) < 0) {
-		err_message(_("%s: Cannot change target ownership to "
-				"uid(%d) gid(%d)"), target,
-				st.st_uid, st.st_gid);
-		unlink(target);
-		goto quit;
-	}
+	/* swapino src target */
+	si.si_version  = XFS_SI_VERSION;
+	si.si_fdtarget = sfd;
+	si.si_fdtmp    = tfd;
 
-	if (clone_attribs(srcname, target) != 0) {
-		err_message(_("unable to duplicate symlink attributes: %s"),
-				srcname);
-		unlink(target);
+	/* swap the inodes */
+	rval = xfs_swapino(sfd, &si);
+	if (rval < 0) {
+		err_swapino(rval, srcname);
 		goto quit;
 	}
 
@@ -1374,8 +1146,8 @@
 	for (i = 0; i < cur_node->numpaths; i++)
 		len += sprintf(buf + len, "%s\n", cur_node->paths[i]);
 
-	len += sprintf(buf + len, "target: %s\ntemp: %s\nend\n",
-			cur_target, cur_temp);
+/*	len += sprintf(buf + len, "target: %s\ntemp: %s\nend\n", */
+/*			cur_target, cur_temp); */
 
 	ASSERT(len < buf_size);
 
@@ -1468,7 +1240,7 @@
 	ino_t		ino;
 	int		ftw_flags;
 	char		buf[PATH_MAX + 10]; /* path + "target: " */
-	struct stat64	s;
+	struct stat64	st;
 	int		first_path;
 
 	/*
@@ -1543,12 +1315,12 @@
 		log_message(LOG_DEBUG, "path: '%s'", buf);
 
 		if (buf[0] == '/') {
-			if (stat64(buf, &s) < 0) {
+			if (stat64(buf, &st) < 0) {
 				err_message(_("Recovery failed: cannot "
 						"stat '%s'"), buf);
 				goto quit;
 			}
-			if (s.st_ino != ino) {
+			if (st.st_ino != ino) {
 				err_message(_("Recovery failed: inode "
 						"number for '%s' does not "
 						"match recorded number"), buf);
@@ -1569,7 +1341,7 @@
 				err_nomem();
 				goto quit;
 			}
-			if (stat64(*target, &s) < 0) {
+			if (stat64(*target, &st) < 0) {
 				err_message(_("Recovery failed: cannot "
 						"stat '%s'"), *target);
 				goto quit;
@@ -1619,12 +1391,10 @@
 	char		*tname,
 	int		phase)
 {
-	int		tfd = -1;
-	int		targetfd = -1;
 	char		*srcname = NULL;
 	int		rval = 0;
 	int		i;
-	int		move_count = 0;
+	int		dir;
 
 	dump_node("recover", node);
 	log_message(LOG_DEBUG, "target: %s, phase: %x", target, phase);
@@ -1632,137 +1402,54 @@
 	if (node)
 		srcname = node->paths[0];
 
+	dir = (phase < DIR_PHASE || phase > DIR_PHASE_MAX);
+
 	switch (phase) {
 
-	case DIR_PHASE_2:
-rmtemps:
-		log_message(LOG_NORMAL, _("Removing temporary directory: '%s'"),
-				tname);
-		if (rmdir(tname) < 0 && errno != ENOENT) {
-			err_message(_("unable to remove directory: %s"), tname);
-			rval = 1;
-		}
-		/* FALL THRU */
 	case DIR_PHASE_1:
-		log_message(LOG_NORMAL, _("Removing target directory: '%s'"),
-				target);
-		if (rmdir(target) < 0 && errno != ENOENT) {
-			err_message(_("unable to remove directory: %s"),
-					target);
-			rval = 1;
-		}
-		break;
+	case FILE_PHASE_1:
+	case SLINK_PHASE_1:
+		log_message(LOG_NORMAL, _("Unlinking temporary %s: \'%s\'"),
+			    dir ? "directory" : "file", target);
 
-	case DIR_PHASE_3:
-		log_message(LOG_NORMAL, _("Completing moving directory "
-				"contents: '%s' to '%s'"), srcname, target);
-		if (move_dirents(srcname, target, &move_count) != 0) {
-			err_message(_("unable to move directory contents: "
-					"%s to %s"), srcname, target);
-			/* uh oh, move everything back... */
-			if (move_count > 0) {
-				if (move_dirents(target, srcname,
-						&move_count) != 0) {
-					/* oh, dear lord... let the admin
-					 * clean this one up */
-					err_message(_("unable to move directory "
-						"contents back: %s to %s"),
-						target, srcname);
-					exit(1);
-				}
-			}
-			goto rmtemps;
-		}
-		/* FALL THRU */
-	case DIR_PHASE_4:
-		log_message(LOG_NORMAL, _("Setting attributes for target "
-				"directory: \'%s\'"), target);
-		tfd = open(tname, O_RDONLY);
-		if (tfd < 0) {
-			err_open(tname);
-			rval = 1;
-			break;
-		}
-		targetfd = open(target, O_RDONLY);
-		if (targetfd < 0) {
-			err_open(target);
-			rval = 1;
-			break;
-		}
-		rval = dup_attributes(tname, tfd, target, targetfd);
-		if (rval != 0) {
-			err_message(_("unable to duplicate directory "
-					"attributes: %s"), tname);
-			break;
-		}
-		close(tfd);
-		close(targetfd);
-		/* FALL THRU */
-	case DIR_PHASE_6:
-		log_message(LOG_NORMAL, _("Removing temporary directory: \'%s\'"),
-				tname);
-		if (rmdir(tname) < 0 && errno != ENOENT) {
-			err_message(_("unable to remove directory: %s"),
-					tname);
-			rval = 1;
-			break;
-		}
-		/* FALL THRU */
-	case DIR_PHASE_5:
-		log_message(LOG_NORMAL, _("Removing old directory: \'%s\'"),
-				srcname);
-		if (rmdir(srcname) < 0 && errno != ENOENT) {
-			err_message(_("unable to remove directory: %s"),
-					srcname);
-			rval = 1;
-			break;
-		}
-		/* FALL THRU */
-	case DIR_PHASE_7:
-		log_message(LOG_NORMAL, _("Renaming new directory to old "
-			"directory: \'%s\' -> \'%s\'"), target, srcname);
-		rval = rename(target, srcname);
-		if (rval != 0) {
-			/* we can't abort since the src dir is now gone.
-			 * let the admin clean this one up
-			 */
-			err_message(_("unable to rename directory: %s to %s"),
-					target, srcname);
-			break;
-		}
-		break;
+		rval = dir ? rmdir(target) : unlink(target);
 
+		if ( rval < 0 && errno != ENOENT)
+			err_message(_("unable to remove %s: %s"),
+				    dir ? "directory" : "file", target);
 
-	case FILE_PHASE_1:
-	case SLINK_PHASE_1:
-		log_message(LOG_NORMAL, _("Unlinking temporary file: \'%s\'"),
-				target);
-		unlink(target);
 		break;
 
+	case DIR_PHASE_2:
 	case FILE_PHASE_2:
 	case SLINK_PHASE_2:
-		log_message(LOG_NORMAL, _("Unlinking old file: \'%s\'"),
-				srcname);
-		rval = unlink(srcname);
-		if (rval != 0) {
-			err_message(_("unable to remove file: %s"), srcname);
+		log_message(LOG_NORMAL, _("Unlinking old %s: \'%s\'"),
+				dir ? "directory" : "file", srcname);
+
+		rval = dir ? rmdir(target) : unlink(srcname);
+
+		if (rval < 0 && errno != ENOENT) {
+			err_message(_("unable to remove %s: %s"), 
+				    dir ? "directory" : "file", srcname);
 			break;
 		}
 		/* FALL THRU */
+	case DIR_PHASE_3:
 	case FILE_PHASE_3:
 	case SLINK_PHASE_3:
-		log_message(LOG_NORMAL, _("Renaming new file to old file: "
+		log_message(LOG_NORMAL, _("Renaming: "
 				"\'%s\' -> \'%s\'"), target, srcname);
 		rval = rename(target, srcname);
 		if (rval != 0) {
 			/* we can't abort since the src file is now gone.
 			 * let the admin clean this one up
 			 */
-			err_message(_("unable to rename file: %s to %s"),
+			err_message(_("unable to rename: %s to %s"),
 					target, srcname);
 			break;
 		}
+		if (dir)
+			break;
 		/* FALL THRU */
 	case FILE_PHASE_4:
 	case SLINK_PHASE_4:

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-10-04  4:25 REVIEW: xfs_reno #2 Barry Naujok
                   ` (3 preceding siblings ...)
  2007-11-20  1:36 ` David Chinner
@ 2008-03-06 16:10 ` Ruben Porras
  2008-06-03 20:34   ` Christoph Hellwig
  4 siblings, 1 reply; 15+ messages in thread
From: Ruben Porras @ 2008-03-06 16:10 UTC (permalink / raw)
  To: Barry Naujok; +Cc: xfs@oss.sgi.com


Am Donnerstag, den 04.10.2007, 14:25 +1000 schrieb Barry Naujok:
> A couple changes from the first xfs_reno:
> 
>   - Major one is that symlinks are now supported, but only
>     owner, group and extended attributes are copied for them
>     (not times or inode attributes).
> 
>   - Man page!
> 
> 
> To make this better, ideally we need some form of
> "swap inodes" function in the kernel, where the entire
> contents of the inode themselves are swapped. This form
> can handle any inode and without any of the dir/file/attr/etc
> copy/swap mechanisms we have in xfs_reno.
> 
> Barry.

+static int
+process_slink(
+	bignode_t	*node)
+{
+	int		i = 0;
+	int		rval = 0;
+	struct stat64	st;
+	char		*srcname = NULL;
+	char		*pname = NULL;
+	char		target[PATH_MAX] = "";
+	char		linkbuf[PATH_MAX];
+
+	SET_PHASE(SLINK_PHASE);
+
+	dump_node("symlink", node);
+
+	cur_node = node;
+	srcname = node->paths[0];
+
+	if (lstat64(srcname, &st) < 0) {
+		if (errno != ENOENT) {
+			err_stat(srcname);
+			global_rval |= 2;
+		}
+		goto quit;
+	}


+	if (st.st_ino <= XFS_MAXINUMBER_32 && !force_all)
+		/* this file has changed, and no longer needs processing */
+		goto quit;

This check need to go out, the same in functions process_dir and
process_file.

+	rval = 1;
+
+	i = readlink(srcname, linkbuf, sizeof(linkbuf) - 1);
+	if (i < 0) {
+		err_message(_("unable to read symlink: %s"), srcname);
+		goto quit;
+	}
+	linkbuf[i] = '\0';
+
+	if (realuid != 0 && realuid != st.st_uid) {
+		errno = EACCES;
+		err_open(srcname);
+		goto quit;
+	}
+
+	/* create target */
+	pname = strdup(srcname);
+	if (pname == NULL) {
+		err_nomem();
+		goto quit;
+	}
+	dirname(pname);
+
+	sprintf(target, "%s/%sXXXXXX", pname, cmd_prefix);
+	if (mktemp(target) == NULL) {
+		err_message(_("unable to create temp symlink name"));
+		goto quit;
+	}

do not create the file, it is done later with symlink, if the file
exists, symlink is going to fail.

+	cur_target = strdup(target);
+	if (cur_target == NULL) {
+		err_nomem();
+		goto quit;
+	}

cur_target is not needed.

+
+	if (symlink(linkbuf, target) != 0) {
+		err_message(_("unable to create symlink: %s"), target);
+		goto quit;
+	}

[...]

+	free(cur_target);
+
+	cur_target = NULL;


again, both are unnecesary.

+	numslinksdone++;
+	return rval;
+}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2007-11-20  1:36 ` David Chinner
  2007-11-23 14:30   ` Ruben Porras
@ 2008-03-06 16:11   ` Ruben Porras
  1 sibling, 0 replies; 15+ messages in thread
From: Ruben Porras @ 2008-03-06 16:11 UTC (permalink / raw)
  To: David Chinner; +Cc: Barry Naujok, xfs@oss.sgi.com

[-- Attachment #1: Type: text/plain, Size: 2579 bytes --]

Am Dienstag, den 20.11.2007, 12:36 +1100 schrieb David Chinner:
> On Thu, Oct 04, 2007 at 02:25:16PM +1000, Barry Naujok wrote:
> > To make this better, ideally we need some form of
> > "swap inodes" function in the kernel, where the entire
> > contents of the inode themselves are swapped. This form
> > can handle any inode and without any of the dir/file/attr/etc
> > copy/swap mechanisms we have in xfs_reno.
> 
> Something like the attached patch?
> 
> This is proof-of-concept. I've compiled it but I haven't tested
> it. Your mission, Barry, should you choose to accept it, it to

Hello again,

I have this week again time to at xfs_reno and xfs_swapino and
xfs_swap_extents functions. I adapted xfs_reno to use these ioctl
instead of the user space dir/file/attr/..., and I move successfully
files and directories (see the problem description later).

Then I run into two problems, one processing directories, and one
processing symlinks, where I do not now how to proceed, and I would like
to have advice.

Firsts, directories:

At this moment it is not possible to use xfs_swap_extents to for
directories:

(extract from xfs_dfrag.c)

        if (VN_CACHED(tvp) != 0) {
                xfs_inval_cached_trace(tip, 0, -1, 0, -1);
                error = xfs_flushinval_pages(tip, 0, -1,
                                FI_REMAPF_LOCKED);
                if (error)
                        goto error0;
        }

        /* Verify O_DIRECT for ftmp */
        if (VN_CACHED(tvp) != 0) {
                error = XFS_ERROR(EINVAL);
                goto error0;
        }

But it is not posible to do an open(2) on a directory with O_DIRECT. I
was unable to find out if this restriction comes from the kernel or from
the glibc, neither why open on dirs with O_DIRECT needs to be forbidden
(hints would be appreciated ;), but changing this snippet to 

        /* There is no O_DIRECT for directories */
        if (VN_CACHED(tvp) != 0 && VN_ISDIR(tvp) == 0) {
                error = XFS_ERROR(EINVAL);
                goto error0;
        }

does the trick. Can we do that?

Second, symlinks:

xfs_swapino and xfs_swap_extents, require the file descriptors of the
related files. However, it is not possible to get from user space the fd
of a symlink, because open(2) follow always the symlinks. I would change
this functions to accept xfs_inode as parameters instead of file
descriptors, and get them in xfs_reno with stat und lstat, but I would
like to get your opinion before changing the ioctls.

Attached the modified xfs_reno.c (process_slink not working).

Regards.

[-- Attachment #2: xfs_reno.c --]
[-- Type: text/x-csrc, Size: 37031 bytes --]

/*
 * Copyright (c) 2007 Silicon Graphics, Inc.
 * All Rights Reserved.
 *
 * This program is free software; you can redistribute it and/or
 * modify it under the terms of the GNU General Public License as
 * published by the Free Software Foundation.
 *
 * This program is distributed in the hope that it would be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.	 See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program; if not, write the Free Software Foundation,
 * Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
 */

/*
 * xfs_reno - renumber 64-bit inodes
 *
 * xfs_reno [-f] [-n] [-p] [-q] [-v] [-P seconds] path ...
 * xfs_reno [-r] path ...
 *
 * Renumbers all inodes > 32 bits into 32 bit space. Requires the filesytem
 * to be mounted with inode32.
 *
 *	-f		force conversion on all inodes rather than just
 *			those with a 64bit inode number.
 *	-n		nothing, do not renumber inodes
 *	-p		show progress status.
 *	-q		quiet, do not report progress, only errors.
 *	-v		verbose, more -v's more verbose.
 *	-P seconds	set the interval for the progress status in seconds.
 *	-r		recover from an interrupted run.
 */

#include <xfs/xfs.h>

#include <dirent.h>
#include <errno.h>
#include <fcntl.h>
#include <ftw.h>
#include <libgen.h>
#include <malloc.h>
#include <signal.h>
#include <stdint.h>
#include <sys/ioctl.h>
#include <attr/attributes.h>
#include <xfs/xfs_dfrag.h>
#include <xfs/xfs_inum.h>
#include <xfs/xfs_types.h>
#include <xfs/xfs_ag.h>

#define SCAN_PHASE	0x00
#define DIR_PHASE	0x10	/* nothing done or all done */
#define DIR_PHASE_1	0x11	/* temp dir created */
#define DIR_PHASE_2	0x12	/* swapped extents and inodes */
#define DIR_PHASE_3	0x13	/* src dir removed */
#define DIR_PHASE_MAX	0x13	/* renamed temp to source name */
#define FILE_PHASE	0x20	/* nothing done or all done */
#define FILE_PHASE_1	0x21	/* temp file created */
#define FILE_PHASE_2	0x22	/* swapped extents and inodes */
#define FILE_PHASE_3	0x23	/* unlinked source */
#define FILE_PHASE_4	0x24	/* hard links copied */
#define FILE_PHASE_MAX	0x24	/* renamed temp to source name */
#define SLINK_PHASE	0x30	/* nothing done or all done */
#define SLINK_PHASE_1	0x31	/* temp symlink created */
#define SLINK_PHASE_2	0x32	/* symlink attrs copied */
#define SLINK_PHASE_3	0x33	/* unlinked source */
#define SLINK_PHASE_4	0x34	/* hard links copied */
#define SLINK_PHASE_MAX	0x34	/* renamed temp to source name */

static void update_recoverfile(void);
#define SET_PHASE(x)	(cur_phase = x, update_recoverfile())

#define LOG_ERR		0
#define LOG_NORMAL	1
#define LOG_INFO	2
#define LOG_DEBUG	3
#define LOG_NITTY	4

#define NH_BUCKETS	65536
#define NH_HASH(ino)	(nodehash + ((ino) % NH_BUCKETS))

typedef struct {
	xfs_ino_t	ino;
	int		ftw_flags;
	nlink_t		numpaths;
	char		**paths;
} bignode_t;

typedef struct {
	bignode_t	*nodes;
	uint64_t	listlen;
	uint64_t	lastnode;
} nodelist_t;

static const char	*cmd_prefix = "xfs_reno_";

static char		*progname;
static int		log_level = LOG_NORMAL;
static int		force_all;
static nodelist_t	*nodehash;
static int		realuid;
static uint64_t		numdirnodes;
static uint64_t		numfilenodes;
static uint64_t		numslinknodes;
static uint64_t		numdirsdone;
static uint64_t		numfilesdone;
static uint64_t		numslinksdone;
static int		poll_interval;
static time_t		starttime;
static bignode_t	*cur_node;
static char		*cur_target;
static int		cur_phase;
static int		highest_numpaths;
static char		*recover_file;
static int		recover_fd;
static volatile int	poll_output;
static int		global_rval;
static int *agmask;


/*
 * message handling
 */
static void
log_message(
	int		level,
	char		*fmt, ...)
{
	char		buf[1024];
	va_list		ap;

	if (log_level < level)
		return;

	va_start(ap, fmt);
	vsnprintf(buf, 1024, fmt, ap);
	va_end(ap);

	printf("%c%s: %s\n", poll_output ? '\n' : '\r', progname, buf);
	poll_output = 0;
}

static void
err_message(
	char		*fmt, ...)
{
	char		buf[1024];
	va_list		ap;

	va_start(ap, fmt);
	vsnprintf(buf, 1024, fmt, ap);
	va_end(ap);

	fprintf(stderr, "%c%s: %s\n", poll_output ? '\n' : '\r', progname, buf);
	poll_output = 0;
}

static void
err_nomem(void)
{
	err_message(_("Out of memory"));
}

static void
err_open(
	const char	*s)
{
	err_message(_("Cannot open %s: %s"), s, strerror(errno));
}

static void
err_not_xfs(
	const char	*s)
{
	err_message(_("%s is not on an XFS filesystem"), s);
}

static void
err_stat(
	const char	*s)
{
	err_message(_("Cannot stat %s: %s\n"), s, strerror(errno));
}

static void
err_swapino(
	    int err,
	    const char *srcname)
{
	if (log_level >= LOG_DEBUG) {
		switch (err) {
		case EIO:
			err_message(_("Filesystem is going down: %s: %s"),
				srcname, strerror(err));
			break;

		default:
			err_message(_("Swap inode failed: %s: %s"),
				srcname, strerror(err));
			break;
		}
	} else
		err_message(_("Swap inode failed: %s: %s"),
				srcname, strerror(err));
}

static void
err_swapext(
	    int err,
	    const char *srcname,
	    xfs_off_t bs_size)
{
	if (log_level >= LOG_DEBUG) {
		switch (err) {
		case ENOTSUP:
			err_message("%s: file type not supported",
				srcname);
			break;
		case EFAULT:
			/* The file has changed since we started the copy */
			err_message("%s: file modified, "
				 "inode renumber aborted: %ld",
				 srcname, bs_size);
			break;
		case EBUSY:
			/* Timestamp has changed or mmap'ed file */
			err_message("%s: file busy", srcname);
			break;
		default:
			err_message(_("Swap extents failed: %s: %s"),
				srcname, strerror(errno));
			break;
		}
	} else
		err_message(_("Swap extents failed: %s: %s"),
				srcname, strerror(errno));
}

/*
 * usage message
 */
static void
usage(void)
{
	fprintf(stderr, _("%s [-fnpqv] [-P <interval>] [-r] <path>\n"),
			progname);
	exit(1);
}


/*
 * XFS interface functions
 */

static int
xfs_bulkstat_single(int fd, xfs_ino_t *lastip, xfs_bstat_t *ubuffer)
{
	xfs_fsop_bulkreq_t  bulkreq;

	bulkreq.lastip = (__u64 *)lastip;
	bulkreq.icount = 1;
	bulkreq.ubuffer = ubuffer;
	bulkreq.ocount = NULL;
	return ioctl(fd, XFS_IOC_FSBULKSTAT_SINGLE, &bulkreq);
}

static int
xfs_swapext(int fd, xfs_swapext_t *sx)
{
	return ioctl(fd, XFS_IOC_SWAPEXT, sx);
}

static int
xfs_get_agflags(const char *filepath)
{
  xfs_fsop_geom_t fsgeo;
  xfs_ioc_agflags_t	ioc_flags;
  int error = 0;
  xfs_agnumber_t agno;

  int fd;

  if ((fd = open(filepath, O_RDONLY /* | O_DIRECT | O_NOATIME */)) == -1) {
    err_open(filepath);
    return -1;
    }

  if ((error = xfsctl(filepath, fd, XFS_IOC_FSGEOMETRY, &fsgeo)) < 0) {
	  fprintf(stderr, _("cannot get geometry of fs: %s\n"),
		  strerror(errno));
	  goto error0; 
  }

  agmask = (int *) calloc(fsgeo.agcount, sizeof(int));

  for (agno = 0; agno < fsgeo.agcount ; agno++) {
    ioc_flags.ag = agno;
			
    if ((error = xfsctl(filepath, fd, XFS_IOC_GET_AGF_FLAGS, &ioc_flags)) < 0) {
		fprintf(stderr,
			_("cannot get flags %d on ag %d at %s: %s\n"),
			ioc_flags.flags,
			ioc_flags.ag, filepath, strerror(errno));

		return error;
	}

	agmask[agno] = ioc_flags.flags & XFS_AGF_FLAGS_ALLOC_DENY;
  }

error0:
  close(fd);

  return error;
}

static int
xfs_swapino(int fd, xfs_swapino_t *iu)
{
	return ioctl(fd, XFS_IOC_SWAPINO, iu);
}

static int
xfs_getxattr(int fd, struct fsxattr *attr)
{
	return ioctl(fd, XFS_IOC_FSGETXATTR, attr);
}

/*
 * A hash table of inode numbers and associated paths.
 */
static nodelist_t *
init_nodehash(void)
{
	nodehash = calloc(NH_BUCKETS, sizeof(nodelist_t));
	if (nodehash == NULL) {
		err_nomem();
		return NULL;
	}

	return nodehash;
}

static int
in_ag_to_free(const char *filepath) {
  xfs_ioc_fileag_t fileag;

  int error;

  if ((fileag.fd = open(filepath, O_RDONLY /* | O_DIRECT | O_NOATIME */)) == -1) {
    err_open(filepath);
    return -1;
    }

  if ((error = xfsctl(filepath, fileag.fd, XFS_IOC_GETFILEAG, &fileag)) < 0) {
    fprintf(stderr, _("%s: cannot get the AG of the file: %s\n"),
		  filepath, strerror(errno));
    close(fileag.fd);
    return error;
  }

  close(fileag.fd);

  if (agmask[fileag.ag] == 1) 
    printf("AG: %d, MASK: %d\t", fileag.ag, agmask[fileag.ag]);

  return agmask[fileag.ag];
}

static void
free_nodehash(void)
{
	int		i, j, k;

	for (i = 0; i < NH_BUCKETS; i++) {
		bignode_t *nodes = nodehash[i].nodes;

		for (j = 0; j < nodehash[i].lastnode; j++) {
			for (k = 0; k < nodes[j].numpaths; k++) {
				free(nodes[j].paths[k]);
			}
			free(nodes[j].paths);
		}

		free(nodes);
	}
	free(nodehash);
}

static nlink_t
add_path(
	bignode_t	*node,
	const char	*path)
{
	node->paths = realloc(node->paths,
			      sizeof(char *) * (node->numpaths + 1));
	if (node->paths == NULL) {
		err_nomem();
		exit(1);
	}

	node->paths[node->numpaths] = strdup(path);
	if (node->paths[node->numpaths] == NULL) {
		err_nomem();
		exit(1);
	}

	node->numpaths++;
	if (node->numpaths > highest_numpaths)
		highest_numpaths = node->numpaths;

	return node->numpaths;
}

static bignode_t *
add_node(
	nodelist_t	*list,
	xfs_ino_t	ino,
	int		ftw_flags,
	const char	*path)
{
	bignode_t	*node;

	if (list->lastnode >= list->listlen) {
		list->listlen += 500;
		list->nodes = realloc(list->nodes,
					sizeof(bignode_t) * list->listlen);
		if (list->nodes == NULL) {
			err_nomem();
			return NULL;
		}
	}

	node = list->nodes + list->lastnode;

	node->ino = ino;
	node->ftw_flags = ftw_flags;
	node->paths = NULL;
	node->numpaths = 0;
	add_path(node, path);

	list->lastnode++;

	return node;
}

static bignode_t *
find_node(
	xfs_ino_t	ino)
{
	int		i;
	nodelist_t	*nodelist;
	bignode_t	*nodes;

	nodelist = NH_HASH(ino);
	nodes = nodelist->nodes;

	for(i = 0; i < nodelist->lastnode; i++) {
		if (nodes[i].ino == ino) {
			return &nodes[i];
		}
	}

	return NULL;
}

static bignode_t *
add_node_path(
	xfs_ino_t	ino,
	int		ftw_flags,
	const char	*path)
{
	nodelist_t	*nodelist;
	bignode_t	*node;

	log_message(LOG_NITTY, "add_node_path: ino %llu, path %s", ino, path);

	node = find_node(ino);
	if (node == NULL) {
		nodelist = NH_HASH(ino);
		return add_node(nodelist, ino, ftw_flags, path);
	}

	add_path(node, path);
	return node;
}

static void
dump_node(
	char		*msg,
	bignode_t	*node)
{
	int		k;

	if (log_level < LOG_DEBUG)
		return;

	log_message(LOG_DEBUG, "%s: %llu %llu %s", msg, node->ino,
			node->numpaths, node->paths[0]);

	for (k = 1; k < node->numpaths; k++)
		log_message(LOG_DEBUG, "\t%s", node->paths[k]);
}

static void
dump_nodehash(void)
{
	int		i, j;

	if (log_level < LOG_NITTY)
		return;

	for (i = 0; i < NH_BUCKETS; i++) {
		bignode_t	*nodes = nodehash[i].nodes;
		for (j = 0; j < nodehash[i].lastnode; j++, nodes++)
			dump_node("nodehash", nodes);
	}
}

static int
for_all_nodes(
	int		(*fn)(bignode_t *node),
	int		ftw_flags,
	int		quit_on_error)
{
	int		i;
	int		j;
	int		rval = 0;

	for (i = 0; i < NH_BUCKETS; i++) {
		bignode_t	*nodes = nodehash[i].nodes;

		for (j = 0; j < nodehash[i].lastnode; j++, nodes++) {
			if (nodes->ftw_flags == ftw_flags) {
				rval = fn(nodes);
				if (rval && quit_on_error)
					goto quit;
			}
		}
	}

quit:
	return rval;
}

/*
 * Adds appropriate files to the inode hash table
 */
static int
nftw_addnodes(
	const char	*path,
	const struct stat64 *st,
	int		flags,
	struct FTW	*sntfw)
{
	if (flags == FTW_F || flags == FTW_D) 
        if (!in_ag_to_free(path) && !force_all)
		    return 0;

    printf("%s\n", path);

	if (flags == FTW_F)
		numfilenodes++;
	else if (flags == FTW_D)
		numdirnodes++;
	else if (flags == FTW_SL)
		numslinknodes++;
	else
		return 0;

	add_node_path(st->st_ino, flags, path);

	return 0;
}

static int
process_dir(
	bignode_t	*node)
{
	int		sfd = -1;
	int		tfd = -1;
	int		rval = 0;
	struct stat64	st;
	char		*srcname = NULL;
	char		*pname = NULL;
	xfs_swapino_t	si;
	xfs_swapext_t	sx;
	xfs_bstat_t	bstatbuf;
	struct fsxattr	fsx;
	char		target[PATH_MAX] = "";

	SET_PHASE(DIR_PHASE);

	dump_node("directory", node);

	cur_node = node;
	srcname = node->paths[0];

	bzero(&st, sizeof(st));
	bzero(&bstatbuf, sizeof(bstatbuf));
	bzero(&si, sizeof(si));
	bzero(&sx, sizeof(sx));

	if (stat64(srcname, &st) < 0) {
		if (errno != ENOENT) {
			err_stat(srcname);
			global_rval |= 2;
		}
		goto quit;
	}
	if (!in_ag_to_free(srcname) && !force_all) {
		/*
		 * This directory has already changed ino's, probably due
		 * to being moved during processing of a parent directory.
		 */
		log_message(LOG_DEBUG, "process_dir: skipping %s", srcname);
		goto quit;
	}

	rval = 1;

	sfd = open(srcname, O_RDONLY);
	if (sfd == -1) {
		err_open(srcname);
		goto quit;
	}

	if (!platform_test_xfs_fd(sfd)) {
		err_not_xfs(srcname);
		goto quit;
	}

	if (xfs_getxattr(sfd, &fsx) < 0) {
		err_message(_("failed to get inode attrs: %s"), srcname);
		goto quit;
	}
	if (fsx.fsx_xflags & (XFS_XFLAG_IMMUTABLE | XFS_XFLAG_APPEND)) {
		err_message(_("%s: immutable/append, ignoring"), srcname);
		global_rval |= 2;
		goto quit;
	}

	if (realuid != 0 && realuid != st.st_uid) {
		errno = EACCES;
		err_open(srcname);
		goto quit;
	}

	/* mkdir parent/target */
	pname = strdup(srcname);
	if (pname == NULL) {
		err_nomem();
		goto quit;
	}
	dirname(pname);
	sprintf(target, "%s/%sXXXXXX", pname, cmd_prefix);
	if (mkdtemp(target) == NULL) {
		err_message(_("Unable to create directory copy: %s, %s"), srcname, strerror(errno));
		goto quit;
	}
	tfd = open(target, O_RDONLY);
	if (tfd == -1) {
		err_open(target);
		goto quit;
	}

	cur_target = strdup(target);
	if (!cur_target) {
		err_nomem();
		goto quit;
	}

	SET_PHASE(DIR_PHASE_1);

	/* swapino src target */
	si.si_version  = XFS_SI_VERSION;
	si.si_fdtarget = tfd;
	si.si_fdtmp    = sfd;

	/* swap the inodes */
	rval = xfs_swapino(tfd, &si);
	if (rval < 0) {
		err_swapino(rval, srcname);
		goto quit_unlink;
	}

	if (xfs_bulkstat_single(sfd, &st.st_ino, &bstatbuf) < 0) {
		err_message(_("unable to bulkstat source file: %s"),
				srcname);
		unlink(target);
		goto quit;
	}

	if (bstatbuf.bs_ino != st.st_ino) {
		err_message(_("bulkstat of source file returned wrong inode: %s"),
				srcname);
		unlink(target);
		goto quit;
	}

	ftruncate64(tfd, bstatbuf.bs_size);

	/* swapextents src target */
	sx.sx_stat     = bstatbuf; /* struct copy */
	sx.sx_version  = XFS_SX_VERSION;
	sx.sx_fdtarget = sfd;
	sx.sx_fdtmp    = tfd;
	sx.sx_offset   = 0;
	sx.sx_length   = bstatbuf.bs_size;

	/* Swap the extents */
	rval = xfs_swapext(sfd, &sx);
	if (rval < 0) {
		err_swapext(rval, srcname, bstatbuf.bs_size);
		goto quit_unlink;
	}

	SET_PHASE(DIR_PHASE_2);

	/* rmdir src */
	rval = rmdir(srcname);
	if (rval != 0) {
		err_message(_("unable to remove directory: %s, %s"), srcname, strerror(errno));
		goto quit;
	}

	SET_PHASE(DIR_PHASE_3);

	/* rename cur_target src */
	rval = rename(target, srcname);
	if (rval != 0) {
		/*
		 * we can't abort since the src dir is now gone.
		 * let the admin clean this one up
		 */
		err_message(_("unable to rename directory: %s to %s, %s"),
				cur_target, srcname, strerror(errno));
	}
	goto quit;

 quit_unlink:
	rval = rmdir(target);
	if (rval != 0)
		err_message(_("unable to remove directory: %s, %s"), target, strerror(errno));

 quit:

	SET_PHASE(DIR_PHASE);

	if (sfd >= 0)
		close(sfd);
	if (tfd >= 0)
		close(tfd);

	free(pname);
	free(cur_target);

	cur_target = NULL;
	cur_node = NULL;

	numdirsdone++;
	return rval;
}

static int
process_file(
	bignode_t	*node)
{
	int		sfd = -1;
	int		tfd = -1;
	int		i = 0;
	int		rval = 0;
	struct stat64	st;
	char		*srcname = NULL;
	char		*pname = NULL;
	xfs_swapino_t	si;
	xfs_swapext_t	sx;
	xfs_bstat_t	bstatbuf;
	struct fsxattr	fsx;
	char		target[PATH_MAX] = "";

	SET_PHASE(FILE_PHASE);

	dump_node("file", node);

	cur_node = node;
	srcname = node->paths[0];

	bzero(&st, sizeof(st));
	bzero(&bstatbuf, sizeof(bstatbuf));
	bzero(&si, sizeof(si));
	bzero(&sx, sizeof(sx));

	if (stat64(srcname, &st) < 0) {
		if (errno != ENOENT) {
			err_stat(srcname);
			global_rval |= 2;
		}
		goto quit;
	}
	if (!in_ag_to_free(srcname) && !force_all)
		/* this file has changed, and no longer needs processing */
		goto quit;

	rval = 1;
	/* open and sync source */
	sfd = open(srcname, O_RDWR | O_DIRECT);
	if (sfd < 0) {
		err_open(srcname);
		goto quit;
	}
	if (!platform_test_xfs_fd(sfd)) {
		err_not_xfs(srcname);
		goto quit;
	}
	if (fsync(sfd) < 0) {
		err_message(_("sync failed: %s: %s"),
				srcname, strerror(errno));
		goto quit;
	}


	/*
	 * Check if a mandatory lock is set on the file to try and
	 * avoid blocking indefinitely on the reads later. Note that
	 * someone could still set a mandatory lock after this check
	 * but before all reads have completed to block xfs_reno reads.
	 * This change just closes the window a bit.
	 */
	if ((st.st_mode & S_ISGID) && !(st.st_mode & S_IXGRP)) {
		struct flock fl;

		fl.l_type = F_RDLCK;
		fl.l_whence = SEEK_SET;
		fl.l_start = (off_t)0;
		fl.l_len = 0;
		if (fcntl(sfd, F_GETLK, &fl) < 0 ) {
			if (log_level >= LOG_DEBUG)
				err_message("locking check failed: %s",
						srcname);
			global_rval |= 2;
			goto quit;
		}
		if (fl.l_type != F_UNLCK) {
			if (log_level >= LOG_DEBUG)
				err_message("mandatory lock: %s: ignoring",
						srcname);
			global_rval |= 2;
			goto quit;
		}
	}

	if (xfs_getxattr(sfd, &fsx) < 0) {
		err_message(_("failed to get inode attrs: %s"), srcname);
		goto quit;
	}
	if (fsx.fsx_xflags & (XFS_XFLAG_IMMUTABLE | XFS_XFLAG_APPEND)) {
		err_message(_("%s: immutable/append, ignoring"), srcname);
		global_rval |= 2;
		goto quit;
	}

	if (realuid != 0 && realuid != st.st_uid) {
		errno = EACCES;
		err_open(srcname);
		goto quit;
	}

	/* creat target */
	pname = strdup(srcname);
	if (pname == NULL) {
		err_nomem();
		goto quit;
	}
	dirname(pname);

	sprintf(target, "%s/%sXXXXXX", pname, cmd_prefix);
	tfd = mkstemp(target);
	if (tfd == -1) {
		err_message("unable to create file copy: %s", strerror(errno));
		goto quit;
	}
	cur_target = strdup(target);
	if (cur_target == NULL) {
		err_nomem();
		goto quit;
	}

	SET_PHASE(FILE_PHASE_1);

	/* swapino src target */
	si.si_version  = XFS_SI_VERSION;
	si.si_fdtarget = sfd;
	si.si_fdtmp    = tfd;

	/* swap the inodes */
	rval = xfs_swapino(sfd, &si);
	if (rval < 0) {
		err_swapino(rval, srcname);
		goto quit_unlink;
	}

	if (xfs_bulkstat_single(sfd, &st.st_ino, &bstatbuf) < 0) {
		err_message(_("unable to bulkstat source file: %s"),
				srcname);
		unlink(target);
		goto quit;
	}

	if (bstatbuf.bs_ino != st.st_ino) {
		err_message(_("bulkstat of source file returned wrong inode: %s"),
				srcname);
		unlink(target);
		goto quit;
	}

	ftruncate64(tfd, bstatbuf.bs_size);

	/* swapextents src target */
	sx.sx_stat     = bstatbuf; /* struct copy */
	sx.sx_version  = XFS_SX_VERSION;
	sx.sx_fdtarget = sfd;
	sx.sx_fdtmp    = tfd;
	sx.sx_offset   = 0;
	sx.sx_length   = bstatbuf.bs_size;

	/* Swap the extents */
	rval = xfs_swapext(sfd, &sx);
	if (rval < 0) {
		err_swapext(rval, srcname, bstatbuf.bs_size);
		goto quit_unlink;
	}

	SET_PHASE(FILE_PHASE_2);

	/* unlink src */
	rval = unlink(srcname);
	if (rval != 0) {
		err_message(_("unable to remove file: %s, %s"), srcname, strerror(errno));
		goto quit;
	}

	SET_PHASE(FILE_PHASE_3);

	/* rename target src */
	rval = rename(target, srcname);
	if (rval != 0) {
		/*
		 * we can't abort since the src file is now gone.
		 * let the admin clean this one up
		 */
		err_message(_("unable to rename file: %s to %s, %s"),
				target, srcname, strerror(errno));
		goto quit;
	}

	SET_PHASE(FILE_PHASE_4);

	/* for each hardlink, unlink and creat pointing to target */
	for (i = 1; i < node->numpaths; i++) {
		/* unlink src */
		rval = unlink(node->paths[i]);
		if (rval != 0) {
			err_message(_("unable to remove file: %s, %s"),
				       node->paths[i], strerror(errno));
			goto quit;
		}

		rval = link(srcname, node->paths[i]);
		if (rval != 0) {
			err_message("unable to link to file: %s, %s", srcname, strerror(errno));
			goto quit;
		}
		numfilesdone++;
	}

 quit_unlink:
	rval = unlink(target);
	if (rval != 0)
		err_message(_("unable to remove file: %s, %s"), target, strerror(errno));

 quit:

	SET_PHASE(FILE_PHASE);

	if (sfd >= 0)
		close(sfd);
	if (tfd >= 0)
		close(tfd);

	free(pname);
	free(cur_target);

	cur_target = NULL;
	cur_node = NULL;

	numfilesdone++;
	return rval;
}


static int
process_slink(
	bignode_t	*node)
{
	int		i = 0;
	int		sfd = -1;
	int		tfd = -1;
	int		rval = 0;
	struct stat64	st;
	char		*srcname = NULL;
	char		*pname = NULL;
	char		target[PATH_MAX] = "";
	char		linkbuf[PATH_MAX];
	xfs_swapino_t	si;

	SET_PHASE(SLINK_PHASE);

	dump_node("symlink", node);

	cur_node = node;
	srcname = node->paths[0];

	bzero(&st, sizeof(st));
	bzero(&si, sizeof(si));

	if (lstat64(srcname, &st) < 0) {
		if (errno != ENOENT) {
			err_stat(srcname);
			global_rval |= 2;
		}
		goto quit;
	}

	rval = 1;

	/* open source */
	sfd = open(srcname, O_RDWR | O_DIRECT);
	if (sfd < 0) {
		err_open(srcname);
		goto quit;
	}

	i = readlink(srcname, linkbuf, sizeof(linkbuf) - 1);
	if (i < 0) {
		err_message(_("unable to read symlink: %s, %s"), srcname, strerror(errno));
		goto quit;
	}
	linkbuf[i] = '\0';

	if (realuid != 0 && realuid != st.st_uid) {
		errno = EACCES;
		err_open(srcname);
		goto quit;
	}

	/* create target */
	pname = strdup(srcname);
	if (pname == NULL) {
		err_nomem();
		goto quit;
	}
	dirname(pname);

	sprintf(target, "%s/%sXXXXXX", pname, cmd_prefix);
	tfd = mkstemp(target);
	if (tfd == -1) {
		err_message(_("unable to create temp symlink name: %s"), strerror(errno));
		goto quit;
	}
	cur_target = strdup(target);
	if (cur_target == NULL) {
		err_nomem();
		goto quit;
	}

	if (symlink(linkbuf, target) != 0) {
		err_message(_("unable to create symlink: %s, %s"), target, strerror(errno));
		goto quit;
	}

	SET_PHASE(SLINK_PHASE_1);

	/* swapino src target */
	si.si_version  = XFS_SI_VERSION;
	si.si_fdtarget = sfd;
	si.si_fdtmp    = tfd;

	/* swap the inodes */
	rval = xfs_swapino(sfd, &si);
	if (rval < 0) {
		err_swapino(rval, srcname);
		goto quit;
	}

	SET_PHASE(SLINK_PHASE_2);

	/* unlink src */
	rval = unlink(srcname);
	if (rval != 0) {
		err_message(_("unable to remove symlink: %s, %s"), srcname, strerror(errno));
		goto quit;
	}

	SET_PHASE(SLINK_PHASE_3);

	/* rename target src */
	rval = rename(target, srcname);
	if (rval != 0) {
		/*
		 * we can't abort since the src file is now gone.
		 * let the admin clean this one up
		 */
		err_message(_("unable to rename symlink: %s to %s, %s"),
				target, srcname, strerror(errno));
		goto quit;
	}

	SET_PHASE(SLINK_PHASE_4);

	/* for each hardlink, unlink and creat pointing to target */
	for (i = 1; i < node->numpaths; i++) {
		/* unlink src */
		rval = unlink(node->paths[i]);
		if (rval != 0) {
			err_message(_("unable to remove symlink: %s, %s"),
				       node->paths[i], strerror(errno));
			goto quit;
		}

		rval = link(srcname, node->paths[i]);
		if (rval != 0) {
			err_message("unable to link to symlink: %s, %s", srcname, strerror(errno));
			goto quit;
		}
		numslinksdone++;
	}

 quit:
	cur_node = NULL;

	SET_PHASE(SLINK_PHASE);

	free(pname);
	free(cur_target);

	cur_target = NULL;

	numslinksdone++;
	return rval;
}

static int
open_recoverfile(void)
{
	recover_fd = open(recover_file, O_RDWR | O_SYNC | O_CREAT | O_EXCL,
			0600);
	if (recover_fd < 0) {
		if (errno == EEXIST)
			err_message(_("Recovery file already exists, either "
				"run '%s -r %s' or remove the file."),
				progname, recover_file);
		else
			err_open(recover_file);
		return 1;
	}

	if (!platform_test_xfs_fd(recover_fd)) {
		err_not_xfs(recover_file);
		close(recover_fd);
		return 1;
	}

	return 0;
}

static void
update_recoverfile(void)
{
	static const char null_file[] = "0\n0\n0\n\ntarget: \ntemp: \nend\n";
	static size_t	buf_size = 0;
	static char	*buf = NULL;
	int		i, len;

	if (recover_fd <= 0)
		return;

	if (cur_node == NULL || cur_phase == 0) {
		/* inbetween processing or still scanning */
		lseek(recover_fd, 0, SEEK_SET);
		write(recover_fd, null_file, sizeof(null_file));
		return;
	}

	ASSERT(highest_numpaths > 0);
	if (buf == NULL) {
		buf_size = (highest_numpaths + 3) * PATH_MAX;
		buf = malloc(buf_size);
		if (buf == NULL) {
			err_nomem();
			exit(1);
		}
	}

	len = sprintf(buf, "%d\n%llu\n%d\n", cur_phase,
			(long long)cur_node->ino, cur_node->ftw_flags);

	for (i = 0; i < cur_node->numpaths; i++)
		len += sprintf(buf + len, "%s\n", cur_node->paths[i]);

/*	len += sprintf(buf + len, "target: %s\ntemp: %s\nend\n", */
/*			cur_target, cur_temp); */

	ASSERT(len < buf_size);

	lseek(recover_fd, 0, SEEK_SET);
	ftruncate(recover_fd, 0);
	write(recover_fd, buf, len);
}

static void
cleanup(void)
{
	log_message(LOG_NORMAL, _("Interrupted -- cleaning up..."));

	free_nodehash();

	log_message(LOG_NORMAL, _("Done."));
}

static void
sighandler(int sig)
{
	static char	cycle[4] = "-\\|/";
	static uint64_t	cur_cycle = 0;
	double		percent;
	char		*typename;
	uint64_t	nodes, done;

	alarm(0);

	if (sig != SIGALRM) {
		cleanup();
		exit(1);
	}

	if (cur_phase == SCAN_PHASE) {
		if (log_level >= LOG_INFO)
			fprintf(stderr, _("\r%llu files, %llu dirs and %llu "
				"symlinks to renumber found... %c"),
				(long long)numfilenodes,
				(long long)numdirnodes,
				(long long)numslinknodes,
				cycle[cur_cycle % 4]);
		else
			fprintf(stderr, "\r%c",
				cycle[cur_cycle % 4]);
		cur_cycle++;
	} else {
		if (cur_phase >= DIR_PHASE && cur_phase <= DIR_PHASE_MAX) {
			nodes = numdirnodes;
			done = numdirsdone;
			typename = _("dirs");
		} else
		if (cur_phase >= FILE_PHASE && cur_phase <= FILE_PHASE_MAX) {
			nodes = numfilenodes;
			done = numfilesdone;
			typename = _("files");
		} else {
			nodes = numslinknodes;
			done = numslinksdone;
			typename = _("symlinks");
		}
		percent = 100.0 * (double)done / (double)nodes;
		if (percent > 100.0)
			percent = 100.0;
		if (log_level >= LOG_INFO)
			fprintf(stderr, _("\r%.1f%%, %llu of %llu %s, "
					"%u seconds elapsed"), percent,
					(long long)done, (long long)nodes,
					typename, (int)(time(0) - starttime));
		else
			fprintf(stderr, "\r%.1f%%", percent);
	}
	poll_output = 1;
	signal(SIGALRM, sighandler);

	if (poll_interval)
		alarm(poll_interval);
}

static int
read_recover_file(
	char		*recover_file,
	bignode_t	**node,
	char		**target,
	char		**temp,
	int		*phase)
{
	FILE		*file;
	int		rval = 1;
	ino_t		ino;
	int		ftw_flags;
	char		buf[PATH_MAX + 10]; /* path + "target: " */
	struct stat64	st;
	int		first_path;

	/*

	A recovery file should look like:

	<phase>
	<ino number>
	<ftw flags>
	<first path to inode>
	<hardlinks to inode>
	target: <path to target dir or file>
	temp: <path to temp dir if dir phase>
	end
	*/

	file = fopen(recover_file, "r");
	if (file == NULL) {
		err_open(recover_file);
		return 1;
	}

	/* read phase */
	*phase = 0;
	if (fgets(buf, PATH_MAX + 10, file) == NULL) {
		err_message("Recovery failed: unable to read phase");
		goto quit;
	}
	buf[strlen(buf) - 1] = '\0';
	*phase = atoi(buf);
	if (*phase == SCAN_PHASE) {
		fclose(file);
		return 0;
	}
	if ((*phase < DIR_PHASE || *phase > DIR_PHASE_MAX) &&
			(*phase < FILE_PHASE || *phase > FILE_PHASE_MAX)) {
		err_message("Recovery failed: failed to read valid recovery phase");
		goto quit;
	}

	/* read inode number */
	if (fgets(buf, PATH_MAX + 10, file) == NULL) {
		err_message("Recovery failed: unable to read inode number");
		goto quit;
	}
	buf[strlen(buf) - 1] = '\0';
	ino = strtoull(buf, NULL, 10);
	if (ino == 0) {
		err_message("Recovery failed: unable to read inode number");
		goto quit;
	}

	/* read ftw_flags */
	if (fgets(buf, PATH_MAX + 10, file) == NULL) {
		err_message("Recovery failed: unable to read flags");
		goto quit;
	}
	buf[strlen(buf) - 1] = '\0';
	if (buf[1] != '\0' || (buf[0] != '0' && buf[0] != '1')) {
		err_message("Recovery failed: unable to read flags: '%s'", buf);
		goto quit;
	}
	ftw_flags = atoi(buf);

	/* read paths and target path */
	*node = NULL;
	*target = NULL;
	first_path = 1;
	while (fgets(buf, PATH_MAX + 10, file) != NULL) {
		buf[strlen(buf) - 1] = '\0';

		log_message(LOG_DEBUG, "path: '%s'", buf);

		if (buf[0] == '/') {
			if (stat64(buf, &st) < 0) {
				err_message(_("Recovery failed: cannot "
						"stat '%s'"), buf);
				goto quit;
			}
			if (st.st_ino != ino) {
				err_message(_("Recovery failed: inode "
						"number for '%s' does not "
						"match recorded number"), buf);
				goto quit;
			}

			if (first_path) {
				first_path = 0;
				*node = add_node_path(ino, ftw_flags, buf);
			}
			else {
				add_path(*node, buf);
			}
		}
		else if (strncmp(buf, "target: ", 8) == 0) {
			*target = strdup(buf + 8);
			if (*target == NULL) {
				err_nomem();
				goto quit;
			}
			if (stat64(*target, &st) < 0) {
				err_message(_("Recovery failed: cannot "
						"stat '%s'"), *target);
				goto quit;
			}
		}
		else if (strncmp(buf, "temp: ", 6) == 0) {
			*temp = strdup(buf + 6);
			if (*temp == NULL) {
				err_nomem();
				goto quit;
			}
		}
		else if (strcmp(buf, "end") == 0) {
			rval = 0;
			goto quit;
		}
		else {
			err_message(_("Recovery failed: unrecognised "
					"string: '%s'"), buf);
			goto quit;
		}
	}

	err_message(_("Recovery failed: end of recovery file not found"));

 quit:
	if (*node == NULL) {
		err_message(_("Recovery failed: no valid inode or paths "
				"specified"));
		rval = 1;
	}

	if (*target == NULL) {
		err_message(_("Recovery failed: no inode target specified"));
		rval = 1;
	}

	fclose(file);

	return rval;
}

int
recover(
	bignode_t	*node,
	char		*target,
	char		*tname,
	int		phase)
{
	char		*srcname = NULL;
	int		rval = 0;
	int		i;
	int		dir;

	dump_node("recover", node);
	log_message(LOG_DEBUG, "target: %s, phase: %x", target, phase);

	if (node)
		srcname = node->paths[0];

	dir = (phase < DIR_PHASE || phase > DIR_PHASE_MAX);

	switch (phase) {

	case DIR_PHASE_1:
	case FILE_PHASE_1:
	case SLINK_PHASE_1:
		log_message(LOG_NORMAL, _("Unlinking temporary %s: \'%s\'"),
			    dir ? "directory" : "file", target);

		rval = dir ? rmdir(target) : unlink(target);

		if ( rval < 0 && errno != ENOENT)
			err_message(_("unable to remove %s: %s, %s"),
				    dir ? "directory" : "file", target, strerror(errno));

		break;

	case DIR_PHASE_2:
	case FILE_PHASE_2:
	case SLINK_PHASE_2:
		log_message(LOG_NORMAL, _("Unlinking old %s: \'%s\'"),
				dir ? "directory" : "file", srcname);

		rval = dir ? rmdir(target) : unlink(srcname);

		if (rval < 0 && errno != ENOENT) {
			err_message(_("unable to remove %s: %s, %s"), 
				    dir ? "directory" : "file", srcname, strerror(errno));
			break;
		}
		/* FALL THRU */
	case DIR_PHASE_3:
	case FILE_PHASE_3:
	case SLINK_PHASE_3:
		log_message(LOG_NORMAL, _("Renaming: "
				"\'%s\' -> \'%s\'"), target, srcname);
		rval = rename(target, srcname);
		if (rval != 0) {
			/* we can't abort since the src file is now gone.
			 * let the admin clean this one up
			 */
			err_message(_("unable to rename: %s to %s, %s"),
					target, srcname, strerror(errno));
			break;
		}
		if (dir)
			break;
		/* FALL THRU */
	case FILE_PHASE_4:
	case SLINK_PHASE_4:
		/* for each hardlink, unlink and creat pointing to target */
		for (i = 1; i < node->numpaths; i++) {
			if (i == 1)
				log_message(LOG_NORMAL, _("Resetting hardlinks "
						"to new file"));

			rval = unlink(node->paths[i]);
			if (rval != 0) {
				err_message(_("unable to remove file: %s, %s"),
						node->paths[i], strerror(errno));
				break;
			}
			rval = link(srcname, node->paths[i]);
			if (rval != 0) {
				err_message(_("unable to link to file: %s, %s"),
						srcname, strerror(errno));
				break;
			}
		}
		break;
	}

	if (rval == 0) {
		log_message(LOG_NORMAL, _("Removing recover file: \'%s\'"),
				recover_file);
		unlink(recover_file);
		log_message(LOG_NORMAL, _("Recovery done."));
	}
	else {
		log_message(LOG_NORMAL, _("Leaving recover file: \'%s\'"),
				recover_file);
		log_message(LOG_NORMAL, _("Recovery failed."));
	}

	return rval;
}

int
main(
	int		argc,
	char		*argv[])
{
	int		c = 0;
	int		rval = 0;
	int		q_opt = 0;
	int		v_opt = 0;
	int		p_opt = 0;
	int		n_opt = 0;
	char		pathname[PATH_MAX];
	struct stat64	st;

	progname = basename(argv[0]);

	setlocale(LC_ALL, "");
	bindtextdomain(PACKAGE, LOCALEDIR);
	textdomain(PACKAGE);

	while ((c = getopt(argc, argv, "fnpqvP:r:")) != -1) {
		switch (c) {
		case 'f':
			force_all = 1;
			break;
		case 'n':
			n_opt++;
			break;
		case 'p':
			p_opt++;
			break;
		case 'q':
			if (v_opt)
				err_message(_("'q' option incompatible "
						"with 'v' option"));
			q_opt++;
			log_level=0;
			break;
		case 'v':
			if (q_opt)
				err_message(_("'v' option incompatible "
						"with 'q' option"));
			v_opt++;
			log_level++;
			break;
		case 'P':
			poll_interval = atoi(optarg);
			break;
		case 'r':
			recover_file = optarg;
			break;
		default:
			err_message(_("%s: illegal option -- %c\n"), c);
			usage();
			/* NOTREACHED */
			break;
		}
	}

	if (optind != argc - 1 && recover_file == NULL) {
		usage();
		exit(1);
	}

	realuid = getuid();
	starttime = time(0);

	init_nodehash();

	signal(SIGALRM, sighandler);
	signal(SIGABRT, sighandler);
	signal(SIGHUP, sighandler);
	signal(SIGINT, sighandler);
	signal(SIGQUIT, sighandler);
	signal(SIGTERM, sighandler);

	if (p_opt && poll_interval == 0) {
		poll_interval = 1;
	}
	if (poll_interval)
		alarm(poll_interval);

	if (recover_file) {
		bignode_t	*node = NULL;
		char		*target = NULL;
		char		*tname = NULL;
		int		phase = 0;

		if (n_opt)
			goto quit;

		/* read node info from recovery file */
		if (read_recover_file(recover_file, &node, &target,
				&tname, &phase) != 0)
			exit(1);

		rval = recover(node, target, tname, phase);

		free(target);
		free(tname);

		return rval;
	}

	recover_file = malloc(PATH_MAX);
	if (recover_file == NULL) {
		err_nomem();
		exit(1);
	}
	recover_file[0] = '\0';

	strcpy(pathname, argv[optind]);
	if (pathname[0] != '/') {
		err_message(_("pathname must begin with a slash ('/')"));
		exit(1);
	}

	if (stat64(pathname, &st) < 0) {
		err_stat(pathname);
		exit(1);
	}
	if (S_ISREG(st.st_mode)) {
		/* single file specified */
		if (st.st_nlink > 1) {
			err_message(_("cannot process single file with a "
					"link count greater than 1"));
			exit(1);
		}

		strcpy(recover_file, pathname);
		dirname(recover_file);

		strcpy(recover_file + strlen(recover_file), "/xfs_reno.recover");
		if (!n_opt) {
			if (open_recoverfile() != 0)
				exit(1);
		}
		add_node_path(st.st_ino, FTW_F, pathname);
	}
	else if (S_ISDIR(st.st_mode)) {
		/* directory tree specified */
		strcpy(recover_file, pathname);

		strcpy(recover_file + strlen(recover_file), "/xfs_reno.recover");
		if (!n_opt) {
			if (open_recoverfile() != 0)
				exit(1);
		}

		/* directory scan */
		log_message(LOG_INFO, _("\rScanning directory tree..."));
		SET_PHASE(SCAN_PHASE);
	    if (xfs_get_agflags(pathname) != 0) {
            err_message(_("Could not get non-allocatable AGs info\n"));
            exit(1);
        }

		nftw64(pathname, nftw_addnodes, 100, FTW_PHYS | FTW_MOUNT);
	}
	else {
		err_message(_("pathname must be either a regular file "
				"or directory"));
		exit(1);
	}

	dump_nodehash();

	if (n_opt) {
		/* n flag set, don't do anything */
		if (numdirnodes)
			log_message(LOG_NORMAL, "\rWould process %d %s",
					numdirnodes, numdirnodes == 1 ?
						"directory" : "directories");
		else
			log_message(LOG_NORMAL, "\rNo directories to process");

		if (numfilenodes)
			/* process files */
			log_message(LOG_NORMAL, "\rWould process %d %s",
					numfilenodes, numfilenodes == 1 ?
						"file" : "files");
		else
			log_message(LOG_NORMAL, "\rNo files to process");
		if (numslinknodes)
			/* process files */
			log_message(LOG_NORMAL, "\rWould process %d %s",
					numslinknodes, numslinknodes == 1 ?
						"symlinx" : "symlinks");
		else
			log_message(LOG_NORMAL, "\rNo symlinks to process");
	} else {
		/* process directories */
		if (numdirnodes) {
			log_message(LOG_INFO, _("\rProcessing %d %s..."),
					numdirnodes, numdirnodes == 1 ?
					    _("directory") : _("directories"));
			cur_phase = DIR_PHASE;
			rval = for_all_nodes(process_dir, FTW_D, 1);
			if (rval != 0)
				goto quit;
		}
		else
			log_message(LOG_INFO, _("\rNo directories to process..."));

		if (numfilenodes) {
			/* process files */
			log_message(LOG_INFO, _("\rProcessing %d %s..."),
					numfilenodes, numfilenodes == 1 ?
						_("file") : _("files"));
			cur_phase = FILE_PHASE;
			for_all_nodes(process_file, FTW_F, 0);
		}
		else
			log_message(LOG_INFO, _("\rNo files to process..."));

		if (numslinknodes) {
			/* process symlinks */
			log_message(LOG_INFO, _("\rProcessing %d %s..."),
					numslinknodes, numslinknodes == 1 ?
						_("symlink") : _("symlinks"));
			cur_phase = SLINK_PHASE;
			for_all_nodes(process_slink, FTW_SL, 0);
		}
		else
			log_message(LOG_INFO, _("\rNo symlinks to process..."));

	}
quit:
	free_nodehash();

	close(recover_fd);

	if (rval == 0)
		unlink(recover_file);

	log_message(LOG_DEBUG, "\r%u seconds elapsed", time(0) - starttime);
	log_message(LOG_INFO, _("\rDone.     "));

	return rval | global_rval;
}

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: REVIEW: xfs_reno #2
  2008-03-06 16:10 ` Ruben Porras
@ 2008-06-03 20:34   ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2008-06-03 20:34 UTC (permalink / raw)
  To: Barry Naujok, xfs@oss.sgi.com

What happened to xfs xfs_reno patch?  While the swapino functionality
would be nice to have I can't think of a reason to not have the current
code in xfsprogs.

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2008-06-03 20:34 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-04  4:25 REVIEW: xfs_reno #2 Barry Naujok
2007-10-17 15:48 ` Ruben Porras
2007-11-16  6:04 ` Vlad Apostolov
2007-11-16  6:20   ` Timothy Shimmin
2007-11-18 23:13     ` Vlad Apostolov
2007-11-18 23:19       ` Vlad Apostolov
2007-11-19 12:39     ` Christoph Hellwig
2007-11-19 15:52       ` Eric Sandeen
2007-11-19 22:08         ` Vlad Apostolov
2007-11-19  3:48 ` Lachlan McIlroy
2007-11-20  1:36 ` David Chinner
2007-11-23 14:30   ` Ruben Porras
2008-03-06 16:11   ` Ruben Porras
2008-03-06 16:10 ` Ruben Porras
2008-06-03 20:34   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox