linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Gao Xiang <hsiangkao@redhat.com>
To: linux-xfs@vger.kernel.org
Cc: Gao Xiang <hsiangkao@redhat.com>
Subject: [RFC PATCH] xfsrestore: fix rootdir due to xfsdump bulkstat misuse
Date: Fri, 13 Nov 2020 20:51:27 +0800	[thread overview]
Message-ID: <20201113125127.966243-1-hsiangkao@redhat.com> (raw)

If rootino is wrong and misspecified to a subdir inode #,
the following assertion could be triggered:
  assert(ino != persp->p_rootino || hardh == persp->p_rooth);

This patch adds a '-x' option (another awkward thing is that
the codebase doesn't support long options) to address
problamatic images by searching for the real dir, the reason
that I don't enable it by default is that I'm not very confident
with the xfsrestore codebase and xfsdump bulkstat issue will
also be fixed immediately as well, so this function might be
optional and only useful for pre-exist corrupted dumps.

In details, my understanding of the original logic is
 1) xfsrestore will create a rootdir node_t (p_rooth);
 2) it will build the tree hierarchy from inomap and adopt
    the parent if needed (so inodes whose parent ino hasn't
    been detected will be in the orphan dir, p_orphh);
 3) during this period, if ino == rootino and
    hardh != persp->p_rooth (IOWs, another node_t with
    the same ino # is created), that'd be definitely wrong.

So the proposal fix is that
 - considering the xfsdump root ino # is a subdir inode, it'll
   trigger ino == rootino && hardh != persp->p_rooth condition;
 - so we log this node_t as persp->p_rooth rather than the
   initial rootdir node_t created in 1);
 - we also know that this node is actually a subdir, and after
   the whole inomap is scanned (IOWs, the tree is built),
   the real root dir will have the orphan dir parent p_orphh;
 - therefore, we walk up by the parent until some node_t has
   the p_orphh, so that's it.

Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
---
 restore/content.c |  7 +++++++
 restore/getopt.h  |  4 ++--
 restore/tree.c    | 44 +++++++++++++++++++++++++++++++++++++++++++-
 restore/tree.h    |  2 ++
 4 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/restore/content.c b/restore/content.c
index 6b22965..e807a83 100644
--- a/restore/content.c
+++ b/restore/content.c
@@ -865,6 +865,7 @@ static int quotafilecheck(char *type, char *dstdir, char *quotafile);
 
 bool_t content_media_change_needed;
 bool_t restore_rootdir_permissions;
+bool_t need_fixrootdir;
 char *media_change_alert_program = NULL;
 size_t perssz;
 
@@ -964,6 +965,7 @@ content_init(int argc, char *argv[], size64_t vmsz)
 	stsz = 0;
 	interpr = BOOL_FALSE;
 	restore_rootdir_permissions = BOOL_FALSE;
+	need_fixrootdir = BOOL_FALSE;
 	optind = 1;
 	opterr = 0;
 	while ((c = getopt(argc, argv, GETOPT_CMDSTRING)) != EOF) {
@@ -1189,6 +1191,9 @@ content_init(int argc, char *argv[], size64_t vmsz)
 		case GETOPT_FMT2COMPAT:
 			tranp->t_truncategenpr = BOOL_TRUE;
 			break;
+		case GETOPT_FIXROOTDIR:
+			need_fixrootdir = BOOL_TRUE;
+			break;
 		}
 	}
 
@@ -3140,6 +3145,8 @@ applydirdump(drive_t *drivep,
 			return rv;
 		}
 
+		if (need_fixrootdir)
+			tree_fixroot();
 		persp->s.dirdonepr = BOOL_TRUE;
 	}
 
diff --git a/restore/getopt.h b/restore/getopt.h
index b5bc004..b0c0c7d 100644
--- a/restore/getopt.h
+++ b/restore/getopt.h
@@ -26,7 +26,7 @@
  * purpose is to contain that command string.
  */
 
-#define GETOPT_CMDSTRING	"a:b:c:def:himn:op:qrs:tv:wABCDEFG:H:I:JKL:M:NO:PQRS:TUVWX:Y:"
+#define GETOPT_CMDSTRING	"a:b:c:def:himn:op:qrs:tv:wxABCDEFG:H:I:JKL:M:NO:PQRS:TUVWX:Y:"
 
 #define GETOPT_WORKSPACE	'a'	/* workspace dir (content.c) */
 #define GETOPT_BLOCKSIZE        'b'     /* blocksize for rmt */
@@ -51,7 +51,7 @@
 /*				'u' */
 #define	GETOPT_VERBOSITY	'v'	/* verbosity level (0 to 4) */
 #define	GETOPT_SMALLWINDOW	'w'	/* use a small window for dir entries */
-/*				'x' */
+#define GETOPT_FIXROOTDIR	'x'	/* try to fix rootdir due to bulkstat misuse */
 /*				'y' */
 /*				'z' */
 #define	GETOPT_NOEXTATTR	'A'	/* do not restore ext. file attr. */
diff --git a/restore/tree.c b/restore/tree.c
index 0670318..c1e0461 100644
--- a/restore/tree.c
+++ b/restore/tree.c
@@ -265,6 +265,7 @@ extern void usage(void);
 extern size_t pgsz;
 extern size_t pgmask;
 extern bool_t restore_rootdir_permissions;
+extern bool_t need_fixrootdir;
 
 /* forward declarations of locally defined static functions ******************/
 
@@ -335,6 +336,36 @@ static xfs_ino_t orphino = ORPH_INO;
 
 /* definition of locally defined global functions ****************************/
 
+void
+tree_fixroot(void)
+{
+	nh_t		rooth = persp->p_rooth;
+	xfs_ino_t 	rootino;
+
+	while (1) {
+		nh_t	parh;
+		node_t *rootp = Node_map(rooth);
+
+		rootino = rootp->n_ino;
+		parh = rootp->n_parh;
+		Node_unmap(rooth, &rootp);
+
+		if (parh == rooth ||
+		/* since all new node (including non-parent) would be adopted into orphh */
+		    parh == persp->p_orphh ||
+		    parh == NH_NULL)
+			break;
+		rooth = parh;
+	}
+
+	if (rooth != persp->p_rooth) {
+		persp->p_rooth = rooth;
+		persp->p_rootino = rootino;
+
+		mlog(MLOG_NORMAL, "fix root # to %llu (bind mount?)\n", rootino);
+	}
+}
+
 /* ARGSUSED */
 bool_t
 tree_init(char *hkdir,
@@ -753,8 +784,13 @@ tree_begindir(filehdr_t *fhdrp, dah_t *dahp)
 
 	/* lookup head of hardlink list
 	 */
+	/* do not use the old fake root node to prevent potential loop */
+	if (need_fixrootdir == BOOL_TRUE && ino == persp->p_rootino && !gen)
+		gen = -1;
+
 	hardh = link_hardh(ino, gen);
-	assert(ino != persp->p_rootino || hardh == persp->p_rooth);
+	if (need_fixrootdir == BOOL_FALSE)
+		assert(ino != persp->p_rootino || hardh == persp->p_rooth);
 
 	/* already present
 	 */
@@ -824,6 +860,12 @@ tree_begindir(filehdr_t *fhdrp, dah_t *dahp)
 		*dahp = dah;
 	}
 
+	/* found the fake rootino from subdir, need fix p_rooth. */
+	if (need_fixrootdir == BOOL_TRUE &&
+	    persp->p_rootino == ino && hardh != persp->p_rooth) {
+		mlog(MLOG_NORMAL, "found fake rootino #%llu, will fix.\n", ino);
+		persp->p_rooth = hardh;
+	}
 	return hardh;
 }
 
diff --git a/restore/tree.h b/restore/tree.h
index 4f9ffe8..5d0c346 100644
--- a/restore/tree.h
+++ b/restore/tree.h
@@ -18,6 +18,8 @@
 #ifndef TREE_H
 #define TREE_H
 
+void tree_fixroot(void);
+
 /* tree_init - creates a new tree abstraction.
  */
 extern bool_t tree_init(char *hkdir,
-- 
2.18.4


             reply	other threads:[~2020-11-13 12:52 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-13 12:51 Gao Xiang [this message]
2020-11-13 14:10 ` [RFC PATCH] xfsrestore: fix rootdir due to xfsdump bulkstat misuse Eric Sandeen
2020-11-13 14:15   ` Gao Xiang
2020-11-16  8:23     ` Gao Xiang
2020-11-16  8:07 ` [RFC PATCH v2] " Gao Xiang
2020-11-16  8:13   ` Gao Xiang
2021-09-27 15:00   ` Bill O'Donnell
2021-12-20 15:14   ` Masayoshi Mizuma
2022-06-10 23:05   ` Hironori Shiina
2022-06-30 19:19     ` Hironori Shiina
2022-09-28 19:10   ` [RFC PATCH V3] " Hironori Shiina
2022-09-29 18:17     ` Darrick J. Wong
2023-04-19 14:16       ` Eric Sandeen
2023-04-19 15:40         ` Darrick J. Wong
2023-04-21  9:10           ` Carlos Maiolino
2023-04-21 16:14             ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201113125127.966243-1-hsiangkao@redhat.com \
    --to=hsiangkao@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).