* [PATCHSET v29.4 10/13] xfs: move orphan files to lost and found
@ 2024-02-27 2:18 Darrick J. Wong
2024-02-27 2:32 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Darrick J. Wong @ 2024-02-27 2:18 UTC (permalink / raw)
To: djwong; +Cc: linux-xfs, hch
Hi all,
Orphaned files are defined to be files with nonzero ondisk link count
but no observable parent directory. This series enables online repair
to reparent orphaned files into the filesystem directory tree, and wires
up this reparenting ability into the directory, file link count, and
parent pointer repair functions. This is how we fix files with positive
link count that are not reachable through the directory tree.
This patch will also create the orphanage directory (lost+found) if it
is not present. In contrast to xfs_repair, we follow e2fsck in creating
the lost+found without group or other-owner access to avoid accidental
disclosure of files that were previously hidden by an 0700 directory.
That's silly security, but people have been known to do it.
If you're going to start using this code, I strongly recommend pulling
from my git trees, which are linked below.
This has been running on the djcloud for months with no problems. Enjoy!
Comments and questions are, as always, welcome.
--D
kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-orphanage
---
Commits in this patchset:
* xfs: move orphan files to the orphanage
* xfs: move files to orphanage instead of letting nlinks drop to zero
* xfs: ensure dentry consistency when the orphanage adopts a file
---
.../filesystems/xfs/xfs-online-fsck-design.rst | 20 -
fs/xfs/Makefile | 1
fs/xfs/scrub/dir_repair.c | 130 ++++
fs/xfs/scrub/nlinks.c | 11
fs/xfs/scrub/nlinks.h | 6
fs/xfs/scrub/nlinks_repair.c | 124 ++++
fs/xfs/scrub/orphanage.c | 587 ++++++++++++++++++++
fs/xfs/scrub/orphanage.h | 75 +++
fs/xfs/scrub/parent_repair.c | 98 +++
fs/xfs/scrub/repair.h | 2
fs/xfs/scrub/scrub.c | 2
fs/xfs/scrub/scrub.h | 4
fs/xfs/scrub/trace.c | 1
fs/xfs/scrub/trace.h | 96 +++
fs/xfs/xfs_inode.c | 6
fs/xfs/xfs_inode.h | 1
16 files changed, 1130 insertions(+), 34 deletions(-)
create mode 100644 fs/xfs/scrub/orphanage.c
create mode 100644 fs/xfs/scrub/orphanage.h
^ permalink raw reply [flat|nested] 12+ messages in thread* [PATCH 1/3] xfs: move orphan files to the orphanage 2024-02-27 2:18 [PATCHSET v29.4 10/13] xfs: move orphan files to lost and found Darrick J. Wong @ 2024-02-27 2:32 ` Darrick J. Wong 2024-02-28 17:21 ` Christoph Hellwig 2024-02-27 2:32 ` [PATCH 2/3] xfs: move files to orphanage instead of letting nlinks drop to zero Darrick J. Wong 2024-02-27 2:32 ` [PATCH 3/3] xfs: ensure dentry consistency when the orphanage adopts a file Darrick J. Wong 2 siblings, 1 reply; 12+ messages in thread From: Darrick J. Wong @ 2024-02-27 2:32 UTC (permalink / raw) To: djwong; +Cc: linux-xfs, hch From: Darrick J. Wong <djwong@kernel.org> When we're repairing a directory structure or fixing the dotdot entry of a subdirectory, it's possible that we won't ever find a parent for the subdirectory. When this is the case, move it to the orphanage, aka /lost+found. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- .../filesystems/xfs/xfs-online-fsck-design.rst | 19 + fs/xfs/Makefile | 1 fs/xfs/scrub/dir_repair.c | 130 +++++ fs/xfs/scrub/orphanage.c | 499 ++++++++++++++++++++ fs/xfs/scrub/orphanage.h | 75 +++ fs/xfs/scrub/parent_repair.c | 98 ++++ fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/scrub.h | 4 fs/xfs/scrub/trace.h | 28 + fs/xfs/xfs_inode.c | 6 fs/xfs/xfs_inode.h | 1 11 files changed, 843 insertions(+), 20 deletions(-) create mode 100644 fs/xfs/scrub/orphanage.c create mode 100644 fs/xfs/scrub/orphanage.h diff --git a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst index f72e1ed2d0e5f..74069f692dfd1 100644 --- a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst +++ b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst @@ -4778,14 +4778,21 @@ Orphaned files are adopted by the orphanage as follows: The ``xrep_orphanage_iolock_two`` function follows the inode locking strategy discussed earlier. -3. Call ``xrep_orphanage_compute_blkres`` and ``xrep_orphanage_compute_name`` - to compute the new name in the orphanage and the block reservation required. - -4. Use ``xrep_orphanage_adoption_prep`` to reserve resources to the repair +3. Use ``xrep_adoption_trans_alloc`` to reserve resources to the repair transaction. -5. Call ``xrep_orphanage_adopt`` to reparent the orphaned file into the lost - and found, and update the kernel dentry cache. +4. Call ``xrep_orphanage_compute_name`` to compute the new name in the + orphanage. + +5. If the adoption is going to happen, call ``xrep_adoption_reparent`` to + reparent the orphaned file into the lost and found and invalidate the dentry + cache. + +6. Call ``xrep_adoption_finish`` to commit any filesystem updates, release the + orphanage ILOCK, and clean the scrub transaction. + +7. If a runtime error happens, call ``xrep_adoption_cancel`` to release all + resources. The proposed patches are in the `orphanage adoption diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 690e082ce93dd..464febc2f7cd2 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -206,6 +206,7 @@ xfs-y += $(addprefix scrub/, \ inode_repair.o \ newbt.o \ nlinks_repair.o \ + orphanage.o \ parent_repair.o \ rcbag_btree.o \ rcbag.o \ diff --git a/fs/xfs/scrub/dir_repair.c b/fs/xfs/scrub/dir_repair.c index dba8745767eab..0bf73e9c63fba 100644 --- a/fs/xfs/scrub/dir_repair.c +++ b/fs/xfs/scrub/dir_repair.c @@ -42,6 +42,7 @@ #include "scrub/readdir.h" #include "scrub/reap.h" #include "scrub/findparent.h" +#include "scrub/orphanage.h" /* * Directory Repair @@ -115,12 +116,21 @@ struct xrep_dir { */ struct xrep_parent_scan_info pscan; + /* + * Context information for attaching this directory to the lost+found + * if this directory does not have a parent. + */ + struct xrep_adoption adoption; + /* How many subdirectories did we find? */ uint64_t subdirs; /* How many dirents did we find? */ unsigned int dirents; + /* Should we move this directory to the orphanage? */ + bool needs_adoption; + /* Directory entry name, plus the trailing null. */ unsigned char namebuf[MAXNAMELEN]; }; @@ -147,6 +157,10 @@ xrep_setup_directory( xchk_fsgates_enable(sc, XCHK_FSGATES_DIRENTS); + error = xrep_orphanage_try_create(sc); + if (error) + return error; + error = xrep_tempfile_create(sc, S_IFDIR); if (error) return error; @@ -1139,10 +1153,8 @@ xrep_dir_set_nlink( /* * The directory is not on the incore unlinked list, which means that * it needs to be reachable via the directory tree. Update the nlink - * with our observed link count. - * - * XXX: A subsequent patch will handle parentless directories by moving - * them to the lost and found instead of aborting the repair. + * with our observed link count. If the directory has no parent, it + * will be moved to the orphanage. */ if (!xfs_inode_on_unlinked_list(dp)) goto reset_nlink; @@ -1153,6 +1165,7 @@ xrep_dir_set_nlink( * inactivate when the last reference drops. */ if (rd->dirents == 0) { + rd->needs_adoption = false; new_nlink = 0; goto reset_nlink; } @@ -1161,7 +1174,8 @@ xrep_dir_set_nlink( * The directory is on the unlinked list and we found dirents. This * directory needs to be reachable via the directory tree. Remove the * dir from the unlinked list and update nlink with the observed link - * count. + * count. If the directory has no parent, it will be moved to the + * orphanage. */ pag = xfs_perag_get(sc->mp, XFS_INO_TO_AGNO(sc->mp, dp->i_ino)); if (!pag) { @@ -1197,12 +1211,16 @@ xrep_dir_swap( return -EFSCORRUPTED; /* - * If we never found the parent for this directory, we can't fix this - * directory. + * If we never found the parent for this directory, temporarily assign + * the root dir as the parent; we'll move this to the orphanage after + * exchanging the dir contents. We hold the ILOCK of the dir being + * repaired, so we're not worried about racy updates of dotdot. */ ASSERT(sc->ilock_flags & XFS_ILOCK_EXCL); - if (rd->pscan.parent_ino == NULLFSINO) - return -EFSCORRUPTED; + if (rd->pscan.parent_ino == NULLFSINO) { + rd->needs_adoption = true; + rd->pscan.parent_ino = rd->sc->mp->m_sb.sb_rootino; + } /* * Reset the temporary directory's '..' entry to point to the parent @@ -1360,6 +1378,91 @@ xrep_dir_setup_scan( return error; } +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_dir_move_to_orphanage( + struct xrep_dir *rd) +{ + struct xfs_scrub *sc = rd->sc; + xfs_ino_t orig_parent, new_parent; + int error; + + /* + * We are about to drop the ILOCK on sc->ip to lock the orphanage and + * prepare for the adoption. Therefore, look up the old dotdot entry + * for sc->ip so that we can compare it after we re-lock sc->ip. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &orig_parent); + if (error) + return error; + + /* + * Drop the ILOCK on the scrub target and commit the transaction. + * Adoption computes its own resource requirements and gathers the + * necessary components. + */ + error = xrep_trans_commit(sc); + if (error) + return error; + xchk_iunlock(sc, XFS_ILOCK_EXCL); + + /* If we can take the orphanage's iolock then we're ready to move. */ + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + return error; + } + + /* Grab transaction and ILOCK the two files. */ + error = xrep_adoption_trans_alloc(sc, &rd->adoption); + if (error) + return error; + + error = xrep_adoption_compute_name(&rd->adoption, rd->namebuf); + if (error) + return error; + + /* + * Now that we've reacquired the ILOCK on sc->ip, look up the dotdot + * entry again. If the parent changed or the child was unlinked while + * the child directory was unlocked, we don't need to move the child to + * the orphanage after all. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &new_parent); + if (error) + return error; + + /* + * Attach to the orphanage if we still have a linked directory and it + * hasn't been moved. + */ + if (orig_parent == new_parent && VFS_I(sc->ip)->i_nlink > 0) { + error = xrep_adoption_move(&rd->adoption); + if (error) + return error; + } + + /* + * Launder the scrub transaction so we can drop the orphanage ILOCK + * and IOLOCK. Return holding the scrub target's ILOCK and IOLOCK. + */ + error = xrep_adoption_trans_roll(&rd->adoption); + if (error) + return error; + + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return 0; +} + /* * Repair the directory metadata. * @@ -1398,6 +1501,15 @@ xrep_directory( if (error) goto out_teardown; + if (rd->needs_adoption) { + if (!xrep_orphanage_can_adopt(rd->sc)) + error = -EFSCORRUPTED; + else + error = xrep_dir_move_to_orphanage(rd); + if (error) + goto out_teardown; + } + out_teardown: xrep_dir_teardown(sc); return error; diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c new file mode 100644 index 0000000000000..0aedc5c70b632 --- /dev/null +++ b/fs/xfs/scrub/orphanage.c @@ -0,0 +1,499 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_log_format.h" +#include "xfs_trans.h" +#include "xfs_inode.h" +#include "xfs_ialloc.h" +#include "xfs_quota.h" +#include "xfs_trans_space.h" +#include "xfs_dir2.h" +#include "xfs_icache.h" +#include "xfs_bmap.h" +#include "xfs_bmap_btree.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/repair.h" +#include "scrub/trace.h" +#include "scrub/orphanage.h" +#include "scrub/readdir.h" + +#include <linux/namei.h> + +/* + * The Orphanage + * ============= + * + * If the directory tree is damaged, children of that directory become + * inaccessible via that file path. If a child has no other parents, the file + * is said to be orphaned. xfs_repair fixes this situation by creating a + * orphanage directory (specifically, /lost+found) and creating a directory + * entry pointing to the orphaned file. + * + * Online repair follows this tactic by creating a root-owned /lost+found + * directory if one does not exist. If an orphan is found, it will move that + * files into orphanage. + */ + +/* Make the orphanage owned by root. */ +STATIC int +xrep_chown_orphanage( + struct xfs_scrub *sc, + struct xfs_inode *dp) +{ + struct xfs_trans *tp; + struct xfs_mount *mp = sc->mp; + struct xfs_dquot *udqp = NULL, *gdqp = NULL, *pdqp = NULL; + struct xfs_dquot *oldu = NULL, *oldg = NULL, *oldp = NULL; + struct inode *inode = VFS_I(dp); + int error; + + error = xfs_qm_vop_dqalloc(dp, GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0, + XFS_QMOPT_QUOTALL, &udqp, &gdqp, &pdqp); + if (error) + return error; + + error = xfs_trans_alloc_ichange(dp, udqp, gdqp, pdqp, true, &tp); + if (error) + goto out_dqrele; + + /* + * Always clear setuid/setgid/sticky on the orphanage since we don't + * normally want that functionality on this directory and xfs_repair + * doesn't create it this way either. Leave the other access bits + * unchanged. + */ + inode->i_mode &= ~(S_ISUID | S_ISGID | S_ISVTX); + + /* + * Change the ownerships and register quota modifications + * in the transaction. + */ + if (!uid_eq(inode->i_uid, GLOBAL_ROOT_UID)) { + if (XFS_IS_UQUOTA_ON(mp)) + oldu = xfs_qm_vop_chown(tp, dp, &dp->i_udquot, udqp); + inode->i_uid = GLOBAL_ROOT_UID; + } + if (!gid_eq(inode->i_gid, GLOBAL_ROOT_GID)) { + if (XFS_IS_GQUOTA_ON(mp)) + oldg = xfs_qm_vop_chown(tp, dp, &dp->i_gdquot, gdqp); + inode->i_gid = GLOBAL_ROOT_GID; + } + if (dp->i_projid != 0) { + if (XFS_IS_PQUOTA_ON(mp)) + oldp = xfs_qm_vop_chown(tp, dp, &dp->i_pdquot, pdqp); + dp->i_projid = 0; + } + + dp->i_diflags &= ~(XFS_DIFLAG_REALTIME | XFS_DIFLAG_RTINHERIT); + xfs_trans_log_inode(tp, dp, XFS_ILOG_CORE); + + XFS_STATS_INC(mp, xs_ig_attrchg); + + if (xfs_has_wsync(mp)) + xfs_trans_set_sync(tp); + error = xfs_trans_commit(tp); + + xfs_qm_dqrele(oldu); + xfs_qm_dqrele(oldg); + xfs_qm_dqrele(oldp); + +out_dqrele: + xfs_qm_dqrele(udqp); + xfs_qm_dqrele(gdqp); + xfs_qm_dqrele(pdqp); + return error; +} + +#define ORPHANAGE "lost+found" + +/* Create the orphanage directory, and set sc->orphanage to it. */ +int +xrep_orphanage_create( + struct xfs_scrub *sc) +{ + struct xfs_mount *mp = sc->mp; + struct dentry *root_dentry, *orphanage_dentry; + struct inode *root_inode = VFS_I(sc->mp->m_rootip); + struct inode *orphanage_inode; + int error; + + if (xfs_is_shutdown(mp)) + return -EIO; + if (xfs_is_readonly(mp)) { + sc->orphanage = NULL; + return 0; + } + + ASSERT(sc->tp == NULL); + ASSERT(sc->orphanage == NULL); + + /* Find the dentry for the root directory... */ + root_dentry = d_find_alias(root_inode); + if (!root_dentry) { + error = -EFSCORRUPTED; + goto out; + } + + /* ...which is a directory, right? */ + if (!d_is_dir(root_dentry)) { + error = -EFSCORRUPTED; + goto out_dput_root; + } + + /* Try to find the orphanage directory. */ + inode_lock_nested(root_inode, I_MUTEX_PARENT); + orphanage_dentry = lookup_one_len(ORPHANAGE, root_dentry, + strlen(ORPHANAGE)); + if (IS_ERR(orphanage_dentry)) { + error = PTR_ERR(orphanage_dentry); + goto out_unlock_root; + } + + /* + * Nothing found? Call mkdir to create the orphanage. Create the + * directory without other-user access because we're live and someone + * could have been relying partly on minimal access to a parent + * directory to control access to a file we put in here. + */ + if (d_really_is_negative(orphanage_dentry)) { + error = vfs_mkdir(&nop_mnt_idmap, root_inode, orphanage_dentry, + 0750); + if (error) + goto out_dput_orphanage; + } + + /* Not a directory? Bail out. */ + if (!d_is_dir(orphanage_dentry)) { + error = -ENOTDIR; + goto out_dput_orphanage; + } + + /* + * Grab a reference to the orphanage. This /should/ succeed since + * we hold the root directory locked and therefore nobody can delete + * the orphanage. + */ + orphanage_inode = igrab(d_inode(orphanage_dentry)); + if (!orphanage_inode) { + error = -ENOENT; + goto out_dput_orphanage; + } + + /* Make sure the orphanage is owned by root. */ + error = xrep_chown_orphanage(sc, XFS_I(orphanage_inode)); + if (error) + goto out_dput_orphanage; + + /* Stash the reference for later and bail out. */ + sc->orphanage = XFS_I(orphanage_inode); + sc->orphanage_ilock_flags = 0; + +out_dput_orphanage: + dput(orphanage_dentry); +out_unlock_root: + inode_unlock(VFS_I(sc->mp->m_rootip)); +out_dput_root: + dput(root_dentry); +out: + return error; +} + +void +xrep_orphanage_ilock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + sc->orphanage_ilock_flags |= ilock_flags; + xfs_ilock(sc->orphanage, ilock_flags); +} + +bool +xrep_orphanage_ilock_nowait( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + if (xfs_ilock_nowait(sc->orphanage, ilock_flags)) { + sc->orphanage_ilock_flags |= ilock_flags; + return true; + } + + return false; +} + +void +xrep_orphanage_iunlock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + xfs_iunlock(sc->orphanage, ilock_flags); + sc->orphanage_ilock_flags &= ~ilock_flags; +} + +/* Grab the IOLOCK of the orphanage and sc->ip. */ +int +xrep_orphanage_iolock_two( + struct xfs_scrub *sc) +{ + int error = 0; + + while (true) { + if (xchk_should_terminate(sc, &error)) + return error; + + /* + * Normal XFS takes the IOLOCK before grabbing a transaction. + * Scrub holds a transaction, which means that we can't block + * on either IOLOCK. + */ + if (xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + if (xchk_ilock_nowait(sc, XFS_IOLOCK_EXCL)) + break; + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + } + delay(1); + } + + return 0; +} + +/* Release the orphanage. */ +void +xrep_orphanage_rele( + struct xfs_scrub *sc) +{ + if (!sc->orphanage) + return; + + if (sc->orphanage_ilock_flags) + xfs_iunlock(sc->orphanage, sc->orphanage_ilock_flags); + + xchk_irele(sc, sc->orphanage); + sc->orphanage = NULL; +} + +/* Adoption moves a file into /lost+found */ + +/* Can the orphanage adopt @sc->ip? */ +bool +xrep_orphanage_can_adopt( + struct xfs_scrub *sc) +{ + ASSERT(sc->ip != NULL); + + if (!sc->orphanage) + return false; + if (sc->ip == sc->orphanage) + return false; + if (xfs_internal_inum(sc->mp, sc->ip->i_ino)) + return false; + return true; +} + +/* + * Create a new transaction to send a child to the orphanage. + * + * Allocate a new transaction with sufficient disk space to handle the + * adoption, take ILOCK_EXCL of the orphanage and sc->ip, joins them to the + * transaction, and reserve quota to reparent the latter. Caller must hold the + * IOLOCK of the orphanage and sc->ip. + */ +int +xrep_adoption_trans_alloc( + struct xfs_scrub *sc, + struct xrep_adoption *adopt) +{ + struct xfs_mount *mp = sc->mp; + unsigned int child_blkres = 0; + int error; + + ASSERT(sc->tp == NULL); + ASSERT(sc->ip != NULL); + ASSERT(sc->orphanage != NULL); + ASSERT(sc->ilock_flags & XFS_IOLOCK_EXCL); + ASSERT(sc->orphanage_ilock_flags & XFS_IOLOCK_EXCL); + ASSERT(!(sc->ilock_flags & (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL))); + ASSERT(!(sc->orphanage_ilock_flags & + (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL))); + + /* Compute the worst case space reservation that we need. */ + adopt->sc = sc; + adopt->orphanage_blkres = XFS_LINK_SPACE_RES(mp, MAXNAMELEN); + if (S_ISDIR(VFS_I(sc->ip)->i_mode)) + child_blkres = XFS_RENAME_SPACE_RES(mp, xfs_name_dotdot.len); + adopt->child_blkres = child_blkres; + + /* + * Allocate a transaction to link the child into the parent, along with + * enough disk space to handle expansion of both the orphanage and the + * dotdot entry of a child directory. + */ + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, + adopt->orphanage_blkres + adopt->child_blkres, 0, 0, + &sc->tp); + if (error) + return error; + + xfs_lock_two_inodes(sc->orphanage, XFS_ILOCK_EXCL, + sc->ip, XFS_ILOCK_EXCL); + sc->ilock_flags |= XFS_ILOCK_EXCL; + sc->orphanage_ilock_flags |= XFS_ILOCK_EXCL; + + xfs_trans_ijoin(sc->tp, sc->orphanage, 0); + xfs_trans_ijoin(sc->tp, sc->ip, 0); + + /* + * Reserve enough quota in the orphan directory to add the new name. + * Normally the orphanage should have user/group/project ids of zero + * and hence is not subject to quota enforcement, but we're allowed to + * exceed quota to reattach disconnected parts of the directory tree. + */ + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->orphanage, + adopt->orphanage_blkres, 0, true); + if (error) + goto out_cancel; + + /* + * Reserve enough quota in the child directory to change dotdot. + * Here we're also allowed to exceed file quota to repair inconsistent + * metadata. + */ + if (adopt->child_blkres) { + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->ip, + adopt->child_blkres, 0, true); + if (error) + goto out_cancel; + } + + return 0; +out_cancel: + xchk_trans_cancel(sc); + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return error; +} + +/* + * Compute the xfs_name for the directory entry that we're adding to the + * orphanage. Caller must hold ILOCKs of sc->ip and the orphanage and must not + * reuse namebuf until the adoption completes or is dissolved. + */ +int +xrep_adoption_compute_name( + struct xrep_adoption *adopt, + unsigned char *namebuf) +{ + struct xfs_name *xname = &adopt->xname; + struct xfs_scrub *sc = adopt->sc; + xfs_ino_t ino; + unsigned int incr = 0; + int error = 0; + + xname->name = namebuf; + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu", sc->ip->i_ino); + xname->type = xfs_mode_to_ftype(VFS_I(sc->ip)->i_mode); + + /* Make sure the filename is unique in the lost+found. */ + error = xchk_dir_lookup(sc, sc->orphanage, xname, &ino); + while (error == 0 && incr < 10000) { + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu.%u", + sc->ip->i_ino, ++incr); + error = xchk_dir_lookup(sc, sc->orphanage, xname, &ino); + } + if (error == 0) { + /* We already have 10,000 entries in the orphanage? */ + return -EFSCORRUPTED; + } + + if (error != -ENOENT) + return error; + return 0; +} + +/* + * Move the current file to the orphanage under the computed name. + * + * Returns with a dirty transaction so that the caller can handle any other + * work, such as fixing up unlinked lists or resetting link counts. + */ +int +xrep_adoption_move( + struct xrep_adoption *adopt) +{ + struct xfs_scrub *sc = adopt->sc; + struct xfs_name *xname = &adopt->xname; + bool isdir = S_ISDIR(VFS_I(sc->ip)->i_mode); + int error; + + trace_xrep_adoption_reparent(sc->orphanage, &adopt->xname, + sc->ip->i_ino); + + /* Create the new name in the orphanage. */ + error = xfs_dir_createname(sc->tp, sc->orphanage, xname, sc->ip->i_ino, + adopt->orphanage_blkres); + if (error) + return error; + + /* + * Bump the link count of the orphanage if we just added a + * subdirectory, and update its timestamps. + */ + xfs_trans_ichgtime(sc->tp, sc->orphanage, + XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG); + if (isdir) + xfs_bumplink(sc->tp, sc->orphanage); + xfs_trans_log_inode(sc->tp, sc->orphanage, XFS_ILOG_CORE); + + /* Replace the dotdot entry if the child is a subdirectory. */ + if (isdir) { + error = xfs_dir_replace(sc->tp, sc->ip, &xfs_name_dotdot, + sc->orphanage->i_ino, adopt->child_blkres); + if (error) + return error; + } + + /* + * Notify dirent hooks that we moved the file to /lost+found, and + * finish all the deferred work so that we know the adoption is fully + * recorded in the log. + */ + xfs_dir_update_hook(sc->orphanage, sc->ip, 1, xname); + return 0; +} + +/* + * Roll to a clean scrub transaction so that we can release the orphanage, + * even if xrep_adoption_move was not called. + * + * Commits all the work and deferred ops attached to an adoption request and + * rolls to a clean scrub transaction. On success, returns 0 with the scrub + * context holding a clean transaction with no inodes joined. On failure, + * returns negative errno with no scrub transaction. All inode locks are + * still held after this function returns. + */ +int +xrep_adoption_trans_roll( + struct xrep_adoption *adopt) +{ + struct xfs_scrub *sc = adopt->sc; + int error; + + trace_xrep_adoption_trans_roll(sc->orphanage, sc->ip, + !!(sc->tp->t_flags & XFS_TRANS_DIRTY)); + + /* Finish all the deferred ops to commit all repairs. */ + error = xrep_defer_finish(sc); + if (error) + return error; + + /* Roll the transaction once more to detach the inodes. */ + return xfs_trans_roll(&sc->tp); +} diff --git a/fs/xfs/scrub/orphanage.h b/fs/xfs/scrub/orphanage.h new file mode 100644 index 0000000000000..9d40992583b24 --- /dev/null +++ b/fs/xfs/scrub/orphanage.h @@ -0,0 +1,75 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#ifndef __XFS_SCRUB_ORPHANAGE_H__ +#define __XFS_SCRUB_ORPHANAGE_H__ + +#ifdef CONFIG_XFS_ONLINE_REPAIR +int xrep_orphanage_create(struct xfs_scrub *sc); + +/* + * If we're doing a repair, ensure that the orphanage exists and attach it to + * the scrub context. + */ +static inline int +xrep_orphanage_try_create( + struct xfs_scrub *sc) +{ + int error; + + ASSERT(sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR); + + error = xrep_orphanage_create(sc); + switch (error) { + case 0: + case -ENOENT: + case -ENOTDIR: + case -ENOSPC: + /* + * If the orphanage can't be found or isn't a directory, we'll + * keep going, but we won't be able to attach the file to the + * orphanage if we can't find the parent. + */ + return 0; + } + + return error; +} + +int xrep_orphanage_iolock_two(struct xfs_scrub *sc); + +void xrep_orphanage_ilock(struct xfs_scrub *sc, unsigned int ilock_flags); +bool xrep_orphanage_ilock_nowait(struct xfs_scrub *sc, + unsigned int ilock_flags); +void xrep_orphanage_iunlock(struct xfs_scrub *sc, unsigned int ilock_flags); + +void xrep_orphanage_rele(struct xfs_scrub *sc); + +/* Information about a request to add a file to the orphanage. */ +struct xrep_adoption { + /* Name structure; caller must provide a buffer separately. */ + struct xfs_name xname; + + struct xfs_scrub *sc; + + /* Block reservations for orphanage and child (if directory). */ + unsigned int orphanage_blkres; + unsigned int child_blkres; +}; + +bool xrep_orphanage_can_adopt(struct xfs_scrub *sc); + +int xrep_adoption_trans_alloc(struct xfs_scrub *sc, + struct xrep_adoption *adopt); +int xrep_adoption_compute_name(struct xrep_adoption *adopt, + unsigned char *namebuf); +int xrep_adoption_move(struct xrep_adoption *adopt); +int xrep_adoption_trans_roll(struct xrep_adoption *adopt); +#else +struct xrep_adoption { /* empty */ }; +# define xrep_orphanage_rele(sc) ((void)0) +#endif /* CONFIG_XFS_ONLINE_REPAIR */ + +#endif /* __XFS_SCRUB_ORPHANAGE_H__ */ diff --git a/fs/xfs/scrub/parent_repair.c b/fs/xfs/scrub/parent_repair.c index 826926c2bb0dd..2bb4cfbeffe82 100644 --- a/fs/xfs/scrub/parent_repair.c +++ b/fs/xfs/scrub/parent_repair.c @@ -32,6 +32,8 @@ #include "scrub/iscan.h" #include "scrub/findparent.h" #include "scrub/readdir.h" +#include "scrub/tempfile.h" +#include "scrub/orphanage.h" /* * Repairing The Directory Parent Pointer @@ -57,6 +59,12 @@ struct xrep_parent { * dotdot entry for this directory. */ struct xrep_parent_scan_info pscan; + + /* Orphanage reparenting request. */ + struct xrep_adoption adoption; + + /* Directory entry name, plus the trailing null. */ + unsigned char namebuf[MAXNAMELEN]; }; /* Tear down all the incore stuff we created. */ @@ -82,7 +90,7 @@ xrep_setup_parent( rp->sc = sc; sc->buf = rp; - return 0; + return xrep_orphanage_try_create(sc); } /* @@ -179,6 +187,91 @@ xrep_parent_reset_dotdot( return xfs_trans_roll(&sc->tp); } +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_parent_move_to_orphanage( + struct xrep_parent *rp) +{ + struct xfs_scrub *sc = rp->sc; + xfs_ino_t orig_parent, new_parent; + int error; + + /* + * We are about to drop the ILOCK on sc->ip to lock the orphanage and + * prepare for the adoption. Therefore, look up the old dotdot entry + * for sc->ip so that we can compare it after we re-lock sc->ip. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &orig_parent); + if (error) + return error; + + /* + * Drop the ILOCK on the scrub target and commit the transaction. + * Adoption computes its own resource requirements and gathers the + * necessary components. + */ + error = xrep_trans_commit(sc); + if (error) + return error; + xchk_iunlock(sc, XFS_ILOCK_EXCL); + + /* If we can take the orphanage's iolock then we're ready to move. */ + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + return error; + } + + /* Grab transaction and ILOCK the two files. */ + error = xrep_adoption_trans_alloc(sc, &rp->adoption); + if (error) + return error; + + error = xrep_adoption_compute_name(&rp->adoption, rp->namebuf); + if (error) + return error; + + /* + * Now that we've reacquired the ILOCK on sc->ip, look up the dotdot + * entry again. If the parent changed or the child was unlinked while + * the child directory was unlocked, we don't need to move the child to + * the orphanage after all. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &new_parent); + if (error) + return error; + + /* + * Attach to the orphanage if we still have a linked directory and it + * hasn't been moved. + */ + if (orig_parent == new_parent && VFS_I(sc->ip)->i_nlink > 0) { + error = xrep_adoption_move(&rp->adoption); + if (error) + return error; + } + + /* + * Launder the scrub transaction so we can drop the orphanage ILOCK + * and IOLOCK. Return holding the scrub target's ILOCK and IOLOCK. + */ + error = xrep_adoption_trans_roll(&rp->adoption); + if (error) + return error; + + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return 0; +} + /* * Commit the new parent pointer structure (currently only the dotdot entry) to * the file that we're repairing. @@ -188,7 +281,8 @@ xrep_parent_rebuild_tree( struct xrep_parent *rp) { if (rp->pscan.parent_ino == NULLFSINO) { - /* Cannot fix orphaned directories yet. */ + if (xrep_orphanage_can_adopt(rp->sc)) + return xrep_parent_move_to_orphanage(rp); return -EFSCORRUPTED; } diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index 520d83db193c3..6417628ce26be 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -27,6 +27,7 @@ #include "scrub/stats.h" #include "scrub/xfile.h" #include "scrub/tempfile.h" +#include "scrub/orphanage.h" /* * Online Scrub and Repair @@ -217,6 +218,7 @@ xchk_teardown( } xrep_tempfile_rele(sc); + xrep_orphanage_rele(sc); xchk_fsgates_disable(sc); return error; } diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index aca3e652343c1..456bb181399f4 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -105,6 +105,10 @@ struct xfs_scrub { /* Lock flags for @ip. */ uint ilock_flags; + /* The orphanage, for stashing files that have lost their parent. */ + uint orphanage_ilock_flags; + struct xfs_inode *orphanage; + /* A temporary file on this filesystem, for staging new metadata. */ struct xfs_inode *tempip; uint temp_ilock_flags; diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 883accdd53f04..e6cf0e0020799 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -2582,6 +2582,34 @@ DEFINE_EVENT(xrep_dirent_class, name, \ DEFINE_XREP_DIRENT_EVENT(xrep_dir_salvage_entry); DEFINE_XREP_DIRENT_EVENT(xrep_dir_stash_createname); DEFINE_XREP_DIRENT_EVENT(xrep_dir_replay_createname); +DEFINE_XREP_DIRENT_EVENT(xrep_adoption_reparent); + +DECLARE_EVENT_CLASS(xrep_adoption_class, + TP_PROTO(struct xfs_inode *dp, struct xfs_inode *ip, bool moved), + TP_ARGS(dp, ip, moved), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_ino_t, dir_ino) + __field(xfs_ino_t, child_ino) + __field(bool, moved) + ), + TP_fast_assign( + __entry->dev = dp->i_mount->m_super->s_dev; + __entry->dir_ino = dp->i_ino; + __entry->child_ino = ip->i_ino; + __entry->moved = moved; + ), + TP_printk("dev %d:%d dir 0x%llx child 0x%llx moved? %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->dir_ino, + __entry->child_ino, + __entry->moved) +); +#define DEFINE_XREP_ADOPTION_EVENT(name) \ +DEFINE_EVENT(xrep_adoption_class, name, \ + TP_PROTO(struct xfs_inode *dp, struct xfs_inode *ip, bool moved), \ + TP_ARGS(dp, ip, moved)) +DEFINE_XREP_ADOPTION_EVENT(xrep_adoption_trans_roll); DECLARE_EVENT_CLASS(xrep_parent_salvage_class, TP_PROTO(struct xfs_inode *dp, xfs_ino_t ino), diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 4ec8a1dd7cf9a..b04f5f0e63b4e 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -936,10 +936,10 @@ xfs_droplink( /* * Increment the link count on an inode & log the change. */ -static void +void xfs_bumplink( - xfs_trans_t *tp, - xfs_inode_t *ip) + struct xfs_trans *tp, + struct xfs_inode *ip) { xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG); diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index e15634fcb2638..cab52bcb64207 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -621,6 +621,7 @@ void xfs_end_io(struct work_struct *work); int xfs_ilock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_remapping(struct xfs_inode *ip1, struct xfs_inode *ip2); +void xfs_bumplink(struct xfs_trans *tp, struct xfs_inode *ip); static inline bool xfs_inode_unlinked_incomplete( ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] xfs: move orphan files to the orphanage 2024-02-27 2:32 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong @ 2024-02-28 17:21 ` Christoph Hellwig 0 siblings, 0 replies; 12+ messages in thread From: Christoph Hellwig @ 2024-02-28 17:21 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs, hch Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/3] xfs: move files to orphanage instead of letting nlinks drop to zero 2024-02-27 2:18 [PATCHSET v29.4 10/13] xfs: move orphan files to lost and found Darrick J. Wong 2024-02-27 2:32 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong @ 2024-02-27 2:32 ` Darrick J. Wong 2024-02-28 17:23 ` Christoph Hellwig 2024-02-27 2:32 ` [PATCH 3/3] xfs: ensure dentry consistency when the orphanage adopts a file Darrick J. Wong 2 siblings, 1 reply; 12+ messages in thread From: Darrick J. Wong @ 2024-02-27 2:32 UTC (permalink / raw) To: djwong; +Cc: linux-xfs, hch From: Darrick J. Wong <djwong@kernel.org> If we encounter an inode with a nonzero link count but zero observed links, move it to the orphanage. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- .../filesystems/xfs/xfs-online-fsck-design.rst | 3 fs/xfs/scrub/nlinks.c | 11 ++ fs/xfs/scrub/nlinks.h | 6 + fs/xfs/scrub/nlinks_repair.c | 124 ++++++++++++++++++-- fs/xfs/scrub/repair.h | 2 fs/xfs/scrub/trace.c | 1 fs/xfs/scrub/trace.h | 26 ++++ 7 files changed, 158 insertions(+), 15 deletions(-) diff --git a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst index 74069f692dfd1..adec3e7d0fd78 100644 --- a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst +++ b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst @@ -4789,7 +4789,8 @@ Orphaned files are adopted by the orphanage as follows: cache. 6. Call ``xrep_adoption_finish`` to commit any filesystem updates, release the - orphanage ILOCK, and clean the scrub transaction. + orphanage ILOCK, and clean the scrub transaction. Call + ``xrep_adoption_commit`` to commit the updates and the scrub transaction. 7. If a runtime error happens, call ``xrep_adoption_cancel`` to release all resources. diff --git a/fs/xfs/scrub/nlinks.c b/fs/xfs/scrub/nlinks.c index 8b9aa73093d66..b13bbbdf2ad32 100644 --- a/fs/xfs/scrub/nlinks.c +++ b/fs/xfs/scrub/nlinks.c @@ -24,6 +24,7 @@ #include "scrub/xfile.h" #include "scrub/xfarray.h" #include "scrub/iscan.h" +#include "scrub/orphanage.h" #include "scrub/nlinks.h" #include "scrub/trace.h" #include "scrub/readdir.h" @@ -44,9 +45,17 @@ int xchk_setup_nlinks( struct xfs_scrub *sc) { + int error; + xchk_fsgates_enable(sc, XCHK_FSGATES_DIRENTS); - sc->buf = kzalloc(sizeof(struct xchk_nlink_ctrs), XCHK_GFP_FLAGS); + if (xchk_could_repair(sc)) { + error = xrep_setup_nlinks(sc); + if (error) + return error; + } + + sc->buf = kvzalloc(sizeof(struct xchk_nlink_ctrs), XCHK_GFP_FLAGS); if (!sc->buf) return -ENOMEM; diff --git a/fs/xfs/scrub/nlinks.h b/fs/xfs/scrub/nlinks.h index a950f3daf204c..84020c4c2fc5a 100644 --- a/fs/xfs/scrub/nlinks.h +++ b/fs/xfs/scrub/nlinks.h @@ -28,6 +28,12 @@ struct xchk_nlink_ctrs { * from other writer threads. */ struct xfs_dir_hook dhook; + + /* Orphanage reparenting request. */ + struct xrep_adoption adoption; + + /* Directory entry name, plus the trailing null. */ + char namebuf[MAXNAMELEN]; }; /* diff --git a/fs/xfs/scrub/nlinks_repair.c b/fs/xfs/scrub/nlinks_repair.c index 23eb08c4b5ad5..1345c07a95c62 100644 --- a/fs/xfs/scrub/nlinks_repair.c +++ b/fs/xfs/scrub/nlinks_repair.c @@ -24,6 +24,7 @@ #include "scrub/xfile.h" #include "scrub/xfarray.h" #include "scrub/iscan.h" +#include "scrub/orphanage.h" #include "scrub/nlinks.h" #include "scrub/trace.h" #include "scrub/tempfile.h" @@ -38,6 +39,34 @@ * inode is locked. */ +/* Set up to repair inode link counts. */ +int +xrep_setup_nlinks( + struct xfs_scrub *sc) +{ + return xrep_orphanage_try_create(sc); +} + +/* + * Inodes that aren't the root directory or the orphanage, have a nonzero link + * count, and no observed parents should be moved to the orphanage. + */ +static inline bool +xrep_nlinks_is_orphaned( + struct xfs_scrub *sc, + struct xfs_inode *ip, + unsigned int actual_nlink, + const struct xchk_nlink *obs) +{ + struct xfs_mount *mp = ip->i_mount; + + if (obs->parents != 0) + return false; + if (ip == mp->m_rootip || ip == sc->orphanage) + return false; + return actual_nlink != 0; +} + /* Remove an inode from the unlinked list. */ STATIC int xrep_nlinks_iunlink_remove( @@ -66,6 +95,7 @@ xrep_nlinks_repair_inode( struct xfs_inode *ip = sc->ip; uint64_t total_links; uint64_t actual_nlink; + bool orphanage_available = false; bool dirty = false; int error; @@ -77,14 +107,41 @@ xrep_nlinks_repair_inode( if (xrep_is_tempfile(ip)) return 0; - xchk_ilock(sc, XFS_IOLOCK_EXCL); + /* + * If the filesystem has an orphanage attached to the scrub context, + * prepare for a link count repair that could involve @ip being adopted + * by the lost+found. + */ + if (xrep_orphanage_can_adopt(sc)) { + error = xrep_orphanage_iolock_two(sc); + if (error) + return error; - error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, 0, 0, 0, &sc->tp); - if (error) - return error; + error = xrep_adoption_trans_alloc(sc, &xnc->adoption); + if (error) { + xchk_iunlock(sc, XFS_IOLOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + } else { + orphanage_available = true; + } + } - xchk_ilock(sc, XFS_ILOCK_EXCL); - xfs_trans_ijoin(sc->tp, ip, 0); + /* + * Either there is no orphanage or we couldn't allocate resources for + * that kind of update. Let's try again with only the resources we + * need for a simple link count update, since that's much more common. + */ + if (!orphanage_available) { + xchk_ilock(sc, XFS_IOLOCK_EXCL); + + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, 0, 0, 0, + &sc->tp); + if (error) + return error; + + xchk_ilock(sc, XFS_ILOCK_EXCL); + xfs_trans_ijoin(sc->tp, ip, 0); + } mutex_lock(&xnc->lock); @@ -122,6 +179,42 @@ xrep_nlinks_repair_inode( goto out_trans; } + /* + * Decide if we're going to move this file to the orphanage, and fix + * up the incore link counts if we are. + */ + if (orphanage_available && + xrep_nlinks_is_orphaned(sc, ip, actual_nlink, &obs)) { + /* Figure out what name we're going to use here. */ + error = xrep_adoption_compute_name(&xnc->adoption, + xnc->namebuf); + if (error) + goto out_trans; + + /* + * Reattach this file to the directory tree by moving it to + * the orphanage per the adoption parameters that we already + * computed. + */ + error = xrep_adoption_move(&xnc->adoption); + if (error) + goto out_trans; + + /* + * Re-read the link counts since the reparenting will have + * updated our scan info. + */ + mutex_lock(&xnc->lock); + error = xfarray_load_sparse(xnc->nlinks, ip->i_ino, &obs); + mutex_unlock(&xnc->lock); + if (error) + goto out_trans; + + total_links = xchk_nlink_total(ip, &obs); + actual_nlink = VFS_I(ip)->i_nlink; + dirty = true; + } + /* * If this inode is linked from the directory tree and on the unlinked * list, remove it from the unlinked list. @@ -165,14 +258,19 @@ xrep_nlinks_repair_inode( xfs_trans_log_inode(sc->tp, ip, XFS_ILOG_CORE); error = xrep_trans_commit(sc); - xchk_iunlock(sc, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL); - return error; + goto out_unlock; out_scanlock: mutex_unlock(&xnc->lock); out_trans: xchk_trans_cancel(sc); - xchk_iunlock(sc, XFS_ILOCK_EXCL | XFS_IOLOCK_EXCL); +out_unlock: + xchk_iunlock(sc, XFS_ILOCK_EXCL); + if (orphanage_available) { + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + } + xchk_iunlock(sc, XFS_IOLOCK_EXCL); return error; } @@ -205,10 +303,10 @@ xrep_nlinks( /* * We need ftype for an accurate count of the number of child * subdirectory links. Child subdirectories with a back link (dotdot - * entry) but no forward link are unfixable, so we cannot repair the - * link count of the parent directory based on the back link count - * alone. Filesystems without ftype support are rare (old V4) so we - * just skip out here. + * entry) but no forward link are moved to the orphanage, so we cannot + * repair the link count of the parent directory based on the back link + * count alone. Filesystems without ftype support are rare (old V4) so + * we just skip out here. */ if (!xfs_has_ftype(sc->mp)) return -EOPNOTSUPP; diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index e53374fa54308..7e6aba7fe5586 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -93,6 +93,7 @@ int xrep_setup_ag_refcountbt(struct xfs_scrub *sc); int xrep_setup_xattr(struct xfs_scrub *sc); int xrep_setup_directory(struct xfs_scrub *sc); int xrep_setup_parent(struct xfs_scrub *sc); +int xrep_setup_nlinks(struct xfs_scrub *sc); /* Repair setup functions */ int xrep_setup_ag_allocbt(struct xfs_scrub *sc); @@ -201,6 +202,7 @@ xrep_setup_nothing( #define xrep_setup_xattr xrep_setup_nothing #define xrep_setup_directory xrep_setup_nothing #define xrep_setup_parent xrep_setup_nothing +#define xrep_setup_nlinks xrep_setup_nothing #define xrep_setup_inode(sc, imap) ((void)0) diff --git a/fs/xfs/scrub/trace.c b/fs/xfs/scrub/trace.c index 3dd281d6d1854..b2ce7b22cad34 100644 --- a/fs/xfs/scrub/trace.c +++ b/fs/xfs/scrub/trace.c @@ -24,6 +24,7 @@ #include "scrub/xfarray.h" #include "scrub/quota.h" #include "scrub/iscan.h" +#include "scrub/orphanage.h" #include "scrub/nlinks.h" #include "scrub/fscounters.h" diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index e6cf0e0020799..618f98cddc3df 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -2637,6 +2637,32 @@ DEFINE_XREP_PARENT_SALVAGE_EVENT(xrep_dir_salvaged_parent); DEFINE_XREP_PARENT_SALVAGE_EVENT(xrep_findparent_dirent); DEFINE_XREP_PARENT_SALVAGE_EVENT(xrep_findparent_from_dcache); +TRACE_EVENT(xrep_nlinks_set_record, + TP_PROTO(struct xfs_mount *mp, xfs_ino_t ino, + const struct xchk_nlink *obs), + TP_ARGS(mp, ino, obs), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_ino_t, ino) + __field(xfs_nlink_t, parents) + __field(xfs_nlink_t, backrefs) + __field(xfs_nlink_t, children) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->ino = ino; + __entry->parents = obs->parents; + __entry->backrefs = obs->backrefs; + __entry->children = obs->children; + ), + TP_printk("dev %d:%d ino 0x%llx parents %u backrefs %u children %u", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->ino, + __entry->parents, + __entry->backrefs, + __entry->children) +); + #endif /* IS_ENABLED(CONFIG_XFS_ONLINE_REPAIR) */ #endif /* _TRACE_XFS_SCRUB_TRACE_H */ ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] xfs: move files to orphanage instead of letting nlinks drop to zero 2024-02-27 2:32 ` [PATCH 2/3] xfs: move files to orphanage instead of letting nlinks drop to zero Darrick J. Wong @ 2024-02-28 17:23 ` Christoph Hellwig 0 siblings, 0 replies; 12+ messages in thread From: Christoph Hellwig @ 2024-02-28 17:23 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs, hch Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 3/3] xfs: ensure dentry consistency when the orphanage adopts a file 2024-02-27 2:18 [PATCHSET v29.4 10/13] xfs: move orphan files to lost and found Darrick J. Wong 2024-02-27 2:32 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong 2024-02-27 2:32 ` [PATCH 2/3] xfs: move files to orphanage instead of letting nlinks drop to zero Darrick J. Wong @ 2024-02-27 2:32 ` Darrick J. Wong 2024-02-28 17:24 ` Christoph Hellwig 2 siblings, 1 reply; 12+ messages in thread From: Darrick J. Wong @ 2024-02-27 2:32 UTC (permalink / raw) To: djwong; +Cc: linux-xfs, hch From: Darrick J. Wong <djwong@kernel.org> When the orphanage adopts a file, that file becomes a child of the orphanage. The dentry cache may have entries for the orphanage directory and the name we've chosen, so (1) make sure we abort if the dcache has a positive entry because something's not right; and (2) invalidate and purge negative dentries if the adoption goes through. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- fs/xfs/scrub/orphanage.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/trace.h | 42 ++++++++++++++++++++++ 2 files changed, 130 insertions(+) diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c index 0aedc5c70b632..e1024a7bc9e96 100644 --- a/fs/xfs/scrub/orphanage.c +++ b/fs/xfs/scrub/orphanage.c @@ -418,6 +418,87 @@ xrep_adoption_compute_name( return 0; } +/* + * Make sure the dcache does not have a positive dentry for the name we've + * chosen. The caller should have checked with the ondisk directory, so any + * discrepancy is a sign that something is seriously wrong. + */ +static int +xrep_adoption_check_dcache( + struct xrep_adoption *adopt) +{ + struct qstr qname = QSTR_INIT(adopt->xname.name, + adopt->xname.len); + struct dentry *d_orphanage, *d_child; + int error = 0; + + d_orphanage = d_find_alias(VFS_I(adopt->sc->orphanage)); + if (!d_orphanage) + return 0; + + d_child = d_hash_and_lookup(d_orphanage, &qname); + if (d_child) { + trace_xrep_adoption_check_child(adopt->sc->mp, d_child); + + if (d_is_positive(d_child)) { + ASSERT(d_is_negative(d_child)); + error = -EFSCORRUPTED; + } + + dput(d_child); + } + + dput(d_orphanage); + if (error) + return error; + + /* + * Do we need to update d_parent of the dentry for the file being + * repaired? In theory there shouldn't be one since the file had + * nonzero nlink but wasn't connected to any parent dir. + */ + d_child = d_find_alias(VFS_I(adopt->sc->ip)); + if (d_child) { + trace_xrep_adoption_check_alias(adopt->sc->mp, d_child); + ASSERT(d_child->d_parent == NULL); + + dput(d_child); + return -EFSCORRUPTED; + } + + return 0; +} + +/* + * Remove all negative dentries from the dcache. There should not be any + * positive entries, since we've maintained our lock on the orphanage + * directory. + */ +static void +xrep_adoption_zap_dcache( + struct xrep_adoption *adopt) +{ + struct qstr qname = QSTR_INIT(adopt->xname.name, + adopt->xname.len); + struct dentry *d_orphanage, *d_child; + + d_orphanage = d_find_alias(VFS_I(adopt->sc->orphanage)); + if (!d_orphanage) + return; + + d_child = d_hash_and_lookup(d_orphanage, &qname); + while (d_child != NULL) { + trace_xrep_adoption_invalidate_child(adopt->sc->mp, d_child); + + ASSERT(d_is_negative(d_child)); + d_invalidate(d_child); + dput(d_child); + d_child = d_lookup(d_orphanage, &qname); + } + + dput(d_orphanage); +} + /* * Move the current file to the orphanage under the computed name. * @@ -436,6 +517,10 @@ xrep_adoption_move( trace_xrep_adoption_reparent(sc->orphanage, &adopt->xname, sc->ip->i_ino); + error = xrep_adoption_check_dcache(adopt); + if (error) + return error; + /* Create the new name in the orphanage. */ error = xfs_dir_createname(sc->tp, sc->orphanage, xname, sc->ip->i_ino, adopt->orphanage_blkres); @@ -466,6 +551,9 @@ xrep_adoption_move( * recorded in the log. */ xfs_dir_update_hook(sc->orphanage, sc->ip, 1, xname); + + /* Remove negative dentries from the lost+found's dcache */ + xrep_adoption_zap_dcache(adopt); return 0; } diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 618f98cddc3df..7915648012c66 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -2663,6 +2663,48 @@ TRACE_EVENT(xrep_nlinks_set_record, __entry->children) ); +DECLARE_EVENT_CLASS(xrep_dentry_class, + TP_PROTO(struct xfs_mount *mp, const struct dentry *dentry), + TP_ARGS(mp, dentry), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(unsigned int, flags) + __field(unsigned long, ino) + __field(bool, positive) + __field(unsigned long, parent_ino) + __field(unsigned int, namelen) + __dynamic_array(char, name, dentry->d_name.len) + ), + TP_fast_assign( + __entry->dev = mp->m_super->s_dev; + __entry->flags = dentry->d_flags; + __entry->positive = d_is_positive(dentry); + if (dentry->d_parent && d_inode(dentry->d_parent)) + __entry->parent_ino = d_inode(dentry->d_parent)->i_ino; + else + __entry->parent_ino = -1UL; + __entry->ino = d_inode(dentry) ? d_inode(dentry)->i_ino : 0; + __entry->namelen = dentry->d_name.len; + memcpy(__get_str(name), dentry->d_name.name, dentry->d_name.len); + ), + TP_printk("dev %d:%d flags 0x%x positive? %d parent_ino 0x%lx ino 0x%lx name '%.*s'", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->flags, + __entry->positive, + __entry->parent_ino, + __entry->ino, + __entry->namelen, + __get_str(name)) +); +#define DEFINE_REPAIR_DENTRY_EVENT(name) \ +DEFINE_EVENT(xrep_dentry_class, name, \ + TP_PROTO(struct xfs_mount *mp, const struct dentry *dentry), \ + TP_ARGS(mp, dentry)) +DEFINE_REPAIR_DENTRY_EVENT(xrep_adoption_check_child); +DEFINE_REPAIR_DENTRY_EVENT(xrep_adoption_check_alias); +DEFINE_REPAIR_DENTRY_EVENT(xrep_adoption_check_dentry); +DEFINE_REPAIR_DENTRY_EVENT(xrep_adoption_invalidate_child); + #endif /* IS_ENABLED(CONFIG_XFS_ONLINE_REPAIR) */ #endif /* _TRACE_XFS_SCRUB_TRACE_H */ ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] xfs: ensure dentry consistency when the orphanage adopts a file 2024-02-27 2:32 ` [PATCH 3/3] xfs: ensure dentry consistency when the orphanage adopts a file Darrick J. Wong @ 2024-02-28 17:24 ` Christoph Hellwig 0 siblings, 0 replies; 12+ messages in thread From: Christoph Hellwig @ 2024-02-28 17:24 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs, hch Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCHSET v30.3 10/16] xfs: move orphan files to lost and found @ 2024-04-15 23:36 Darrick J. Wong 2024-04-15 23:53 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong 0 siblings, 1 reply; 12+ messages in thread From: Darrick J. Wong @ 2024-04-15 23:36 UTC (permalink / raw) To: chandanbabu, djwong; +Cc: Christoph Hellwig, hch, linux-xfs Hi all, Orphaned files are defined to be files with nonzero ondisk link count but no observable parent directory. This series enables online repair to reparent orphaned files into the filesystem directory tree, and wires up this reparenting ability into the directory, file link count, and parent pointer repair functions. This is how we fix files with positive link count that are not reachable through the directory tree. This patch will also create the orphanage directory (lost+found) if it is not present. In contrast to xfs_repair, we follow e2fsck in creating the lost+found without group or other-owner access to avoid accidental disclosure of files that were previously hidden by an 0700 directory. That's silly security, but people have been known to do it. If you're going to start using this code, I strongly recommend pulling from my git trees, which are linked below. This has been running on the djcloud for months with no problems. Enjoy! Comments and questions are, as always, welcome. --D kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-orphanage-6.10 --- Commits in this patchset: * xfs: move orphan files to the orphanage * xfs: move files to orphanage instead of letting nlinks drop to zero * xfs: ensure dentry consistency when the orphanage adopts a file --- .../filesystems/xfs/xfs-online-fsck-design.rst | 20 - fs/xfs/Makefile | 1 fs/xfs/scrub/dir_repair.c | 130 ++++ fs/xfs/scrub/nlinks.c | 20 + fs/xfs/scrub/nlinks.h | 7 fs/xfs/scrub/nlinks_repair.c | 123 ++++ fs/xfs/scrub/orphanage.c | 589 ++++++++++++++++++++ fs/xfs/scrub/orphanage.h | 75 +++ fs/xfs/scrub/parent_repair.c | 100 +++ fs/xfs/scrub/repair.h | 2 fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/scrub.h | 4 fs/xfs/scrub/trace.c | 1 fs/xfs/scrub/trace.h | 96 +++ fs/xfs/xfs_inode.c | 6 fs/xfs/xfs_inode.h | 1 16 files changed, 1139 insertions(+), 38 deletions(-) create mode 100644 fs/xfs/scrub/orphanage.c create mode 100644 fs/xfs/scrub/orphanage.h ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] xfs: move orphan files to the orphanage 2024-04-15 23:36 [PATCHSET v30.3 10/16] xfs: move orphan files to lost and found Darrick J. Wong @ 2024-04-15 23:53 ` Darrick J. Wong 0 siblings, 0 replies; 12+ messages in thread From: Darrick J. Wong @ 2024-04-15 23:53 UTC (permalink / raw) To: chandanbabu, djwong; +Cc: Christoph Hellwig, hch, linux-xfs From: Darrick J. Wong <djwong@kernel.org> When we're repairing a directory structure or fixing the dotdot entry of a subdirectory, it's possible that we won't ever find a parent for the subdirectory. When this is the case, move it to the orphanage, aka /lost+found. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> --- .../filesystems/xfs/xfs-online-fsck-design.rst | 19 + fs/xfs/Makefile | 1 fs/xfs/scrub/dir_repair.c | 130 +++++ fs/xfs/scrub/orphanage.c | 498 ++++++++++++++++++++ fs/xfs/scrub/orphanage.h | 75 +++ fs/xfs/scrub/parent_repair.c | 100 ++++ fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/scrub.h | 4 fs/xfs/scrub/trace.h | 28 + fs/xfs/xfs_inode.c | 6 fs/xfs/xfs_inode.h | 1 11 files changed, 844 insertions(+), 20 deletions(-) create mode 100644 fs/xfs/scrub/orphanage.c create mode 100644 fs/xfs/scrub/orphanage.h diff --git a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst index 3afa1bc5f47c..37dddaaeda50 100644 --- a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst +++ b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst @@ -4778,14 +4778,21 @@ Orphaned files are adopted by the orphanage as follows: The ``xrep_orphanage_iolock_two`` function follows the inode locking strategy discussed earlier. -3. Call ``xrep_orphanage_compute_blkres`` and ``xrep_orphanage_compute_name`` - to compute the new name in the orphanage and the block reservation required. - -4. Use ``xrep_orphanage_adoption_prep`` to reserve resources to the repair +3. Use ``xrep_adoption_trans_alloc`` to reserve resources to the repair transaction. -5. Call ``xrep_orphanage_adopt`` to reparent the orphaned file into the lost - and found, and update the kernel dentry cache. +4. Call ``xrep_orphanage_compute_name`` to compute the new name in the + orphanage. + +5. If the adoption is going to happen, call ``xrep_adoption_reparent`` to + reparent the orphaned file into the lost and found and invalidate the dentry + cache. + +6. Call ``xrep_adoption_finish`` to commit any filesystem updates, release the + orphanage ILOCK, and clean the scrub transaction. + +7. If a runtime error happens, call ``xrep_adoption_cancel`` to release all + resources. The proposed patches are in the `orphanage adoption diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index d48646f86563..1e23d1b3cd7b 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -205,6 +205,7 @@ xfs-y += $(addprefix scrub/, \ inode_repair.o \ newbt.o \ nlinks_repair.o \ + orphanage.o \ parent_repair.o \ rcbag_btree.o \ rcbag.o \ diff --git a/fs/xfs/scrub/dir_repair.c b/fs/xfs/scrub/dir_repair.c index 34fe720fde0e..c150b2efa2c2 100644 --- a/fs/xfs/scrub/dir_repair.c +++ b/fs/xfs/scrub/dir_repair.c @@ -42,6 +42,7 @@ #include "scrub/readdir.h" #include "scrub/reap.h" #include "scrub/findparent.h" +#include "scrub/orphanage.h" /* * Directory Repair @@ -115,12 +116,21 @@ struct xrep_dir { */ struct xrep_parent_scan_info pscan; + /* + * Context information for attaching this directory to the lost+found + * if this directory does not have a parent. + */ + struct xrep_adoption adoption; + /* How many subdirectories did we find? */ uint64_t subdirs; /* How many dirents did we find? */ unsigned int dirents; + /* Should we move this directory to the orphanage? */ + bool needs_adoption; + /* Directory entry name, plus the trailing null. */ struct xfs_name xname; unsigned char namebuf[MAXNAMELEN]; @@ -148,6 +158,10 @@ xrep_setup_directory( xchk_fsgates_enable(sc, XCHK_FSGATES_DIRENTS); + error = xrep_orphanage_try_create(sc); + if (error) + return error; + error = xrep_tempfile_create(sc, S_IFDIR); if (error) return error; @@ -1137,10 +1151,8 @@ xrep_dir_set_nlink( /* * The directory is not on the incore unlinked list, which means that * it needs to be reachable via the directory tree. Update the nlink - * with our observed link count. - * - * XXX: A subsequent patch will handle parentless directories by moving - * them to the lost and found instead of aborting the repair. + * with our observed link count. If the directory has no parent, it + * will be moved to the orphanage. */ if (!xfs_inode_on_unlinked_list(dp)) goto reset_nlink; @@ -1151,6 +1163,7 @@ xrep_dir_set_nlink( * inactivate when the last reference drops. */ if (rd->dirents == 0) { + rd->needs_adoption = false; new_nlink = 0; goto reset_nlink; } @@ -1159,7 +1172,8 @@ xrep_dir_set_nlink( * The directory is on the unlinked list and we found dirents. This * directory needs to be reachable via the directory tree. Remove the * dir from the unlinked list and update nlink with the observed link - * count. + * count. If the directory has no parent, it will be moved to the + * orphanage. */ pag = xfs_perag_get(sc->mp, XFS_INO_TO_AGNO(sc->mp, dp->i_ino)); if (!pag) { @@ -1195,12 +1209,16 @@ xrep_dir_swap( return -EFSCORRUPTED; /* - * If we never found the parent for this directory, we can't fix this - * directory. + * If we never found the parent for this directory, temporarily assign + * the root dir as the parent; we'll move this to the orphanage after + * exchanging the dir contents. We hold the ILOCK of the dir being + * repaired, so we're not worried about racy updates of dotdot. */ ASSERT(sc->ilock_flags & XFS_ILOCK_EXCL); - if (rd->pscan.parent_ino == NULLFSINO) - return -EFSCORRUPTED; + if (rd->pscan.parent_ino == NULLFSINO) { + rd->needs_adoption = true; + rd->pscan.parent_ino = rd->sc->mp->m_sb.sb_rootino; + } /* * Reset the temporary directory's '..' entry to point to the parent @@ -1358,6 +1376,91 @@ xrep_dir_setup_scan( return error; } +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_dir_move_to_orphanage( + struct xrep_dir *rd) +{ + struct xfs_scrub *sc = rd->sc; + xfs_ino_t orig_parent, new_parent; + int error; + + /* + * We are about to drop the ILOCK on sc->ip to lock the orphanage and + * prepare for the adoption. Therefore, look up the old dotdot entry + * for sc->ip so that we can compare it after we re-lock sc->ip. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &orig_parent); + if (error) + return error; + + /* + * Drop the ILOCK on the scrub target and commit the transaction. + * Adoption computes its own resource requirements and gathers the + * necessary components. + */ + error = xrep_trans_commit(sc); + if (error) + return error; + xchk_iunlock(sc, XFS_ILOCK_EXCL); + + /* If we can take the orphanage's iolock then we're ready to move. */ + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + return error; + } + + /* Grab transaction and ILOCK the two files. */ + error = xrep_adoption_trans_alloc(sc, &rd->adoption); + if (error) + return error; + + error = xrep_adoption_compute_name(&rd->adoption, &rd->xname); + if (error) + return error; + + /* + * Now that we've reacquired the ILOCK on sc->ip, look up the dotdot + * entry again. If the parent changed or the child was unlinked while + * the child directory was unlocked, we don't need to move the child to + * the orphanage after all. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &new_parent); + if (error) + return error; + + /* + * Attach to the orphanage if we still have a linked directory and it + * hasn't been moved. + */ + if (orig_parent == new_parent && VFS_I(sc->ip)->i_nlink > 0) { + error = xrep_adoption_move(&rd->adoption); + if (error) + return error; + } + + /* + * Launder the scrub transaction so we can drop the orphanage ILOCK + * and IOLOCK. Return holding the scrub target's ILOCK and IOLOCK. + */ + error = xrep_adoption_trans_roll(&rd->adoption); + if (error) + return error; + + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return 0; +} + /* * Repair the directory metadata. * @@ -1396,6 +1499,15 @@ xrep_directory( if (error) goto out_teardown; + if (rd->needs_adoption) { + if (!xrep_orphanage_can_adopt(rd->sc)) + error = -EFSCORRUPTED; + else + error = xrep_dir_move_to_orphanage(rd); + if (error) + goto out_teardown; + } + out_teardown: xrep_dir_teardown(sc); return error; diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c new file mode 100644 index 000000000000..41733be3ef45 --- /dev/null +++ b/fs/xfs/scrub/orphanage.c @@ -0,0 +1,498 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_log_format.h" +#include "xfs_trans.h" +#include "xfs_inode.h" +#include "xfs_ialloc.h" +#include "xfs_quota.h" +#include "xfs_trans_space.h" +#include "xfs_dir2.h" +#include "xfs_icache.h" +#include "xfs_bmap.h" +#include "xfs_bmap_btree.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/repair.h" +#include "scrub/trace.h" +#include "scrub/orphanage.h" +#include "scrub/readdir.h" + +#include <linux/namei.h> + +/* + * The Orphanage + * ============= + * + * If the directory tree is damaged, children of that directory become + * inaccessible via that file path. If a child has no other parents, the file + * is said to be orphaned. xfs_repair fixes this situation by creating a + * orphanage directory (specifically, /lost+found) and creating a directory + * entry pointing to the orphaned file. + * + * Online repair follows this tactic by creating a root-owned /lost+found + * directory if one does not exist. If an orphan is found, it will move that + * files into orphanage. + */ + +/* Make the orphanage owned by root. */ +STATIC int +xrep_chown_orphanage( + struct xfs_scrub *sc, + struct xfs_inode *dp) +{ + struct xfs_trans *tp; + struct xfs_mount *mp = sc->mp; + struct xfs_dquot *udqp = NULL, *gdqp = NULL, *pdqp = NULL; + struct xfs_dquot *oldu = NULL, *oldg = NULL, *oldp = NULL; + struct inode *inode = VFS_I(dp); + int error; + + error = xfs_qm_vop_dqalloc(dp, GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0, + XFS_QMOPT_QUOTALL, &udqp, &gdqp, &pdqp); + if (error) + return error; + + error = xfs_trans_alloc_ichange(dp, udqp, gdqp, pdqp, true, &tp); + if (error) + goto out_dqrele; + + /* + * Always clear setuid/setgid/sticky on the orphanage since we don't + * normally want that functionality on this directory and xfs_repair + * doesn't create it this way either. Leave the other access bits + * unchanged. + */ + inode->i_mode &= ~(S_ISUID | S_ISGID | S_ISVTX); + + /* + * Change the ownerships and register quota modifications + * in the transaction. + */ + if (!uid_eq(inode->i_uid, GLOBAL_ROOT_UID)) { + if (XFS_IS_UQUOTA_ON(mp)) + oldu = xfs_qm_vop_chown(tp, dp, &dp->i_udquot, udqp); + inode->i_uid = GLOBAL_ROOT_UID; + } + if (!gid_eq(inode->i_gid, GLOBAL_ROOT_GID)) { + if (XFS_IS_GQUOTA_ON(mp)) + oldg = xfs_qm_vop_chown(tp, dp, &dp->i_gdquot, gdqp); + inode->i_gid = GLOBAL_ROOT_GID; + } + if (dp->i_projid != 0) { + if (XFS_IS_PQUOTA_ON(mp)) + oldp = xfs_qm_vop_chown(tp, dp, &dp->i_pdquot, pdqp); + dp->i_projid = 0; + } + + dp->i_diflags &= ~(XFS_DIFLAG_REALTIME | XFS_DIFLAG_RTINHERIT); + xfs_trans_log_inode(tp, dp, XFS_ILOG_CORE); + + XFS_STATS_INC(mp, xs_ig_attrchg); + + if (xfs_has_wsync(mp)) + xfs_trans_set_sync(tp); + error = xfs_trans_commit(tp); + + xfs_qm_dqrele(oldu); + xfs_qm_dqrele(oldg); + xfs_qm_dqrele(oldp); + +out_dqrele: + xfs_qm_dqrele(udqp); + xfs_qm_dqrele(gdqp); + xfs_qm_dqrele(pdqp); + return error; +} + +#define ORPHANAGE "lost+found" + +/* Create the orphanage directory, and set sc->orphanage to it. */ +int +xrep_orphanage_create( + struct xfs_scrub *sc) +{ + struct xfs_mount *mp = sc->mp; + struct dentry *root_dentry, *orphanage_dentry; + struct inode *root_inode = VFS_I(sc->mp->m_rootip); + struct inode *orphanage_inode; + int error; + + if (xfs_is_shutdown(mp)) + return -EIO; + if (xfs_is_readonly(mp)) { + sc->orphanage = NULL; + return 0; + } + + ASSERT(sc->tp == NULL); + ASSERT(sc->orphanage == NULL); + + /* Find the dentry for the root directory... */ + root_dentry = d_find_alias(root_inode); + if (!root_dentry) { + error = -EFSCORRUPTED; + goto out; + } + + /* ...which is a directory, right? */ + if (!d_is_dir(root_dentry)) { + error = -EFSCORRUPTED; + goto out_dput_root; + } + + /* Try to find the orphanage directory. */ + inode_lock_nested(root_inode, I_MUTEX_PARENT); + orphanage_dentry = lookup_one_len(ORPHANAGE, root_dentry, + strlen(ORPHANAGE)); + if (IS_ERR(orphanage_dentry)) { + error = PTR_ERR(orphanage_dentry); + goto out_unlock_root; + } + + /* + * Nothing found? Call mkdir to create the orphanage. Create the + * directory without other-user access because we're live and someone + * could have been relying partly on minimal access to a parent + * directory to control access to a file we put in here. + */ + if (d_really_is_negative(orphanage_dentry)) { + error = vfs_mkdir(&nop_mnt_idmap, root_inode, orphanage_dentry, + 0750); + if (error) + goto out_dput_orphanage; + } + + /* Not a directory? Bail out. */ + if (!d_is_dir(orphanage_dentry)) { + error = -ENOTDIR; + goto out_dput_orphanage; + } + + /* + * Grab a reference to the orphanage. This /should/ succeed since + * we hold the root directory locked and therefore nobody can delete + * the orphanage. + */ + orphanage_inode = igrab(d_inode(orphanage_dentry)); + if (!orphanage_inode) { + error = -ENOENT; + goto out_dput_orphanage; + } + + /* Make sure the orphanage is owned by root. */ + error = xrep_chown_orphanage(sc, XFS_I(orphanage_inode)); + if (error) + goto out_dput_orphanage; + + /* Stash the reference for later and bail out. */ + sc->orphanage = XFS_I(orphanage_inode); + sc->orphanage_ilock_flags = 0; + +out_dput_orphanage: + dput(orphanage_dentry); +out_unlock_root: + inode_unlock(VFS_I(sc->mp->m_rootip)); +out_dput_root: + dput(root_dentry); +out: + return error; +} + +void +xrep_orphanage_ilock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + sc->orphanage_ilock_flags |= ilock_flags; + xfs_ilock(sc->orphanage, ilock_flags); +} + +bool +xrep_orphanage_ilock_nowait( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + if (xfs_ilock_nowait(sc->orphanage, ilock_flags)) { + sc->orphanage_ilock_flags |= ilock_flags; + return true; + } + + return false; +} + +void +xrep_orphanage_iunlock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + xfs_iunlock(sc->orphanage, ilock_flags); + sc->orphanage_ilock_flags &= ~ilock_flags; +} + +/* Grab the IOLOCK of the orphanage and sc->ip. */ +int +xrep_orphanage_iolock_two( + struct xfs_scrub *sc) +{ + int error = 0; + + while (true) { + if (xchk_should_terminate(sc, &error)) + return error; + + /* + * Normal XFS takes the IOLOCK before grabbing a transaction. + * Scrub holds a transaction, which means that we can't block + * on either IOLOCK. + */ + if (xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + if (xchk_ilock_nowait(sc, XFS_IOLOCK_EXCL)) + break; + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + } + delay(1); + } + + return 0; +} + +/* Release the orphanage. */ +void +xrep_orphanage_rele( + struct xfs_scrub *sc) +{ + if (!sc->orphanage) + return; + + if (sc->orphanage_ilock_flags) + xfs_iunlock(sc->orphanage, sc->orphanage_ilock_flags); + + xchk_irele(sc, sc->orphanage); + sc->orphanage = NULL; +} + +/* Adoption moves a file into /lost+found */ + +/* Can the orphanage adopt @sc->ip? */ +bool +xrep_orphanage_can_adopt( + struct xfs_scrub *sc) +{ + ASSERT(sc->ip != NULL); + + if (!sc->orphanage) + return false; + if (sc->ip == sc->orphanage) + return false; + if (xfs_internal_inum(sc->mp, sc->ip->i_ino)) + return false; + return true; +} + +/* + * Create a new transaction to send a child to the orphanage. + * + * Allocate a new transaction with sufficient disk space to handle the + * adoption, take ILOCK_EXCL of the orphanage and sc->ip, joins them to the + * transaction, and reserve quota to reparent the latter. Caller must hold the + * IOLOCK of the orphanage and sc->ip. + */ +int +xrep_adoption_trans_alloc( + struct xfs_scrub *sc, + struct xrep_adoption *adopt) +{ + struct xfs_mount *mp = sc->mp; + unsigned int child_blkres = 0; + int error; + + ASSERT(sc->tp == NULL); + ASSERT(sc->ip != NULL); + ASSERT(sc->orphanage != NULL); + ASSERT(sc->ilock_flags & XFS_IOLOCK_EXCL); + ASSERT(sc->orphanage_ilock_flags & XFS_IOLOCK_EXCL); + ASSERT(!(sc->ilock_flags & (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL))); + ASSERT(!(sc->orphanage_ilock_flags & + (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL))); + + /* Compute the worst case space reservation that we need. */ + adopt->sc = sc; + adopt->orphanage_blkres = XFS_LINK_SPACE_RES(mp, MAXNAMELEN); + if (S_ISDIR(VFS_I(sc->ip)->i_mode)) + child_blkres = XFS_RENAME_SPACE_RES(mp, xfs_name_dotdot.len); + adopt->child_blkres = child_blkres; + + /* + * Allocate a transaction to link the child into the parent, along with + * enough disk space to handle expansion of both the orphanage and the + * dotdot entry of a child directory. + */ + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, + adopt->orphanage_blkres + adopt->child_blkres, 0, 0, + &sc->tp); + if (error) + return error; + + xfs_lock_two_inodes(sc->orphanage, XFS_ILOCK_EXCL, + sc->ip, XFS_ILOCK_EXCL); + sc->ilock_flags |= XFS_ILOCK_EXCL; + sc->orphanage_ilock_flags |= XFS_ILOCK_EXCL; + + xfs_trans_ijoin(sc->tp, sc->orphanage, 0); + xfs_trans_ijoin(sc->tp, sc->ip, 0); + + /* + * Reserve enough quota in the orphan directory to add the new name. + * Normally the orphanage should have user/group/project ids of zero + * and hence is not subject to quota enforcement, but we're allowed to + * exceed quota to reattach disconnected parts of the directory tree. + */ + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->orphanage, + adopt->orphanage_blkres, 0, true); + if (error) + goto out_cancel; + + /* + * Reserve enough quota in the child directory to change dotdot. + * Here we're also allowed to exceed file quota to repair inconsistent + * metadata. + */ + if (adopt->child_blkres) { + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->ip, + adopt->child_blkres, 0, true); + if (error) + goto out_cancel; + } + + return 0; +out_cancel: + xchk_trans_cancel(sc); + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return error; +} + +/* + * Compute the xfs_name for the directory entry that we're adding to the + * orphanage. Caller must hold ILOCKs of sc->ip and the orphanage and must not + * reuse namebuf until the adoption completes or is dissolved. + */ +int +xrep_adoption_compute_name( + struct xrep_adoption *adopt, + struct xfs_name *xname) +{ + struct xfs_scrub *sc = adopt->sc; + char *namebuf = (void *)xname->name; + xfs_ino_t ino; + unsigned int incr = 0; + int error = 0; + + adopt->xname = xname; + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu", sc->ip->i_ino); + xname->type = xfs_mode_to_ftype(VFS_I(sc->ip)->i_mode); + + /* Make sure the filename is unique in the lost+found. */ + error = xchk_dir_lookup(sc, sc->orphanage, xname, &ino); + while (error == 0 && incr < 10000) { + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu.%u", + sc->ip->i_ino, ++incr); + error = xchk_dir_lookup(sc, sc->orphanage, xname, &ino); + } + if (error == 0) { + /* We already have 10,000 entries in the orphanage? */ + return -EFSCORRUPTED; + } + + if (error != -ENOENT) + return error; + return 0; +} + +/* + * Move the current file to the orphanage under the computed name. + * + * Returns with a dirty transaction so that the caller can handle any other + * work, such as fixing up unlinked lists or resetting link counts. + */ +int +xrep_adoption_move( + struct xrep_adoption *adopt) +{ + struct xfs_scrub *sc = adopt->sc; + bool isdir = S_ISDIR(VFS_I(sc->ip)->i_mode); + int error; + + trace_xrep_adoption_reparent(sc->orphanage, adopt->xname, + sc->ip->i_ino); + + /* Create the new name in the orphanage. */ + error = xfs_dir_createname(sc->tp, sc->orphanage, adopt->xname, + sc->ip->i_ino, adopt->orphanage_blkres); + if (error) + return error; + + /* + * Bump the link count of the orphanage if we just added a + * subdirectory, and update its timestamps. + */ + xfs_trans_ichgtime(sc->tp, sc->orphanage, + XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG); + if (isdir) + xfs_bumplink(sc->tp, sc->orphanage); + xfs_trans_log_inode(sc->tp, sc->orphanage, XFS_ILOG_CORE); + + /* Replace the dotdot entry if the child is a subdirectory. */ + if (isdir) { + error = xfs_dir_replace(sc->tp, sc->ip, &xfs_name_dotdot, + sc->orphanage->i_ino, adopt->child_blkres); + if (error) + return error; + } + + /* + * Notify dirent hooks that we moved the file to /lost+found, and + * finish all the deferred work so that we know the adoption is fully + * recorded in the log. + */ + xfs_dir_update_hook(sc->orphanage, sc->ip, 1, adopt->xname); + return 0; +} + +/* + * Roll to a clean scrub transaction so that we can release the orphanage, + * even if xrep_adoption_move was not called. + * + * Commits all the work and deferred ops attached to an adoption request and + * rolls to a clean scrub transaction. On success, returns 0 with the scrub + * context holding a clean transaction with no inodes joined. On failure, + * returns negative errno with no scrub transaction. All inode locks are + * still held after this function returns. + */ +int +xrep_adoption_trans_roll( + struct xrep_adoption *adopt) +{ + struct xfs_scrub *sc = adopt->sc; + int error; + + trace_xrep_adoption_trans_roll(sc->orphanage, sc->ip, + !!(sc->tp->t_flags & XFS_TRANS_DIRTY)); + + /* Finish all the deferred ops to commit all repairs. */ + error = xrep_defer_finish(sc); + if (error) + return error; + + /* Roll the transaction once more to detach the inodes. */ + return xfs_trans_roll(&sc->tp); +} diff --git a/fs/xfs/scrub/orphanage.h b/fs/xfs/scrub/orphanage.h new file mode 100644 index 000000000000..319179ab788d --- /dev/null +++ b/fs/xfs/scrub/orphanage.h @@ -0,0 +1,75 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#ifndef __XFS_SCRUB_ORPHANAGE_H__ +#define __XFS_SCRUB_ORPHANAGE_H__ + +#ifdef CONFIG_XFS_ONLINE_REPAIR +int xrep_orphanage_create(struct xfs_scrub *sc); + +/* + * If we're doing a repair, ensure that the orphanage exists and attach it to + * the scrub context. + */ +static inline int +xrep_orphanage_try_create( + struct xfs_scrub *sc) +{ + int error; + + ASSERT(sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR); + + error = xrep_orphanage_create(sc); + switch (error) { + case 0: + case -ENOENT: + case -ENOTDIR: + case -ENOSPC: + /* + * If the orphanage can't be found or isn't a directory, we'll + * keep going, but we won't be able to attach the file to the + * orphanage if we can't find the parent. + */ + return 0; + } + + return error; +} + +int xrep_orphanage_iolock_two(struct xfs_scrub *sc); + +void xrep_orphanage_ilock(struct xfs_scrub *sc, unsigned int ilock_flags); +bool xrep_orphanage_ilock_nowait(struct xfs_scrub *sc, + unsigned int ilock_flags); +void xrep_orphanage_iunlock(struct xfs_scrub *sc, unsigned int ilock_flags); + +void xrep_orphanage_rele(struct xfs_scrub *sc); + +/* Information about a request to add a file to the orphanage. */ +struct xrep_adoption { + struct xfs_scrub *sc; + + /* Name used for the adoption. */ + struct xfs_name *xname; + + /* Block reservations for orphanage and child (if directory). */ + unsigned int orphanage_blkres; + unsigned int child_blkres; +}; + +bool xrep_orphanage_can_adopt(struct xfs_scrub *sc); + +int xrep_adoption_trans_alloc(struct xfs_scrub *sc, + struct xrep_adoption *adopt); +int xrep_adoption_compute_name(struct xrep_adoption *adopt, + struct xfs_name *xname); +int xrep_adoption_move(struct xrep_adoption *adopt); +int xrep_adoption_trans_roll(struct xrep_adoption *adopt); +#else +struct xrep_adoption { /* empty */ }; +# define xrep_orphanage_rele(sc) ((void)0) +#endif /* CONFIG_XFS_ONLINE_REPAIR */ + +#endif /* __XFS_SCRUB_ORPHANAGE_H__ */ diff --git a/fs/xfs/scrub/parent_repair.c b/fs/xfs/scrub/parent_repair.c index 826926c2bb0d..ebb5791bf839 100644 --- a/fs/xfs/scrub/parent_repair.c +++ b/fs/xfs/scrub/parent_repair.c @@ -32,6 +32,8 @@ #include "scrub/iscan.h" #include "scrub/findparent.h" #include "scrub/readdir.h" +#include "scrub/tempfile.h" +#include "scrub/orphanage.h" /* * Repairing The Directory Parent Pointer @@ -57,6 +59,13 @@ struct xrep_parent { * dotdot entry for this directory. */ struct xrep_parent_scan_info pscan; + + /* Orphanage reparenting request. */ + struct xrep_adoption adoption; + + /* Directory entry name, plus the trailing null. */ + struct xfs_name xname; + unsigned char namebuf[MAXNAMELEN]; }; /* Tear down all the incore stuff we created. */ @@ -80,9 +89,10 @@ xrep_setup_parent( if (!rp) return -ENOMEM; rp->sc = sc; + rp->xname.name = rp->namebuf; sc->buf = rp; - return 0; + return xrep_orphanage_try_create(sc); } /* @@ -179,6 +189,91 @@ xrep_parent_reset_dotdot( return xfs_trans_roll(&sc->tp); } +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_parent_move_to_orphanage( + struct xrep_parent *rp) +{ + struct xfs_scrub *sc = rp->sc; + xfs_ino_t orig_parent, new_parent; + int error; + + /* + * We are about to drop the ILOCK on sc->ip to lock the orphanage and + * prepare for the adoption. Therefore, look up the old dotdot entry + * for sc->ip so that we can compare it after we re-lock sc->ip. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &orig_parent); + if (error) + return error; + + /* + * Drop the ILOCK on the scrub target and commit the transaction. + * Adoption computes its own resource requirements and gathers the + * necessary components. + */ + error = xrep_trans_commit(sc); + if (error) + return error; + xchk_iunlock(sc, XFS_ILOCK_EXCL); + + /* If we can take the orphanage's iolock then we're ready to move. */ + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + return error; + } + + /* Grab transaction and ILOCK the two files. */ + error = xrep_adoption_trans_alloc(sc, &rp->adoption); + if (error) + return error; + + error = xrep_adoption_compute_name(&rp->adoption, &rp->xname); + if (error) + return error; + + /* + * Now that we've reacquired the ILOCK on sc->ip, look up the dotdot + * entry again. If the parent changed or the child was unlinked while + * the child directory was unlocked, we don't need to move the child to + * the orphanage after all. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &new_parent); + if (error) + return error; + + /* + * Attach to the orphanage if we still have a linked directory and it + * hasn't been moved. + */ + if (orig_parent == new_parent && VFS_I(sc->ip)->i_nlink > 0) { + error = xrep_adoption_move(&rp->adoption); + if (error) + return error; + } + + /* + * Launder the scrub transaction so we can drop the orphanage ILOCK + * and IOLOCK. Return holding the scrub target's ILOCK and IOLOCK. + */ + error = xrep_adoption_trans_roll(&rp->adoption); + if (error) + return error; + + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return 0; +} + /* * Commit the new parent pointer structure (currently only the dotdot entry) to * the file that we're repairing. @@ -188,7 +283,8 @@ xrep_parent_rebuild_tree( struct xrep_parent *rp) { if (rp->pscan.parent_ino == NULLFSINO) { - /* Cannot fix orphaned directories yet. */ + if (xrep_orphanage_can_adopt(rp->sc)) + return xrep_parent_move_to_orphanage(rp); return -EFSCORRUPTED; } diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index 520d83db193c..6417628ce26b 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -27,6 +27,7 @@ #include "scrub/stats.h" #include "scrub/xfile.h" #include "scrub/tempfile.h" +#include "scrub/orphanage.h" /* * Online Scrub and Repair @@ -217,6 +218,7 @@ xchk_teardown( } xrep_tempfile_rele(sc); + xrep_orphanage_rele(sc); xchk_fsgates_disable(sc); return error; } diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index d38f0b30416c..7abe498f7a46 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -105,6 +105,10 @@ struct xfs_scrub { /* Lock flags for @ip. */ uint ilock_flags; + /* The orphanage, for stashing files that have lost their parent. */ + uint orphanage_ilock_flags; + struct xfs_inode *orphanage; + /* A temporary file on this filesystem, for staging new metadata. */ struct xfs_inode *tempip; uint temp_ilock_flags; diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index d68ec8e2781e..7c49aa6f6b8d 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -2588,6 +2588,34 @@ DEFINE_EVENT(xrep_dirent_class, name, \ DEFINE_XREP_DIRENT_EVENT(xrep_dir_salvage_entry); DEFINE_XREP_DIRENT_EVENT(xrep_dir_stash_createname); DEFINE_XREP_DIRENT_EVENT(xrep_dir_replay_createname); +DEFINE_XREP_DIRENT_EVENT(xrep_adoption_reparent); + +DECLARE_EVENT_CLASS(xrep_adoption_class, + TP_PROTO(struct xfs_inode *dp, struct xfs_inode *ip, bool moved), + TP_ARGS(dp, ip, moved), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_ino_t, dir_ino) + __field(xfs_ino_t, child_ino) + __field(bool, moved) + ), + TP_fast_assign( + __entry->dev = dp->i_mount->m_super->s_dev; + __entry->dir_ino = dp->i_ino; + __entry->child_ino = ip->i_ino; + __entry->moved = moved; + ), + TP_printk("dev %d:%d dir 0x%llx child 0x%llx moved? %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->dir_ino, + __entry->child_ino, + __entry->moved) +); +#define DEFINE_XREP_ADOPTION_EVENT(name) \ +DEFINE_EVENT(xrep_adoption_class, name, \ + TP_PROTO(struct xfs_inode *dp, struct xfs_inode *ip, bool moved), \ + TP_ARGS(dp, ip, moved)) +DEFINE_XREP_ADOPTION_EVENT(xrep_adoption_trans_roll); DECLARE_EVENT_CLASS(xrep_parent_salvage_class, TP_PROTO(struct xfs_inode *dp, xfs_ino_t ino), diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 09d643a9e997..803a64687014 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -914,10 +914,10 @@ xfs_droplink( /* * Increment the link count on an inode & log the change. */ -static void +void xfs_bumplink( - xfs_trans_t *tp, - xfs_inode_t *ip) + struct xfs_trans *tp, + struct xfs_inode *ip) { xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG); diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 8157ae7f8e59..18bc3d7750a0 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -625,6 +625,7 @@ void xfs_end_io(struct work_struct *work); int xfs_ilock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_remapping(struct xfs_inode *ip1, struct xfs_inode *ip2); +void xfs_bumplink(struct xfs_trans *tp, struct xfs_inode *ip); static inline bool xfs_inode_unlinked_incomplete( ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET v30.1 11/15] xfs: move orphan files to lost and found @ 2024-03-27 1:49 Darrick J. Wong 2024-03-27 2:05 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong 0 siblings, 1 reply; 12+ messages in thread From: Darrick J. Wong @ 2024-03-27 1:49 UTC (permalink / raw) To: djwong; +Cc: Christoph Hellwig, hch, linux-xfs Hi all, Orphaned files are defined to be files with nonzero ondisk link count but no observable parent directory. This series enables online repair to reparent orphaned files into the filesystem directory tree, and wires up this reparenting ability into the directory, file link count, and parent pointer repair functions. This is how we fix files with positive link count that are not reachable through the directory tree. This patch will also create the orphanage directory (lost+found) if it is not present. In contrast to xfs_repair, we follow e2fsck in creating the lost+found without group or other-owner access to avoid accidental disclosure of files that were previously hidden by an 0700 directory. That's silly security, but people have been known to do it. If you're going to start using this code, I strongly recommend pulling from my git trees, which are linked below. This has been running on the djcloud for months with no problems. Enjoy! Comments and questions are, as always, welcome. --D kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-orphanage --- Commits in this patchset: * xfs: move orphan files to the orphanage * xfs: move files to orphanage instead of letting nlinks drop to zero * xfs: ensure dentry consistency when the orphanage adopts a file --- .../filesystems/xfs/xfs-online-fsck-design.rst | 20 - fs/xfs/Makefile | 1 fs/xfs/scrub/dir_repair.c | 130 ++++ fs/xfs/scrub/nlinks.c | 11 fs/xfs/scrub/nlinks.h | 6 fs/xfs/scrub/nlinks_repair.c | 124 ++++ fs/xfs/scrub/orphanage.c | 587 ++++++++++++++++++++ fs/xfs/scrub/orphanage.h | 75 +++ fs/xfs/scrub/parent_repair.c | 98 +++ fs/xfs/scrub/repair.h | 2 fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/scrub.h | 4 fs/xfs/scrub/trace.c | 1 fs/xfs/scrub/trace.h | 96 +++ fs/xfs/xfs_inode.c | 6 fs/xfs/xfs_inode.h | 1 16 files changed, 1130 insertions(+), 34 deletions(-) create mode 100644 fs/xfs/scrub/orphanage.c create mode 100644 fs/xfs/scrub/orphanage.h ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] xfs: move orphan files to the orphanage 2024-03-27 1:49 [PATCHSET v30.1 11/15] xfs: move orphan files to lost and found Darrick J. Wong @ 2024-03-27 2:05 ` Darrick J. Wong 0 siblings, 0 replies; 12+ messages in thread From: Darrick J. Wong @ 2024-03-27 2:05 UTC (permalink / raw) To: djwong; +Cc: Christoph Hellwig, hch, linux-xfs From: Darrick J. Wong <djwong@kernel.org> When we're repairing a directory structure or fixing the dotdot entry of a subdirectory, it's possible that we won't ever find a parent for the subdirectory. When this is the case, move it to the orphanage, aka /lost+found. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> --- .../filesystems/xfs/xfs-online-fsck-design.rst | 19 + fs/xfs/Makefile | 1 fs/xfs/scrub/dir_repair.c | 130 +++++ fs/xfs/scrub/orphanage.c | 499 ++++++++++++++++++++ fs/xfs/scrub/orphanage.h | 75 +++ fs/xfs/scrub/parent_repair.c | 98 ++++ fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/scrub.h | 4 fs/xfs/scrub/trace.h | 28 + fs/xfs/xfs_inode.c | 6 fs/xfs/xfs_inode.h | 1 11 files changed, 843 insertions(+), 20 deletions(-) create mode 100644 fs/xfs/scrub/orphanage.c create mode 100644 fs/xfs/scrub/orphanage.h diff --git a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst index f72e1ed2d0e5f..74069f692dfd1 100644 --- a/Documentation/filesystems/xfs/xfs-online-fsck-design.rst +++ b/Documentation/filesystems/xfs/xfs-online-fsck-design.rst @@ -4778,14 +4778,21 @@ Orphaned files are adopted by the orphanage as follows: The ``xrep_orphanage_iolock_two`` function follows the inode locking strategy discussed earlier. -3. Call ``xrep_orphanage_compute_blkres`` and ``xrep_orphanage_compute_name`` - to compute the new name in the orphanage and the block reservation required. - -4. Use ``xrep_orphanage_adoption_prep`` to reserve resources to the repair +3. Use ``xrep_adoption_trans_alloc`` to reserve resources to the repair transaction. -5. Call ``xrep_orphanage_adopt`` to reparent the orphaned file into the lost - and found, and update the kernel dentry cache. +4. Call ``xrep_orphanage_compute_name`` to compute the new name in the + orphanage. + +5. If the adoption is going to happen, call ``xrep_adoption_reparent`` to + reparent the orphaned file into the lost and found and invalidate the dentry + cache. + +6. Call ``xrep_adoption_finish`` to commit any filesystem updates, release the + orphanage ILOCK, and clean the scrub transaction. + +7. If a runtime error happens, call ``xrep_adoption_cancel`` to release all + resources. The proposed patches are in the `orphanage adoption diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index d48646f86563f..1e23d1b3cd7b1 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -205,6 +205,7 @@ xfs-y += $(addprefix scrub/, \ inode_repair.o \ newbt.o \ nlinks_repair.o \ + orphanage.o \ parent_repair.o \ rcbag_btree.o \ rcbag.o \ diff --git a/fs/xfs/scrub/dir_repair.c b/fs/xfs/scrub/dir_repair.c index bbd953017bf49..983d6519dc3f6 100644 --- a/fs/xfs/scrub/dir_repair.c +++ b/fs/xfs/scrub/dir_repair.c @@ -42,6 +42,7 @@ #include "scrub/readdir.h" #include "scrub/reap.h" #include "scrub/findparent.h" +#include "scrub/orphanage.h" /* * Directory Repair @@ -115,12 +116,21 @@ struct xrep_dir { */ struct xrep_parent_scan_info pscan; + /* + * Context information for attaching this directory to the lost+found + * if this directory does not have a parent. + */ + struct xrep_adoption adoption; + /* How many subdirectories did we find? */ uint64_t subdirs; /* How many dirents did we find? */ unsigned int dirents; + /* Should we move this directory to the orphanage? */ + bool needs_adoption; + /* Directory entry name, plus the trailing null. */ unsigned char namebuf[MAXNAMELEN]; }; @@ -147,6 +157,10 @@ xrep_setup_directory( xchk_fsgates_enable(sc, XCHK_FSGATES_DIRENTS); + error = xrep_orphanage_try_create(sc); + if (error) + return error; + error = xrep_tempfile_create(sc, S_IFDIR); if (error) return error; @@ -1139,10 +1153,8 @@ xrep_dir_set_nlink( /* * The directory is not on the incore unlinked list, which means that * it needs to be reachable via the directory tree. Update the nlink - * with our observed link count. - * - * XXX: A subsequent patch will handle parentless directories by moving - * them to the lost and found instead of aborting the repair. + * with our observed link count. If the directory has no parent, it + * will be moved to the orphanage. */ if (!xfs_inode_on_unlinked_list(dp)) goto reset_nlink; @@ -1153,6 +1165,7 @@ xrep_dir_set_nlink( * inactivate when the last reference drops. */ if (rd->dirents == 0) { + rd->needs_adoption = false; new_nlink = 0; goto reset_nlink; } @@ -1161,7 +1174,8 @@ xrep_dir_set_nlink( * The directory is on the unlinked list and we found dirents. This * directory needs to be reachable via the directory tree. Remove the * dir from the unlinked list and update nlink with the observed link - * count. + * count. If the directory has no parent, it will be moved to the + * orphanage. */ pag = xfs_perag_get(sc->mp, XFS_INO_TO_AGNO(sc->mp, dp->i_ino)); if (!pag) { @@ -1197,12 +1211,16 @@ xrep_dir_swap( return -EFSCORRUPTED; /* - * If we never found the parent for this directory, we can't fix this - * directory. + * If we never found the parent for this directory, temporarily assign + * the root dir as the parent; we'll move this to the orphanage after + * exchanging the dir contents. We hold the ILOCK of the dir being + * repaired, so we're not worried about racy updates of dotdot. */ ASSERT(sc->ilock_flags & XFS_ILOCK_EXCL); - if (rd->pscan.parent_ino == NULLFSINO) - return -EFSCORRUPTED; + if (rd->pscan.parent_ino == NULLFSINO) { + rd->needs_adoption = true; + rd->pscan.parent_ino = rd->sc->mp->m_sb.sb_rootino; + } /* * Reset the temporary directory's '..' entry to point to the parent @@ -1360,6 +1378,91 @@ xrep_dir_setup_scan( return error; } +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_dir_move_to_orphanage( + struct xrep_dir *rd) +{ + struct xfs_scrub *sc = rd->sc; + xfs_ino_t orig_parent, new_parent; + int error; + + /* + * We are about to drop the ILOCK on sc->ip to lock the orphanage and + * prepare for the adoption. Therefore, look up the old dotdot entry + * for sc->ip so that we can compare it after we re-lock sc->ip. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &orig_parent); + if (error) + return error; + + /* + * Drop the ILOCK on the scrub target and commit the transaction. + * Adoption computes its own resource requirements and gathers the + * necessary components. + */ + error = xrep_trans_commit(sc); + if (error) + return error; + xchk_iunlock(sc, XFS_ILOCK_EXCL); + + /* If we can take the orphanage's iolock then we're ready to move. */ + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + return error; + } + + /* Grab transaction and ILOCK the two files. */ + error = xrep_adoption_trans_alloc(sc, &rd->adoption); + if (error) + return error; + + error = xrep_adoption_compute_name(&rd->adoption, rd->namebuf); + if (error) + return error; + + /* + * Now that we've reacquired the ILOCK on sc->ip, look up the dotdot + * entry again. If the parent changed or the child was unlinked while + * the child directory was unlocked, we don't need to move the child to + * the orphanage after all. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &new_parent); + if (error) + return error; + + /* + * Attach to the orphanage if we still have a linked directory and it + * hasn't been moved. + */ + if (orig_parent == new_parent && VFS_I(sc->ip)->i_nlink > 0) { + error = xrep_adoption_move(&rd->adoption); + if (error) + return error; + } + + /* + * Launder the scrub transaction so we can drop the orphanage ILOCK + * and IOLOCK. Return holding the scrub target's ILOCK and IOLOCK. + */ + error = xrep_adoption_trans_roll(&rd->adoption); + if (error) + return error; + + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return 0; +} + /* * Repair the directory metadata. * @@ -1398,6 +1501,15 @@ xrep_directory( if (error) goto out_teardown; + if (rd->needs_adoption) { + if (!xrep_orphanage_can_adopt(rd->sc)) + error = -EFSCORRUPTED; + else + error = xrep_dir_move_to_orphanage(rd); + if (error) + goto out_teardown; + } + out_teardown: xrep_dir_teardown(sc); return error; diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c new file mode 100644 index 0000000000000..0aedc5c70b632 --- /dev/null +++ b/fs/xfs/scrub/orphanage.c @@ -0,0 +1,499 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_log_format.h" +#include "xfs_trans.h" +#include "xfs_inode.h" +#include "xfs_ialloc.h" +#include "xfs_quota.h" +#include "xfs_trans_space.h" +#include "xfs_dir2.h" +#include "xfs_icache.h" +#include "xfs_bmap.h" +#include "xfs_bmap_btree.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/repair.h" +#include "scrub/trace.h" +#include "scrub/orphanage.h" +#include "scrub/readdir.h" + +#include <linux/namei.h> + +/* + * The Orphanage + * ============= + * + * If the directory tree is damaged, children of that directory become + * inaccessible via that file path. If a child has no other parents, the file + * is said to be orphaned. xfs_repair fixes this situation by creating a + * orphanage directory (specifically, /lost+found) and creating a directory + * entry pointing to the orphaned file. + * + * Online repair follows this tactic by creating a root-owned /lost+found + * directory if one does not exist. If an orphan is found, it will move that + * files into orphanage. + */ + +/* Make the orphanage owned by root. */ +STATIC int +xrep_chown_orphanage( + struct xfs_scrub *sc, + struct xfs_inode *dp) +{ + struct xfs_trans *tp; + struct xfs_mount *mp = sc->mp; + struct xfs_dquot *udqp = NULL, *gdqp = NULL, *pdqp = NULL; + struct xfs_dquot *oldu = NULL, *oldg = NULL, *oldp = NULL; + struct inode *inode = VFS_I(dp); + int error; + + error = xfs_qm_vop_dqalloc(dp, GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0, + XFS_QMOPT_QUOTALL, &udqp, &gdqp, &pdqp); + if (error) + return error; + + error = xfs_trans_alloc_ichange(dp, udqp, gdqp, pdqp, true, &tp); + if (error) + goto out_dqrele; + + /* + * Always clear setuid/setgid/sticky on the orphanage since we don't + * normally want that functionality on this directory and xfs_repair + * doesn't create it this way either. Leave the other access bits + * unchanged. + */ + inode->i_mode &= ~(S_ISUID | S_ISGID | S_ISVTX); + + /* + * Change the ownerships and register quota modifications + * in the transaction. + */ + if (!uid_eq(inode->i_uid, GLOBAL_ROOT_UID)) { + if (XFS_IS_UQUOTA_ON(mp)) + oldu = xfs_qm_vop_chown(tp, dp, &dp->i_udquot, udqp); + inode->i_uid = GLOBAL_ROOT_UID; + } + if (!gid_eq(inode->i_gid, GLOBAL_ROOT_GID)) { + if (XFS_IS_GQUOTA_ON(mp)) + oldg = xfs_qm_vop_chown(tp, dp, &dp->i_gdquot, gdqp); + inode->i_gid = GLOBAL_ROOT_GID; + } + if (dp->i_projid != 0) { + if (XFS_IS_PQUOTA_ON(mp)) + oldp = xfs_qm_vop_chown(tp, dp, &dp->i_pdquot, pdqp); + dp->i_projid = 0; + } + + dp->i_diflags &= ~(XFS_DIFLAG_REALTIME | XFS_DIFLAG_RTINHERIT); + xfs_trans_log_inode(tp, dp, XFS_ILOG_CORE); + + XFS_STATS_INC(mp, xs_ig_attrchg); + + if (xfs_has_wsync(mp)) + xfs_trans_set_sync(tp); + error = xfs_trans_commit(tp); + + xfs_qm_dqrele(oldu); + xfs_qm_dqrele(oldg); + xfs_qm_dqrele(oldp); + +out_dqrele: + xfs_qm_dqrele(udqp); + xfs_qm_dqrele(gdqp); + xfs_qm_dqrele(pdqp); + return error; +} + +#define ORPHANAGE "lost+found" + +/* Create the orphanage directory, and set sc->orphanage to it. */ +int +xrep_orphanage_create( + struct xfs_scrub *sc) +{ + struct xfs_mount *mp = sc->mp; + struct dentry *root_dentry, *orphanage_dentry; + struct inode *root_inode = VFS_I(sc->mp->m_rootip); + struct inode *orphanage_inode; + int error; + + if (xfs_is_shutdown(mp)) + return -EIO; + if (xfs_is_readonly(mp)) { + sc->orphanage = NULL; + return 0; + } + + ASSERT(sc->tp == NULL); + ASSERT(sc->orphanage == NULL); + + /* Find the dentry for the root directory... */ + root_dentry = d_find_alias(root_inode); + if (!root_dentry) { + error = -EFSCORRUPTED; + goto out; + } + + /* ...which is a directory, right? */ + if (!d_is_dir(root_dentry)) { + error = -EFSCORRUPTED; + goto out_dput_root; + } + + /* Try to find the orphanage directory. */ + inode_lock_nested(root_inode, I_MUTEX_PARENT); + orphanage_dentry = lookup_one_len(ORPHANAGE, root_dentry, + strlen(ORPHANAGE)); + if (IS_ERR(orphanage_dentry)) { + error = PTR_ERR(orphanage_dentry); + goto out_unlock_root; + } + + /* + * Nothing found? Call mkdir to create the orphanage. Create the + * directory without other-user access because we're live and someone + * could have been relying partly on minimal access to a parent + * directory to control access to a file we put in here. + */ + if (d_really_is_negative(orphanage_dentry)) { + error = vfs_mkdir(&nop_mnt_idmap, root_inode, orphanage_dentry, + 0750); + if (error) + goto out_dput_orphanage; + } + + /* Not a directory? Bail out. */ + if (!d_is_dir(orphanage_dentry)) { + error = -ENOTDIR; + goto out_dput_orphanage; + } + + /* + * Grab a reference to the orphanage. This /should/ succeed since + * we hold the root directory locked and therefore nobody can delete + * the orphanage. + */ + orphanage_inode = igrab(d_inode(orphanage_dentry)); + if (!orphanage_inode) { + error = -ENOENT; + goto out_dput_orphanage; + } + + /* Make sure the orphanage is owned by root. */ + error = xrep_chown_orphanage(sc, XFS_I(orphanage_inode)); + if (error) + goto out_dput_orphanage; + + /* Stash the reference for later and bail out. */ + sc->orphanage = XFS_I(orphanage_inode); + sc->orphanage_ilock_flags = 0; + +out_dput_orphanage: + dput(orphanage_dentry); +out_unlock_root: + inode_unlock(VFS_I(sc->mp->m_rootip)); +out_dput_root: + dput(root_dentry); +out: + return error; +} + +void +xrep_orphanage_ilock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + sc->orphanage_ilock_flags |= ilock_flags; + xfs_ilock(sc->orphanage, ilock_flags); +} + +bool +xrep_orphanage_ilock_nowait( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + if (xfs_ilock_nowait(sc->orphanage, ilock_flags)) { + sc->orphanage_ilock_flags |= ilock_flags; + return true; + } + + return false; +} + +void +xrep_orphanage_iunlock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + xfs_iunlock(sc->orphanage, ilock_flags); + sc->orphanage_ilock_flags &= ~ilock_flags; +} + +/* Grab the IOLOCK of the orphanage and sc->ip. */ +int +xrep_orphanage_iolock_two( + struct xfs_scrub *sc) +{ + int error = 0; + + while (true) { + if (xchk_should_terminate(sc, &error)) + return error; + + /* + * Normal XFS takes the IOLOCK before grabbing a transaction. + * Scrub holds a transaction, which means that we can't block + * on either IOLOCK. + */ + if (xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + if (xchk_ilock_nowait(sc, XFS_IOLOCK_EXCL)) + break; + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + } + delay(1); + } + + return 0; +} + +/* Release the orphanage. */ +void +xrep_orphanage_rele( + struct xfs_scrub *sc) +{ + if (!sc->orphanage) + return; + + if (sc->orphanage_ilock_flags) + xfs_iunlock(sc->orphanage, sc->orphanage_ilock_flags); + + xchk_irele(sc, sc->orphanage); + sc->orphanage = NULL; +} + +/* Adoption moves a file into /lost+found */ + +/* Can the orphanage adopt @sc->ip? */ +bool +xrep_orphanage_can_adopt( + struct xfs_scrub *sc) +{ + ASSERT(sc->ip != NULL); + + if (!sc->orphanage) + return false; + if (sc->ip == sc->orphanage) + return false; + if (xfs_internal_inum(sc->mp, sc->ip->i_ino)) + return false; + return true; +} + +/* + * Create a new transaction to send a child to the orphanage. + * + * Allocate a new transaction with sufficient disk space to handle the + * adoption, take ILOCK_EXCL of the orphanage and sc->ip, joins them to the + * transaction, and reserve quota to reparent the latter. Caller must hold the + * IOLOCK of the orphanage and sc->ip. + */ +int +xrep_adoption_trans_alloc( + struct xfs_scrub *sc, + struct xrep_adoption *adopt) +{ + struct xfs_mount *mp = sc->mp; + unsigned int child_blkres = 0; + int error; + + ASSERT(sc->tp == NULL); + ASSERT(sc->ip != NULL); + ASSERT(sc->orphanage != NULL); + ASSERT(sc->ilock_flags & XFS_IOLOCK_EXCL); + ASSERT(sc->orphanage_ilock_flags & XFS_IOLOCK_EXCL); + ASSERT(!(sc->ilock_flags & (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL))); + ASSERT(!(sc->orphanage_ilock_flags & + (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL))); + + /* Compute the worst case space reservation that we need. */ + adopt->sc = sc; + adopt->orphanage_blkres = XFS_LINK_SPACE_RES(mp, MAXNAMELEN); + if (S_ISDIR(VFS_I(sc->ip)->i_mode)) + child_blkres = XFS_RENAME_SPACE_RES(mp, xfs_name_dotdot.len); + adopt->child_blkres = child_blkres; + + /* + * Allocate a transaction to link the child into the parent, along with + * enough disk space to handle expansion of both the orphanage and the + * dotdot entry of a child directory. + */ + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, + adopt->orphanage_blkres + adopt->child_blkres, 0, 0, + &sc->tp); + if (error) + return error; + + xfs_lock_two_inodes(sc->orphanage, XFS_ILOCK_EXCL, + sc->ip, XFS_ILOCK_EXCL); + sc->ilock_flags |= XFS_ILOCK_EXCL; + sc->orphanage_ilock_flags |= XFS_ILOCK_EXCL; + + xfs_trans_ijoin(sc->tp, sc->orphanage, 0); + xfs_trans_ijoin(sc->tp, sc->ip, 0); + + /* + * Reserve enough quota in the orphan directory to add the new name. + * Normally the orphanage should have user/group/project ids of zero + * and hence is not subject to quota enforcement, but we're allowed to + * exceed quota to reattach disconnected parts of the directory tree. + */ + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->orphanage, + adopt->orphanage_blkres, 0, true); + if (error) + goto out_cancel; + + /* + * Reserve enough quota in the child directory to change dotdot. + * Here we're also allowed to exceed file quota to repair inconsistent + * metadata. + */ + if (adopt->child_blkres) { + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->ip, + adopt->child_blkres, 0, true); + if (error) + goto out_cancel; + } + + return 0; +out_cancel: + xchk_trans_cancel(sc); + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return error; +} + +/* + * Compute the xfs_name for the directory entry that we're adding to the + * orphanage. Caller must hold ILOCKs of sc->ip and the orphanage and must not + * reuse namebuf until the adoption completes or is dissolved. + */ +int +xrep_adoption_compute_name( + struct xrep_adoption *adopt, + unsigned char *namebuf) +{ + struct xfs_name *xname = &adopt->xname; + struct xfs_scrub *sc = adopt->sc; + xfs_ino_t ino; + unsigned int incr = 0; + int error = 0; + + xname->name = namebuf; + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu", sc->ip->i_ino); + xname->type = xfs_mode_to_ftype(VFS_I(sc->ip)->i_mode); + + /* Make sure the filename is unique in the lost+found. */ + error = xchk_dir_lookup(sc, sc->orphanage, xname, &ino); + while (error == 0 && incr < 10000) { + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu.%u", + sc->ip->i_ino, ++incr); + error = xchk_dir_lookup(sc, sc->orphanage, xname, &ino); + } + if (error == 0) { + /* We already have 10,000 entries in the orphanage? */ + return -EFSCORRUPTED; + } + + if (error != -ENOENT) + return error; + return 0; +} + +/* + * Move the current file to the orphanage under the computed name. + * + * Returns with a dirty transaction so that the caller can handle any other + * work, such as fixing up unlinked lists or resetting link counts. + */ +int +xrep_adoption_move( + struct xrep_adoption *adopt) +{ + struct xfs_scrub *sc = adopt->sc; + struct xfs_name *xname = &adopt->xname; + bool isdir = S_ISDIR(VFS_I(sc->ip)->i_mode); + int error; + + trace_xrep_adoption_reparent(sc->orphanage, &adopt->xname, + sc->ip->i_ino); + + /* Create the new name in the orphanage. */ + error = xfs_dir_createname(sc->tp, sc->orphanage, xname, sc->ip->i_ino, + adopt->orphanage_blkres); + if (error) + return error; + + /* + * Bump the link count of the orphanage if we just added a + * subdirectory, and update its timestamps. + */ + xfs_trans_ichgtime(sc->tp, sc->orphanage, + XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG); + if (isdir) + xfs_bumplink(sc->tp, sc->orphanage); + xfs_trans_log_inode(sc->tp, sc->orphanage, XFS_ILOG_CORE); + + /* Replace the dotdot entry if the child is a subdirectory. */ + if (isdir) { + error = xfs_dir_replace(sc->tp, sc->ip, &xfs_name_dotdot, + sc->orphanage->i_ino, adopt->child_blkres); + if (error) + return error; + } + + /* + * Notify dirent hooks that we moved the file to /lost+found, and + * finish all the deferred work so that we know the adoption is fully + * recorded in the log. + */ + xfs_dir_update_hook(sc->orphanage, sc->ip, 1, xname); + return 0; +} + +/* + * Roll to a clean scrub transaction so that we can release the orphanage, + * even if xrep_adoption_move was not called. + * + * Commits all the work and deferred ops attached to an adoption request and + * rolls to a clean scrub transaction. On success, returns 0 with the scrub + * context holding a clean transaction with no inodes joined. On failure, + * returns negative errno with no scrub transaction. All inode locks are + * still held after this function returns. + */ +int +xrep_adoption_trans_roll( + struct xrep_adoption *adopt) +{ + struct xfs_scrub *sc = adopt->sc; + int error; + + trace_xrep_adoption_trans_roll(sc->orphanage, sc->ip, + !!(sc->tp->t_flags & XFS_TRANS_DIRTY)); + + /* Finish all the deferred ops to commit all repairs. */ + error = xrep_defer_finish(sc); + if (error) + return error; + + /* Roll the transaction once more to detach the inodes. */ + return xfs_trans_roll(&sc->tp); +} diff --git a/fs/xfs/scrub/orphanage.h b/fs/xfs/scrub/orphanage.h new file mode 100644 index 0000000000000..9d40992583b24 --- /dev/null +++ b/fs/xfs/scrub/orphanage.h @@ -0,0 +1,75 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#ifndef __XFS_SCRUB_ORPHANAGE_H__ +#define __XFS_SCRUB_ORPHANAGE_H__ + +#ifdef CONFIG_XFS_ONLINE_REPAIR +int xrep_orphanage_create(struct xfs_scrub *sc); + +/* + * If we're doing a repair, ensure that the orphanage exists and attach it to + * the scrub context. + */ +static inline int +xrep_orphanage_try_create( + struct xfs_scrub *sc) +{ + int error; + + ASSERT(sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR); + + error = xrep_orphanage_create(sc); + switch (error) { + case 0: + case -ENOENT: + case -ENOTDIR: + case -ENOSPC: + /* + * If the orphanage can't be found or isn't a directory, we'll + * keep going, but we won't be able to attach the file to the + * orphanage if we can't find the parent. + */ + return 0; + } + + return error; +} + +int xrep_orphanage_iolock_two(struct xfs_scrub *sc); + +void xrep_orphanage_ilock(struct xfs_scrub *sc, unsigned int ilock_flags); +bool xrep_orphanage_ilock_nowait(struct xfs_scrub *sc, + unsigned int ilock_flags); +void xrep_orphanage_iunlock(struct xfs_scrub *sc, unsigned int ilock_flags); + +void xrep_orphanage_rele(struct xfs_scrub *sc); + +/* Information about a request to add a file to the orphanage. */ +struct xrep_adoption { + /* Name structure; caller must provide a buffer separately. */ + struct xfs_name xname; + + struct xfs_scrub *sc; + + /* Block reservations for orphanage and child (if directory). */ + unsigned int orphanage_blkres; + unsigned int child_blkres; +}; + +bool xrep_orphanage_can_adopt(struct xfs_scrub *sc); + +int xrep_adoption_trans_alloc(struct xfs_scrub *sc, + struct xrep_adoption *adopt); +int xrep_adoption_compute_name(struct xrep_adoption *adopt, + unsigned char *namebuf); +int xrep_adoption_move(struct xrep_adoption *adopt); +int xrep_adoption_trans_roll(struct xrep_adoption *adopt); +#else +struct xrep_adoption { /* empty */ }; +# define xrep_orphanage_rele(sc) ((void)0) +#endif /* CONFIG_XFS_ONLINE_REPAIR */ + +#endif /* __XFS_SCRUB_ORPHANAGE_H__ */ diff --git a/fs/xfs/scrub/parent_repair.c b/fs/xfs/scrub/parent_repair.c index 826926c2bb0dd..2bb4cfbeffe82 100644 --- a/fs/xfs/scrub/parent_repair.c +++ b/fs/xfs/scrub/parent_repair.c @@ -32,6 +32,8 @@ #include "scrub/iscan.h" #include "scrub/findparent.h" #include "scrub/readdir.h" +#include "scrub/tempfile.h" +#include "scrub/orphanage.h" /* * Repairing The Directory Parent Pointer @@ -57,6 +59,12 @@ struct xrep_parent { * dotdot entry for this directory. */ struct xrep_parent_scan_info pscan; + + /* Orphanage reparenting request. */ + struct xrep_adoption adoption; + + /* Directory entry name, plus the trailing null. */ + unsigned char namebuf[MAXNAMELEN]; }; /* Tear down all the incore stuff we created. */ @@ -82,7 +90,7 @@ xrep_setup_parent( rp->sc = sc; sc->buf = rp; - return 0; + return xrep_orphanage_try_create(sc); } /* @@ -179,6 +187,91 @@ xrep_parent_reset_dotdot( return xfs_trans_roll(&sc->tp); } +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_parent_move_to_orphanage( + struct xrep_parent *rp) +{ + struct xfs_scrub *sc = rp->sc; + xfs_ino_t orig_parent, new_parent; + int error; + + /* + * We are about to drop the ILOCK on sc->ip to lock the orphanage and + * prepare for the adoption. Therefore, look up the old dotdot entry + * for sc->ip so that we can compare it after we re-lock sc->ip. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &orig_parent); + if (error) + return error; + + /* + * Drop the ILOCK on the scrub target and commit the transaction. + * Adoption computes its own resource requirements and gathers the + * necessary components. + */ + error = xrep_trans_commit(sc); + if (error) + return error; + xchk_iunlock(sc, XFS_ILOCK_EXCL); + + /* If we can take the orphanage's iolock then we're ready to move. */ + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + return error; + } + + /* Grab transaction and ILOCK the two files. */ + error = xrep_adoption_trans_alloc(sc, &rp->adoption); + if (error) + return error; + + error = xrep_adoption_compute_name(&rp->adoption, rp->namebuf); + if (error) + return error; + + /* + * Now that we've reacquired the ILOCK on sc->ip, look up the dotdot + * entry again. If the parent changed or the child was unlinked while + * the child directory was unlocked, we don't need to move the child to + * the orphanage after all. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &new_parent); + if (error) + return error; + + /* + * Attach to the orphanage if we still have a linked directory and it + * hasn't been moved. + */ + if (orig_parent == new_parent && VFS_I(sc->ip)->i_nlink > 0) { + error = xrep_adoption_move(&rp->adoption); + if (error) + return error; + } + + /* + * Launder the scrub transaction so we can drop the orphanage ILOCK + * and IOLOCK. Return holding the scrub target's ILOCK and IOLOCK. + */ + error = xrep_adoption_trans_roll(&rp->adoption); + if (error) + return error; + + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return 0; +} + /* * Commit the new parent pointer structure (currently only the dotdot entry) to * the file that we're repairing. @@ -188,7 +281,8 @@ xrep_parent_rebuild_tree( struct xrep_parent *rp) { if (rp->pscan.parent_ino == NULLFSINO) { - /* Cannot fix orphaned directories yet. */ + if (xrep_orphanage_can_adopt(rp->sc)) + return xrep_parent_move_to_orphanage(rp); return -EFSCORRUPTED; } diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index 520d83db193c3..6417628ce26be 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -27,6 +27,7 @@ #include "scrub/stats.h" #include "scrub/xfile.h" #include "scrub/tempfile.h" +#include "scrub/orphanage.h" /* * Online Scrub and Repair @@ -217,6 +218,7 @@ xchk_teardown( } xrep_tempfile_rele(sc); + xrep_orphanage_rele(sc); xchk_fsgates_disable(sc); return error; } diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index aca3e652343c1..456bb181399f4 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -105,6 +105,10 @@ struct xfs_scrub { /* Lock flags for @ip. */ uint ilock_flags; + /* The orphanage, for stashing files that have lost their parent. */ + uint orphanage_ilock_flags; + struct xfs_inode *orphanage; + /* A temporary file on this filesystem, for staging new metadata. */ struct xfs_inode *tempip; uint temp_ilock_flags; diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 883accdd53f04..e6cf0e0020799 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -2582,6 +2582,34 @@ DEFINE_EVENT(xrep_dirent_class, name, \ DEFINE_XREP_DIRENT_EVENT(xrep_dir_salvage_entry); DEFINE_XREP_DIRENT_EVENT(xrep_dir_stash_createname); DEFINE_XREP_DIRENT_EVENT(xrep_dir_replay_createname); +DEFINE_XREP_DIRENT_EVENT(xrep_adoption_reparent); + +DECLARE_EVENT_CLASS(xrep_adoption_class, + TP_PROTO(struct xfs_inode *dp, struct xfs_inode *ip, bool moved), + TP_ARGS(dp, ip, moved), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_ino_t, dir_ino) + __field(xfs_ino_t, child_ino) + __field(bool, moved) + ), + TP_fast_assign( + __entry->dev = dp->i_mount->m_super->s_dev; + __entry->dir_ino = dp->i_ino; + __entry->child_ino = ip->i_ino; + __entry->moved = moved; + ), + TP_printk("dev %d:%d dir 0x%llx child 0x%llx moved? %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->dir_ino, + __entry->child_ino, + __entry->moved) +); +#define DEFINE_XREP_ADOPTION_EVENT(name) \ +DEFINE_EVENT(xrep_adoption_class, name, \ + TP_PROTO(struct xfs_inode *dp, struct xfs_inode *ip, bool moved), \ + TP_ARGS(dp, ip, moved)) +DEFINE_XREP_ADOPTION_EVENT(xrep_adoption_trans_roll); DECLARE_EVENT_CLASS(xrep_parent_salvage_class, TP_PROTO(struct xfs_inode *dp, xfs_ino_t ino), diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index e49cbb1c75fb2..c0bdcf0f2448e 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -914,10 +914,10 @@ xfs_droplink( /* * Increment the link count on an inode & log the change. */ -static void +void xfs_bumplink( - xfs_trans_t *tp, - xfs_inode_t *ip) + struct xfs_trans *tp, + struct xfs_inode *ip) { xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG); diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 22d6fa2fcc676..66271daff5b3f 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -625,6 +625,7 @@ void xfs_end_io(struct work_struct *work); int xfs_ilock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_remapping(struct xfs_inode *ip1, struct xfs_inode *ip2); +void xfs_bumplink(struct xfs_trans *tp, struct xfs_inode *ip); static inline bool xfs_inode_unlinked_incomplete( ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET v29.0 23/28] xfs: move orphan files to lost and found @ 2023-12-31 19:31 Darrick J. Wong 2023-12-31 20:38 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong 0 siblings, 1 reply; 12+ messages in thread From: Darrick J. Wong @ 2023-12-31 19:31 UTC (permalink / raw) To: djwong; +Cc: linux-xfs Hi all, Orphaned files are defined to be files with nonzero ondisk link count but no observable parent directory. This series enables online repair to reparent orphaned files into the filesystem directory tree, and wires up this reparenting ability into the directory, file link count, and parent pointer repair functions. This is how we fix files with positive link count that are not reachable through the directory tree. This patch will also create the orphanage directory (lost+found) if it is not present. In contrast to xfs_repair, we follow e2fsck in creating the lost+found without group or other-owner access to avoid accidental disclosure of files that were previously hidden by an 0700 directory. That's silly security, but people have been known to do it. If you're going to start using this code, I strongly recommend pulling from my git trees, which are linked below. This has been running on the djcloud for months with no problems. Enjoy! Comments and questions are, as always, welcome. --D kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-orphanage --- .../filesystems/xfs-online-fsck-design.rst | 20 - fs/xfs/Makefile | 1 fs/xfs/scrub/dir_repair.c | 130 ++++ fs/xfs/scrub/nlinks.c | 11 fs/xfs/scrub/nlinks.h | 6 fs/xfs/scrub/nlinks_repair.c | 124 ++++ fs/xfs/scrub/orphanage.c | 587 ++++++++++++++++++++ fs/xfs/scrub/orphanage.h | 75 +++ fs/xfs/scrub/parent_repair.c | 98 +++ fs/xfs/scrub/repair.h | 2 fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/scrub.h | 4 fs/xfs/scrub/trace.c | 1 fs/xfs/scrub/trace.h | 96 +++ fs/xfs/xfs_inode.c | 6 fs/xfs/xfs_inode.h | 1 16 files changed, 1130 insertions(+), 34 deletions(-) create mode 100644 fs/xfs/scrub/orphanage.c create mode 100644 fs/xfs/scrub/orphanage.h ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] xfs: move orphan files to the orphanage 2023-12-31 19:31 [PATCHSET v29.0 23/28] xfs: move orphan files to lost and found Darrick J. Wong @ 2023-12-31 20:38 ` Darrick J. Wong 0 siblings, 0 replies; 12+ messages in thread From: Darrick J. Wong @ 2023-12-31 20:38 UTC (permalink / raw) To: djwong; +Cc: linux-xfs From: Darrick J. Wong <djwong@kernel.org> When we're repairing a directory structure or fixing the dotdot entry of a subdirectory, it's possible that we won't ever find a parent for the subdirectory. When this is the case, move it to the orphanage, aka /lost+found. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- .../filesystems/xfs-online-fsck-design.rst | 19 + fs/xfs/Makefile | 1 fs/xfs/scrub/dir_repair.c | 130 +++++ fs/xfs/scrub/orphanage.c | 499 ++++++++++++++++++++ fs/xfs/scrub/orphanage.h | 75 +++ fs/xfs/scrub/parent_repair.c | 98 ++++ fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/scrub.h | 4 fs/xfs/scrub/trace.h | 28 + fs/xfs/xfs_inode.c | 6 fs/xfs/xfs_inode.h | 1 11 files changed, 843 insertions(+), 20 deletions(-) create mode 100644 fs/xfs/scrub/orphanage.c create mode 100644 fs/xfs/scrub/orphanage.h diff --git a/Documentation/filesystems/xfs-online-fsck-design.rst b/Documentation/filesystems/xfs-online-fsck-design.rst index a0678101a7d02..63c78e1e85e52 100644 --- a/Documentation/filesystems/xfs-online-fsck-design.rst +++ b/Documentation/filesystems/xfs-online-fsck-design.rst @@ -4777,14 +4777,21 @@ Orphaned files are adopted by the orphanage as follows: The ``xrep_orphanage_iolock_two`` function follows the inode locking strategy discussed earlier. -3. Call ``xrep_orphanage_compute_blkres`` and ``xrep_orphanage_compute_name`` - to compute the new name in the orphanage and the block reservation required. - -4. Use ``xrep_orphanage_adoption_prep`` to reserve resources to the repair +3. Use ``xrep_adoption_trans_alloc`` to reserve resources to the repair transaction. -5. Call ``xrep_orphanage_adopt`` to reparent the orphaned file into the lost - and found, and update the kernel dentry cache. +4. Call ``xrep_orphanage_compute_name`` to compute the new name in the + orphanage. + +5. If the adoption is going to happen, call ``xrep_adoption_reparent`` to + reparent the orphaned file into the lost and found and invalidate the dentry + cache. + +6. Call ``xrep_adoption_finish`` to commit any filesystem updates, release the + orphanage ILOCK, and clean the scrub transaction. + +7. If a runtime error happens, call ``xrep_adoption_cancel`` to release all + resources. The proposed patches are in the `orphanage adoption diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 36e4cbbe21999..49f12fb8480e1 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -205,6 +205,7 @@ xfs-y += $(addprefix scrub/, \ inode_repair.o \ newbt.o \ nlinks_repair.o \ + orphanage.o \ parent_repair.o \ rcbag_btree.o \ rcbag.o \ diff --git a/fs/xfs/scrub/dir_repair.c b/fs/xfs/scrub/dir_repair.c index b22fd59c2f8b3..141682e2477af 100644 --- a/fs/xfs/scrub/dir_repair.c +++ b/fs/xfs/scrub/dir_repair.c @@ -42,6 +42,7 @@ #include "scrub/readdir.h" #include "scrub/reap.h" #include "scrub/findparent.h" +#include "scrub/orphanage.h" /* * Directory Repair @@ -115,12 +116,21 @@ struct xrep_dir { */ struct xrep_parent_scan_info pscan; + /* + * Context information for attaching this directory to the lost+found + * if this directory does not have a parent. + */ + struct xrep_adoption adoption; + /* How many subdirectories did we find? */ uint64_t subdirs; /* How many dirents did we find? */ unsigned int dirents; + /* Should we move this directory to the orphanage? */ + bool needs_adoption; + /* Directory entry name, plus the trailing null. */ unsigned char namebuf[MAXNAMELEN]; }; @@ -147,6 +157,10 @@ xrep_setup_directory( xchk_fsgates_enable(sc, XCHK_FSGATES_DIRENTS); + error = xrep_orphanage_try_create(sc); + if (error) + return error; + error = xrep_tempfile_create(sc, S_IFDIR); if (error) return error; @@ -1137,10 +1151,8 @@ xrep_dir_set_nlink( /* * The directory is not on the incore unlinked list, which means that * it needs to be reachable via the directory tree. Update the nlink - * with our observed link count. - * - * XXX: A subsequent patch will handle parentless directories by moving - * them to the lost and found instead of aborting the repair. + * with our observed link count. If the directory has no parent, it + * will be moved to the orphanage. */ if (!xfs_inode_on_unlinked_list(dp)) goto reset_nlink; @@ -1151,6 +1163,7 @@ xrep_dir_set_nlink( * inactivate when the last reference drops. */ if (rd->dirents == 0) { + rd->needs_adoption = false; new_nlink = 0; goto reset_nlink; } @@ -1159,7 +1172,8 @@ xrep_dir_set_nlink( * The directory is on the unlinked list and we found dirents. This * directory needs to be reachable via the directory tree. Remove the * dir from the unlinked list and update nlink with the observed link - * count. + * count. If the directory has no parent, it will be moved to the + * orphanage. */ pag = xfs_perag_get(sc->mp, XFS_INO_TO_AGNO(sc->mp, dp->i_ino)); if (!pag) { @@ -1195,12 +1209,16 @@ xrep_dir_swap( return -EFSCORRUPTED; /* - * If we never found the parent for this directory, we can't fix this - * directory. + * If we never found the parent for this directory, temporarily assign + * the root dir as the parent; we'll move this to the orphanage after + * swapping the dir contents. We hold the ILOCK of the dir being + * repaired, so we're not worried about racy updates of dotdot. */ ASSERT(sc->ilock_flags & XFS_ILOCK_EXCL); - if (rd->pscan.parent_ino == NULLFSINO) - return -EFSCORRUPTED; + if (rd->pscan.parent_ino == NULLFSINO) { + rd->needs_adoption = true; + rd->pscan.parent_ino = rd->sc->mp->m_sb.sb_rootino; + } /* * Reset the temporary directory's '..' entry to point to the parent @@ -1355,6 +1373,91 @@ xrep_dir_setup_scan( return error; } +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_dir_move_to_orphanage( + struct xrep_dir *rd) +{ + struct xfs_scrub *sc = rd->sc; + xfs_ino_t orig_parent, new_parent; + int error; + + /* + * We are about to drop the ILOCK on sc->ip to lock the orphanage and + * prepare for the adoption. Therefore, look up the old dotdot entry + * for sc->ip so that we can compare it after we re-lock sc->ip. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &orig_parent); + if (error) + return error; + + /* + * Drop the ILOCK on the scrub target and commit the transaction. + * Adoption computes its own resource requirements and gathers the + * necessary components. + */ + error = xrep_trans_commit(sc); + if (error) + return error; + xchk_iunlock(sc, XFS_ILOCK_EXCL); + + /* If we can take the orphanage's iolock then we're ready to move. */ + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + return error; + } + + /* Grab transaction and ILOCK the two files. */ + error = xrep_adoption_trans_alloc(sc, &rd->adoption); + if (error) + return error; + + error = xrep_adoption_compute_name(&rd->adoption, rd->namebuf); + if (error) + return error; + + /* + * Now that we've reacquired the ILOCK on sc->ip, look up the dotdot + * entry again. If the parent changed or the child was unlinked while + * the child directory was unlocked, we don't need to move the child to + * the orphanage after all. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &new_parent); + if (error) + return error; + + /* + * Attach to the orphanage if we still have a linked directory and it + * hasn't been moved. + */ + if (orig_parent == new_parent && VFS_I(sc->ip)->i_nlink > 0) { + error = xrep_adoption_move(&rd->adoption); + if (error) + return error; + } + + /* + * Launder the scrub transaction so we can drop the orphanage ILOCK + * and IOLOCK. Return holding the scrub target's ILOCK and IOLOCK. + */ + error = xrep_adoption_trans_roll(&rd->adoption); + if (error) + return error; + + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return 0; +} + /* * Repair the directory metadata. * @@ -1393,6 +1496,15 @@ xrep_directory( if (error) goto out_teardown; + if (rd->needs_adoption) { + if (!xrep_orphanage_can_adopt(rd->sc)) + error = -EFSCORRUPTED; + else + error = xrep_dir_move_to_orphanage(rd); + if (error) + goto out_teardown; + } + out_teardown: xrep_dir_teardown(sc); return error; diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c new file mode 100644 index 0000000000000..0aedc5c70b632 --- /dev/null +++ b/fs/xfs/scrub/orphanage.c @@ -0,0 +1,499 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_log_format.h" +#include "xfs_trans.h" +#include "xfs_inode.h" +#include "xfs_ialloc.h" +#include "xfs_quota.h" +#include "xfs_trans_space.h" +#include "xfs_dir2.h" +#include "xfs_icache.h" +#include "xfs_bmap.h" +#include "xfs_bmap_btree.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/repair.h" +#include "scrub/trace.h" +#include "scrub/orphanage.h" +#include "scrub/readdir.h" + +#include <linux/namei.h> + +/* + * The Orphanage + * ============= + * + * If the directory tree is damaged, children of that directory become + * inaccessible via that file path. If a child has no other parents, the file + * is said to be orphaned. xfs_repair fixes this situation by creating a + * orphanage directory (specifically, /lost+found) and creating a directory + * entry pointing to the orphaned file. + * + * Online repair follows this tactic by creating a root-owned /lost+found + * directory if one does not exist. If an orphan is found, it will move that + * files into orphanage. + */ + +/* Make the orphanage owned by root. */ +STATIC int +xrep_chown_orphanage( + struct xfs_scrub *sc, + struct xfs_inode *dp) +{ + struct xfs_trans *tp; + struct xfs_mount *mp = sc->mp; + struct xfs_dquot *udqp = NULL, *gdqp = NULL, *pdqp = NULL; + struct xfs_dquot *oldu = NULL, *oldg = NULL, *oldp = NULL; + struct inode *inode = VFS_I(dp); + int error; + + error = xfs_qm_vop_dqalloc(dp, GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0, + XFS_QMOPT_QUOTALL, &udqp, &gdqp, &pdqp); + if (error) + return error; + + error = xfs_trans_alloc_ichange(dp, udqp, gdqp, pdqp, true, &tp); + if (error) + goto out_dqrele; + + /* + * Always clear setuid/setgid/sticky on the orphanage since we don't + * normally want that functionality on this directory and xfs_repair + * doesn't create it this way either. Leave the other access bits + * unchanged. + */ + inode->i_mode &= ~(S_ISUID | S_ISGID | S_ISVTX); + + /* + * Change the ownerships and register quota modifications + * in the transaction. + */ + if (!uid_eq(inode->i_uid, GLOBAL_ROOT_UID)) { + if (XFS_IS_UQUOTA_ON(mp)) + oldu = xfs_qm_vop_chown(tp, dp, &dp->i_udquot, udqp); + inode->i_uid = GLOBAL_ROOT_UID; + } + if (!gid_eq(inode->i_gid, GLOBAL_ROOT_GID)) { + if (XFS_IS_GQUOTA_ON(mp)) + oldg = xfs_qm_vop_chown(tp, dp, &dp->i_gdquot, gdqp); + inode->i_gid = GLOBAL_ROOT_GID; + } + if (dp->i_projid != 0) { + if (XFS_IS_PQUOTA_ON(mp)) + oldp = xfs_qm_vop_chown(tp, dp, &dp->i_pdquot, pdqp); + dp->i_projid = 0; + } + + dp->i_diflags &= ~(XFS_DIFLAG_REALTIME | XFS_DIFLAG_RTINHERIT); + xfs_trans_log_inode(tp, dp, XFS_ILOG_CORE); + + XFS_STATS_INC(mp, xs_ig_attrchg); + + if (xfs_has_wsync(mp)) + xfs_trans_set_sync(tp); + error = xfs_trans_commit(tp); + + xfs_qm_dqrele(oldu); + xfs_qm_dqrele(oldg); + xfs_qm_dqrele(oldp); + +out_dqrele: + xfs_qm_dqrele(udqp); + xfs_qm_dqrele(gdqp); + xfs_qm_dqrele(pdqp); + return error; +} + +#define ORPHANAGE "lost+found" + +/* Create the orphanage directory, and set sc->orphanage to it. */ +int +xrep_orphanage_create( + struct xfs_scrub *sc) +{ + struct xfs_mount *mp = sc->mp; + struct dentry *root_dentry, *orphanage_dentry; + struct inode *root_inode = VFS_I(sc->mp->m_rootip); + struct inode *orphanage_inode; + int error; + + if (xfs_is_shutdown(mp)) + return -EIO; + if (xfs_is_readonly(mp)) { + sc->orphanage = NULL; + return 0; + } + + ASSERT(sc->tp == NULL); + ASSERT(sc->orphanage == NULL); + + /* Find the dentry for the root directory... */ + root_dentry = d_find_alias(root_inode); + if (!root_dentry) { + error = -EFSCORRUPTED; + goto out; + } + + /* ...which is a directory, right? */ + if (!d_is_dir(root_dentry)) { + error = -EFSCORRUPTED; + goto out_dput_root; + } + + /* Try to find the orphanage directory. */ + inode_lock_nested(root_inode, I_MUTEX_PARENT); + orphanage_dentry = lookup_one_len(ORPHANAGE, root_dentry, + strlen(ORPHANAGE)); + if (IS_ERR(orphanage_dentry)) { + error = PTR_ERR(orphanage_dentry); + goto out_unlock_root; + } + + /* + * Nothing found? Call mkdir to create the orphanage. Create the + * directory without other-user access because we're live and someone + * could have been relying partly on minimal access to a parent + * directory to control access to a file we put in here. + */ + if (d_really_is_negative(orphanage_dentry)) { + error = vfs_mkdir(&nop_mnt_idmap, root_inode, orphanage_dentry, + 0750); + if (error) + goto out_dput_orphanage; + } + + /* Not a directory? Bail out. */ + if (!d_is_dir(orphanage_dentry)) { + error = -ENOTDIR; + goto out_dput_orphanage; + } + + /* + * Grab a reference to the orphanage. This /should/ succeed since + * we hold the root directory locked and therefore nobody can delete + * the orphanage. + */ + orphanage_inode = igrab(d_inode(orphanage_dentry)); + if (!orphanage_inode) { + error = -ENOENT; + goto out_dput_orphanage; + } + + /* Make sure the orphanage is owned by root. */ + error = xrep_chown_orphanage(sc, XFS_I(orphanage_inode)); + if (error) + goto out_dput_orphanage; + + /* Stash the reference for later and bail out. */ + sc->orphanage = XFS_I(orphanage_inode); + sc->orphanage_ilock_flags = 0; + +out_dput_orphanage: + dput(orphanage_dentry); +out_unlock_root: + inode_unlock(VFS_I(sc->mp->m_rootip)); +out_dput_root: + dput(root_dentry); +out: + return error; +} + +void +xrep_orphanage_ilock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + sc->orphanage_ilock_flags |= ilock_flags; + xfs_ilock(sc->orphanage, ilock_flags); +} + +bool +xrep_orphanage_ilock_nowait( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + if (xfs_ilock_nowait(sc->orphanage, ilock_flags)) { + sc->orphanage_ilock_flags |= ilock_flags; + return true; + } + + return false; +} + +void +xrep_orphanage_iunlock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + xfs_iunlock(sc->orphanage, ilock_flags); + sc->orphanage_ilock_flags &= ~ilock_flags; +} + +/* Grab the IOLOCK of the orphanage and sc->ip. */ +int +xrep_orphanage_iolock_two( + struct xfs_scrub *sc) +{ + int error = 0; + + while (true) { + if (xchk_should_terminate(sc, &error)) + return error; + + /* + * Normal XFS takes the IOLOCK before grabbing a transaction. + * Scrub holds a transaction, which means that we can't block + * on either IOLOCK. + */ + if (xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + if (xchk_ilock_nowait(sc, XFS_IOLOCK_EXCL)) + break; + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + } + delay(1); + } + + return 0; +} + +/* Release the orphanage. */ +void +xrep_orphanage_rele( + struct xfs_scrub *sc) +{ + if (!sc->orphanage) + return; + + if (sc->orphanage_ilock_flags) + xfs_iunlock(sc->orphanage, sc->orphanage_ilock_flags); + + xchk_irele(sc, sc->orphanage); + sc->orphanage = NULL; +} + +/* Adoption moves a file into /lost+found */ + +/* Can the orphanage adopt @sc->ip? */ +bool +xrep_orphanage_can_adopt( + struct xfs_scrub *sc) +{ + ASSERT(sc->ip != NULL); + + if (!sc->orphanage) + return false; + if (sc->ip == sc->orphanage) + return false; + if (xfs_internal_inum(sc->mp, sc->ip->i_ino)) + return false; + return true; +} + +/* + * Create a new transaction to send a child to the orphanage. + * + * Allocate a new transaction with sufficient disk space to handle the + * adoption, take ILOCK_EXCL of the orphanage and sc->ip, joins them to the + * transaction, and reserve quota to reparent the latter. Caller must hold the + * IOLOCK of the orphanage and sc->ip. + */ +int +xrep_adoption_trans_alloc( + struct xfs_scrub *sc, + struct xrep_adoption *adopt) +{ + struct xfs_mount *mp = sc->mp; + unsigned int child_blkres = 0; + int error; + + ASSERT(sc->tp == NULL); + ASSERT(sc->ip != NULL); + ASSERT(sc->orphanage != NULL); + ASSERT(sc->ilock_flags & XFS_IOLOCK_EXCL); + ASSERT(sc->orphanage_ilock_flags & XFS_IOLOCK_EXCL); + ASSERT(!(sc->ilock_flags & (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL))); + ASSERT(!(sc->orphanage_ilock_flags & + (XFS_ILOCK_SHARED | XFS_ILOCK_EXCL))); + + /* Compute the worst case space reservation that we need. */ + adopt->sc = sc; + adopt->orphanage_blkres = XFS_LINK_SPACE_RES(mp, MAXNAMELEN); + if (S_ISDIR(VFS_I(sc->ip)->i_mode)) + child_blkres = XFS_RENAME_SPACE_RES(mp, xfs_name_dotdot.len); + adopt->child_blkres = child_blkres; + + /* + * Allocate a transaction to link the child into the parent, along with + * enough disk space to handle expansion of both the orphanage and the + * dotdot entry of a child directory. + */ + error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, + adopt->orphanage_blkres + adopt->child_blkres, 0, 0, + &sc->tp); + if (error) + return error; + + xfs_lock_two_inodes(sc->orphanage, XFS_ILOCK_EXCL, + sc->ip, XFS_ILOCK_EXCL); + sc->ilock_flags |= XFS_ILOCK_EXCL; + sc->orphanage_ilock_flags |= XFS_ILOCK_EXCL; + + xfs_trans_ijoin(sc->tp, sc->orphanage, 0); + xfs_trans_ijoin(sc->tp, sc->ip, 0); + + /* + * Reserve enough quota in the orphan directory to add the new name. + * Normally the orphanage should have user/group/project ids of zero + * and hence is not subject to quota enforcement, but we're allowed to + * exceed quota to reattach disconnected parts of the directory tree. + */ + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->orphanage, + adopt->orphanage_blkres, 0, true); + if (error) + goto out_cancel; + + /* + * Reserve enough quota in the child directory to change dotdot. + * Here we're also allowed to exceed file quota to repair inconsistent + * metadata. + */ + if (adopt->child_blkres) { + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->ip, + adopt->child_blkres, 0, true); + if (error) + goto out_cancel; + } + + return 0; +out_cancel: + xchk_trans_cancel(sc); + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return error; +} + +/* + * Compute the xfs_name for the directory entry that we're adding to the + * orphanage. Caller must hold ILOCKs of sc->ip and the orphanage and must not + * reuse namebuf until the adoption completes or is dissolved. + */ +int +xrep_adoption_compute_name( + struct xrep_adoption *adopt, + unsigned char *namebuf) +{ + struct xfs_name *xname = &adopt->xname; + struct xfs_scrub *sc = adopt->sc; + xfs_ino_t ino; + unsigned int incr = 0; + int error = 0; + + xname->name = namebuf; + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu", sc->ip->i_ino); + xname->type = xfs_mode_to_ftype(VFS_I(sc->ip)->i_mode); + + /* Make sure the filename is unique in the lost+found. */ + error = xchk_dir_lookup(sc, sc->orphanage, xname, &ino); + while (error == 0 && incr < 10000) { + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu.%u", + sc->ip->i_ino, ++incr); + error = xchk_dir_lookup(sc, sc->orphanage, xname, &ino); + } + if (error == 0) { + /* We already have 10,000 entries in the orphanage? */ + return -EFSCORRUPTED; + } + + if (error != -ENOENT) + return error; + return 0; +} + +/* + * Move the current file to the orphanage under the computed name. + * + * Returns with a dirty transaction so that the caller can handle any other + * work, such as fixing up unlinked lists or resetting link counts. + */ +int +xrep_adoption_move( + struct xrep_adoption *adopt) +{ + struct xfs_scrub *sc = adopt->sc; + struct xfs_name *xname = &adopt->xname; + bool isdir = S_ISDIR(VFS_I(sc->ip)->i_mode); + int error; + + trace_xrep_adoption_reparent(sc->orphanage, &adopt->xname, + sc->ip->i_ino); + + /* Create the new name in the orphanage. */ + error = xfs_dir_createname(sc->tp, sc->orphanage, xname, sc->ip->i_ino, + adopt->orphanage_blkres); + if (error) + return error; + + /* + * Bump the link count of the orphanage if we just added a + * subdirectory, and update its timestamps. + */ + xfs_trans_ichgtime(sc->tp, sc->orphanage, + XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG); + if (isdir) + xfs_bumplink(sc->tp, sc->orphanage); + xfs_trans_log_inode(sc->tp, sc->orphanage, XFS_ILOG_CORE); + + /* Replace the dotdot entry if the child is a subdirectory. */ + if (isdir) { + error = xfs_dir_replace(sc->tp, sc->ip, &xfs_name_dotdot, + sc->orphanage->i_ino, adopt->child_blkres); + if (error) + return error; + } + + /* + * Notify dirent hooks that we moved the file to /lost+found, and + * finish all the deferred work so that we know the adoption is fully + * recorded in the log. + */ + xfs_dir_update_hook(sc->orphanage, sc->ip, 1, xname); + return 0; +} + +/* + * Roll to a clean scrub transaction so that we can release the orphanage, + * even if xrep_adoption_move was not called. + * + * Commits all the work and deferred ops attached to an adoption request and + * rolls to a clean scrub transaction. On success, returns 0 with the scrub + * context holding a clean transaction with no inodes joined. On failure, + * returns negative errno with no scrub transaction. All inode locks are + * still held after this function returns. + */ +int +xrep_adoption_trans_roll( + struct xrep_adoption *adopt) +{ + struct xfs_scrub *sc = adopt->sc; + int error; + + trace_xrep_adoption_trans_roll(sc->orphanage, sc->ip, + !!(sc->tp->t_flags & XFS_TRANS_DIRTY)); + + /* Finish all the deferred ops to commit all repairs. */ + error = xrep_defer_finish(sc); + if (error) + return error; + + /* Roll the transaction once more to detach the inodes. */ + return xfs_trans_roll(&sc->tp); +} diff --git a/fs/xfs/scrub/orphanage.h b/fs/xfs/scrub/orphanage.h new file mode 100644 index 0000000000000..9d40992583b24 --- /dev/null +++ b/fs/xfs/scrub/orphanage.h @@ -0,0 +1,75 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (c) 2021-2024 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#ifndef __XFS_SCRUB_ORPHANAGE_H__ +#define __XFS_SCRUB_ORPHANAGE_H__ + +#ifdef CONFIG_XFS_ONLINE_REPAIR +int xrep_orphanage_create(struct xfs_scrub *sc); + +/* + * If we're doing a repair, ensure that the orphanage exists and attach it to + * the scrub context. + */ +static inline int +xrep_orphanage_try_create( + struct xfs_scrub *sc) +{ + int error; + + ASSERT(sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR); + + error = xrep_orphanage_create(sc); + switch (error) { + case 0: + case -ENOENT: + case -ENOTDIR: + case -ENOSPC: + /* + * If the orphanage can't be found or isn't a directory, we'll + * keep going, but we won't be able to attach the file to the + * orphanage if we can't find the parent. + */ + return 0; + } + + return error; +} + +int xrep_orphanage_iolock_two(struct xfs_scrub *sc); + +void xrep_orphanage_ilock(struct xfs_scrub *sc, unsigned int ilock_flags); +bool xrep_orphanage_ilock_nowait(struct xfs_scrub *sc, + unsigned int ilock_flags); +void xrep_orphanage_iunlock(struct xfs_scrub *sc, unsigned int ilock_flags); + +void xrep_orphanage_rele(struct xfs_scrub *sc); + +/* Information about a request to add a file to the orphanage. */ +struct xrep_adoption { + /* Name structure; caller must provide a buffer separately. */ + struct xfs_name xname; + + struct xfs_scrub *sc; + + /* Block reservations for orphanage and child (if directory). */ + unsigned int orphanage_blkres; + unsigned int child_blkres; +}; + +bool xrep_orphanage_can_adopt(struct xfs_scrub *sc); + +int xrep_adoption_trans_alloc(struct xfs_scrub *sc, + struct xrep_adoption *adopt); +int xrep_adoption_compute_name(struct xrep_adoption *adopt, + unsigned char *namebuf); +int xrep_adoption_move(struct xrep_adoption *adopt); +int xrep_adoption_trans_roll(struct xrep_adoption *adopt); +#else +struct xrep_adoption { /* empty */ }; +# define xrep_orphanage_rele(sc) ((void)0) +#endif /* CONFIG_XFS_ONLINE_REPAIR */ + +#endif /* __XFS_SCRUB_ORPHANAGE_H__ */ diff --git a/fs/xfs/scrub/parent_repair.c b/fs/xfs/scrub/parent_repair.c index 430f95171b50b..2eb0dbde9c459 100644 --- a/fs/xfs/scrub/parent_repair.c +++ b/fs/xfs/scrub/parent_repair.c @@ -32,6 +32,8 @@ #include "scrub/iscan.h" #include "scrub/findparent.h" #include "scrub/readdir.h" +#include "scrub/tempfile.h" +#include "scrub/orphanage.h" /* * Repairing The Directory Parent Pointer @@ -57,6 +59,12 @@ struct xrep_parent { * dotdot entry for this directory. */ struct xrep_parent_scan_info pscan; + + /* Orphanage reparenting request. */ + struct xrep_adoption adoption; + + /* Directory entry name, plus the trailing null. */ + unsigned char namebuf[MAXNAMELEN]; }; /* Tear down all the incore stuff we created. */ @@ -82,7 +90,7 @@ xrep_setup_parent( rp->sc = sc; sc->buf = rp; - return 0; + return xrep_orphanage_try_create(sc); } /* @@ -179,6 +187,91 @@ xrep_parent_reset_dotdot( return xfs_trans_roll(&sc->tp); } +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_parent_move_to_orphanage( + struct xrep_parent *rp) +{ + struct xfs_scrub *sc = rp->sc; + xfs_ino_t orig_parent, new_parent; + int error; + + /* + * We are about to drop the ILOCK on sc->ip to lock the orphanage and + * prepare for the adoption. Therefore, look up the old dotdot entry + * for sc->ip so that we can compare it after we re-lock sc->ip. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &orig_parent); + if (error) + return error; + + /* + * Drop the ILOCK on the scrub target and commit the transaction. + * Adoption computes its own resource requirements and gathers the + * necessary components. + */ + error = xrep_trans_commit(sc); + if (error) + return error; + xchk_iunlock(sc, XFS_ILOCK_EXCL); + + /* If we can take the orphanage's iolock then we're ready to move. */ + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + return error; + } + + /* Grab transaction and ILOCK the two files. */ + error = xrep_adoption_trans_alloc(sc, &rp->adoption); + if (error) + return error; + + error = xrep_adoption_compute_name(&rp->adoption, rp->namebuf); + if (error) + return error; + + /* + * Now that we've reacquired the ILOCK on sc->ip, look up the dotdot + * entry again. If the parent changed or the child was unlinked while + * the child directory was unlocked, we don't need to move the child to + * the orphanage after all. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &new_parent); + if (error) + return error; + + /* + * Attach to the orphanage if we still have a linked directory and it + * hasn't been moved. + */ + if (orig_parent == new_parent && VFS_I(sc->ip)->i_nlink > 0) { + error = xrep_adoption_move(&rp->adoption); + if (error) + return error; + } + + /* + * Launder the scrub transaction so we can drop the orphanage ILOCK + * and IOLOCK. Return holding the scrub target's ILOCK and IOLOCK. + */ + error = xrep_adoption_trans_roll(&rp->adoption); + if (error) + return error; + + xrep_orphanage_iunlock(sc, XFS_ILOCK_EXCL); + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + return 0; +} + /* * Commit the new parent pointer structure (currently only the dotdot entry) to * the file that we're repairing. @@ -188,7 +281,8 @@ xrep_parent_rebuild_tree( struct xrep_parent *rp) { if (rp->pscan.parent_ino == NULLFSINO) { - /* Cannot fix orphaned directories yet. */ + if (xrep_orphanage_can_adopt(rp->sc)) + return xrep_parent_move_to_orphanage(rp); return -EFSCORRUPTED; } diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index f9455502b4170..0d35f1f30d9be 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -27,6 +27,7 @@ #include "scrub/stats.h" #include "scrub/xfile.h" #include "scrub/tempfile.h" +#include "scrub/orphanage.h" /* * Online Scrub and Repair @@ -220,6 +221,7 @@ xchk_teardown( } xrep_tempfile_rele(sc); + xrep_orphanage_rele(sc); xchk_fsgates_disable(sc); return error; } diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index 48b2fb8271499..09769af6b66a9 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -105,6 +105,10 @@ struct xfs_scrub { /* Lock flags for @ip. */ uint ilock_flags; + /* The orphanage, for stashing files that have lost their parent. */ + uint orphanage_ilock_flags; + struct xfs_inode *orphanage; + /* A temporary file on this filesystem, for staging new metadata. */ struct xfs_inode *tempip; uint temp_ilock_flags; diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 5ccb92214a80d..10b8aa82c2fc8 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -2669,6 +2669,34 @@ DEFINE_EVENT(xrep_dirent_class, name, \ DEFINE_XREP_DIRENT_EVENT(xrep_dir_salvage_entry); DEFINE_XREP_DIRENT_EVENT(xrep_dir_stash_createname); DEFINE_XREP_DIRENT_EVENT(xrep_dir_replay_createname); +DEFINE_XREP_DIRENT_EVENT(xrep_adoption_reparent); + +DECLARE_EVENT_CLASS(xrep_adoption_class, + TP_PROTO(struct xfs_inode *dp, struct xfs_inode *ip, bool moved), + TP_ARGS(dp, ip, moved), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_ino_t, dir_ino) + __field(xfs_ino_t, child_ino) + __field(bool, moved) + ), + TP_fast_assign( + __entry->dev = dp->i_mount->m_super->s_dev; + __entry->dir_ino = dp->i_ino; + __entry->child_ino = ip->i_ino; + __entry->moved = moved; + ), + TP_printk("dev %d:%d dir 0x%llx child 0x%llx moved? %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->dir_ino, + __entry->child_ino, + __entry->moved) +); +#define DEFINE_XREP_ADOPTION_EVENT(name) \ +DEFINE_EVENT(xrep_adoption_class, name, \ + TP_PROTO(struct xfs_inode *dp, struct xfs_inode *ip, bool moved), \ + TP_ARGS(dp, ip, moved)) +DEFINE_XREP_ADOPTION_EVENT(xrep_adoption_trans_roll); DECLARE_EVENT_CLASS(xrep_parent_salvage_class, TP_PROTO(struct xfs_inode *dp, xfs_ino_t ino), diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 0d7dcb128f857..9618c014615f5 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -936,10 +936,10 @@ xfs_droplink( /* * Increment the link count on an inode & log the change. */ -static void +void xfs_bumplink( - xfs_trans_t *tp, - xfs_inode_t *ip) + struct xfs_trans *tp, + struct xfs_inode *ip) { xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG); diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index f11e91c6e2182..9cee94a0de2c8 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -621,6 +621,7 @@ void xfs_end_io(struct work_struct *work); int xfs_ilock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_remapping(struct xfs_inode *ip1, struct xfs_inode *ip2); +void xfs_bumplink(struct xfs_trans *tp, struct xfs_inode *ip); static inline bool xfs_inode_unlinked_incomplete( ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET v25.0 0/3] xfs: move orphan files to lost and found @ 2023-05-26 0:36 Darrick J. Wong 2023-05-26 1:36 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong 0 siblings, 1 reply; 12+ messages in thread From: Darrick J. Wong @ 2023-05-26 0:36 UTC (permalink / raw) To: djwong; +Cc: linux-xfs Hi all, Orphaned files are defined to be files with nonzero ondisk link count but no observable parent directory. This series enables online repair to reparent orphaned files into the filesystem directory tree, and wires up this reparenting ability into the directory, file link count, and parent pointer repair functions. This is how we fix files with positive link count that are not reachable through the directory tree. This patch will also create the orphanage directory (lost+found) if it is not present. In contrast to xfs_repair, we follow e2fsck in creating the lost+found without group or other-owner access to avoid accidental disclosure of files that were previously hidden by an 0700 directory. That's silly security, but people have been known to do it. If you're going to start using this mess, you probably ought to just pull from my git trees, which are linked below. This is an extraordinary way to destroy everything. Enjoy! Comments and questions are, as always, welcome. --D kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-orphanage --- fs/xfs/Makefile | 1 fs/xfs/scrub/dir_repair.c | 129 +++++++++- fs/xfs/scrub/nlinks.c | 11 + fs/xfs/scrub/nlinks.h | 6 fs/xfs/scrub/nlinks_repair.c | 129 ++++++++++ fs/xfs/scrub/orphanage.c | 539 ++++++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/orphanage.h | 73 ++++++ fs/xfs/scrub/parent_repair.c | 94 +++++++ fs/xfs/scrub/repair.h | 2 fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/scrub.h | 4 fs/xfs/scrub/trace.c | 1 fs/xfs/scrub/trace.h | 81 ++++++ fs/xfs/xfs_inode.c | 6 fs/xfs/xfs_inode.h | 1 15 files changed, 1054 insertions(+), 25 deletions(-) create mode 100644 fs/xfs/scrub/orphanage.c create mode 100644 fs/xfs/scrub/orphanage.h ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] xfs: move orphan files to the orphanage 2023-05-26 0:36 [PATCHSET v25.0 0/3] xfs: move orphan files to lost and found Darrick J. Wong @ 2023-05-26 1:36 ` Darrick J. Wong 0 siblings, 0 replies; 12+ messages in thread From: Darrick J. Wong @ 2023-05-26 1:36 UTC (permalink / raw) To: djwong; +Cc: linux-xfs From: Darrick J. Wong <djwong@kernel.org> When we're repairing a directory structure or fixing the dotdot entry of a subdirectory, it's possible that we won't ever find a parent for the subdirectory. When this is the case, move it to the orphanage, aka /lost+found. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- fs/xfs/Makefile | 1 fs/xfs/scrub/dir_repair.c | 129 +++++++++++- fs/xfs/scrub/orphanage.c | 448 ++++++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/orphanage.h | 73 +++++++ fs/xfs/scrub/parent_repair.c | 94 ++++++++- fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/scrub.h | 4 fs/xfs/scrub/trace.h | 23 ++ fs/xfs/xfs_inode.c | 6 - fs/xfs/xfs_inode.h | 1 10 files changed, 764 insertions(+), 17 deletions(-) create mode 100644 fs/xfs/scrub/orphanage.c create mode 100644 fs/xfs/scrub/orphanage.h diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index ac0da07428c8..e52cd4ccf8d6 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -200,6 +200,7 @@ xfs-y += $(addprefix scrub/, \ inode_repair.o \ newbt.o \ nlinks_repair.o \ + orphanage.o \ parent_repair.o \ rcbag_btree.o \ rcbag.o \ diff --git a/fs/xfs/scrub/dir_repair.c b/fs/xfs/scrub/dir_repair.c index 215a8c1ce5e3..60578de031ca 100644 --- a/fs/xfs/scrub/dir_repair.c +++ b/fs/xfs/scrub/dir_repair.c @@ -42,6 +42,7 @@ #include "scrub/readdir.h" #include "scrub/reap.h" #include "scrub/findparent.h" +#include "scrub/orphanage.h" /* * Directory Repair @@ -115,12 +116,21 @@ struct xrep_dir { */ struct xrep_parent_scan_info pscan; + /* + * Context information for attaching this directory to the lost+found + * if this directory does not have a parent. + */ + struct xrep_adoption adoption; + /* How many subdirectories did we find? */ uint64_t subdirs; /* How many dirents did we find? */ unsigned int dirents; + /* Should we move this directory to the orphanage? */ + bool move_orphanage; + /* Directory entry name, plus the trailing null. */ unsigned char namebuf[MAXNAMELEN]; }; @@ -147,6 +157,10 @@ xrep_setup_directory( xchk_fsgates_enable(sc, XCHK_FSGATES_DIRENTS); + error = xrep_orphanage_try_create(sc); + if (error) + return error; + error = xrep_tempfile_create(sc, S_IFDIR); if (error) return error; @@ -1136,10 +1150,8 @@ xrep_dir_set_nlink( /* * The directory is not on the incore unlinked list, which means that * it needs to be reachable via the directory tree. Update the nlink - * with our observed link count. - * - * XXX: A subsequent patch will handle parentless directories by moving - * them to the lost and found instead of aborting the repair. + * with our observed link count. If the directory has no parent, it + * will be moved to the orphanage. */ if (!xfs_inode_on_unlinked_list(dp)) { xrep_set_nlink(sc->ip, rd->subdirs + 2); @@ -1152,6 +1164,7 @@ xrep_dir_set_nlink( * inactivate when the last reference drops. */ if (rd->dirents == 0) { + rd->move_orphanage = false; xrep_set_nlink(sc->ip, 0); return 0; } @@ -1160,7 +1173,8 @@ xrep_dir_set_nlink( * The directory is on the unlinked list and we found dirents. This * directory needs to be reachable via the directory tree. Remove the * dir from the unlinked list and update nlink with the observed link - * count. + * count. If the directory has no parent, it will be moved to the + * orphanage. */ pag = xfs_perag_get(sc->mp, XFS_INO_TO_AGNO(sc->mp, dp->i_ino)); if (!pag) { @@ -1187,12 +1201,16 @@ xrep_dir_swap( int error = 0; /* - * If we never found the parent for this directory, we can't fix this - * directory. + * If we never found the parent for this directory, temporarily assign + * the root dir as the parent; we'll move this to the orphanage after + * swapping the dir contents. We hold the ILOCK of the dir being + * repaired, so we're not worried about racy updates of dotdot. */ ASSERT(sc->ilock_flags & XFS_ILOCK_EXCL); - if (rd->pscan.parent_ino == NULLFSINO) - return -EFSCORRUPTED; + if (rd->pscan.parent_ino == NULLFSINO) { + rd->move_orphanage = true; + rd->pscan.parent_ino = rd->sc->mp->m_sb.sb_rootino; + } /* * Reset the temporary directory's '..' entry to point to the parent @@ -1326,6 +1344,93 @@ xrep_dir_setup_scan( return error; } +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_dir_move_to_orphanage( + struct xrep_dir *rd) +{ + struct xfs_scrub *sc = rd->sc; + xfs_ino_t orig_parent, new_parent; + int error; + + /* No orphanage? We can't fix this. */ + if (!sc->orphanage) + return -EFSCORRUPTED; + + /* + * We are about to drop the ILOCK on sc->ip to lock the orphanage and + * prepare for the adoption. Therefore, look up the old dotdot entry + * for sc->ip so that we can compare it after we re-lock sc->ip. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &orig_parent); + if (error) + return error; + + /* + * We hold ILOCK_EXCL on both the directory and the tempdir after a + * successful rebuild. Before we can move the directory to the + * orphanage, we must roll to a clean unjoined transaction. + */ + error = xfs_trans_roll(&sc->tp); + if (error) + return error; + + /* + * Because the orphanage is just another directory in the filesystem, + * we must take its IOLOCK to coordinate with the VFS. We cannot take + * an IOLOCK while holding an ILOCK, so we must drop them all. We may + * have to drop the IOLOCK as well. + */ + xrep_tempfile_iunlock_both(sc); + + error = xrep_adoption_init(sc, &rd->adoption); + if (error) + return error; + + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + goto err_adoption; + } + + /* Prepare for the adoption and lock both down. */ + error = xrep_adoption_prep(&rd->adoption); + if (error) + goto err_adoption; + + error = xrep_adoption_compute_name(&rd->adoption, rd->namebuf); + if (error) + goto err_adoption; + + /* + * Now that we've reacquired the ILOCK on sc->ip, look up the dotdot + * entry again. If the parent changed or the child was unlinked while + * the child directory was unlocked, we don't need to move the child to + * the orphanage after all. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &new_parent); + if (error) + goto err_adoption; + if (orig_parent != new_parent || VFS_I(sc->ip)->i_nlink == 0) { + error = 0; + goto err_adoption; + } + + /* Attach to the orphanage. */ + return xrep_adoption_commit(&rd->adoption); +err_adoption: + xrep_adoption_cancel(&rd->adoption, error); + return error; +} + /* * Repair the directory metadata. * @@ -1364,6 +1469,12 @@ xrep_directory( if (error) goto out_teardown; + if (rd->move_orphanage) { + error = xrep_dir_move_to_orphanage(rd); + if (error) + goto out_teardown; + } + out_teardown: xrep_dir_teardown(sc); return error; diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c new file mode 100644 index 000000000000..aece4f94398e --- /dev/null +++ b/fs/xfs/scrub/orphanage.c @@ -0,0 +1,448 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2021-2023 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_log_format.h" +#include "xfs_trans.h" +#include "xfs_inode.h" +#include "xfs_ialloc.h" +#include "xfs_quota.h" +#include "xfs_trans_space.h" +#include "xfs_dir2.h" +#include "xfs_icache.h" +#include "xfs_bmap.h" +#include "xfs_bmap_btree.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/repair.h" +#include "scrub/trace.h" +#include "scrub/orphanage.h" +#include "scrub/readdir.h" + +#include <linux/namei.h> + +/* Make the orphanage owned by root. */ +STATIC int +xrep_chown_orphanage( + struct xfs_scrub *sc, + struct xfs_inode *dp) +{ + struct xfs_trans *tp; + struct xfs_mount *mp = sc->mp; + struct xfs_dquot *udqp = NULL, *gdqp = NULL, *pdqp = NULL; + struct xfs_dquot *oldu = NULL, *oldg = NULL, *oldp = NULL; + struct inode *inode = VFS_I(dp); + int error; + + error = xfs_qm_vop_dqalloc(dp, GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0, + XFS_QMOPT_QUOTALL, &udqp, &gdqp, &pdqp); + if (error) + return error; + + error = xfs_trans_alloc_ichange(dp, udqp, gdqp, pdqp, true, &tp); + if (error) + goto out_dqrele; + + /* + * Always clear setuid/setgid on the orphanage since we don't normally + * want that functionality on this directory and xfs_repair doesn't + * create it this way either. Leave the other access bits unchanged. + */ + inode->i_mode &= ~(S_ISUID | S_ISGID); + + /* + * Change the ownerships and register quota modifications + * in the transaction. + */ + if (!uid_eq(inode->i_uid, GLOBAL_ROOT_UID)) { + if (XFS_IS_UQUOTA_ON(mp)) + oldu = xfs_qm_vop_chown(tp, dp, &dp->i_udquot, udqp); + inode->i_uid = GLOBAL_ROOT_UID; + } + if (!gid_eq(inode->i_gid, GLOBAL_ROOT_GID)) { + if (XFS_IS_GQUOTA_ON(mp)) + oldg = xfs_qm_vop_chown(tp, dp, &dp->i_gdquot, gdqp); + inode->i_gid = GLOBAL_ROOT_GID; + } + if (dp->i_projid != 0) { + if (XFS_IS_PQUOTA_ON(mp)) + oldp = xfs_qm_vop_chown(tp, dp, &dp->i_pdquot, pdqp); + dp->i_projid = 0; + } + + dp->i_diflags &= ~(XFS_DIFLAG_REALTIME | XFS_DIFLAG_RTINHERIT); + xfs_trans_log_inode(tp, dp, XFS_ILOG_CORE); + + XFS_STATS_INC(mp, xs_ig_attrchg); + + if (xfs_has_wsync(mp)) + xfs_trans_set_sync(tp); + error = xfs_trans_commit(tp); + + xfs_qm_dqrele(oldu); + xfs_qm_dqrele(oldg); + xfs_qm_dqrele(oldp); + +out_dqrele: + xfs_qm_dqrele(udqp); + xfs_qm_dqrele(gdqp); + xfs_qm_dqrele(pdqp); + return error; +} + +#define ORPHANAGE "lost+found" + +/* Create the orphanage directory, and set sc->orphanage to it. */ +int +xrep_orphanage_create( + struct xfs_scrub *sc) +{ + struct xfs_mount *mp = sc->mp; + struct dentry *root_dentry, *orphanage_dentry; + struct inode *root_inode = VFS_I(sc->mp->m_rootip); + struct inode *orphanage_inode; + int error; + + if (xfs_is_shutdown(mp)) + return -EIO; + if (xfs_is_readonly(mp)) { + sc->orphanage = NULL; + return 0; + } + + ASSERT(sc->tp == NULL); + ASSERT(sc->orphanage == NULL); + + /* Find the dentry for the root directory... */ + root_dentry = d_find_alias(root_inode); + if (!root_dentry) { + error = -EFSCORRUPTED; + goto out; + } + + /* ...which is a directory, right? */ + if (!d_is_dir(root_dentry)) { + error = -EFSCORRUPTED; + goto out_dput_root; + } + + /* Try to find the orphanage directory. */ + inode_lock_nested(root_inode, I_MUTEX_PARENT); + orphanage_dentry = lookup_one_len(ORPHANAGE, root_dentry, + strlen(ORPHANAGE)); + if (IS_ERR(orphanage_dentry)) { + error = PTR_ERR(orphanage_dentry); + goto out_unlock_root; + } + + /* + * Nothing found? Call mkdir to create the orphanage. Create the + * directory without group or other-user access because we're live and + * someone could have been relying partly on minimal access to a parent + * directory to control access to a file we put in here. + */ + if (d_really_is_negative(orphanage_dentry)) { + error = vfs_mkdir(&nop_mnt_idmap, root_inode, orphanage_dentry, + 0700); + if (error) + goto out_dput_orphanage; + } + + /* Not a directory? Bail out. */ + if (!d_is_dir(orphanage_dentry)) { + error = -ENOTDIR; + goto out_dput_orphanage; + } + + /* + * Grab a reference to the orphanage. This /should/ succeed since + * we hold the root directory locked and therefore nobody can delete + * the orphanage. + */ + orphanage_inode = igrab(d_inode(orphanage_dentry)); + if (!orphanage_inode) { + error = -ENOENT; + goto out_dput_orphanage; + } + + /* Make sure the orphanage is owned by root. */ + error = xrep_chown_orphanage(sc, XFS_I(orphanage_inode)); + if (error) + goto out_dput_orphanage; + + /* Stash the reference for later and bail out. */ + sc->orphanage = XFS_I(orphanage_inode); + sc->orphanage_ilock_flags = 0; + +out_dput_orphanage: + dput(orphanage_dentry); +out_unlock_root: + inode_unlock(VFS_I(sc->mp->m_rootip)); +out_dput_root: + dput(root_dentry); +out: + return error; +} + +void +xrep_orphanage_ilock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + sc->orphanage_ilock_flags |= ilock_flags; + xfs_ilock(sc->orphanage, ilock_flags); +} + +bool +xrep_orphanage_ilock_nowait( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + if (xfs_ilock_nowait(sc->orphanage, ilock_flags)) { + sc->orphanage_ilock_flags |= ilock_flags; + return true; + } + + return false; +} + +void +xrep_orphanage_iunlock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + xfs_iunlock(sc->orphanage, ilock_flags); + sc->orphanage_ilock_flags &= ~ilock_flags; +} + +/* Grab the IOLOCK of the orphanage and sc->ip. */ +int +xrep_orphanage_iolock_two( + struct xfs_scrub *sc) +{ + int error = 0; + + while (true) { + if (xchk_should_terminate(sc, &error)) + return error; + + /* + * Normal XFS takes the IOLOCK before grabbing a transaction. + * Scrub holds a transaction, which means that we can't block + * on either IOLOCK. + */ + if (xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + if (xchk_ilock_nowait(sc, XFS_IOLOCK_EXCL)) + break; + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + } + delay(1); + } + + return 0; +} + +/* + * Set up the adoption structure and compute the block reservations needed to + * add sc->ip to the orphanage. + */ +int +xrep_adoption_init( + struct xfs_scrub *sc, + struct xrep_adoption *adopt) +{ + struct xfs_mount *mp = sc->mp; + unsigned int child_blkres = 0; + + adopt->sc = sc; + adopt->orphanage_blkres = XFS_LINK_SPACE_RES(mp, MAXNAMELEN); + if (S_ISDIR(VFS_I(sc->ip)->i_mode)) + child_blkres = XFS_RENAME_SPACE_RES(sc->mp, + xfs_name_dotdot.len); + adopt->child_blkres = child_blkres; + return 0; +} + +/* + * Compute the xfs_name for the directory entry that we're adding to the + * orphanage. Caller must hold ILOCKs of sc->ip and the orphanage and must not + * reuse namebuf until the adoption completes or is cancelled. + */ +int +xrep_adoption_compute_name( + struct xrep_adoption *adopt, + unsigned char *namebuf) +{ + struct xfs_name *xname = &adopt->xname; + struct xfs_scrub *sc = adopt->sc; + xfs_ino_t ino; + unsigned int incr = 0; + int error = 0; + + xname->name = namebuf; + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu", sc->ip->i_ino); + xname->type = xfs_mode_to_ftype(VFS_I(sc->ip)->i_mode); + + /* Make sure the filename is unique in the lost+found. */ + error = xchk_dir_lookup(sc, sc->orphanage, xname, &ino); + while (error == 0 && incr < 10000) { + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu.%u", + sc->ip->i_ino, ++incr); + error = xchk_dir_lookup(sc, sc->orphanage, xname, &ino); + } + if (error == 0) { + /* We already have 10,000 entries in the orphanage? */ + return -EFSCORRUPTED; + } + + if (error != -ENOENT) + return error; + return 0; +} + +/* + * Prepare to send a child to the orphanage. + * + * Reserve more space in the transaction, take the ILOCKs of the orphanage and + * sc->ip, join them to the transaction, and reserve quota to reparent the + * latter. Caller must hold the IOLOCK of the orphanage and sc->ip. + */ +int +xrep_adoption_prep( + struct xrep_adoption *adopt) +{ + struct xfs_scrub *sc = adopt->sc; + int error; + + /* + * Reserve space to the transaction to handle expansion of both the + * orphanage and the child directory. + */ + error = xfs_trans_reserve_more(sc->tp, + adopt->orphanage_blkres + adopt->child_blkres, 0); + if (error) + return error; + + xfs_lock_two_inodes(sc->orphanage, XFS_ILOCK_EXCL, + sc->ip, XFS_ILOCK_EXCL); + sc->ilock_flags |= XFS_ILOCK_EXCL; + sc->orphanage_ilock_flags |= XFS_ILOCK_EXCL; + + xfs_trans_ijoin(sc->tp, sc->orphanage, 0); + xfs_trans_ijoin(sc->tp, sc->ip, 0); + + /* + * Reserve enough quota in the orphan directory to add the new name. + * Normally the orphanage should have user/group/project ids of zero + * and hence is not subject to quota enforcement, but we're allowed to + * exceed quota to reattach disconnected parts of the directory tree. + */ + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->orphanage, + adopt->orphanage_blkres, 0, true); + if (error) + return error; + + /* + * Reserve enough quota in the child directory to change dotdot. + * Here we're also allowed to exceed file quota to repair inconsistent + * metadata. + */ + if (adopt->child_blkres) { + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->ip, + adopt->child_blkres, 0, true); + if (error) + return error; + } + + return 0; +} + +/* + * Move the current file to the orphanage. + * + * The caller must hold the IOLOCKs and the ILOCKs for both sc->ip and the + * orphanage. The directory entry name must have been computed, and quota + * reserved. The function returns with both inodes joined and ILOCKed to the + * transaction. + */ +int +xrep_adoption_commit( + struct xrep_adoption *adopt) +{ + struct xfs_scrub *sc = adopt->sc; + struct xfs_name *xname = &adopt->xname; + bool isdir = S_ISDIR(VFS_I(sc->ip)->i_mode); + int error; + + trace_xrep_adoption_commit(sc->orphanage, &adopt->xname, sc->ip->i_ino); + + /* + * Create the new name in the orphanage, and bump the link count of + * the orphanage if we just added a directory. + */ + error = xfs_dir_createname(sc->tp, sc->orphanage, xname, sc->ip->i_ino, + adopt->orphanage_blkres); + if (error) + return error; + + xfs_trans_ichgtime(sc->tp, sc->orphanage, + XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG); + if (isdir) + xfs_bumplink(sc->tp, sc->orphanage); + xfs_trans_log_inode(sc->tp, sc->orphanage, XFS_ILOG_CORE); + + /* Replace the dotdot entry in the child directory. */ + if (isdir) { + error = xfs_dir_replace(sc->tp, sc->ip, &xfs_name_dotdot, + sc->orphanage->i_ino, adopt->child_blkres); + if (error) + return error; + } + + /* + * Notify dirent hooks that we moved the file to /lost+found, and + * finish all the deferred work so that we know the adoption is fully + * recorded in the log. + */ + xfs_dir_update_hook(sc->orphanage, sc->ip, 1, xname); + return xrep_defer_finish(sc); +} + +/* Cancel a proposed relocation of a file to the orphanage. */ +void +xrep_adoption_cancel( + struct xrep_adoption *adopt, + int error) +{ + struct xfs_scrub *sc = adopt->sc; + + /* + * Setting up (and hence cancelling) an adoption is the last thing that + * repair code does. Hence we don't bother giving back the quota or + * space reservations or unlock the inodes. Later when we have incore + * state to manage, we'll need to give that back. + */ + trace_xrep_adoption_cancel(sc->orphanage, sc->ip, error); +} + +/* Release the orphanage. */ +void +xrep_orphanage_rele( + struct xfs_scrub *sc) +{ + if (!sc->orphanage) + return; + + if (sc->orphanage_ilock_flags) + xfs_iunlock(sc->orphanage, sc->orphanage_ilock_flags); + + xchk_irele(sc, sc->orphanage); + sc->orphanage = NULL; +} diff --git a/fs/xfs/scrub/orphanage.h b/fs/xfs/scrub/orphanage.h new file mode 100644 index 000000000000..31f068198c8a --- /dev/null +++ b/fs/xfs/scrub/orphanage.h @@ -0,0 +1,73 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2021-2023 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#ifndef __XFS_SCRUB_ORPHANAGE_H__ +#define __XFS_SCRUB_ORPHANAGE_H__ + +#ifdef CONFIG_XFS_ONLINE_REPAIR +int xrep_orphanage_create(struct xfs_scrub *sc); + +/* + * If we're doing a repair, ensure that the orphanage exists and attach it to + * the scrub context. + */ +static inline int +xrep_orphanage_try_create( + struct xfs_scrub *sc) +{ + int error; + + ASSERT(sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR); + + error = xrep_orphanage_create(sc); + switch (error) { + case 0: + case -ENOENT: + case -ENOTDIR: + case -ENOSPC: + /* + * If the orphanage can't be found or isn't a directory, we'll + * keep going, but we won't be able to attach the file to the + * orphanage if we can't find the parent. + */ + return 0; + } + + return error; +} + +int xrep_orphanage_iolock_two(struct xfs_scrub *sc); + +/* Information about a request to add a file to the orphanage. */ +struct xrep_adoption { + /* Name structure; caller must provide a buffer separately. */ + struct xfs_name xname; + + struct xfs_scrub *sc; + + /* Block reservations for orphanage and child (if directory). */ + unsigned int orphanage_blkres; + unsigned int child_blkres; +}; + +int xrep_adoption_init(struct xfs_scrub *sc, struct xrep_adoption *adopt); +int xrep_adoption_compute_name(struct xrep_adoption *adopt, + unsigned char *namebuf); +int xrep_adoption_prep(struct xrep_adoption *adopt); +int xrep_adoption_commit(struct xrep_adoption *adopt); +void xrep_adoption_cancel(struct xrep_adoption *adopt, int error); + +void xrep_orphanage_ilock(struct xfs_scrub *sc, unsigned int ilock_flags); +bool xrep_orphanage_ilock_nowait(struct xfs_scrub *sc, + unsigned int ilock_flags); +void xrep_orphanage_iunlock(struct xfs_scrub *sc, unsigned int ilock_flags); + +void xrep_orphanage_rele(struct xfs_scrub *sc); +#else +struct xrep_adoption { /* empty */ }; +# define xrep_orphanage_rele(sc) +#endif /* CONFIG_XFS_ONLINE_REPAIR */ + +#endif /* __XFS_SCRUB_ORPHANAGE_H__ */ diff --git a/fs/xfs/scrub/parent_repair.c b/fs/xfs/scrub/parent_repair.c index 0343b1fdbeeb..e0ac7336c1ac 100644 --- a/fs/xfs/scrub/parent_repair.c +++ b/fs/xfs/scrub/parent_repair.c @@ -32,6 +32,8 @@ #include "scrub/iscan.h" #include "scrub/findparent.h" #include "scrub/readdir.h" +#include "scrub/tempfile.h" +#include "scrub/orphanage.h" /* * Repairing The Directory Parent Pointer @@ -57,6 +59,12 @@ struct xrep_parent { * dotdot entry for this directory. */ struct xrep_parent_scan_info pscan; + + /* Orphanage reparenting request. */ + struct xrep_adoption adoption; + + /* Directory entry name, plus the trailing null. */ + unsigned char namebuf[MAXNAMELEN]; }; /* Tear down all the incore stuff we created. */ @@ -82,7 +90,7 @@ xrep_setup_parent( rp->sc = sc; sc->buf = rp; - return 0; + return xrep_orphanage_try_create(sc); } /* @@ -171,6 +179,84 @@ xrep_parent_reset_dotdot( rp->pscan.parent_ino, spaceres); } +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_parent_move_to_orphanage( + struct xrep_parent *rp) +{ + struct xfs_scrub *sc = rp->sc; + xfs_ino_t orig_parent, new_parent; + int error; + + /* No orphanage? We can't fix this. */ + if (!sc->orphanage) + return -EFSCORRUPTED; + + /* + * We are about to drop the ILOCK on sc->ip to lock the orphanage and + * prepare for the adoption. Therefore, look up the old dotdot entry + * for sc->ip so that we can compare it after we re-lock sc->ip. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &orig_parent); + if (error) + return error; + + /* + * Because the orphanage is just another directory in the filesystem, + * we must take its IOLOCK to coordinate with the VFS. We cannot take + * an IOLOCK while holding an ILOCK, so we must drop the ILOCK. We + * may have to drop the IOLOCK as well. + */ + xchk_iunlock(sc, XFS_ILOCK_EXCL); + + error = xrep_adoption_init(sc, &rp->adoption); + if (error) + return error; + + /* If we can take the orphanage's iolock then we're ready to move. */ + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + goto err_adoption; + } + + /* Prepare for the adoption and lock both down. */ + error = xrep_adoption_prep(&rp->adoption); + if (error) + goto err_adoption; + + error = xrep_adoption_compute_name(&rp->adoption, rp->namebuf); + if (error) + goto err_adoption; + + /* + * Now that we've reacquired the ILOCK on sc->ip, look up the dotdot + * entry again. If the parent changed or the child was unlinked while + * the child directory was unlocked, we don't need to move the child to + * the orphanage after all. + */ + error = xchk_dir_lookup(sc, sc->ip, &xfs_name_dotdot, &new_parent); + if (error) + goto err_adoption; + if (orig_parent != new_parent || VFS_I(sc->ip)->i_nlink == 0) { + error = 0; + goto err_adoption; + } + + return xrep_adoption_commit(&rp->adoption); +err_adoption: + xrep_adoption_cancel(&rp->adoption, error); + return error; +} + /* * Commit the new parent pointer structure (currently only the dotdot entry) to * the file that we're repairing. @@ -179,10 +265,8 @@ STATIC int xrep_parent_rebuild_tree( struct xrep_parent *rp) { - if (rp->pscan.parent_ino == NULLFSINO) { - /* Cannot fix orphaned directories yet. */ - return -EFSCORRUPTED; - } + if (rp->pscan.parent_ino == NULLFSINO) + return xrep_parent_move_to_orphanage(rp); return xrep_parent_reset_dotdot(rp); } diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index 52404558ac99..f90223b909a2 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -28,6 +28,7 @@ #include "scrub/health.h" #include "scrub/xfile.h" #include "scrub/tempfile.h" +#include "scrub/orphanage.h" /* * Online Scrub and Repair @@ -221,6 +222,7 @@ xchk_teardown( } xrep_tempfile_rele(sc); + xrep_orphanage_rele(sc); xchk_fsgates_disable(sc); return error; } diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index e3a53c3a13a3..6f23edcac5cd 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -105,6 +105,10 @@ struct xfs_scrub { /* Lock flags for @ip. */ uint ilock_flags; + /* The orphanage, for stashing files that have lost their parent. */ + uint orphanage_ilock_flags; + struct xfs_inode *orphanage; + /* A temporary file on this filesystem, for staging new metadata. */ struct xfs_inode *tempip; uint temp_ilock_flags; diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index 260720500440..abf8d1fad537 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -2517,6 +2517,29 @@ DEFINE_EVENT(xrep_dirent_class, name, \ DEFINE_XREP_DIRENT_CLASS(xrep_dir_salvage_entry); DEFINE_XREP_DIRENT_CLASS(xrep_dir_stash_createname); DEFINE_XREP_DIRENT_CLASS(xrep_dir_replay_createname); +DEFINE_XREP_DIRENT_CLASS(xrep_adoption_commit); + +TRACE_EVENT(xrep_adoption_cancel, + TP_PROTO(struct xfs_inode *dp, struct xfs_inode *ip, int error), + TP_ARGS(dp, ip, error), + TP_STRUCT__entry( + __field(dev_t, dev) + __field(xfs_ino_t, dir_ino) + __field(xfs_ino_t, child_ino) + __field(int, error) + ), + TP_fast_assign( + __entry->dev = dp->i_mount->m_super->s_dev; + __entry->dir_ino = dp->i_ino; + __entry->child_ino = ip->i_ino; + __entry->error = error; + ), + TP_printk("dev %d:%d dir 0x%llx child 0x%llx error %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + __entry->dir_ino, + __entry->child_ino, + __entry->error) +); DECLARE_EVENT_CLASS(xrep_parent_salvage_class, TP_PROTO(struct xfs_inode *dp, xfs_ino_t ino), diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 2fd55c2f1b82..c5d2dae9c00b 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -928,10 +928,10 @@ xfs_droplink( /* * Increment the link count on an inode & log the change. */ -static void +void xfs_bumplink( - xfs_trans_t *tp, - xfs_inode_t *ip) + struct xfs_trans *tp, + struct xfs_inode *ip) { xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG); diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 2fe629eede76..f5794e3045ae 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -601,6 +601,7 @@ void xfs_end_io(struct work_struct *work); int xfs_ilock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); +void xfs_bumplink(struct xfs_trans *tp, struct xfs_inode *ip); void xfs_inode_count_blocks(struct xfs_trans *tp, struct xfs_inode *ip, xfs_filblks_t *dblocks, xfs_filblks_t *rblocks); ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCHSET v24.0 0/3] xfs: move orphan files to lost and found @ 2022-12-30 22:14 Darrick J. Wong 2022-12-30 22:14 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong 0 siblings, 1 reply; 12+ messages in thread From: Darrick J. Wong @ 2022-12-30 22:14 UTC (permalink / raw) To: djwong; +Cc: linux-xfs Hi all, Orphaned files are defined to be files with nonzero ondisk link count but no observable parent directory. This series enables online repair to reparent orphaned files into the filesystem directory tree, and wires up this reparenting ability into the directory, file link count, and parent pointer repair functions. This is how we fix files with positive link count that are not reachable through the directory tree. This patch will also create the orphanage directory (lost+found) if it is not present. In contrast to xfs_repair, we follow e2fsck in creating the lost+found without group or other-owner access to avoid accidental disclosure of files that were previously hidden by an 0700 directory. That's silly security, but people have been known to do it. If you're going to start using this mess, you probably ought to just pull from my git trees, which are linked below. This is an extraordinary way to destroy everything. Enjoy! Comments and questions are, as always, welcome. --D kernel git tree: https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=repair-orphanage --- fs/xfs/Makefile | 1 fs/xfs/scrub/dir_repair.c | 101 ++++++++ fs/xfs/scrub/nlinks.c | 11 + fs/xfs/scrub/nlinks.h | 6 + fs/xfs/scrub/nlinks_repair.c | 250 ++++++++++++++++++++- fs/xfs/scrub/orphanage.c | 504 ++++++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/orphanage.h | 79 +++++++ fs/xfs/scrub/parent.c | 10 + fs/xfs/scrub/parent_repair.c | 92 ++++++++ fs/xfs/scrub/repair.h | 4 fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/scrub.h | 4 fs/xfs/scrub/trace.c | 1 fs/xfs/scrub/trace.h | 59 +++++ fs/xfs/xfs_inode.c | 6 - fs/xfs/xfs_inode.h | 1 16 files changed, 1119 insertions(+), 12 deletions(-) create mode 100644 fs/xfs/scrub/orphanage.c create mode 100644 fs/xfs/scrub/orphanage.h ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] xfs: move orphan files to the orphanage 2022-12-30 22:14 [PATCHSET v24.0 0/3] xfs: move orphan files to lost and found Darrick J. Wong @ 2022-12-30 22:14 ` Darrick J. Wong 0 siblings, 0 replies; 12+ messages in thread From: Darrick J. Wong @ 2022-12-30 22:14 UTC (permalink / raw) To: djwong; +Cc: linux-xfs From: Darrick J. Wong <djwong@kernel.org> If we can't find a parent for a file, move it to the orphanage. Signed-off-by: Darrick J. Wong <djwong@kernel.org> --- fs/xfs/Makefile | 1 fs/xfs/scrub/dir_repair.c | 101 ++++++++++ fs/xfs/scrub/orphanage.c | 414 ++++++++++++++++++++++++++++++++++++++++++ fs/xfs/scrub/orphanage.h | 79 ++++++++ fs/xfs/scrub/parent.c | 10 + fs/xfs/scrub/parent_repair.c | 92 +++++++++ fs/xfs/scrub/repair.h | 2 fs/xfs/scrub/scrub.c | 2 fs/xfs/scrub/scrub.h | 4 fs/xfs/scrub/trace.h | 1 fs/xfs/xfs_inode.c | 6 - fs/xfs/xfs_inode.h | 1 12 files changed, 707 insertions(+), 6 deletions(-) create mode 100644 fs/xfs/scrub/orphanage.c create mode 100644 fs/xfs/scrub/orphanage.h diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile index 5e9ffd9f1583..6d6ca775553f 100644 --- a/fs/xfs/Makefile +++ b/fs/xfs/Makefile @@ -197,6 +197,7 @@ xfs-y += $(addprefix scrub/, \ inode_repair.o \ newbt.o \ nlinks_repair.o \ + orphanage.o \ parent_repair.o \ rcbag_btree.o \ rcbag.o \ diff --git a/fs/xfs/scrub/dir_repair.c b/fs/xfs/scrub/dir_repair.c index 871b14c09e86..7530819e1435 100644 --- a/fs/xfs/scrub/dir_repair.c +++ b/fs/xfs/scrub/dir_repair.c @@ -40,6 +40,7 @@ #include "scrub/readdir.h" #include "scrub/reap.h" #include "scrub/parent.h" +#include "scrub/orphanage.h" /* * Directory Repair @@ -91,9 +92,15 @@ struct xrep_dir { /* nlink value of the corrected directory. */ xfs_nlink_t new_nlink; + /* Should we move this directory to the orphanage? */ + bool move_orphanage; + /* Preallocated args struct for performing dir operations */ struct xfs_da_args args; + /* Orphanage reparinting request. */ + struct xrep_orphanage_req adoption; + /* Directory entry name, plus the trailing null. */ char namebuf[MAXNAMELEN]; }; @@ -108,6 +115,10 @@ xrep_setup_directory( { int error; + error = xrep_orphanage_try_create(sc); + if (error) + return error; + error = xrep_tempfile_create(sc, S_IFDIR); if (error) return error; @@ -1088,8 +1099,76 @@ xrep_dir_find_parent( if (rd->parent_ino != NULLFSINO) return 0; - /* NOTE: A future patch will deal with moving orphans. */ - return -EFSCORRUPTED; + /* + * Temporarily assign the root dir as the parent; we'll move this to + * the orphanage after swapping the dir contents. + */ + rd->move_orphanage = true; + rd->parent_ino = rd->sc->mp->m_sb.sb_rootino; + return 0; +} + +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_dir_move_to_orphanage( + struct xrep_dir *rd) +{ + struct xfs_scrub *sc = rd->sc; + int error; + + /* No orphanage? We can't fix this. */ + if (!sc->orphanage) + return -EFSCORRUPTED; + + /* If we can take the orphanage's iolock then we're ready to move. */ + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xfs_ino_t orig_parent, new_parent; + + /* + * We may have to drop the lock on sc->ip to try to lock the + * orphanage. Therefore, look up the old dotdot entry for + * sc->ip so that we can compare it after we re-lock sc->ip. + */ + orig_parent = xrep_dotdot_lookup(sc); + + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + return error; + + /* + * If the parent changed or the child was unlinked while the + * child directory was unlocked, we don't need to move the + * child to the orphanage after all. + */ + new_parent = xrep_dotdot_lookup(sc); + + if (orig_parent != new_parent || VFS_I(sc->ip)->i_nlink == 0) + return 0; + } + + /* + * Move the directory to the orphanage, and let scrub teardown unlock + * everything for us. + */ + xrep_orphanage_compute_blkres(sc, &rd->adoption); + + error = xrep_orphanage_compute_name(&rd->adoption, rd->namebuf); + if (error) + return error; + + error = xrep_orphanage_adoption_prep(&rd->adoption); + if (error) + return error; + + return xrep_orphanage_adopt(&rd->adoption); } /* @@ -1167,6 +1246,24 @@ xrep_directory( /* Swap in the good contents. */ error = xrep_dir_rebuild_tree(rd); + if (error || !rd->move_orphanage) + goto out_rd; + + /* + * We hold ILOCK_EXCL on both the directory and the tempdir after a + * successful rebuild. Before we can move the directory to the + * orphanage, we must roll to a clean unjoined transaction and drop the + * ILOCKs on the dir and the temp dir. We still hold IOLOCK_EXCL on + * the dir, so nobody will be able to access it in the mean time. + */ + error = xfs_trans_roll(&sc->tp); + if (error) + goto out_rd; + + xchk_iunlock(sc, XFS_ILOCK_EXCL); + xrep_tempfile_iunlock(sc); + + error = xrep_dir_move_to_orphanage(rd); out_names: if (rd->dir_names) diff --git a/fs/xfs/scrub/orphanage.c b/fs/xfs/scrub/orphanage.c new file mode 100644 index 000000000000..1fe7935433bf --- /dev/null +++ b/fs/xfs/scrub/orphanage.c @@ -0,0 +1,414 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#include "xfs.h" +#include "xfs_fs.h" +#include "xfs_shared.h" +#include "xfs_format.h" +#include "xfs_trans_resv.h" +#include "xfs_mount.h" +#include "xfs_log_format.h" +#include "xfs_trans.h" +#include "xfs_inode.h" +#include "xfs_ialloc.h" +#include "xfs_quota.h" +#include "xfs_trans_space.h" +#include "xfs_dir2.h" +#include "xfs_icache.h" +#include "xfs_bmap.h" +#include "xfs_bmap_btree.h" +#include "scrub/scrub.h" +#include "scrub/common.h" +#include "scrub/repair.h" +#include "scrub/trace.h" +#include "scrub/orphanage.h" + +#include <linux/namei.h> + +/* Make the orphanage owned by root. */ +STATIC int +xrep_chown_orphanage( + struct xfs_scrub *sc, + struct xfs_inode *dp) +{ + struct xfs_trans *tp; + struct xfs_mount *mp = sc->mp; + struct xfs_dquot *udqp = NULL, *gdqp = NULL, *pdqp = NULL; + struct xfs_dquot *oldu = NULL, *oldg = NULL, *oldp = NULL; + struct inode *inode = VFS_I(dp); + int error; + + error = xfs_qm_vop_dqalloc(dp, GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0, + XFS_QMOPT_QUOTALL, &udqp, &gdqp, &pdqp); + if (error) + return error; + + error = xfs_trans_alloc_ichange(dp, udqp, gdqp, pdqp, true, &tp); + if (error) + goto out_dqrele; + + /* + * Always clear setuid/setgid on the orphanage since we don't normally + * want that functionality on this directory and xfs_repair doesn't + * create it this way either. Leave the other access bits unchanged. + */ + inode->i_mode &= ~(S_ISUID | S_ISGID); + + /* + * Change the ownerships and register quota modifications + * in the transaction. + */ + if (!uid_eq(inode->i_uid, GLOBAL_ROOT_UID)) { + if (XFS_IS_UQUOTA_ON(mp)) + oldu = xfs_qm_vop_chown(tp, dp, &dp->i_udquot, udqp); + inode->i_uid = GLOBAL_ROOT_UID; + } + if (!gid_eq(inode->i_gid, GLOBAL_ROOT_GID)) { + if (XFS_IS_GQUOTA_ON(mp)) + oldg = xfs_qm_vop_chown(tp, dp, &dp->i_gdquot, gdqp); + inode->i_gid = GLOBAL_ROOT_GID; + } + if (dp->i_projid != 0) { + if (XFS_IS_PQUOTA_ON(mp)) + oldp = xfs_qm_vop_chown(tp, dp, &dp->i_pdquot, pdqp); + dp->i_projid = 0; + } + + dp->i_diflags &= ~(XFS_DIFLAG_REALTIME | XFS_DIFLAG_RTINHERIT); + xfs_trans_log_inode(tp, dp, XFS_ILOG_CORE); + + XFS_STATS_INC(mp, xs_ig_attrchg); + + if (xfs_has_wsync(mp)) + xfs_trans_set_sync(tp); + error = xfs_trans_commit(tp); + + xfs_qm_dqrele(oldu); + xfs_qm_dqrele(oldg); + xfs_qm_dqrele(oldp); + +out_dqrele: + xfs_qm_dqrele(udqp); + xfs_qm_dqrele(gdqp); + xfs_qm_dqrele(pdqp); + return error; +} + +#define ORPHANAGE "lost+found" + +/* Create the orphanage directory, and set sc->orphanage to it. */ +int +xrep_orphanage_create( + struct xfs_scrub *sc) +{ + struct xfs_mount *mp = sc->mp; + struct dentry *root_dentry, *orphanage_dentry; + struct inode *root_inode = VFS_I(sc->mp->m_rootip); + struct inode *orphanage_inode; + int error; + + if (xfs_is_shutdown(mp)) + return -EIO; + if (xfs_is_readonly(mp)) { + sc->orphanage = NULL; + return 0; + } + + ASSERT(sc->tp == NULL); + ASSERT(sc->orphanage == NULL); + + /* Find the dentry for the root directory... */ + root_dentry = d_find_alias(root_inode); + if (!root_dentry) { + error = -EFSCORRUPTED; + goto out; + } + + /* ...which is a directory, right? */ + if (!d_is_dir(root_dentry)) { + error = -EFSCORRUPTED; + goto out_dput_root; + } + + /* Try to find the orphanage directory. */ + inode_lock_nested(root_inode, I_MUTEX_PARENT); + orphanage_dentry = lookup_one_len(ORPHANAGE, root_dentry, + strlen(ORPHANAGE)); + if (IS_ERR(orphanage_dentry)) { + error = PTR_ERR(orphanage_dentry); + goto out_unlock_root; + } + + /* + * Nothing found? Call mkdir to create the orphanage. Create the + * directory without group or other-user access because we're live and + * someone could have been relying partly on minimal access to a parent + * directory to control access to a file we put in here. + */ + if (d_really_is_negative(orphanage_dentry)) { + error = vfs_mkdir(&init_user_ns, root_inode, orphanage_dentry, + 0700); + if (error) + goto out_dput_orphanage; + } + + /* Not a directory? Bail out. */ + if (!d_is_dir(orphanage_dentry)) { + error = -ENOTDIR; + goto out_dput_orphanage; + } + + /* + * Grab a reference to the orphanage. This /should/ succeed since + * we hold the root directory locked and therefore nobody can delete + * the orphanage. + */ + orphanage_inode = igrab(d_inode(orphanage_dentry)); + if (!orphanage_inode) { + error = -ENOENT; + goto out_dput_orphanage; + } + + /* Make sure the orphanage is owned by root. */ + error = xrep_chown_orphanage(sc, XFS_I(orphanage_inode)); + if (error) + goto out_dput_orphanage; + + /* Stash the reference for later and bail out. */ + sc->orphanage = XFS_I(orphanage_inode); + sc->orphanage_ilock_flags = 0; + +out_dput_orphanage: + dput(orphanage_dentry); +out_unlock_root: + inode_unlock(VFS_I(sc->mp->m_rootip)); +out_dput_root: + dput(root_dentry); +out: + return error; +} + +void +xrep_orphanage_ilock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + sc->orphanage_ilock_flags |= ilock_flags; + xfs_ilock(sc->orphanage, ilock_flags); +} + +bool +xrep_orphanage_ilock_nowait( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + if (xfs_ilock_nowait(sc->orphanage, ilock_flags)) { + sc->orphanage_ilock_flags |= ilock_flags; + return true; + } + + return false; +} + +void +xrep_orphanage_iunlock( + struct xfs_scrub *sc, + unsigned int ilock_flags) +{ + xfs_iunlock(sc->orphanage, ilock_flags); + sc->orphanage_ilock_flags &= ~ilock_flags; +} + +/* Grab the IOLOCK of the orphanage and sc->ip. */ +int +xrep_orphanage_iolock_two( + struct xfs_scrub *sc) +{ + int error = 0; + + while (true) { + if (xchk_should_terminate(sc, &error)) + return error; + + /* + * Normal XFS takes the IOLOCK before grabbing a transaction. + * Scrub holds a transaction, which means that we can't block + * on either IOLOCK. + */ + if (xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + if (xchk_ilock_nowait(sc, XFS_IOLOCK_EXCL)) + break; + xrep_orphanage_iunlock(sc, XFS_IOLOCK_EXCL); + } + delay(1); + } + + return 0; +} + +/* Compute block reservation needed to add sc->ip to the orphanage. */ +void +xrep_orphanage_compute_blkres( + struct xfs_scrub *sc, + struct xrep_orphanage_req *orph) +{ + struct xfs_mount *mp = sc->mp; + bool isdir = S_ISDIR(VFS_I(sc->ip)->i_mode); + + orph->sc = sc; + orph->orphanage_blkres = XFS_LINK_SPACE_RES(mp, MAXNAMELEN); + orph->child_blkres = isdir ? XFS_RENAME_SPACE_RES(mp, 2) : 0; +} + +/* + * Compute the xfs_name for the directory entry that we're adding to the + * orphanage. Caller must have the IOLOCK of the orphanage and sc->ip. + */ +int +xrep_orphanage_compute_name( + struct xrep_orphanage_req *orph, + unsigned char *namebuf) +{ + struct xfs_name *xname = &orph->xname; + struct xfs_scrub *sc = orph->sc; + xfs_ino_t ino; + unsigned int incr = 0; + int error = 0; + + xname->name = namebuf; + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu", sc->ip->i_ino); + xname->type = xfs_mode_to_ftype(VFS_I(sc->ip)->i_mode); + + /* Make sure the filename is unique in the lost+found. */ + error = xfs_dir_lookup(sc->tp, sc->orphanage, xname, &ino, NULL); + while (error == 0 && incr < 10000) { + xname->len = snprintf(namebuf, MAXNAMELEN, "%llu.%u", + sc->ip->i_ino, ++incr); + error = xfs_dir_lookup(sc->tp, sc->orphanage, xname, &ino, + NULL); + } + if (error == 0) { + /* We already have 10,000 entries in the orphanage? */ + return -EFSCORRUPTED; + } + + if (error != -ENOENT) + return error; + return 0; +} + +/* + * Prepare to send a child to the orphanage. + * + * Reserve more space in the transaction, take the ILOCKs of the orphanage and + * sc->ip, join them to the transaction, and reserve quota to reparent the + * latter. + */ +int +xrep_orphanage_adoption_prep( + struct xrep_orphanage_req *orph) +{ + struct xfs_scrub *sc = orph->sc; + int error; + + /* + * Reserve space to the transaction to handle expansion of both the + * orphanage and the child directory. + */ + error = xfs_trans_reserve_more(sc->tp, + orph->orphanage_blkres + orph->child_blkres, 0); + if (error) + return error; + + xfs_lock_two_inodes(sc->orphanage, XFS_ILOCK_EXCL, + sc->ip, XFS_ILOCK_EXCL); + sc->ilock_flags |= XFS_ILOCK_EXCL; + sc->orphanage_ilock_flags |= XFS_ILOCK_EXCL; + + xfs_trans_ijoin(sc->tp, sc->orphanage, 0); + xfs_trans_ijoin(sc->tp, sc->ip, 0); + + /* + * Reserve enough quota in the orphan directory to add the new name. + * Normally the orphanage should have user/group/project ids of zero + * and hence is not subject to quota enforcement, but we're allowed to + * exceed quota to reattach disconnected parts of the directory tree. + */ + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->orphanage, + orph->orphanage_blkres, 0, true); + if (error) + return error; + + /* + * Reserve enough quota in the child directory to change dotdot. + * Here we're also allowed to exceed file quota to repair inconsistent + * metadata. + */ + if (orph->child_blkres) { + error = xfs_trans_reserve_quota_nblks(sc->tp, sc->ip, + orph->child_blkres, 0, true); + if (error) + return error; + } + + return 0; +} + +/* + * Move the current file to the orphanage. + * + * The caller must hold the IOLOCKs and the ILOCKs for both sc->ip and the + * orphanage. The directory entry name must have been computed, and quota + * reserved. The function returns with both inodes joined and ILOCKed to the + * transaction. + */ +int +xrep_orphanage_adopt( + struct xrep_orphanage_req *orph) +{ + struct xfs_scrub *sc = orph->sc; + struct xfs_name *xname = &orph->xname; + bool isdir = S_ISDIR(VFS_I(sc->ip)->i_mode); + int error; + + trace_xrep_orphanage_adopt(sc->orphanage, &orph->xname, sc->ip->i_ino); + + /* + * Create the new name in the orphanage, and bump the link count of + * the orphanage if we just added a directory. + */ + error = xfs_dir_createname(sc->tp, sc->orphanage, xname, sc->ip->i_ino, + orph->orphanage_blkres); + if (error) + return error; + + xfs_trans_ichgtime(sc->tp, sc->orphanage, + XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG); + if (isdir) + xfs_bumplink(sc->tp, sc->orphanage); + xfs_trans_log_inode(sc->tp, sc->orphanage, XFS_ILOG_CORE); + + if (!isdir) + return 0; + + /* Replace the dotdot entry in the child directory. */ + return xfs_dir_replace(sc->tp, sc->ip, &xfs_name_dotdot, + sc->orphanage->i_ino, orph->child_blkres); +} + +/* Release the orphanage. */ +void +xrep_orphanage_rele( + struct xfs_scrub *sc) +{ + if (!sc->orphanage) + return; + + if (sc->orphanage_ilock_flags) + xfs_iunlock(sc->orphanage, sc->orphanage_ilock_flags); + + xchk_irele(sc, sc->orphanage); + sc->orphanage = NULL; +} diff --git a/fs/xfs/scrub/orphanage.h b/fs/xfs/scrub/orphanage.h new file mode 100644 index 000000000000..6087d68bc68f --- /dev/null +++ b/fs/xfs/scrub/orphanage.h @@ -0,0 +1,79 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2022 Oracle. All Rights Reserved. + * Author: Darrick J. Wong <djwong@kernel.org> + */ +#ifndef __XFS_SCRUB_ORPHANAGE_H__ +#define __XFS_SCRUB_ORPHANAGE_H__ + +#ifdef CONFIG_XFS_ONLINE_REPAIR +int xrep_orphanage_create(struct xfs_scrub *sc); + +/* + * If we're doing a repair, ensure that the orphanage exists and attach it to + * the scrub context. + */ +static inline int +xrep_orphanage_try_create( + struct xfs_scrub *sc) +{ + int error; + + ASSERT(sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR); + + error = xrep_orphanage_create(sc); + switch (error) { + case 0: + case -ENOENT: + case -ENOTDIR: + case -ENOSPC: + /* + * If the orphanage can't be found or isn't a directory, we'll + * keep going, but we won't be able to attach the file to the + * orphanage if we can't find the parent. + */ + return 0; + } + + return error; +} + +int xrep_orphanage_iolock_two(struct xfs_scrub *sc); + +/* Information about a request to add a file to the orphanage. */ +struct xrep_orphanage_req { + /* Name structure; caller must provide a buffer separately. */ + struct xfs_name xname; + + struct xfs_scrub *sc; + + /* Block reservations for orphanage and child (if directory). */ + unsigned int orphanage_blkres; + unsigned int child_blkres; +}; + +static inline size_t +xrep_orphanage_req_sizeof(void) +{ + return sizeof(struct xrep_orphanage_req) + MAXNAMELEN + 1; +} + +void xrep_orphanage_compute_blkres(struct xfs_scrub *sc, + struct xrep_orphanage_req *orph); +int xrep_orphanage_compute_name(struct xrep_orphanage_req *orph, + unsigned char *namebuf); +int xrep_orphanage_adoption_prep(struct xrep_orphanage_req *orph); +int xrep_orphanage_adopt(struct xrep_orphanage_req *orph); + +void xrep_orphanage_ilock(struct xfs_scrub *sc, unsigned int ilock_flags); +bool xrep_orphanage_ilock_nowait(struct xfs_scrub *sc, + unsigned int ilock_flags); +void xrep_orphanage_iunlock(struct xfs_scrub *sc, unsigned int ilock_flags); + +void xrep_orphanage_rele(struct xfs_scrub *sc); +#else +struct xrep_orphanage_req { /* empty */ }; +# define xrep_orphanage_rele(sc) +#endif /* CONFIG_XFS_ONLINE_REPAIR */ + +#endif /* __XFS_SCRUB_ORPHANAGE_H__ */ diff --git a/fs/xfs/scrub/parent.c b/fs/xfs/scrub/parent.c index dfea3102f52f..92866f1757be 100644 --- a/fs/xfs/scrub/parent.c +++ b/fs/xfs/scrub/parent.c @@ -10,6 +10,7 @@ #include "xfs_trans_resv.h" #include "xfs_mount.h" #include "xfs_log_format.h" +#include "xfs_trans.h" #include "xfs_inode.h" #include "xfs_icache.h" #include "xfs_dir2.h" @@ -18,12 +19,21 @@ #include "scrub/common.h" #include "scrub/readdir.h" #include "scrub/parent.h" +#include "scrub/repair.h" /* Set us up to scrub parents. */ int xchk_setup_parent( struct xfs_scrub *sc) { + int error; + + if (sc->sm->sm_flags & XFS_SCRUB_IFLAG_REPAIR) { + error = xrep_setup_parent(sc); + if (error) + return error; + } + return xchk_setup_inode_contents(sc, 0); } diff --git a/fs/xfs/scrub/parent_repair.c b/fs/xfs/scrub/parent_repair.c index d83948d1fd05..ffef5de0fbe2 100644 --- a/fs/xfs/scrub/parent_repair.c +++ b/fs/xfs/scrub/parent_repair.c @@ -33,6 +33,7 @@ #include "scrub/parent.h" #include "scrub/readdir.h" #include "scrub/tempfile.h" +#include "scrub/orphanage.h" struct xrep_findparent_info { /* The directory currently being scanned. */ @@ -352,6 +353,29 @@ xrep_parent_scan( return 0; } +struct xrep_parent { + struct xfs_scrub *sc; + + /* Orphanage reparinting request. */ + struct xrep_orphanage_req adoption; + + /* Directory entry name, plus the trailing null. */ + char namebuf[MAXNAMELEN]; +}; + +/* Set up for a parent repair. */ +int +xrep_setup_parent( + struct xfs_scrub *sc) +{ + /* We need a buffer for the orphanage request and a name buffer. */ + sc->buf = kvzalloc(sizeof(struct xrep_parent), XCHK_GFP_FLAGS); + if (!sc->buf) + return -ENOMEM; + + return xrep_orphanage_try_create(sc); +} + /* * If we're the root of a directory tree, we are our own parent. If we're an * unlinked directory, the parent /won't/ have a link to us. Set the parent @@ -409,14 +433,80 @@ xrep_parent_reset_dir( spaceres); } +/* + * Move the current file to the orphanage. + * + * Caller must hold IOLOCK_EXCL on @sc->ip, and no other inode locks. Upon + * successful return, the scrub transaction will have enough extra reservation + * to make the move; it will hold IOLOCK_EXCL and ILOCK_EXCL of @sc->ip and the + * orphanage; and both inodes will be ijoined. + */ +STATIC int +xrep_parent_move_to_orphanage( + struct xrep_parent *rp) +{ + struct xfs_scrub *sc = rp->sc; + int error; + + /* No orphanage? We can't fix this. */ + if (!sc->orphanage) + return -EFSCORRUPTED; + + /* If we can take the orphanage's iolock then we're ready to move. */ + if (!xrep_orphanage_ilock_nowait(sc, XFS_IOLOCK_EXCL)) { + xfs_ino_t orig_parent, new_parent; + + /* + * We may have to drop the lock on sc->ip to try to lock the + * orphanage. Therefore, look up the old dotdot entry for + * sc->ip so that we can compare it after we re-lock sc->ip. + */ + orig_parent = xrep_dotdot_lookup(sc); + + xchk_iunlock(sc, sc->ilock_flags); + error = xrep_orphanage_iolock_two(sc); + if (error) + return error; + + /* + * If the parent changed or the child was unlinked while the + * child directory was unlocked, we don't need to move the + * child to the orphanage after all. + */ + new_parent = xrep_dotdot_lookup(sc); + + if (orig_parent != new_parent || VFS_I(sc->ip)->i_nlink == 0) + return 0; + } + + /* + * Move the directory to the orphanage, and let scrub teardown unlock + * everything for us. + */ + xrep_orphanage_compute_blkres(sc, &rp->adoption); + + error = xrep_orphanage_compute_name(&rp->adoption, rp->namebuf); + if (error) + return error; + + error = xrep_orphanage_adoption_prep(&rp->adoption); + if (error) + return error; + + return xrep_orphanage_adopt(&rp->adoption); +} + int xrep_parent( struct xfs_scrub *sc) { + struct xrep_parent *rp = sc->buf; xfs_ino_t parent_ino, curr_parent; unsigned int sick, checked; int error; + rp->sc = sc; + /* * Avoid sick directories. The parent pointer scrubber dropped the * ILOCK and MMAPLOCK, but we still hold IOLOCK_EXCL on the directory. @@ -441,7 +531,7 @@ xrep_parent( if (error) return error; if (parent_ino == NULLFSINO) - return -EFSCORRUPTED; + return xrep_parent_move_to_orphanage(rp); reset_parent: /* If the '..' entry is already set to the parent inode, we're done. */ diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h index acd7fccf8bee..596993b06256 100644 --- a/fs/xfs/scrub/repair.h +++ b/fs/xfs/scrub/repair.h @@ -84,6 +84,7 @@ int xrep_setup_rtsummary(struct xfs_scrub *sc, unsigned int *resblks, size_t *bufsize); int xrep_setup_xattr(struct xfs_scrub *sc); int xrep_setup_directory(struct xfs_scrub *sc); +int xrep_setup_parent(struct xfs_scrub *sc); int xrep_xattr_reset_fork(struct xfs_scrub *sc); @@ -194,6 +195,7 @@ xrep_setup_nothing( #define xrep_setup_ag_refcountbt xrep_setup_nothing #define xrep_setup_xattr xrep_setup_nothing #define xrep_setup_directory xrep_setup_nothing +#define xrep_setup_parent xrep_setup_nothing #define xrep_setup_inode(sc, imap) ((void)0) diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c index 39ad06e6b2d0..b334fd3d7706 100644 --- a/fs/xfs/scrub/scrub.c +++ b/fs/xfs/scrub/scrub.c @@ -27,6 +27,7 @@ #include "scrub/health.h" #include "scrub/xfile.h" #include "scrub/tempfile.h" +#include "scrub/orphanage.h" /* * Online Scrub and Repair @@ -219,6 +220,7 @@ xchk_teardown( } xrep_tempfile_rele(sc); + xrep_orphanage_rele(sc); xchk_fshooks_disable(sc); return error; } diff --git a/fs/xfs/scrub/scrub.h b/fs/xfs/scrub/scrub.h index 9c26a6092c52..d606d4f370c7 100644 --- a/fs/xfs/scrub/scrub.h +++ b/fs/xfs/scrub/scrub.h @@ -105,6 +105,10 @@ struct xfs_scrub { /* Lock flags for @ip. */ uint ilock_flags; + /* The orphanage, for stashing files that have lost their parent. */ + uint orphanage_ilock_flags; + struct xfs_inode *orphanage; + /* A temporary file on this filesystem, for staging new metadata. */ struct xfs_inode *tempip; uint temp_ilock_flags; diff --git a/fs/xfs/scrub/trace.h b/fs/xfs/scrub/trace.h index d8223ec24369..ae8a5852e258 100644 --- a/fs/xfs/scrub/trace.h +++ b/fs/xfs/scrub/trace.h @@ -2510,6 +2510,7 @@ DEFINE_EVENT(xrep_dirent_class, name, \ TP_PROTO(struct xfs_inode *dp, struct xfs_name *name, xfs_ino_t ino), \ TP_ARGS(dp, name, ino)) DEFINE_XREP_DIRENT_CLASS(xrep_dir_insert_rec); +DEFINE_XREP_DIRENT_CLASS(xrep_orphanage_adopt); DECLARE_EVENT_CLASS(xrep_parent_salvage_class, TP_PROTO(struct xfs_inode *dp, xfs_ino_t ino), diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index ce55dde40d9d..85ce54a09d82 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -930,10 +930,10 @@ xfs_droplink( /* * Increment the link count on an inode & log the change. */ -static void +void xfs_bumplink( - xfs_trans_t *tp, - xfs_inode_t *ip) + struct xfs_trans *tp, + struct xfs_inode *ip) { xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG); diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 3f6c63304ca3..34f596deb92c 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -581,6 +581,7 @@ void xfs_end_io(struct work_struct *work); int xfs_ilock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); void xfs_iunlock2_io_mmap(struct xfs_inode *ip1, struct xfs_inode *ip2); +void xfs_bumplink(struct xfs_trans *tp, struct xfs_inode *ip); void xfs_inode_count_blocks(struct xfs_trans *tp, struct xfs_inode *ip, xfs_filblks_t *dblocks, xfs_filblks_t *rblocks); ^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-04-15 23:53 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-02-27 2:18 [PATCHSET v29.4 10/13] xfs: move orphan files to lost and found Darrick J. Wong 2024-02-27 2:32 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong 2024-02-28 17:21 ` Christoph Hellwig 2024-02-27 2:32 ` [PATCH 2/3] xfs: move files to orphanage instead of letting nlinks drop to zero Darrick J. Wong 2024-02-28 17:23 ` Christoph Hellwig 2024-02-27 2:32 ` [PATCH 3/3] xfs: ensure dentry consistency when the orphanage adopts a file Darrick J. Wong 2024-02-28 17:24 ` Christoph Hellwig -- strict thread matches above, loose matches on Subject: below -- 2024-04-15 23:36 [PATCHSET v30.3 10/16] xfs: move orphan files to lost and found Darrick J. Wong 2024-04-15 23:53 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong 2024-03-27 1:49 [PATCHSET v30.1 11/15] xfs: move orphan files to lost and found Darrick J. Wong 2024-03-27 2:05 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong 2023-12-31 19:31 [PATCHSET v29.0 23/28] xfs: move orphan files to lost and found Darrick J. Wong 2023-12-31 20:38 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong 2023-05-26 0:36 [PATCHSET v25.0 0/3] xfs: move orphan files to lost and found Darrick J. Wong 2023-05-26 1:36 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong 2022-12-30 22:14 [PATCHSET v24.0 0/3] xfs: move orphan files to lost and found Darrick J. Wong 2022-12-30 22:14 ` [PATCH 1/3] xfs: move orphan files to the orphanage Darrick J. Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).