From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Allison Henderson <allison.henderson@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 07/21] xfs: repair inode btrees
Date: Sat, 30 Jun 2018 11:30:11 -0700 [thread overview]
Message-ID: <20180630183011.GP5711@magnolia> (raw)
In-Reply-To: <b11c8c6c-fdd6-144b-1913-76a8f7123599@oracle.com>
On Sat, Jun 30, 2018 at 10:36:23AM -0700, Allison Henderson wrote:
> On 06/24/2018 12:24 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> >
> > Use the rmapbt to find inode chunks, query the chunks to compute
> > hole and free masks, and with that information rebuild the inobt
> > and finobt.
> >
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> > fs/xfs/Makefile | 1
> > fs/xfs/scrub/ialloc_repair.c | 585 ++++++++++++++++++++++++++++++++++++++++++
> > fs/xfs/scrub/repair.h | 2
> > fs/xfs/scrub/scrub.c | 4
> > 4 files changed, 590 insertions(+), 2 deletions(-)
> > create mode 100644 fs/xfs/scrub/ialloc_repair.c
> >
> >
> > diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
> > index 841e0824eeb6..837fd4a95f6f 100644
> > --- a/fs/xfs/Makefile
> > +++ b/fs/xfs/Makefile
> > @@ -165,6 +165,7 @@ ifeq ($(CONFIG_XFS_ONLINE_REPAIR),y)
> > xfs-y += $(addprefix scrub/, \
> > agheader_repair.o \
> > alloc_repair.o \
> > + ialloc_repair.o \
> > repair.o \
> > )
> > endif
> > diff --git a/fs/xfs/scrub/ialloc_repair.c b/fs/xfs/scrub/ialloc_repair.c
> > new file mode 100644
> > index 000000000000..29c736466bba
> > --- /dev/null
> > +++ b/fs/xfs/scrub/ialloc_repair.c
> > @@ -0,0 +1,585 @@
> > +// SPDX-License-Identifier: GPL-2.0+
> > +/*
> > + * Copyright (C) 2018 Oracle. All Rights Reserved.
> > + * Author: Darrick J. Wong <darrick.wong@oracle.com>
> > + */
> > +#include "xfs.h"
> > +#include "xfs_fs.h"
> > +#include "xfs_shared.h"
> > +#include "xfs_format.h"
> > +#include "xfs_trans_resv.h"
> > +#include "xfs_mount.h"
> > +#include "xfs_defer.h"
> > +#include "xfs_btree.h"
> > +#include "xfs_bit.h"
> > +#include "xfs_log_format.h"
> > +#include "xfs_trans.h"
> > +#include "xfs_sb.h"
> > +#include "xfs_inode.h"
> > +#include "xfs_alloc.h"
> > +#include "xfs_ialloc.h"
> > +#include "xfs_ialloc_btree.h"
> > +#include "xfs_icache.h"
> > +#include "xfs_rmap.h"
> > +#include "xfs_rmap_btree.h"
> > +#include "xfs_log.h"
> > +#include "xfs_trans_priv.h"
> > +#include "xfs_error.h"
> > +#include "scrub/xfs_scrub.h"
> > +#include "scrub/scrub.h"
> > +#include "scrub/common.h"
> > +#include "scrub/btree.h"
> > +#include "scrub/trace.h"
> > +#include "scrub/repair.h"
> > +
> > +/*
> > + * Inode Btree Repair
> > + * ==================
> > + *
> > + * Iterate the reverse mapping records looking for OWN_INODES and OWN_INOBT
> > + * records. The OWN_INOBT records are the old inode btree blocks and will be
> > + * cleared out after we've rebuilt the tree. Each possible inode chunk within
> > + * an OWN_INODES record will be read in and the freemask calculated from the
> > + * i_mode data in the inode chunk. For sparse inodes the holemask will be
> > + * calculated by creating the properly aligned inobt record and punching out
> > + * any chunk that's missing. Inode allocations and frees grab the AGI first,
> > + * so repair protects itself from concurrent access by locking the AGI.
> > + *
> > + * Once we've reconstructed all the inode records, we can create new inode
> > + * btree roots and reload the btrees. We rebuild both inode trees at the same
> > + * time because they have the same rmap owner and it would be more complex to
> > + * figure out if the other tree isn't in need of a rebuild and which OWN_INOBT
> > + * blocks it owns. We have all the data we need to build both, so dump
> > + * everything and start over.
> > + */
> > +
> > +struct xfs_repair_ialloc_extent {
> > + struct list_head list;
> > + xfs_inofree_t freemask;
> > + xfs_agino_t startino;
> > + unsigned int count;
> > + unsigned int usedcount;
> > + uint16_t holemask;
> > +};
> > +
> > +struct xfs_repair_ialloc {
> > + struct list_head *extlist;
> > + struct xfs_repair_extent_list *btlist;
> > + struct xfs_scrub_context *sc;
> > + uint64_t nr_records;
> > +};
> > +
> > +/*
> > + * Is this inode in use? If the inode is in memory we can tell from i_mode,
> > + * otherwise we have to check di_mode in the on-disk buffer. We only care
> > + * that the high (i.e. non-permission) bits of _mode are zero. This should be
> > + * safe because repair keeps all AG headers locked until the end, and process
> > + * trying to perform an inode allocation/free must lock the AGI.
> > + */
> > +STATIC int
> > +xfs_repair_ialloc_check_free(
> > + struct xfs_scrub_context *sc,
> > + struct xfs_buf *bp,
> > + xfs_ino_t fsino,
> > + xfs_agino_t bpino,
> > + bool *inuse)
> > +{
> > + struct xfs_mount *mp = sc->mp;
> > + struct xfs_dinode *dip;
> > + int error;
> > +
> > + /* Will the in-core inode tell us if it's in use? */
> > + error = xfs_icache_inode_is_allocated(mp, sc->tp, fsino, inuse);
> > + if (!error)
> > + return 0;
> > +
> > + /* Inode uncached or half assembled, read disk buffer */
> > + dip = xfs_buf_offset(bp, bpino * mp->m_sb.sb_inodesize);
> > + if (be16_to_cpu(dip->di_magic) != XFS_DINODE_MAGIC)
> > + return -EFSCORRUPTED;
> > +
> > + if (dip->di_version >= 3 && be64_to_cpu(dip->di_ino) != fsino)
> > + return -EFSCORRUPTED;
> > +
> > + *inuse = dip->di_mode != 0;
> > + return 0;
> > +}
> > +
> > +/*
> > + * For each cluster in this blob of inode, we must calculate the
> Ok, so I've been over this one a few times, and I still don't feel
> like I've figured out what a blob of an inode is. So I'm gonna have
> to break and ask for clarification on that one? Thx! :-)
Heh, sorry.
"For each inode cluster covering the physical extent recorded by the
rmapbt, we must calculate..."
> > + * properly aligned startino of that cluster, then iterate each
> > + * cluster to fill in used and filled masks appropriately. We
> > + * then use the (startino, used, filled) information to construct
> > + * the appropriate inode records.
> > + */
> > +STATIC int
> > +xfs_repair_ialloc_process_cluster(
> > + struct xfs_repair_ialloc *ri,
> > + xfs_agblock_t agbno,
> > + int blks_per_cluster,
> > + xfs_agino_t rec_agino)
> > +{
> > + struct xfs_imap imap;
> > + struct xfs_repair_ialloc_extent *rie;
> > + struct xfs_dinode *dip;
> > + struct xfs_buf *bp;
> > + struct xfs_scrub_context *sc = ri->sc;
> > + struct xfs_mount *mp = sc->mp;
> > + xfs_ino_t fsino;
> > + xfs_inofree_t usedmask;
> > + xfs_agino_t nr_inodes;
> > + xfs_agino_t startino;
> > + xfs_agino_t clusterino;
> > + xfs_agino_t clusteroff;
> > + xfs_agino_t agino;
> > + uint16_t fillmask;
> > + bool inuse;
> > + int usedcount;
> > + int error;
> > +
> > + /* The per-AG inum of this inode cluster. */
> > + agino = XFS_OFFBNO_TO_AGINO(mp, agbno, 0);
> > +
> > + /* The per-AG inum of the inobt record. */
> > + startino = rec_agino + rounddown(agino - rec_agino,
> > + XFS_INODES_PER_CHUNK);
> > +
> > + /* The per-AG inum of the cluster within the inobt record. */
> > + clusteroff = agino - startino;
> > +
> > + /* Every inode in this holemask slot is filled. */
> > + nr_inodes = XFS_OFFBNO_TO_AGINO(mp, blks_per_cluster, 0);
> > + fillmask = xfs_inobt_maskn(clusteroff / XFS_INODES_PER_HOLEMASK_BIT,
> > + nr_inodes / XFS_INODES_PER_HOLEMASK_BIT);
> > +
> > + /* Grab the inode cluster buffer. */
> > + imap.im_blkno = XFS_AGB_TO_DADDR(mp, sc->sa.agno, agbno);
> > + imap.im_len = XFS_FSB_TO_BB(mp, blks_per_cluster);
> > + imap.im_boffset = 0;
> > +
> > + error = xfs_imap_to_bp(mp, sc->tp, &imap, &dip, &bp, 0,
> > + XFS_IGET_UNTRUSTED);
> > + if (error)
> > + return error;
> > +
> > + usedmask = 0;
> > + usedcount = 0;
> > + /* Which inodes within this cluster are free? */
> > + for (clusterino = 0; clusterino < nr_inodes; clusterino++) {
> > + fsino = XFS_AGINO_TO_INO(mp, sc->sa.agno, agino + clusterino);
> > + error = xfs_repair_ialloc_check_free(sc, bp, fsino,
> > + clusterino, &inuse);
> > + if (error) {
> > + xfs_trans_brelse(sc->tp, bp);
> > + return error;
> > + }
> > + if (inuse) {
> > + usedcount++;
> > + usedmask |= XFS_INOBT_MASK(clusteroff + clusterino);
> > + }
> > + }
> > + xfs_trans_brelse(sc->tp, bp);
> > +
> > + /*
> > + * If the last item in the list is our chunk record,
> > + * update that.
> > + */
> > + if (!list_empty(ri->extlist)) {
> > + rie = list_last_entry(ri->extlist,
> > + struct xfs_repair_ialloc_extent, list);
> > + if (rie->startino + XFS_INODES_PER_CHUNK > startino) {
> > + rie->freemask &= ~usedmask;
> > + rie->holemask &= ~fillmask;
> > + rie->count += nr_inodes;
> > + rie->usedcount += usedcount;
> > + return 0;
> > + }
> > + }
> > +
> > + /* New inode chunk; add to the list. */
> > + rie = kmem_alloc(sizeof(struct xfs_repair_ialloc_extent), KM_MAYFAIL);
> > + if (!rie)
> > + return -ENOMEM;
> > +
> > + INIT_LIST_HEAD(&rie->list);
> > + rie->startino = startino;
> > + rie->freemask = XFS_INOBT_ALL_FREE & ~usedmask;
> > + rie->holemask = XFS_INOBT_ALL_FREE & ~fillmask;
> > + rie->count = nr_inodes;
> > + rie->usedcount = usedcount;
> > + list_add_tail(&rie->list, ri->extlist);
> > + ri->nr_records++;
> > +
> > + return 0;
> > +}
> > +
> > +/* Record extents that belong to inode btrees. */
> > +STATIC int
> > +xfs_repair_ialloc_extent_fn(
> > + struct xfs_btree_cur *cur,
> > + struct xfs_rmap_irec *rec,
> > + void *priv)
> > +{
> > + struct xfs_repair_ialloc *ri = priv;
> > + struct xfs_mount *mp = cur->bc_mp;
> > + xfs_fsblock_t fsbno;
> > + xfs_agblock_t agbno = rec->rm_startblock;
> > + xfs_agino_t inoalign;
> > + xfs_agino_t agino;
> > + xfs_agino_t rec_agino;
> > + int blks_per_cluster;
> > + int error = 0;
> > +
> > + if (xfs_scrub_should_terminate(ri->sc, &error))
> > + return error;
> > +
> > + /* Fragment of the old btrees; dispose of them later. */
> > + if (rec->rm_owner == XFS_RMAP_OWN_INOBT) {
> > + fsbno = XFS_AGB_TO_FSB(mp, ri->sc->sa.agno, agbno);
> > + return xfs_repair_collect_btree_extent(ri->sc, ri->btlist,
> > + fsbno, rec->rm_blockcount);
> > + }
> > +
> > + /* Skip extents which are not owned by this inode and fork. */
> > + if (rec->rm_owner != XFS_RMAP_OWN_INODES)
> > + return 0;
> > +
> > + blks_per_cluster = xfs_icluster_size_fsb(mp);
> > +
> > + if (agbno % blks_per_cluster != 0)
> > + return -EFSCORRUPTED;
> > +
> > + trace_xfs_repair_ialloc_extent_fn(mp, ri->sc->sa.agno,
> > + rec->rm_startblock, rec->rm_blockcount, rec->rm_owner,
> > + rec->rm_offset, rec->rm_flags);
> > +
> > + /*
> > + * Determine the inode block alignment, and where the block
> > + * ought to start if it's aligned properly. On a sparse inode
> > + * system the rmap doesn't have to start on an alignment boundary,
> > + * but the record does. On pre-sparse filesystems, we /must/
> > + * start both rmap and inobt on an alignment boundary.
> > + */
> > + inoalign = xfs_ialloc_cluster_alignment(mp);
> > + agino = XFS_OFFBNO_TO_AGINO(mp, agbno, 0);
> > + rec_agino = XFS_OFFBNO_TO_AGINO(mp, rounddown(agbno, inoalign), 0);
> > + if (!xfs_sb_version_hassparseinodes(&mp->m_sb) && agino != rec_agino)
> > + return -EFSCORRUPTED;
> > +
> > + /* Set up the free/hole masks for each cluster in this inode chunk. */
> By chunk you did you mean record? Please try to keep terminology
> consistent as best you can. Thx! :-)
Yikes, that /is/ a misleading comment.
"Set up the free/hole masks for each inode cluster that could be mapped
by this rmap record."
> > + for (;
> > + agbno < rec->rm_startblock + rec->rm_blockcount;
> > + agbno += blks_per_cluster) {
> > + error = xfs_repair_ialloc_process_cluster(ri, agbno,
> > + blks_per_cluster, rec_agino);
> > + if (error)
> > + return error;
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +/* Compare two ialloc extents. */
> > +static int
> > +xfs_repair_ialloc_extent_cmp(
> > + void *priv,
> > + struct list_head *a,
> > + struct list_head *b)
> > +{
> > + struct xfs_repair_ialloc_extent *ap;
> > + struct xfs_repair_ialloc_extent *bp;
> > +
> > + ap = container_of(a, struct xfs_repair_ialloc_extent, list);
> > + bp = container_of(b, struct xfs_repair_ialloc_extent, list);
> > +
> > + if (ap->startino > bp->startino)
> > + return 1;
> > + else if (ap->startino < bp->startino)
> > + return -1;
> > + return 0;
> > +}
> > +
> > +/* Insert an inode chunk record into a given btree. */
> > +static int
> > +xfs_repair_iallocbt_insert_btrec(
> > + struct xfs_btree_cur *cur,
> > + struct xfs_repair_ialloc_extent *rie)
> > +{
> > + int stat;
> > + int error;
> > +
> > + error = xfs_inobt_lookup(cur, rie->startino, XFS_LOOKUP_EQ, &stat);
> > + if (error)
> > + return error;
> > + XFS_WANT_CORRUPTED_RETURN(cur->bc_mp, stat == 0);
> > + error = xfs_inobt_insert_rec(cur, rie->holemask, rie->count,
> > + rie->count - rie->usedcount, rie->freemask, &stat);
> > + if (error)
> > + return error;
> > + XFS_WANT_CORRUPTED_RETURN(cur->bc_mp, stat == 1);
> > + return error;
> > +}
> > +
> > +/* Insert an inode chunk record into both inode btrees. */
> > +static int
> > +xfs_repair_iallocbt_insert_rec(
> > + struct xfs_scrub_context *sc,
> > + struct xfs_repair_ialloc_extent *rie)
> > +{
> > + struct xfs_btree_cur *cur;
> > + int error;
> > +
> > + trace_xfs_repair_ialloc_insert(sc->mp, sc->sa.agno, rie->startino,
> > + rie->holemask, rie->count, rie->count - rie->usedcount,
> > + rie->freemask);
> > +
> > + /* Insert into the inobt. */
> > + cur = xfs_inobt_init_cursor(sc->mp, sc->tp, sc->sa.agi_bp, sc->sa.agno,
> > + XFS_BTNUM_INO);
> > + error = xfs_repair_iallocbt_insert_btrec(cur, rie);
> > + if (error)
> > + goto out_cur;
> > + xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
> > +
> > + /* Insert into the finobt if chunk has free inodes. */
> > + if (xfs_sb_version_hasfinobt(&sc->mp->m_sb) &&
> > + rie->count != rie->usedcount) {
> > + cur = xfs_inobt_init_cursor(sc->mp, sc->tp, sc->sa.agi_bp,
> > + sc->sa.agno, XFS_BTNUM_FINO);
> > + error = xfs_repair_iallocbt_insert_btrec(cur, rie);
> > + if (error)
> > + goto out_cur;
> > + xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
> > + }
> > +
> > + return xfs_repair_roll_ag_trans(sc);
> > +out_cur:
> > + xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);
> > + return error;
> > +}
> > +
> > +/* Free every record in the inode list. */
> > +STATIC void
> > +xfs_repair_iallocbt_cancel_inorecs(
> > + struct list_head *reclist)
> > +{
> > + struct xfs_repair_ialloc_extent *rie;
> > + struct xfs_repair_ialloc_extent *n;
> > +
> > + list_for_each_entry_safe(rie, n, reclist, list) {
> > + list_del(&rie->list);
> > + kmem_free(rie);
> > + }
> > +}
> > +
> > +/*
> > + * Iterate all reverse mappings to find the inodes (OWN_INODES) and the inode
> > + * btrees (OWN_INOBT). Figure out if we have enough free space to reconstruct
> > + * the inode btrees. The caller must clean up the lists if anything goes
> > + * wrong.
> > + */
> > +STATIC int
> > +xfs_repair_iallocbt_find_inodes(
> > + struct xfs_scrub_context *sc,
> > + struct list_head *inode_records,
> > + struct xfs_repair_extent_list *old_iallocbt_blocks)
> > +{
> > + struct xfs_repair_ialloc ri;
> > + struct xfs_mount *mp = sc->mp;
> > + struct xfs_btree_cur *cur;
> > + xfs_agblock_t nr_blocks;
> > + int error;
> > +
> > + /* Collect all reverse mappings for inode blocks. */
> > + ri.extlist = inode_records;
> > + ri.btlist = old_iallocbt_blocks;
> > + ri.nr_records = 0;
> > + ri.sc = sc;
> > +
> > + cur = xfs_rmapbt_init_cursor(mp, sc->tp, sc->sa.agf_bp, sc->sa.agno);
> > + error = xfs_rmap_query_all(cur, xfs_repair_ialloc_extent_fn, &ri);
> > + if (error)
> > + goto err;
> > + xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
> > +
> > + /* Do we actually have enough space to do this? */
> > + nr_blocks = xfs_iallocbt_calc_size(mp, ri.nr_records);
> > + if (xfs_sb_version_hasfinobt(&mp->m_sb))
> > + nr_blocks *= 2;
> > + if (!xfs_repair_ag_has_space(sc->sa.pag, nr_blocks, XFS_AG_RESV_NONE))
> > + return -ENOSPC;
> > +
> > + return 0;
> > +
> > +err:
> > + xfs_btree_del_cursor(cur, XFS_BTREE_ERROR);
> > + return error;
> > +}
> > +
> > +/* Update the AGI counters. */
> > +STATIC int
> > +xfs_repair_iallocbt_reset_counters(
> > + struct xfs_scrub_context *sc,
> > + struct list_head *inode_records,
> > + int *log_flags)
> > +{
> > + struct xfs_agi *agi;
> > + struct xfs_repair_ialloc_extent *rie;
> > + unsigned int count = 0;
> > + unsigned int usedcount = 0;
> > + unsigned int freecount;
> > +
> > + /* Figure out the new counters. */
> > + list_for_each_entry(rie, inode_records, list) {
> > + count += rie->count;
> > + usedcount += rie->usedcount;
> > + }
> > +
> > + agi = XFS_BUF_TO_AGI(sc->sa.agi_bp);
> > + freecount = count - usedcount;
> > +
> > + /* XXX: trigger inode count recalculation */
> > +
> > + /* Reset the per-AG info, both incore and ondisk. */
> > + sc->sa.pag->pagi_count = count;
> > + sc->sa.pag->pagi_freecount = freecount;
> > + agi->agi_count = cpu_to_be32(count);
> > + agi->agi_freecount = cpu_to_be32(freecount);
> > + *log_flags |= XFS_AGI_COUNT | XFS_AGI_FREECOUNT;
> > +
> > + return 0;
> > +}
> > +
> > +/* Initialize new inobt/finobt roots and implant them into the AGI. */
> > +STATIC int
> > +xfs_repair_iallocbt_reset_btrees(
> > + struct xfs_scrub_context *sc,
> > + struct xfs_owner_info *oinfo,
> > + int *log_flags)
> > +{
> > + struct xfs_agi *agi;
> > + struct xfs_buf *bp;
> > + struct xfs_mount *mp = sc->mp;
> > + xfs_fsblock_t inofsb;
> > + xfs_fsblock_t finofsb;
> > + enum xfs_ag_resv_type resv;
> > + int error;
> > +
> > + agi = XFS_BUF_TO_AGI(sc->sa.agi_bp);
> > +
> > + /* Initialize new inobt root. */
> > + resv = XFS_AG_RESV_NONE;
> > + error = xfs_repair_alloc_ag_block(sc, oinfo, &inofsb, resv);
> > + if (error)
> > + return error;
> > + error = xfs_repair_init_btblock(sc, inofsb, &bp, XFS_BTNUM_INO,
> > + &xfs_inobt_buf_ops);
> > + if (error)
> > + return error;
> > + agi->agi_root = cpu_to_be32(XFS_FSB_TO_AGBNO(mp, inofsb));
> > + agi->agi_level = cpu_to_be32(1);
> > + *log_flags |= XFS_AGI_ROOT | XFS_AGI_LEVEL;
> > +
> > + /* Initialize new finobt root. */
> > + if (!xfs_sb_version_hasfinobt(&mp->m_sb))
> > + return 0;
> > +
> > + resv = mp->m_inotbt_nores ? XFS_AG_RESV_NONE : XFS_AG_RESV_METADATA;
> > + error = xfs_repair_alloc_ag_block(sc, oinfo, &finofsb, resv);
> > + if (error)
> > + return error;
> > + error = xfs_repair_init_btblock(sc, finofsb, &bp, XFS_BTNUM_FINO,
> > + &xfs_inobt_buf_ops);
> > + if (error)
> > + return error;
> > + agi->agi_free_root = cpu_to_be32(XFS_FSB_TO_AGBNO(mp, finofsb));
> > + agi->agi_free_level = cpu_to_be32(1);
> > + *log_flags |= XFS_AGI_FREE_ROOT | XFS_AGI_FREE_LEVEL;
> > +
> > + return 0;
> > +}
> > +
> > +/* Build new inode btrees and dispose of the old one. */
> > +STATIC int
> > +xfs_repair_iallocbt_rebuild_trees(
> > + struct xfs_scrub_context *sc,
> > + struct list_head *inode_records,
> > + struct xfs_owner_info *oinfo,
> > + struct xfs_repair_extent_list *old_iallocbt_blocks)
> > +{
> > + struct xfs_repair_ialloc_extent *rie;
> > + struct xfs_repair_ialloc_extent *n;
> > + int error;
> > +
> > + /* Add all records. */
> > + list_sort(NULL, inode_records, xfs_repair_ialloc_extent_cmp);
> > + list_for_each_entry_safe(rie, n, inode_records, list) {
> > + error = xfs_repair_iallocbt_insert_rec(sc, rie);
> > + if (error)
> > + return error;
> > +
> > + list_del(&rie->list);
> > + kmem_free(rie);
> > + }
> > +
> > + /* Free the old inode btree blocks if they're not in use. */
> > + return xfs_repair_reap_btree_extents(sc, old_iallocbt_blocks, oinfo,
> > + XFS_AG_RESV_NONE);
> > +}
> > +
> > +/* Repair both inode btrees. */
> > +int
> > +xfs_repair_iallocbt(
> > + struct xfs_scrub_context *sc)
> > +{
> > + struct xfs_owner_info oinfo;
> > + struct list_head inode_records;
> > + struct xfs_repair_extent_list old_iallocbt_blocks;
> > + struct xfs_mount *mp = sc->mp;
> > + int log_flags = 0;
> > + int error = 0;
> > +
> > + /* We require the rmapbt to rebuild anything. */
> > + if (!xfs_sb_version_hasrmapbt(&mp->m_sb))
> > + return -EOPNOTSUPP;
> > +
> > + xfs_scrub_perag_get(sc->mp, &sc->sa);
> > +
> > + /* Collect the free space data and find the old btree blocks. */
> > + xfs_rmap_ag_owner(&oinfo, XFS_RMAP_OWN_INOBT);
> > + INIT_LIST_HEAD(&inode_records);
> > + xfs_repair_init_extent_list(&old_iallocbt_blocks);
> > + error = xfs_repair_iallocbt_find_inodes(sc, &inode_records,
> > + &old_iallocbt_blocks);
> > + if (error)
> > + goto out;
> > +
> > + /*
> > + * Blow out the old inode btrees. This is the point at which
> > + * we are no longer able to bail out gracefully.
> > + */
> > + error = xfs_repair_iallocbt_reset_counters(sc, &inode_records,
> > + &log_flags);
> > + if (error)
> > + goto out;
> > + error = xfs_repair_iallocbt_reset_btrees(sc, &oinfo, &log_flags);
> > + if (error)
> > + goto out;
> > + xfs_ialloc_log_agi(sc->tp, sc->sa.agi_bp, log_flags);
> > +
> > + /* Invalidate all the inobt/finobt blocks in btlist. */
> > + error = xfs_repair_invalidate_blocks(sc, &old_iallocbt_blocks);
> > + if (error)
> > + goto out;
> > + error = xfs_repair_roll_ag_trans(sc);
> > + if (error)
> > + goto out;
> > +
> > + /* Now rebuild the inode information. */
> > + error = xfs_repair_iallocbt_rebuild_trees(sc, &inode_records, &oinfo,
> > + &old_iallocbt_blocks);
> > +out:
> > + xfs_repair_cancel_btree_extents(sc, &old_iallocbt_blocks);
> > + xfs_repair_iallocbt_cancel_inorecs(&inode_records);
> > + return error;
> > +}
> > diff --git a/fs/xfs/scrub/repair.h b/fs/xfs/scrub/repair.h
> > index e5f67fc68e9a..dcfa5eb18940 100644
> > --- a/fs/xfs/scrub/repair.h
> > +++ b/fs/xfs/scrub/repair.h
> > @@ -104,6 +104,7 @@ int xfs_repair_agf(struct xfs_scrub_context *sc);
> > int xfs_repair_agfl(struct xfs_scrub_context *sc);
> > int xfs_repair_agi(struct xfs_scrub_context *sc);
> > int xfs_repair_allocbt(struct xfs_scrub_context *sc);
> > +int xfs_repair_iallocbt(struct xfs_scrub_context *sc);
> > #else
> > @@ -131,6 +132,7 @@ xfs_repair_calc_ag_resblks(
> > #define xfs_repair_agfl xfs_repair_notsupported
> > #define xfs_repair_agi xfs_repair_notsupported
> > #define xfs_repair_allocbt xfs_repair_notsupported
> > +#define xfs_repair_iallocbt xfs_repair_notsupported
> > #endif /* CONFIG_XFS_ONLINE_REPAIR */
> > diff --git a/fs/xfs/scrub/scrub.c b/fs/xfs/scrub/scrub.c
> > index 7a55b20b7e4e..fec0e130f19e 100644
> > --- a/fs/xfs/scrub/scrub.c
> > +++ b/fs/xfs/scrub/scrub.c
> > @@ -238,14 +238,14 @@ static const struct xfs_scrub_meta_ops meta_scrub_ops[] = {
> > .type = ST_PERAG,
> > .setup = xfs_scrub_setup_ag_iallocbt,
> > .scrub = xfs_scrub_inobt,
> > - .repair = xfs_repair_notsupported,
> > + .repair = xfs_repair_iallocbt,
> > },
> > [XFS_SCRUB_TYPE_FINOBT] = { /* finobt */
> > .type = ST_PERAG,
> > .setup = xfs_scrub_setup_ag_iallocbt,
> > .scrub = xfs_scrub_finobt,
> > .has = xfs_sb_version_hasfinobt,
> > - .repair = xfs_repair_notsupported,
> > + .repair = xfs_repair_iallocbt,
> > },
> > [XFS_SCRUB_TYPE_RMAPBT] = { /* rmapbt */
> > .type = ST_PERAG,
> >
>
> Ok, some parts took some time to figure out, but I think I understand
> the overall idea. The comments help, and if you could add in a little
> extra detail describing the function parameters, I think it would help
> to add more supporting context to your comments. Thx!
Every time I go wandering through the ialloc code my head also gets
twisted in knots over inode chunks and inode clusters. I think for the
next round I'll try to make some ascii art diagrams that I can refer
back to the next time I have to go digging through here (which will
probably be not that long from now, rumor has it the ialloc scrub don't
quite work right on systems with 64K pagesize.
--D
> Allison
>
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=LHZQ8fHvy6wDKXGTWcm97burZH5sQKHRDMaY1UthQxc&m=fIL2s7bIVyQHhkt6FVjoAC9YFnsVQMVUbz6DfuinhZs&s=m56pNZbCxuiPzbhEv3nD5G2PqN_7BLoQhkXF1E-CTzY&e=
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2018-06-30 18:30 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-24 19:23 [PATCH v16 00/21] xfs-4.19: online repair support Darrick J. Wong
2018-06-24 19:23 ` [PATCH 01/21] xfs: don't assume a left rmap when allocating a new rmap Darrick J. Wong
2018-06-27 0:54 ` Dave Chinner
2018-06-28 21:11 ` Allison Henderson
2018-06-29 14:39 ` Darrick J. Wong
2018-06-24 19:23 ` [PATCH 02/21] xfs: add helper to decide if an inode has allocated cow blocks Darrick J. Wong
2018-06-27 1:02 ` Dave Chinner
2018-06-28 21:12 ` Allison Henderson
2018-06-24 19:23 ` [PATCH 03/21] xfs: refactor part of xfs_free_eofblocks Darrick J. Wong
2018-06-28 21:13 ` Allison Henderson
2018-06-24 19:23 ` [PATCH 04/21] xfs: repair the AGF and AGFL Darrick J. Wong
2018-06-27 2:19 ` Dave Chinner
2018-06-27 16:44 ` Allison Henderson
2018-06-27 23:37 ` Dave Chinner
2018-06-29 15:14 ` Darrick J. Wong
2018-06-28 17:25 ` Allison Henderson
2018-06-29 15:08 ` Darrick J. Wong
2018-06-28 21:14 ` Allison Henderson
2018-06-28 23:21 ` Dave Chinner
2018-06-29 1:35 ` Allison Henderson
2018-06-29 14:55 ` Darrick J. Wong
2018-06-24 19:24 ` [PATCH 05/21] xfs: repair the AGI Darrick J. Wong
2018-06-27 2:22 ` Dave Chinner
2018-06-28 21:15 ` Allison Henderson
2018-06-24 19:24 ` [PATCH 06/21] xfs: repair free space btrees Darrick J. Wong
2018-06-27 3:21 ` Dave Chinner
2018-07-04 2:15 ` Darrick J. Wong
2018-07-04 2:25 ` Dave Chinner
2018-06-30 17:36 ` Allison Henderson
2018-06-24 19:24 ` [PATCH 07/21] xfs: repair inode btrees Darrick J. Wong
2018-06-28 0:55 ` Dave Chinner
2018-07-04 2:22 ` Darrick J. Wong
2018-06-30 17:36 ` Allison Henderson
2018-06-30 18:30 ` Darrick J. Wong [this message]
2018-07-01 0:45 ` Allison Henderson
2018-06-24 19:24 ` [PATCH 08/21] xfs: defer iput on certain inodes while scrub / repair are running Darrick J. Wong
2018-06-28 23:37 ` Dave Chinner
2018-06-29 14:49 ` Darrick J. Wong
2018-06-24 19:24 ` [PATCH 09/21] xfs: finish our set of inode get/put tracepoints for scrub Darrick J. Wong
2018-06-24 19:24 ` [PATCH 10/21] xfs: introduce online scrub freeze Darrick J. Wong
2018-06-24 19:24 ` [PATCH 11/21] xfs: repair the rmapbt Darrick J. Wong
2018-07-03 5:32 ` Dave Chinner
2018-07-03 23:59 ` Darrick J. Wong
2018-07-04 8:44 ` Carlos Maiolino
2018-07-04 18:40 ` Darrick J. Wong
2018-07-04 23:21 ` Dave Chinner
2018-07-05 3:48 ` Darrick J. Wong
2018-07-05 7:03 ` Dave Chinner
2018-07-06 0:47 ` Darrick J. Wong
2018-07-06 1:08 ` Dave Chinner
2018-06-24 19:24 ` [PATCH 12/21] xfs: repair refcount btrees Darrick J. Wong
2018-07-03 5:50 ` Dave Chinner
2018-07-04 2:23 ` Darrick J. Wong
2018-06-24 19:24 ` [PATCH 13/21] xfs: repair inode records Darrick J. Wong
2018-07-03 6:17 ` Dave Chinner
2018-07-04 0:16 ` Darrick J. Wong
2018-07-04 1:03 ` Dave Chinner
2018-07-04 1:30 ` Darrick J. Wong
2018-06-24 19:24 ` [PATCH 14/21] xfs: zap broken inode forks Darrick J. Wong
2018-07-04 2:07 ` Dave Chinner
2018-07-04 3:26 ` Darrick J. Wong
2018-06-24 19:25 ` [PATCH 15/21] xfs: repair inode block maps Darrick J. Wong
2018-07-04 3:00 ` Dave Chinner
2018-07-04 3:41 ` Darrick J. Wong
2018-06-24 19:25 ` [PATCH 16/21] xfs: repair damaged symlinks Darrick J. Wong
2018-07-04 5:45 ` Dave Chinner
2018-07-04 18:45 ` Darrick J. Wong
2018-06-24 19:25 ` [PATCH 17/21] xfs: repair extended attributes Darrick J. Wong
2018-07-06 1:03 ` Dave Chinner
2018-07-06 3:10 ` Darrick J. Wong
2018-06-24 19:25 ` [PATCH 18/21] xfs: scrub should set preen if attr leaf has holes Darrick J. Wong
2018-06-29 2:52 ` Dave Chinner
2018-06-24 19:25 ` [PATCH 19/21] xfs: repair quotas Darrick J. Wong
2018-07-06 1:50 ` Dave Chinner
2018-07-06 3:16 ` Darrick J. Wong
2018-06-24 19:25 ` [PATCH 20/21] xfs: implement live quotacheck as part of quota repair Darrick J. Wong
2018-06-24 19:25 ` [PATCH 21/21] xfs: add online scrub/repair for superblock counters Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180630183011.GP5711@magnolia \
--to=darrick.wong@oracle.com \
--cc=allison.henderson@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).