From: Brian Foster <bfoster@redhat.com>
To: xfs@oss.sgi.com
Subject: [PATCH v4 14/20] xfsprogs/repair: phase 2 finobt scan
Date: Wed, 7 May 2014 08:21:53 -0400 [thread overview]
Message-ID: <1399465319-65066-15-git-send-email-bfoster@redhat.com> (raw)
In-Reply-To: <1399465319-65066-1-git-send-email-bfoster@redhat.com>
If one exists, scan the free inode btree in phase 2 of xfs_repair.
We use the same general infrastructure as for the inobt scan, but
trigger finobt chunk scan logic in in scan_inobt() via the magic
value.
The new scan_single_finobt_chunk() function is similar to the inobt
equivalent with some finobt specific logic. We can expect that
underlying inode chunk blocks are already marked used due to the
previous inobt scan. We can also expect to find every record
tracked by the finobt already accounted for in the in-core tree
with equivalent (and internally consistent) inobt record data.
Spit out a warning on any divergences from the above and add the
inodes referenced by the current finobt record to the appropriate
in-core tree.
Signed-off-by: Brian Foster <bfoster@redhat.com>
---
repair/scan.c | 251 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 246 insertions(+), 5 deletions(-)
diff --git a/repair/scan.c b/repair/scan.c
index 4b0ea04..1b64d8b 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -46,6 +46,7 @@ struct aghdr_cnts {
__uint64_t fdblocks;
__uint64_t icount;
__uint64_t ifreecount;
+ __uint32_t fibtfreecount;
};
void
@@ -897,6 +898,208 @@ _("inode rec for ino %" PRIu64 " (%d/%d) overlaps existing rec (start %d/%d)\n")
return suspect;
}
+static int
+scan_single_finobt_chunk(
+ xfs_agnumber_t agno,
+ xfs_inobt_rec_t *rp,
+ int suspect)
+{
+ xfs_ino_t lino;
+ xfs_agino_t ino;
+ xfs_agblock_t agbno;
+ int j;
+ int nfree;
+ int off;
+ int state;
+ ino_tree_node_t *first_rec, *last_rec, *ino_rec;
+
+ ino = be32_to_cpu(rp->ir_startino);
+ off = XFS_AGINO_TO_OFFSET(mp, ino);
+ agbno = XFS_AGINO_TO_AGBNO(mp, ino);
+ lino = XFS_AGINO_TO_INO(mp, agno, ino);
+
+ /*
+ * on multi-block block chunks, all chunks start at the beginning of the
+ * block. with multi-chunk blocks, all chunks must start on 64-inode
+ * boundaries since each block can hold N complete chunks. if fs has
+ * aligned inodes, all chunks must start at a fs_ino_alignment*N'th
+ * agbno. skip recs with badly aligned starting inodes.
+ */
+ if (ino == 0 ||
+ (inodes_per_block <= XFS_INODES_PER_CHUNK && off != 0) ||
+ (inodes_per_block > XFS_INODES_PER_CHUNK &&
+ off % XFS_INODES_PER_CHUNK != 0) ||
+ (fs_aligned_inodes && agbno % fs_ino_alignment != 0)) {
+ do_warn(
+ _("badly aligned finobt inode rec (starting inode = %" PRIu64 ")\n"),
+ lino);
+ suspect++;
+ }
+
+ /*
+ * verify numeric validity of inode chunk first before inserting into a
+ * tree. don't have to worry about the overflow case because the
+ * starting ino number of a chunk can only get within 255 inodes of max
+ * (NULLAGINO). if it gets closer, the agino number will be illegal as
+ * the agbno will be too large.
+ */
+ if (verify_aginum(mp, agno, ino)) {
+ do_warn(
+_("bad starting inode # (%" PRIu64 " (0x%x 0x%x)) in finobt rec, skipping rec\n"),
+ lino, agno, ino);
+ return ++suspect;
+ }
+
+ if (verify_aginum(mp, agno,
+ ino + XFS_INODES_PER_CHUNK - 1)) {
+ do_warn(
+_("bad ending inode # (%" PRIu64 " (0x%x 0x%zx)) in finobt rec, skipping rec\n"),
+ lino + XFS_INODES_PER_CHUNK - 1,
+ agno,
+ ino + XFS_INODES_PER_CHUNK - 1);
+ return ++suspect;
+ }
+
+ /*
+ * cross check state of each block containing inodes referenced by the
+ * finobt against what we have already scanned from the alloc inobt.
+ */
+ if (off == 0 && !suspect) {
+ for (j = 0;
+ j < XFS_INODES_PER_CHUNK;
+ j += mp->m_sb.sb_inopblock) {
+ agbno = XFS_AGINO_TO_AGBNO(mp, ino + j);
+
+ state = get_bmap(agno, agbno);
+ if (state == XR_E_INO) {
+ continue;
+ } else if ((state == XR_E_UNKNOWN) ||
+ (state == XR_E_INUSE_FS && agno == 0 &&
+ ino + j >= first_prealloc_ino &&
+ ino + j < last_prealloc_ino)) {
+ do_warn(
+_("inode chunk claims untracked block, finobt block - agno %d, bno %d, inopb %d\n"),
+ agno, agbno, mp->m_sb.sb_inopblock);
+
+ set_bmap(agno, agbno, XR_E_INO);
+ suspect++;
+ } else {
+ do_warn(
+_("inode chunk claims used block, finobt block - agno %d, bno %d, inopb %d\n"),
+ agno, agbno, mp->m_sb.sb_inopblock);
+ return ++suspect;
+ }
+ }
+ }
+
+ /*
+ * ensure we have an incore entry for each chunk
+ */
+ find_inode_rec_range(mp, agno, ino, ino + XFS_INODES_PER_CHUNK,
+ &first_rec, &last_rec);
+
+ if (first_rec) {
+ if (suspect)
+ return suspect;
+
+ /*
+ * verify consistency between finobt record and incore state
+ */
+ if (first_rec->ino_startnum != ino) {
+ do_warn(
+_("finobt rec for ino %" PRIu64 " (%d/%u) does not match existing rec (%d/%d)\n"),
+ lino, agno, ino, agno, first_rec->ino_startnum);
+ return ++suspect;
+ }
+
+ nfree = 0;
+ for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
+ int isfree = XFS_INOBT_IS_FREE_DISK(rp, j);
+
+ if (isfree)
+ nfree++;
+
+ /*
+ * inode allocation state should be consistent between
+ * the inobt and finobt
+ */
+ if (!suspect &&
+ isfree != is_inode_free(first_rec, j))
+ suspect++;
+ }
+
+ goto check_freecount;
+ }
+
+ /*
+ * the finobt contains a record that the previous alloc inobt scan never
+ * found. insert the inodes into the appropriate tree.
+ */
+ do_warn(_("undiscovered finobt record, ino %" PRIu64 " (%d/%u)\n"),
+ lino, agno, ino);
+
+ if (!suspect) {
+ /*
+ * inodes previously inserted into the uncertain tree should be
+ * superceded by these when the uncertain tree is processed
+ */
+ nfree = 0;
+ if (XFS_INOBT_IS_FREE_DISK(rp, 0)) {
+ nfree++;
+ ino_rec = set_inode_free_alloc(mp, agno, ino);
+ } else {
+ ino_rec = set_inode_used_alloc(mp, agno, ino);
+ }
+ for (j = 1; j < XFS_INODES_PER_CHUNK; j++) {
+ if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
+ nfree++;
+ set_inode_free(ino_rec, j);
+ } else {
+ set_inode_used(ino_rec, j);
+ }
+ }
+ } else {
+ /*
+ * this should handle the case where the inobt scan may have
+ * already added uncertain inodes
+ */
+ nfree = 0;
+ for (j = 0; j < XFS_INODES_PER_CHUNK; j++) {
+ if (XFS_INOBT_IS_FREE_DISK(rp, j)) {
+ add_aginode_uncertain(mp, agno, ino + j, 1);
+ nfree++;
+ } else {
+ add_aginode_uncertain(mp, agno, ino + j, 0);
+ }
+ }
+ }
+
+check_freecount:
+
+ /*
+ * Verify that the record freecount matches the actual number of free
+ * inodes counted in the record. Don't increment 'suspect' here, since
+ * we have already verified the allocation state of the individual
+ * inodes against the in-core state. This will have already incremented
+ * 'suspect' if something is wrong. If suspect hasn't been set at this
+ * point, these warnings mean that we have a simple freecount
+ * inconsistency or a stray finobt record (as opposed to a broader tree
+ * corruption). Issue a warning and continue the scan. The final btree
+ * reconstruction will correct this naturally.
+ */
+ if (nfree != be32_to_cpu(rp->ir_freecount)) {
+ do_warn(
+_("finobt ir_freecount/free mismatch, inode chunk %d/%u, freecount %d nfree %d\n"),
+ agno, ino, be32_to_cpu(rp->ir_freecount), nfree);
+ }
+
+ if (!nfree) {
+ do_warn(
+_("finobt record with no free inodes, inode chunk %d/%u\n"), agno, ino);
+ }
+
+ return suspect;
+}
/*
* this one walks the inode btrees sucking the info there into
@@ -1005,12 +1208,29 @@ _("inode btree block claimed (state %d), agno %d, bno %d, suspect %d\n"),
* the block. skip processing of bogus records.
*/
for (i = 0; i < numrecs; i++) {
- agcnts->agicount += XFS_INODES_PER_CHUNK;
- agcnts->icount += XFS_INODES_PER_CHUNK;
- agcnts->agifreecount += be32_to_cpu(rp[i].ir_freecount);
- agcnts->ifreecount += be32_to_cpu(rp[i].ir_freecount);
+ if (magic == XFS_IBT_MAGIC ||
+ magic == XFS_IBT_CRC_MAGIC) {
+ agcnts->agicount += XFS_INODES_PER_CHUNK;
+ agcnts->icount += XFS_INODES_PER_CHUNK;
+ agcnts->agifreecount +=
+ be32_to_cpu(rp[i].ir_freecount);
+ agcnts->ifreecount +=
+ be32_to_cpu(rp[i].ir_freecount);
+
+ suspect = scan_single_ino_chunk(agno, &rp[i],
+ suspect);
+ } else {
+ /*
+ * the finobt tracks records with free inodes,
+ * so only the free inode count is expected to be
+ * consistent with the agi
+ */
+ agcnts->fibtfreecount +=
+ be32_to_cpu(rp[i].ir_freecount);
- suspect = scan_single_ino_chunk(agno, &rp[i], suspect);
+ suspect = scan_single_finobt_chunk(agno, &rp[i],
+ suspect);
+ }
}
if (suspect)
@@ -1198,6 +1418,20 @@ validate_agi(
be32_to_cpu(agi->agi_root), agno);
}
+ if (xfs_sb_version_hasfinobt(&mp->m_sb)) {
+ bno = be32_to_cpu(agi->agi_free_root);
+ if (bno != 0 && verify_agbno(mp, agno, bno)) {
+ magic = xfs_sb_version_hascrc(&mp->m_sb) ?
+ XFS_FIBT_CRC_MAGIC : XFS_FIBT_MAGIC;
+ scan_sbtree(bno, be32_to_cpu(agi->agi_free_level),
+ agno, 0, scan_inobt, 1, magic, agcnts,
+ &xfs_inobt_buf_ops);
+ } else {
+ do_warn(_("bad agbno %u for finobt root, agno %d\n"),
+ be32_to_cpu(agi->agi_free_root), agno);
+ }
+ }
+
if (be32_to_cpu(agi->agi_count) != agcnts->agicount) {
do_warn(_("agi_count %u, counted %u in ag %u\n"),
be32_to_cpu(agi->agi_count), agcnts->agicount, agno);
@@ -1208,6 +1442,13 @@ validate_agi(
be32_to_cpu(agi->agi_freecount), agcnts->agifreecount, agno);
}
+ if (xfs_sb_version_hasfinobt(&mp->m_sb) &&
+ be32_to_cpu(agi->agi_freecount) != agcnts->fibtfreecount) {
+ do_warn(_("agi_freecount %u, counted %u in ag %u finobt\n"),
+ be32_to_cpu(agi->agi_freecount), agcnts->fibtfreecount,
+ agno);
+ }
+
for (i = 0; i < XFS_AGI_UNLINKED_BUCKETS; i++) {
xfs_agino_t agino = be32_to_cpu(agi->agi_unlinked[i]);
--
1.8.3.1
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-05-07 12:22 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-07 12:21 [PATCH v4 00/20] xfsprogs: introduce the free inode btree Brian Foster
2014-05-07 12:21 ` [PATCH v4 01/20] xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers Brian Foster
2014-05-07 12:21 ` [PATCH v4 02/20] xfs: reserve v5 superblock read-only compat. feature bit for finobt Brian Foster
2014-05-07 12:21 ` [PATCH v4 03/20] xfs: support the XFS_BTNUM_FINOBT free inode btree type Brian Foster
2014-05-07 12:21 ` [PATCH v4 04/20] xfs: update inode allocation/free transaction reservations for finobt Brian Foster
2014-05-07 12:21 ` [PATCH v4 05/20] xfs: insert newly allocated inode chunks into the finobt Brian Foster
2014-05-07 12:21 ` [PATCH v4 06/20] xfs: use and update the finobt on inode allocation Brian Foster
2014-05-07 12:21 ` [PATCH v4 07/20] xfs: refactor xfs_difree() inobt bits into xfs_difree_inobt() helper Brian Foster
2014-05-07 12:21 ` [PATCH v4 08/20] xfs: update the finobt on inode free Brian Foster
2014-05-07 12:21 ` [PATCH v4 09/20] xfs: report finobt status in fs geometry Brian Foster
2014-05-07 12:21 ` [PATCH v4 10/20] xfs: enable the finobt feature on v5 superblocks Brian Foster
2014-05-07 12:21 ` [PATCH v4 11/20] xfsprogs/mkfs: finobt mkfs support Brian Foster
2014-05-07 12:21 ` [PATCH v4 12/20] xfsprogs/db: finobt support Brian Foster
2014-05-07 12:21 ` [PATCH v4 13/20] xfsprogs/repair: account for finobt in ag 0 geometry pre-calculation Brian Foster
2014-05-07 12:21 ` Brian Foster [this message]
2014-05-07 12:21 ` [PATCH v4 15/20] xfsprogs/repair: pass btree block magic as param to build_ino_tree() Brian Foster
2014-05-07 12:21 ` [PATCH v4 16/20] xfsprogs/repair: pull the build_agi() call up out of the inode tree build Brian Foster
2014-05-07 12:21 ` [PATCH v4 17/20] xfsprogs/repair: helpers for finding in-core inode records w/ free inodes Brian Foster
2014-05-07 12:21 ` [PATCH v4 18/20] xfsprogs/repair: reconstruct the finobt in phase 5 Brian Foster
2014-05-07 12:21 ` [PATCH v4 19/20] xfsprogs/growfs: report finobt status in fs geometry (xfs_info) Brian Foster
2014-05-07 12:21 ` [PATCH v4 20/20] xfsprogs/db: add finobt support to metadump Brian Foster
2014-05-26 22:40 ` [PATCH v4 00/20] xfsprogs: introduce the free inode btree Dave Chinner
2014-05-27 12:06 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1399465319-65066-15-git-send-email-bfoster@redhat.com \
--to=bfoster@redhat.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox