From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org, Brian Foster <bfoster@redhat.com>
Subject: Re: [PATCH 4/5] xfs: scrub/repair should update filesystem metadata health
Date: Tue, 16 Apr 2019 11:09:05 +1000 [thread overview]
Message-ID: <20190416010905.GJ29573@dread.disaster.area> (raw)
In-Reply-To: <155537399567.27935.2695652869908662243.stgit@magnolia>
On Mon, Apr 15, 2019 at 05:19:55PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
>
> Now that we have the ability to track sick metadata in-core, make scrub
> and repair update those health assessments after doing work.
>
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
...
> +/*
> + * Scrub and In-Core Filesystem Health Assessments
> + * ===============================================
> + *
> + * Online scrub and repair have the time and the ability to perform stronger
> + * checks than we can do from the metadata verifiers, because they can
> + * cross-reference records between data structures. Therefore, scrub is in a
> + * good position to update the online filesystem health assessments to reflect
> + * the good/bad state of the data structure.
> + *
> + * We therefore extend scrub in the following ways to achieve this:
> + *
> + * 1. Create a "sick_mask_update" field in the scrub context. When we're
> + * setting up a scrub call, set this to the default XFS_SICK_* flag(s) for the
> + * selected scrub type (call it A). Scrub and repair functions can override
> + * the default sick_mask_update value if they choose.
The name of this field seems ... non-obvious. It's the current
health state of the object as the scrubber has scanned it, right?
And that if the scrubber has detected an error, it will contain
the things that wrong with the object?
I guess it's the "update" part of the name that I don't quite
understand. "update" is an action, not a state. we update the
object with the current sickness state mask, the mask itself is not
doing any updating. Just calling it "sick_mask" gets rid of that
icky feeling I have about it's name....
> + * 2. If the scrubber returns a runtime error code, we exit making no changes
> + * to the incore sick state.
> + *
> + * 3. If the scrubber finds that A is clean, use sick_mask_update to clear the
> + * incore sick flags before exiting.
> + *
> + * 4. If the scrubber finds that A is corrupt, use sick_mask_update to set the
> + * incore sick flags. If the user didn't want to repair then we exit, leaving
> + * the metadata structure unfixed and the sick flag set.
> + *
> + * 5. Now we know that A is corrupt and the user wants to repair, so run the
> + * repairer. If the repairer returns an error code, we exit with that error
> + * code, having made no further changes to the incore sick state.
> + *
> + * 6. If repair rebuilds A correctly and the subsequent re-scrub of A is
> + * clean, use sick_mask_update to clear the incore sick flags. This should
> + * have the effect that A is no longer marked sick.
.... so it really does hold "current state" and not an "update"
operation :)
> + * 7. If repair rebuilds A incorrectly, the re-scrub will find it corrupt and
> + * use sick_mask_update to set the incore sick flags. This should have no
> + * externally visible effect since we already set them in step (4).
> + *
> + * There are some complications to this story, however. For certain types of
> + * complementary metadata indices (e.g. inobt/finobt), it is easier to rebuild
> + * both structures at the same time. The following principles apply to this
> + * type of repair strategy:
> + *
> + * 8. Any repair function that rebuilds multiple structures should update
> + * sick_mask_visible to reflect whatever other structures are rebuilt, and
> + * verify that all the rebuilt structures can pass a scrub check. The
> + * outcomes of 5-7 still apply, but with a sick_mask_update that covers
> + * everything being rebuilt.
> + */
Otherwise this all makes sense and the patch is pretty straight
forward...
> +static const struct xchk_health_map type_to_health_flag[XFS_SCRUB_TYPE_NR] = {
> + [XFS_SCRUB_TYPE_SB] = { XHG_AG, XFS_SICK_AG_SB },
> + [XFS_SCRUB_TYPE_AGF] = { XHG_AG, XFS_SICK_AG_AGF },
> + [XFS_SCRUB_TYPE_AGFL] = { XHG_AG, XFS_SICK_AG_AGFL },
> + [XFS_SCRUB_TYPE_AGI] = { XHG_AG, XFS_SICK_AG_AGI },
> + [XFS_SCRUB_TYPE_BNOBT] = { XHG_AG, XFS_SICK_AG_BNOBT },
> + [XFS_SCRUB_TYPE_CNTBT] = { XHG_AG, XFS_SICK_AG_CNTBT },
> + [XFS_SCRUB_TYPE_INOBT] = { XHG_AG, XFS_SICK_AG_INOBT },
> + [XFS_SCRUB_TYPE_FINOBT] = { XHG_AG, XFS_SICK_AG_FINOBT },
> + [XFS_SCRUB_TYPE_RMAPBT] = { XHG_AG, XFS_SICK_AG_RMAPBT },
> + [XFS_SCRUB_TYPE_REFCNTBT] = { XHG_AG, XFS_SICK_AG_REFCNTBT },
> + [XFS_SCRUB_TYPE_INODE] = { XHG_INO, XFS_SICK_INO_CORE },
> + [XFS_SCRUB_TYPE_BMBTD] = { XHG_INO, XFS_SICK_INO_BMBTD },
> + [XFS_SCRUB_TYPE_BMBTA] = { XHG_INO, XFS_SICK_INO_BMBTA },
> + [XFS_SCRUB_TYPE_BMBTC] = { XHG_INO, XFS_SICK_INO_BMBTC },
> + [XFS_SCRUB_TYPE_DIR] = { XHG_INO, XFS_SICK_INO_DIR },
> + [XFS_SCRUB_TYPE_XATTR] = { XHG_INO, XFS_SICK_INO_XATTR },
> + [XFS_SCRUB_TYPE_SYMLINK] = { XHG_INO, XFS_SICK_INO_SYMLINK },
> + [XFS_SCRUB_TYPE_PARENT] = { XHG_INO, XFS_SICK_INO_PARENT },
> + [XFS_SCRUB_TYPE_RTBITMAP] = { XHG_RT, XFS_SICK_RT_BITMAP },
> + [XFS_SCRUB_TYPE_RTSUM] = { XHG_RT, XFS_SICK_RT_SUMMARY },
> + [XFS_SCRUB_TYPE_UQUOTA] = { XHG_FS, XFS_SICK_FS_UQUOTA },
> + [XFS_SCRUB_TYPE_GQUOTA] = { XHG_FS, XFS_SICK_FS_GQUOTA },
> + [XFS_SCRUB_TYPE_PQUOTA] = { XHG_FS, XFS_SICK_FS_PQUOTA },
> +};
/me wonders if the primary superblock is XHG_FS or XHG_AG....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2019-04-16 1:09 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-16 0:19 [PATCH v3 0/5] xfs: scrub/repair update health tracking Darrick J. Wong
2019-04-16 0:19 ` [PATCH 1/5] xfs: refactor scrub context initialization Darrick J. Wong
2019-04-16 0:47 ` Dave Chinner
2019-04-16 0:19 ` [PATCH 2/5] xfs: collapse scrub bool state flags into a single unsigned int Darrick J. Wong
2019-04-16 0:49 ` Dave Chinner
2019-04-16 0:19 ` [PATCH 3/5] xfs: hoist the already_fixed variable to the scrub context Darrick J. Wong
2019-04-16 0:51 ` Dave Chinner
2019-04-16 0:54 ` Darrick J. Wong
2019-04-16 0:19 ` [PATCH 4/5] xfs: scrub/repair should update filesystem metadata health Darrick J. Wong
2019-04-16 1:09 ` Dave Chinner [this message]
2019-04-16 1:31 ` Darrick J. Wong
2019-04-16 1:43 ` [PATCH v2 " Darrick J. Wong
2019-04-16 7:57 ` Dave Chinner
2019-04-16 0:20 ` [PATCH 5/5] xfs: scrub should only cross-reference with healthy btrees Darrick J. Wong
2019-04-16 1:16 ` Dave Chinner
2019-04-16 1:45 ` [PATCH v2 " Darrick J. Wong
2019-04-16 7:58 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190416010905.GJ29573@dread.disaster.area \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox