From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 05/22] xfs: scrub in-memory metadata buffers
Date: Mon, 24 Jul 2017 22:27:40 -0700 [thread overview]
Message-ID: <20170725052740.GN4352@magnolia> (raw)
In-Reply-To: <20170725033203.GH17762@dastard>
On Tue, Jul 25, 2017 at 01:32:03PM +1000, Dave Chinner wrote:
> On Mon, Jul 24, 2017 at 05:14:33PM -0700, Darrick J. Wong wrote:
> > On Tue, Jul 25, 2017 at 09:38:13AM +1000, Dave Chinner wrote:
> > > On Mon, Jul 24, 2017 at 03:36:54PM -0700, Darrick J. Wong wrote:
> > > > On Mon, Jul 24, 2017 at 11:43:27AM +1000, Dave Chinner wrote:
> > > > > On Thu, Jul 20, 2017 at 09:39:00PM -0700, Darrick J. Wong wrote:
> > > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > > >
> > > > > > Call the verifier function for all in-memory metadata buffers, looking
> > > > > > for memory corruption either due to bad memory or coding bugs.
> > > > >
> > > > > How does this fit into the bigger picture? We can't do an exhaustive
> > > > > search of the in memory buffer cache, because access is racy w.r.t.
> > > > > the life cycle of in memory buffers.
> > > > >
> > > > > Also, if we are doing a full scrub, we're going to hit and then
> > > > > check the cached in-memory buffers anyway, so I'm missing the
> > > > > context that explains why this code is necessary.
> > > >
> > > > Before we start scanning the filesystem (which could lead to clean
> > > > buffers being pushed out of memory and later reread), we want to check
> > > > the buffers that have been sitting around in memory to see if they've
> > > > mutated since the last time the verifiers ran.
> > >
> > > I'm not sure we need a special cache walk to do this.
> > >
> > > My thinking is that if the buffers get pushed out of memory, the
> > > verifier will be run at that time, so we don't need to run the
> > > verifier before a scrub to avoid problems here.
> >
> > Agreed.
> >
> > > Further, if we read the buffer as part of the scrub and it's found
> > > in cache, then if the scrub finds a corruption we'll know it
> > > happened between the last verifier invocation and the scrub.
> >
> > Hm. Prior to the introduction of the metabufs scanner a few weeks ago,
> > I had thought it sufficient to assume that memory won't get corrupt, so
> > as long as the read verifier ran at /some/ point in the past we didn't
> > need to recheck now.
> >
> > What if we scrap the metabufs scanner and adapt the read verifier
> > function pointer to allow scrub to bypass the crc check and return the
> > _THIS_IP_ from any failing structural test? Then scrubbers can call the
> > read verifier directly and extract failure info directly.
>
> Yeah, that would work - rather than adapting the .read_verify op
> we currently have, maybe a new op .read_verify_nocrc could be added?
> THat would mostly just be a different wrapper around the existing
> verify functions that are shared between the read and write
> verifiers...
I was going to call them ->verify_structure() or something like that.
> > > If the buffer is not in cache and scrub reads the metadata from
> > > disk, then the verifier should fire on read if the item is corrupt
> > > coming off disk. If the verifier doesn't find corruption in this
> > > case but scrub does, then we've got to think about whether the
> > > verifier has sufficient coverage.
> >
> > Scrub has more comprehensive checks (or it will when xref comes along)
> > so this is likely to happen, fyi.
>
> Yup, I expect it will. :) I also expect this to point out where the
> verifiers can be improved, because I'm sure we haven't caught all
> the "obviously wrong" cases in the verifiers yet...
Well yes. :) I have been running the "dangerous_fuzzers dangerous_scrub
dangerous_repair" tests again, as they fuzz everything to see what scrub
complains about vs. what repair actually tries to fix.
Current wtf list:
- do we need to check all the padding areas? does repair?
- problems with repair not resetting di_nlink when it moves stuff to l+f?
(x/380)
- repair doesn't correct remote symlink blocks
- scrub doesn't barf on free block problems?? (x/396)
- repair doesn't notice attr remote value block problems? (x/404)
- XFS_WANT_CORRUPTED_ are verifiers, but should not trigger asserts?
- xfs_repair segfaults when freetag is too big
(Actually, I already sent a fix a couple of days ago for that last one,
but nobody has reviewed it.)
--D
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2017-07-25 5:27 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-21 4:38 [PATCH v8 00/22] xfs: online scrub support Darrick J. Wong
2017-07-21 4:38 ` [PATCH 01/22] xfs: query the per-AG reservation counters Darrick J. Wong
2017-07-23 16:16 ` Allison Henderson
2017-07-23 22:25 ` Dave Chinner
2017-07-24 19:07 ` Darrick J. Wong
2017-07-21 4:38 ` [PATCH 02/22] xfs: add scrub tracepoints Darrick J. Wong
2017-07-23 16:23 ` Allison Henderson
2017-07-21 4:38 ` [PATCH 03/22] xfs: create an ioctl to scrub AG metadata Darrick J. Wong
2017-07-23 16:37 ` Allison Henderson
2017-07-23 23:45 ` Dave Chinner
2017-07-24 21:14 ` Darrick J. Wong
2017-07-21 4:38 ` [PATCH 04/22] xfs: generic functions to scrub metadata and btrees Darrick J. Wong
2017-07-23 16:40 ` Allison Henderson
2017-07-24 1:05 ` Dave Chinner
2017-07-24 21:58 ` Darrick J. Wong
2017-07-24 23:15 ` Dave Chinner
2017-07-25 0:39 ` Darrick J. Wong
2017-07-21 4:39 ` [PATCH 05/22] xfs: scrub in-memory metadata buffers Darrick J. Wong
2017-07-23 16:48 ` Allison Henderson
2017-07-24 1:43 ` Dave Chinner
2017-07-24 22:36 ` Darrick J. Wong
2017-07-24 23:38 ` Dave Chinner
2017-07-25 0:14 ` Darrick J. Wong
2017-07-25 3:32 ` Dave Chinner
2017-07-25 5:27 ` Darrick J. Wong [this message]
2017-07-21 4:39 ` [PATCH 06/22] xfs: scrub the backup superblocks Darrick J. Wong
2017-07-23 16:50 ` Allison Henderson
2017-07-25 4:05 ` Dave Chinner
2017-07-25 5:42 ` Darrick J. Wong
2017-07-21 4:39 ` [PATCH 07/22] xfs: scrub AGF and AGFL Darrick J. Wong
2017-07-23 16:59 ` Allison Henderson
2017-07-21 4:39 ` [PATCH 08/22] xfs: scrub the AGI Darrick J. Wong
2017-07-23 17:02 ` Allison Henderson
2017-07-21 4:39 ` [PATCH 09/22] xfs: scrub free space btrees Darrick J. Wong
2017-07-23 17:09 ` Allison Henderson
2017-07-21 4:39 ` [PATCH 10/22] xfs: scrub inode btrees Darrick J. Wong
2017-07-23 17:15 ` Allison Henderson
2017-07-21 4:39 ` [PATCH 11/22] xfs: scrub rmap btrees Darrick J. Wong
2017-07-23 17:21 ` Allison Henderson
2017-07-21 4:39 ` [PATCH 12/22] xfs: scrub refcount btrees Darrick J. Wong
2017-07-23 17:25 ` Allison Henderson
2017-07-21 4:39 ` [PATCH 13/22] xfs: scrub inodes Darrick J. Wong
2017-07-23 17:38 ` Allison Henderson
2017-07-24 20:02 ` Darrick J. Wong
2017-07-21 4:40 ` [PATCH 14/22] xfs: scrub inode block mappings Darrick J. Wong
2017-07-23 17:41 ` Allison Henderson
2017-07-24 20:05 ` Darrick J. Wong
2017-07-21 4:40 ` [PATCH 15/22] xfs: scrub directory/attribute btrees Darrick J. Wong
2017-07-23 17:45 ` Allison Henderson
2017-07-21 4:40 ` [PATCH 16/22] xfs: scrub directory metadata Darrick J. Wong
2017-07-23 17:51 ` Allison Henderson
2017-07-21 4:40 ` [PATCH 17/22] xfs: scrub directory freespace Darrick J. Wong
2017-07-23 17:55 ` Allison Henderson
2017-07-21 4:40 ` [PATCH 18/22] xfs: scrub extended attributes Darrick J. Wong
2017-07-23 17:57 ` Allison Henderson
2017-07-21 4:40 ` [PATCH 19/22] xfs: scrub symbolic links Darrick J. Wong
2017-07-23 17:59 ` Allison Henderson
2017-07-21 4:40 ` [PATCH 20/22] xfs: scrub parent pointers Darrick J. Wong
2017-07-23 18:03 ` Allison Henderson
2017-07-21 4:40 ` [PATCH 21/22] xfs: scrub realtime bitmap/summary Darrick J. Wong
2017-07-23 18:05 ` Allison Henderson
2017-07-21 4:40 ` [PATCH 22/22] xfs: scrub quota information Darrick J. Wong
2017-07-23 18:07 ` Allison Henderson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170725052740.GN4352@magnolia \
--to=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox