Re: [PATCH 2/2] mdrestore: warn about corruption if log is dirty

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: Jan Tulak <jtulak@redhat.com>,
	linux-xfs@vger.kernel.org, sandeen@sandeen.net
Subject: Re: [PATCH 2/2] mdrestore: warn about corruption if log is dirty
Date: Thu, 13 Apr 2017 12:51:05 +1000	[thread overview]
Message-ID: <20170413025105.GD12369@dastard> (raw)
In-Reply-To: <20170412110403.GB6834@bfoster.bfoster>

On Wed, Apr 12, 2017 at 07:04:04AM -0400, Brian Foster wrote:
> On Wed, Apr 12, 2017 at 08:34:05AM +1000, Dave Chinner wrote:
> > On Tue, Apr 11, 2017 at 04:12:37PM +0200, Jan Tulak wrote:
> > > A dirty log in an obfuscated dump means that a corruption can happen
> > > when replaying the log (which contains unobfuscated data). Warn the user
> > > about this possibility.
> > > 
> > > The xlog workaround is copy&paste solution from repair/phase2.c and
> > > other tools, because the function is not implemented in libxlog.
> > > 
> > > Signed-off-by: Jan Tulak <jtulak@redhat.com>
> > 
> > I think this is overkill. mdrestore is not the place
> > to be interpreting the state of the dumped image - it is a basic
> > "restore the image" program, not a "check the validity of the image"
> > program.
> > 
> 
> I think that's a reasonable argument for the mdrestore side. I'm less
> interested in seeing a warning on the restore side in general,
> personally. I was initially thinking it would have required less code
> and the whole obfuscation detection thing is getting into hackish
> territory, to be fair.
> 
> > Secondly, if people are having problems with running log recovery on
> > a restored obfuscated image and getting corruption and not knowing
> > why or what to do, then that is a /documentation and training/
> > problem, not a code problem.
> > 
> > i.e. the problem is that people who aren't developers are trying to
> > use tools that were written for developers to do forensic analysis
> > of failures. Don't dumb down the tool for clueless users - point the
> > users at the documentation that the tool requires to use correctly...
> > 
> 
> Put me in the clueless users bucket, then. This started with a customer
> with a corrupted filesystem that provided a metadump that exhibited
> filesystem corruption. A support person began the process of diagnosing
> the problem and it eventually got to me, who had to spend a nontrivial
> amount of time trying to identify what the problem was, see if I could
> reproduce it on my own to verify it was actually specific to the
> metadump, etc.
> 
> This is not an obvious "your metadump is broken" log recovery failure.
> It's a latent directory corruption that doesn't obviously have anything
> to do with log recovery in the first place. I'm sure I'll be able to
> spot it going forward for some time while it's fresh in my mind, but I
> expect to lose track of that eventually given the rarity (of debugging
> log recovery issues). It's not reasonable at all to expect regular users
> or support people to understand this enough to filter out bad images or
> know when to use or not use a certain combination of metadump options,
> because it otherwise requires a detailed understanding of XFS logging
> and directory internals.

Log recovery on an obfuscated directory is, to me, a known obvious
vector for directory corruption because we replay unobfuscated
dirents over obfuscated on-disk data. Buffer logging is done in
aligned 128 byte chunks, so it /should/ be obvious that the recovery
of directory data buffers will partially overwrite dirents on disk
even when they were not directly modified by the user.  And because
this causes an obfuscated/clear text mismatch in the dirent name,
the hash will not calculate to teh same as what the directory stored
for that dirent. Hence the corruption reports that repair will now
spew...

This was always considered a known problem for obfuscated metadump
restorations - the unobfuscated log will result in recovery issues
and name/data corruptions for dirs and xattrs.  In hindsight, this
should have been documented long ago so you didn't have to waste the
time to "rediscover" it like you did. It wasn't documented because
both developers and users were far more concerned about the data
exposure issues than they were about whether the log unobfuscated
log replayed correctly or not.

IMO - and as I said to Eric on IRC - we should not be trying to work
around institutional problems (i.e. inability to train or impart the
necessary knowledge on support engineers) with code changes.
Training support engineers properly requires documentation and
knowledge distribution processes; the code implementing the tools
they are being taught about is not the right instrument to perform
this knowledge transfer....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2017-04-13  2:51 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-11 14:12 [PATCH 0/2] xfsprogs: metadump/mdrestore warns about dirty journal Jan Tulak
2017-04-11 14:12 ` [PATCH 1/2] metadump: warn about corruption if log is dirty Jan Tulak
2017-04-11 18:30   ` Brian Foster
2017-04-11 18:34     ` Eric Sandeen
2017-04-11 18:43       ` Brian Foster
2017-04-11 19:01         ` Eric Sandeen
2017-04-11 23:44           ` Darrick J. Wong
2017-04-12 11:03             ` Brian Foster
2017-04-12 11:24               ` Jan Tulak
2017-04-11 14:12 ` [PATCH 2/2] mdrestore: " Jan Tulak
2017-04-11 18:33   ` Brian Foster
2017-04-11 18:39     ` Eric Sandeen
2017-04-11 18:49       ` Brian Foster
2017-04-11 18:59         ` Eric Sandeen
2017-04-11 22:34   ` Dave Chinner
2017-04-11 23:43     ` Darrick J. Wong
2017-04-12  1:48       ` Eric Sandeen
2017-04-12 11:26         ` Brian Foster
2017-04-12 11:06       ` Brian Foster
2017-04-12 17:45         ` Darrick J. Wong
2017-04-13  8:12           ` Jan Tulak
2017-04-12 11:04     ` Brian Foster
2017-04-13  2:51       ` Dave Chinner [this message]
2017-04-13 13:10         ` Brian Foster
2017-04-14  0:29           ` Dave Chinner
2017-04-14  2:54             ` Brian Foster
2017-05-25 17:29 ` [PATCH 0/2] xfsprogs: metadump/mdrestore warns about dirty journal Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170413025105.GD12369@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=jtulak@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).