Re: XFS crash consistency bug : Loss of fsynced metadata operation

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Lukas Czerner <lczerner@redhat.com>
Cc: Jayashree Mohan <jayashree2912@gmail.com>,
	linux-xfs@vger.kernel.org,
	Vijaychidambaram Velayudhan Pillai <vijay@cs.utexas.edu>,
	Ashlie Martinez <ashmrtn@utexas.edu>,
	fstests@vger.kernel.org, Theodore Ts'o <tytso@mit.edu>
Subject: Re: XFS crash consistency bug : Loss of fsynced metadata operation
Date: Thu, 15 Mar 2018 21:06:46 +1100	[thread overview]
Message-ID: <20180315100646.GE7000@dastard> (raw)
In-Reply-To: <20180315061557.5jajfodpkjdmcrld@rh_laptop>

On Thu, Mar 15, 2018 at 07:15:57AM +0100, Lukas Czerner wrote:
> On Thu, Mar 15, 2018 at 08:24:41AM +1100, Dave Chinner wrote:
> > On Wed, Mar 14, 2018 at 02:57:52PM +0100, Lukas Czerner wrote:
> > > On Thu, Mar 15, 2018 at 12:32:58AM +1100, Dave Chinner wrote:
> > > > On Wed, Mar 14, 2018 at 02:16:59PM +0100, Lukas Czerner wrote:
> > > > > just FYI the 042 xfstest does fail on xfs with what I think is stale
> > > > > data exposure. It might not be related at all to what crashmonkey is
> > > > > reporting but there is something wrong nevertheless.
> > > > 
> > > > generic/042 is unreliable and certain operations result in a
> > > > non-zero length file because of metadata commits/writeback that
> > > > occur as a result of the fallocate operations. It got removed from
> > > > the auto group because it isn't a reliable test about 3 years ago:
> > > 
> > > Sure, I just that it clearly exposes stale data on xfs. That is, the
> > > resulting file contains data that was previously written to the
> > > underlying image file to catch the exposure. I am aware of the non-zero
> > > length file problem, that's not what I am pointing out though.
> > 
> > What setup are you testing on? I haven't seen it fail in some time.
> > Here, on emulated pmem:
> 
> Virtual machine with Virtio devices backed by a linear lvs consisting of
> SCSI drives, all local.
> 
> > 
> > SECTION       -- xfs
> > FSTYP         -- xfs (debug)
> > PLATFORM      -- Linux/x86_64 test4 4.16.0-rc5-dgc
> > MKFS_OPTIONS  -- -f -m rmapbt=1,reflink=1 -i sparse=1 /dev/pmem1
> > MOUNT_OPTIONS -- /dev/pmem1 /mnt/scratch
> > 
> > xfs/042 10s ... 14s
> 
> We are talking about generic/042. xfs/042 is very much a different
> test.

Ugh copy-n-paste fail. Sorry.

I was looking at the right test, just running the wrong one.

Anyway, what makes you think this:

> 
> SECTION       -- xfs
> RECREATING    -- xfs on /dev/vdc1
> FSTYP         -- xfs (non-debug)
> PLATFORM      -- Linux/x86_64 rhel7 4.16.0-rc5+
> MKFS_OPTIONS  -- -f -f /dev/vdb1
> MOUNT_OPTIONS -- -o context=system_u:object_r:root_t:s0 /dev/vdb1 /mnt/test1
> 
> generic/042	 - output mismatch (see /root/Projects/xfstests-dev/results//ext4/generic/042.out.bad)
>     --- tests/generic/042.out	2018-03-14 05:56:38.619124060 -0400
>     +++ /root/Projects/xfstests-dev/results//ext4/generic/042.out.bad	2018-03-15 02:15:02.872113819 -0400
>     @@ -5,6 +5,16 @@
>      fpunch
>      wrote 65536/65536 bytes at offset 0
>      XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>     +0000000 cdcd cdcd cdcd cdcd cdcd cdcd cdcd cdcd
>     +*
>     +000f000 0000 0000 0000 0000 0000 0000 0000 0000
>     +*

exposes stale data? The command is:

$XFS_IO_PROG -f -c "pwrite -S 1 0 64k" -c "$cmd 60k 4k" $file

i.e. We wrote bytes from 0 to 64k, then punched from 60k to 64k. if
the file is 64k in length, then it should contain either all "cdcd"
pattern, or there should be "cdcd" data except for the range from
60k to 64k where there should be zeros.

The later is exactly what the diff output is say - "cdcd" data from
0-60k, zeros from 60 to 64k. So there's no stale data exposure
occurring here (those bugs got fixed!), it's just the test output is
unreliable and does not match the golden output.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2018-03-15 10:06 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-13  2:15 XFS crash consistency bug : Loss of fsynced metadata operation Jayashree Mohan
2018-03-13  4:21 ` Dave Chinner
2018-03-13  6:36   ` Amir Goldstein
2018-03-13 18:05     ` Jayashree Mohan
2018-03-13 16:57   ` Jayashree Mohan
2018-03-13 22:45     ` Dave Chinner
2018-03-14 13:16       ` Lukas Czerner
2018-03-14 13:32         ` Dave Chinner
2018-03-14 13:57           ` Lukas Czerner
2018-03-14 21:24             ` Dave Chinner
2018-03-15  6:15               ` Lukas Czerner
2018-03-15 10:06                 ` Dave Chinner [this message]
2018-03-15 10:32                   ` Lukas Czerner
2018-03-16  0:19                     ` Dave Chinner
2018-03-16  5:45                       ` Lukas Czerner
2018-03-17  3:16                         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180315100646.GE7000@dastard \
    --to=david@fromorbit.com \
    --cc=ashmrtn@utexas.edu \
    --cc=fstests@vger.kernel.org \
    --cc=jayashree2912@gmail.com \
    --cc=lczerner@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=vijay@cs.utexas.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).