All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: xfs@oss.sgi.com, Viro@disappointment.disaster,
	viro@ZenIV.linux.org.uk, Al@disappointment.disaster
Subject: Re: [RFC, PATCH 0/6] xfs: delalloc, DIO and corruption....
Date: Tue, 1 Apr 2014 07:54:08 -0400	[thread overview]
Message-ID: <20140401115408.GA21540@bfoster.bfoster> (raw)
In-Reply-To: <20140331201757.GC17603@dastard>

On Tue, Apr 01, 2014 at 07:17:57AM +1100, Dave Chinner wrote:
> On Mon, Mar 31, 2014 at 01:22:43PM -0400, Brian Foster wrote:
> > On Fri, Mar 21, 2014 at 09:11:44PM +1100, Dave Chinner wrote:
> > > Hi folks,
> > > 
> > > This patch series mostly shuts a can of worms that Al opened when he
> > > found the cause of the generic/263 fsx failures. The fix for that is
> > > patch 6 of this series, but, well, there are a bunch of other
> > > problems that need to be fixed before making that change.
> > > 
> > > Basically, the direct Io block mapping behaviour was covering up a
> > > bunch of other bugs in the delayed allocation extent/page cache
> > > state coherency mappings. Essentially, we punch out the page cache
> > > in quite a few places without first cleaning up delayed allocation
> > > extents over that range and that exposes all sorts of nasty issues
> > > once the direct IO mapping changes are made.  All of these are
> > > existing problems, most of them are very unlikely to be seen in the
> > > wild.
> > > 
> > > This patch set passes xfstests on a 4k block size/4k page size
> > > config with out problems. However, there is still a fsx failure in
> > > generic/127 on 1k block size/4k page size configurations that I
> > > haven't yet tracked down. That test was failing occasionally before
> > > this patch set as well, so it may be a completely unrelated problem.
> > > 
> > > The sad fact of this patchset is it is mostly playing whack-a-mole
> > > with visible symptoms of bugs.  It drives home the fact that
> > > bufferheads and the keeping of internal filesystem state attached to
> > > the page cache simply isn't a verifiable architecture.  After
> > > spending several days of doing nothing else but tracking down these
> > > inconsistencies i can only conclude that the code is complex,
> > > fragile and extremely difficult to verify that behaviour is correct.
> > > As such, I doubt that the fixes are entirely correct, so I'm left
> > > with using fsx and fsstress to tell me if I've broken anything.
> > > 
> > > Eyeballs appreciated, as is test results.
> > > 
> > 
> > I had an xfstests running against this (on for-next) over the weekend
> > and it hit the following bug on xfs/297:
> > 
> > [ 6408.168767] kernel BUG at fs/xfs/xfs_aops.c:1336!
> > [ 6408.169542] invalid opcode: 0000 [#1] SMP 
> 
> Ok, so that's found another stale delalloc range where there
> shouldn't be. I know there were still problems when I left because
> generic/127 was failing on 1k block size filesystems, but I haven't
> yet had a chance to get back to determine if the bug was the broken
> code in xfs_check_page_types() that Dan Carpenter noticed. Were you
> running with that fix?
> 

Ah, good point. I was running with the check_page_type() rework, but not
the most recent fix. I'll plan to test again with that included.

Brian

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

      reply	other threads:[~2014-04-01 11:54 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-21 10:11 [RFC, PATCH 0/6] xfs: delalloc, DIO and corruption Dave Chinner
2014-03-21 10:11 ` [PATCH 1/6] xfs: kill buffers over failed write ranges properly Dave Chinner
2014-03-21 10:11 ` [PATCH 2/6] xfs: write failure beyond EOF truncates too much data Dave Chinner
2014-03-29 15:14   ` Brian Foster
2014-04-04 15:26     ` Brian Foster
2014-04-04 21:26       ` Dave Chinner
2014-03-21 10:11 ` [PATCH 3/6] xfs: xfs_vm_write_end truncates too much on failure Dave Chinner
2014-03-21 10:11 ` [PATCH 4/6] xfs: zeroing space needs to punch delalloc blocks Dave Chinner
2014-03-21 10:11 ` [PATCH 5/6] xfs: splitting delalloc extents can run out of reservation Dave Chinner
2014-04-04 13:37   ` Brian Foster
2014-04-04 21:31     ` Dave Chinner
2014-03-21 10:11 ` [PATCH 6/6] xfs: don't map ranges that span EOF for direct IO Dave Chinner
2014-03-31 17:22 ` [RFC, PATCH 0/6] xfs: delalloc, DIO and corruption Brian Foster
2014-03-31 20:17   ` Dave Chinner
2014-04-01 11:54     ` Brian Foster [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140401115408.GA21540@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=Al@disappointment.disaster \
    --cc=Viro@disappointment.disaster \
    --cc=david@fromorbit.com \
    --cc=viro@ZenIV.linux.org.uk \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.