public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 4/6] libxfs: reused invalidated buffers leak state and data
Date: Sat, 5 Jul 2014 08:22:10 +1000	[thread overview]
Message-ID: <20140704222210.GM9508@dastard> (raw)
In-Reply-To: <20140704141509.GB29520@infradead.org>

On Fri, Jul 04, 2014 at 07:15:09AM -0700, Christoph Hellwig wrote:
> On Fri, Jul 04, 2014 at 03:57:13PM +1000, Dave Chinner wrote:
> > @@ -632,6 +632,12 @@ libxfs_putbuf(xfs_buf_t *bp)
> >  			pthread_mutex_unlock(&bp->b_lock);
> >  		}
> >  	}
> > +	/*
> > +	 * ensure that any errors on this use of the buffer don't carry
> > +	 * over to the next user.
> > +	 */
> > +	bp->b_error = 0;
> > +
> >  	cache_node_put(libxfs_bcache, (struct cache_node *)bp);
> 
> This seems a bit fishy to me.  For one I'm pretty sure it needs to be
> done before unlocking b_lock,

Fair enough.

> second it's different behavior from the
> kernel where we explicitly clear it in the caller for the rare case
> we want to reuse a buffer that had an error (xfs_buf_iodone_callbacks
> seems to be the only one).  Any reason to do this differently in
> userspace?

The userspace code that uses the buffer cache is much less
constrained than the kernel code. The userspace code is pretty nasty
in places, especially when it comes to buffer error handling.

We can't clear errors or zero buffer contents in libxfs_getbuf-*
like we do in the kernel, because those functions are used by the
libxfs_readbuf_* functions and hence need to leave the buffers
unchanged on cache hits. This is actually the only way to gather a
write error from a libxfs_writebuf() call - you need to
get the buffer again so you can check bp->b_error field - assuming
that the buffer is still in the cache when you check, that is.

This is very different to the kernel code which idoes not release
buffers on a write so we can wait on IO and check errors. The kernel
buffer cache also guarantees a buffer of a known initial state from
xfs_buf_get() even on a cache hit.

Hence the userspace buffer cache is behaving quite differently to
the kernel buffer cache and as a result it's leaking errors from
reads, invalidations and writes through
xfs_da_get_buf/libxfs_getbuf.  Current no userspace outside libxfs
code clears bp->b_error - very little code even checks it - so th
elibxfs code is tripping on stale errors left by the usrspace code.
libxfs_writebuf() already zeros bp->b_error to prevent propagation
of stale errors into future reads, so this patch is really just
closing the hole in the other buffer release path that the code
usually takes.

Doing a full audit and addition of error handling of all the
userspace code is a little beyond my resources right now. The only
thing I can really do quickly about the problem is clear the error
when we've finished with the buffer.

I'm open to other ways of fixing this, but right now we've got to
fix xfs_repair because it's currently breaking filesystems worse
than before xfs_repair was run...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2014-07-04 22:22 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-04  5:57 [PATCH 0/6] xfsprogs: fixes for 3.2.1 Dave Chinner
2014-07-04  5:57 ` [PATCH 1/6] repair: support more than 25 ACLs Dave Chinner
2014-07-04 14:23   ` Christoph Hellwig
2014-07-04  5:57 ` [PATCH 2/6] xfs_db: write command broken on 64 bit values Dave Chinner
2014-07-04 14:08   ` Christoph Hellwig
2014-07-04  5:57 ` [PATCH 3/6] repair: handle directory block corruption in phase 6 Dave Chinner
2014-07-04 14:24   ` Christoph Hellwig
2014-07-04  5:57 ` [PATCH 4/6] libxfs: reused invalidated buffers leak state and data Dave Chinner
2014-07-04 14:15   ` Christoph Hellwig
2014-07-04 22:22     ` Dave Chinner [this message]
2014-07-05  9:48       ` Christoph Hellwig
2014-07-06 23:54         ` Dave Chinner
2014-07-07  0:09           ` Dave Chinner
2014-07-07 10:05             ` Christoph Hellwig
2014-07-04  5:57 ` [PATCH 5/6] repair: fix quota inode handling in secondary superblocks Dave Chinner
2014-07-04 14:35   ` Christoph Hellwig
2014-07-04  5:57 ` [PATCH 6/6] repair: get rid of BADFSINO Dave Chinner
2014-07-04 14:15   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140704222210.GM9508@dastard \
    --to=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox