public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@sandeen.net>
To: lachlan@sgi.com
Cc: xfs-oss <xfs@oss.sgi.com>
Subject: Re: [PATCH] Prevent lookup from finding bad buffers
Date: Wed, 25 Nov 2009 16:29:09 -0600	[thread overview]
Message-ID: <4B0DAFB5.3050003@sandeen.net> (raw)
In-Reply-To: <4B0D92EB.8030501@sandeen.net>

Eric Sandeen wrote:
> Lachlan McIlroy wrote:
>> There's a bug in _xfs_buf_find() that will cause it to return buffers
>> that failed to be initialised.
>>
>> If a thread has a buffer locked and is waiting for I/O to initialise
>> it and another thread wants the same buffer the second thread will
>> wait on the buffer lock in _xfs_buf_find().  If the initial thread
>> gets an I/O error it marks the buffer in error and releases the
>> buffer lock.  The second thread gets the buffer lock, assumes the
>> buffer has been successfully initialised, and then tries to use it.
>>
>> Some callers of xfs_buf_get_flags() will check for B_DONE, and if
>> it's not set then re-issue the I/O, bust most callers assume the
>> buffer and it's contents are good and then use the uninitialised
>> data.
>>
>> The solution I've come up with is if we lookup a buffer and find
>> it's got b_error set or has been marked stale then unhash it from
>> the buffer hashtable and retry the lookup.  Also if we fail to setup
>> the buffer correctly in xfs_buf_get_flags() then mark the buffer in
>> error and unhash it.  If the buffer is marked stale then in
>> xfs_buf_free() inform the page cache that the contents of the pages
>> are no longer valid.
> 
> I managed to come up with a sorta-kinda testcase for this.
> 
> Fragmented freespace, many files in a dir, on raid5; simply doing
> drop caches / ls in a loop triggered it.
> 
> I guess raid5 is bad in this respect; in it's make_request() we have:
> 
>                 } else {
>                         /* cannot get stripe for read-ahead, just give-up */
>                         clear_bit(BIO_UPTODATE, &bi->bi_flags);
>                         finish_wait(&conf->wait_for_overlap, &w);
>                         break;
>                 }
> 
> and this happens fairly often.  This probably explains a large
> percentage of our xfs_da_do_buf(2) errors we've seen on the list.
> 
> From my testing, I think this suffices - and interestingly, Lachlan's
> original patch doesn't seem to help...
> 
> Comments?
> 
> Maybe could clean up the logic a bit... should this only be
> tested for XBF_READ buffers as well ... or maybe an assert that
> if !uptodate, error should be set ...
> 
> diff --git a/fs/xfs/linux-2.6/xfs_buf.c b/fs/xfs/linux-2.6/xfs_buf.c
> index 965df12..cbc0541 100644
> --- a/fs/xfs/linux-2.6/xfs_buf.c
> +++ b/fs/xfs/linux-2.6/xfs_buf.c
> @@ -1142,6 +1165,8 @@ xfs_buf_bio_end_io(
>  		if (unlikely(bp->b_error)) {
>  			if (bp->b_flags & XBF_READ)
>  				ClearPageUptodate(page);
> +		} else if (!test_bit(BIO_UPTODATE, &bio->bi_flags)) {
> +			ClearPageUptodate(bp);
>  		} else if (blocksize >= PAGE_CACHE_SIZE) {
>  			SetPageUptodate(page);
>  		} else if (!PagePrivate(page) &&

Ok, so that was shoot-from-the-hip, and actually it was tested on an
older kernel; upstream didn't demonstrate the problem, thanks to:

commit c2b00852fbae4f8c45c2651530ded3bd01bde814
Author: NeilBrown <neilb@suse.de>
Date:   Sun Dec 10 02:20:51 2006 -0800

[PATCH] md: return a non-zero error to bi_end_io as appropriate in raid5

Currently raid5 depends on clearing the BIO_UPTODATE flag to signal an
error
to higher levels.  While this should be sufficient, it is safer to
explicitly
set the error code as well - less room for confusion.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

As Neil says, xfs should probably cope too by checking uptodate itself
as well, though...

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

      reply	other threads:[~2009-11-25 22:28 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-10  2:48 [PATCH] Prevent lookup from finding bad buffers Lachlan McIlroy
2009-02-15 20:12 ` Christoph Hellwig
2009-11-25 20:26 ` Eric Sandeen
2009-11-25 22:29   ` Eric Sandeen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B0DAFB5.3050003@sandeen.net \
    --to=sandeen@sandeen.net \
    --cc=lachlan@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox