public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 5/6] xfs: fix buffer shudown reference count mismatch
Date: Sat, 3 Nov 2012 10:47:41 +1100	[thread overview]
Message-ID: <20121102234741.GB29378@dastard> (raw)
In-Reply-To: <20121102131326.GG12578@infradead.org>

On Fri, Nov 02, 2012 at 09:13:26AM -0400, Christoph Hellwig wrote:
> > The fix that I've done here means all buffers going through this
> > path will take an extra reference, but that reference is only
> > dropped on async buffers. Because all the buffers are markd stale,
> > they are removed from the LRU, and so xfs_buftarg_wait() during
> > unmount does not find them and hence the remaining reference is
> > never removed. Hence the perag reference still remains, and we
> > assert fail there.
> > 
> > Solution seems simple - set the XBF_ASYNC flag on all buffers so
> > that the last reference is taken away correctly. Testing that now.
> 
> I don't like this.  ioend processing is very different for synchrous
> writes, with the most important difference being that synchronous
> writes need to wake the submitter at I/O completion.

I think that's irrelevant here - there will *never* be an IO waiter
at this point in time.  This processing is in log buffer IO
completion context, so the buffers are still pinned in memory. Hence
anyone trying to do IO on it will be waiting in xfs_buf_wait_unpin()
and never get to xfs_buf_iowait(). And because xfs_buf_wait_unpin()
is called with the buffer lock held, we'll never do the failure
handling in xfs_buf_item_unpin until the buffer IO is completed and
it is unlocked.

FWIW, this also indicates problem in IO submission -
xfs_buf_wait_unpin() occurs after the shutdown checks, so we could
wake a pinned waiter that then issues IO after the shutdown has
started because there are no shutdown checks after
xfs_buf_wait_unpin() is called....

> From all I can
> see your v2 patch breaks that beahviour.  For 3.7-rc I'd suggest
> taking the additional reference conditionally.

It's not a conditional reference - every buffer needs it. Right now
we have an unconditional use after free because bp->b_iodone() drops
the last reference on the buffer, then calls xfs_buf_iodone() again.
The only thing the XBF_ASYNC flag cwcurrently determines is is
whether we get an assert failure by calling xfs_buf_rele() on a
freed buffer or not.

What we are doing here is ioend processing without actually having
issued an IO. i.e. we are using the ioend processing for error
handling, not for IO completion. Hence we have to set the buffer up
as though it has been under IO so that ioend processing does exactly
what we want. And what we want is the buffer to be unconditionally
freed at the end of ioend processing.

What we end up with when we hit the "remove && freed" case is
a buffer that is not pinned, is not under IO and will have
no new references because we are in a shutdown situation. We need to
mark the buffer stale to ensure it is not written, and free the
buffer when the last reference goes away. That reference is the
reference the transaction subsystem owns and it guarantees that
there is no IO completion waiters on the buffer.

IOWs, we have two options after taking an extra reference:

	1. clear the XBF_ASYNC flag unconditionally and call
	xfs_buf_relse() ourselves to unlock and free the buffer
	2. set the XBF_ASYNC flag unconditionally and let
	xfs_buf_ioend() call xfs_buf_relse() itself.

I tossed a coin, and it came down on #2. Either way will work, but
we need that extra reference on all buffers passing through this
code...

> For 3.8 I'm going to look into simply acquiring an additional reference
> for synchronous writes during I/O submission to kill these special cases
> all over the buffer code.

Sure, but that's a different issue, not directly realted to fixing
the bug at hand. It would also mean that if we choose #1, we'll need
to take 2 references for the IO for this error handling path. That
seems like a much stranger special case than just using async buffer
IO completion to do the work we need...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  parent reply	other threads:[~2012-11-02 23:45 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-02  0:38 [PATCH 0/6] xfs: fixes for 3.7-rc3 Dave Chinner
2012-11-02  0:38 ` [PATCH 1/6] xfs: silence uninitialised f.file warning Dave Chinner
2012-11-02 13:04   ` Christoph Hellwig
2012-11-02 21:23   ` Mark Tinguely
2012-11-02  0:38 ` [PATCH 2/6] xfs: growfs: don't read garbage for new secondary superblocks Dave Chinner
2012-11-02  0:38 ` [PATCH 3/6] xfs: invalidate allocbt blocks moved to the free list Dave Chinner
2012-10-09 19:11   ` [PATCH] xfs: report projid32bit feature in geometry call Eric Sandeen
2012-10-09 19:28     ` Carlos Maiolino
2012-10-09 19:45     ` Dave Chinner
2012-10-11  0:02     ` Eric Sandeen
2012-10-30 19:43       ` Ben Myers
2012-10-30 19:44         ` Eric Sandeen
2012-11-08 16:12     ` Ben Myers
2012-11-02 21:23   ` [PATCH 3/6] xfs: invalidate allocbt blocks moved to the free list Mark Tinguely
2012-11-02  0:38 ` [PATCH 4/6] xfs: don't vmap inode cluster buffers during free Dave Chinner
2012-11-02 13:05   ` Christoph Hellwig
2012-11-02 21:24   ` Mark Tinguely
2012-11-02  0:38 ` [PATCH 5/6] xfs: fix buffer shudown reference count mismatch Dave Chinner
2012-11-02  2:43   ` Dave Chinner
2012-11-02  3:23     ` [PATCH 5/6 V2] " Dave Chinner
2012-11-02 13:17       ` Mark Tinguely
2012-11-02 13:13     ` [PATCH 5/6] " Christoph Hellwig
2012-11-02 17:10       ` Mark Tinguely
2012-11-02 23:47       ` Dave Chinner [this message]
2012-11-06 12:59         ` Christoph Hellwig
2012-11-06 19:59           ` Dave Chinner
2012-11-02  0:38 ` [PATCH 6/6] xfs: fix reading of wrapped log data Dave Chinner
2012-11-02 13:07   ` Christoph Hellwig
2012-11-02 23:51     ` Dave Chinner
2012-11-02 21:24   ` Mark Tinguely
2012-11-07 20:56 ` [PATCH 0/6] xfs: fixes for 3.7-rc3 Dave Chinner
2012-11-08 16:34   ` Ben Myers
2012-11-08 16:15 ` Ben Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121102234741.GB29378@dastard \
    --to=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox