public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 1/9] xfs: synchronous buffer IO needs a reference
Date: Sat, 16 Aug 2014 09:17:36 +1000	[thread overview]
Message-ID: <20140815231736.GT26465@dastard> (raw)
In-Reply-To: <20140815131804.GA4096@laptop.bfoster>

On Fri, Aug 15, 2014 at 09:18:04AM -0400, Brian Foster wrote:
> On Fri, Aug 15, 2014 at 04:38:59PM +1000, Dave Chinner wrote:
....
> >  	if (bp->b_flags & XBF_WRITE)
> >  		xfs_buf_wait_unpin(bp);
> > +
> > +	/*
> > +	 * Take references to the buffer. For XBF_ASYNC buffers, holding a
> > +	 * reference for as long as submission takes is all that is necessary
> > +	 * here. The IO inherits the lock and hold count from the submitter,
> > +	 * and these are release during IO completion processing. Taking a hold
> > +	 * over submission ensures that the buffer is not freed until we have
> > +	 * completed all processing, regardless of when IO errors occur or are
> > +	 * reported.
> > +	 *
> > +	 * However, for synchronous IO, the IO does not inherit the submitters
> > +	 * reference count, nor the buffer lock. Hence we need to take an extra
> > +	 * reference to the buffer for the for the IO context so that we can
> > +	 * guarantee the buffer is not freed until all IO completion processing
> > +	 * is done. Otherwise the caller can drop their reference while the IO
> > +	 * is still in progress and hence trigger a use-after-free situation.
> > +	 */
> >  	xfs_buf_hold(bp);
> > +	if (!(bp->b_flags & XBF_ASYNC))
> > +		xfs_buf_hold(bp);
> > +
> >  
> >  	/*
> > -	 * Set the count to 1 initially, this will stop an I/O
> > -	 * completion callout which happens before we have started
> > -	 * all the I/O from calling xfs_buf_ioend too early.
> > +	 * Set the count to 1 initially, this will stop an I/O completion
> > +	 * callout which happens before we have started all the I/O from calling
> > +	 * xfs_buf_ioend too early.
> >  	 */
> >  	atomic_set(&bp->b_io_remaining, 1);
> >  	_xfs_buf_ioapply(bp);
> > +
> >  	/*
> > -	 * If _xfs_buf_ioapply failed, we'll get back here with
> > -	 * only the reference we took above.  _xfs_buf_ioend will
> > -	 * drop it to zero, so we'd better not queue it for later,
> > -	 * or we'll free it before it's done.
> > +	 * If _xfs_buf_ioapply failed or we are doing synchronous IO that
> > +	 * completes extremely quickly, we can get back here with only the IO
> > +	 * reference we took above.  _xfs_buf_ioend will drop it to zero, so
> > +	 * we'd better run completion processing synchronously so that the we
> > +	 * don't return to the caller with completion still pending. In the
> > +	 * error case, this allows the caller to check b_error safely without
> > +	 * waiting, and in the synchronous IO case it avoids unnecessary context
> > +	 * switches an latency for high-peformance devices.
> >  	 */
> 
> AFAICT there is no real wait if the buf has completed at this point. The
> wait just decrements the completion counter.

If the IO has completed, then we run the completion code.

> So what's the benefit of
> "not waiting?" Where is the potential context switch?

async work for completion processing on synchrnous IO means we queue
the work, then sleep in xfs_buf_iowait(). Two context switches, plus
a work queue execution

> Are you referring
> to the case where error is set but I/O is not complete? Are you saying
> the advantage to the caller is it doesn't have to care about the state
> of further I/O once it has been determined at least one error has
> occurred? (If so, who cares about latency given that some operation that
> depends on this I/O is already doomed to fail?).

No, you're reading *way* too much into this. For sync IO, it's
always best to process completion inline. For async, it doesn't
matter, but if there's a submission error is *more effecient* to
process it in the current context.

> The code looks fine, but I'm trying to understand the reasoning better
> (and I suspect we can clarify the comment).
> 
> > -	_xfs_buf_ioend(bp, bp->b_error ? 0 : 1);
> > +	if (bp->b_error || !(bp->b_flags & XBF_ASYNC))
> > +		_xfs_buf_ioend(bp, 0);
> > +	else
> > +		_xfs_buf_ioend(bp, 1);
> 
> Not related to this patch, but it seems like the problem this code tries
> to address is still possible.

The race condition is still possible - it just won't result in a
use-after-free. The race condition is not fixed until patch 8,
but as a backportable fix, this patch is much, much simpler.

> Perhaps this papers over a particular
> instance. Consider the case where an I/O fails immediately after this
> call completes, but not before. We have an extra reference now for
> completion, but we can still return to the caller with completion
> pending. I suppose its fine if we consider the "problem" to be that the
> reference goes away underneath the completion, as opposed to the caller
> caring about the status of completion.

Precisely.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2014-08-15 23:18 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-15  6:38 [RFC PATCH 0/9] xfs: clean up xfs_buf io interfaces Dave Chinner
2014-08-15  6:38 ` [PATCH 1/9] xfs: synchronous buffer IO needs a reference Dave Chinner
2014-08-15 13:18   ` Brian Foster
2014-08-15 23:17     ` Dave Chinner [this message]
2014-08-18 14:15       ` Brian Foster
2014-08-29  0:18   ` Christoph Hellwig
2014-08-15  6:39 ` [PATCH 2/9] xfs: xfs_buf_ioend and xfs_buf_iodone_work duplicate functionality Dave Chinner
2014-08-15 13:18   ` Brian Foster
2014-08-15 23:21     ` Dave Chinner
2014-08-18 14:15       ` Brian Foster
2014-08-29  0:22   ` Christoph Hellwig
2014-08-29  0:55     ` Dave Chinner
2014-08-15  6:39 ` [PATCH 3/9] xfs: rework xfs_buf_bio_endio error handling Dave Chinner
2014-08-15 13:18   ` Brian Foster
2014-08-15 23:25     ` Dave Chinner
2014-08-29  0:23   ` Christoph Hellwig
2014-08-15  6:39 ` [PATCH 4/9] xfs: kill xfs_bdstrat_cb Dave Chinner
2014-08-29  0:24   ` Christoph Hellwig
2014-08-15  6:39 ` [PATCH 5/9] xfs: xfs_bioerror can die Dave Chinner
2014-08-15 14:35   ` Brian Foster
2014-08-15 23:27     ` Dave Chinner
2014-08-29  0:28   ` Christoph Hellwig
2014-08-29  1:05     ` Dave Chinner
2014-08-15  6:39 ` [PATCH 6/9] xfs: kill xfs_bioerror_relse Dave Chinner
2014-08-29  0:32   ` Christoph Hellwig
2014-08-29  1:12     ` Dave Chinner
2014-08-29 18:26       ` Christoph Hellwig
2014-08-30  0:05         ` Dave Chinner
2014-08-15  6:39 ` [PATCH 7/9] xfs: clean up xfs_trans_buf_read_map Dave Chinner
2014-08-15  6:39 ` [PATCH 8/9] xfs: introduce xfs_buf_submit[_wait] Dave Chinner
2014-08-15 13:10   ` Christoph Hellwig
2014-08-15 23:37     ` Dave Chinner
2014-08-16  4:55       ` Christoph Hellwig
2014-08-15 14:35   ` Brian Foster
2014-08-15 23:39     ` Dave Chinner
2014-08-18 14:16       ` Brian Foster
2014-08-15 16:13   ` Brian Foster
2014-08-15 23:58     ` Dave Chinner
2014-08-18 14:26       ` Brian Foster
2014-08-15  6:39 ` [PATCH 9/9] xfs: check xfs_buf_read_uncached returns correctly Dave Chinner
2014-08-15 12:56   ` Christoph Hellwig
2014-08-15 23:58     ` Dave Chinner
2014-08-29  0:37       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140815231736.GT26465@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox