All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 4/4] mkfs: Use AIO for batched writeback
Date: Fri, 7 Sep 2018 10:30:16 +1000	[thread overview]
Message-ID: <20180907003016.GK27618@dastard> (raw)
In-Reply-To: <20180906133224.GD3311@bfoster>

On Thu, Sep 06, 2018 at 09:32:24AM -0400, Brian Foster wrote:
> On Wed, Sep 05, 2018 at 06:19:32PM +1000, Dave Chinner wrote:
> > +int
> > +get_write_completions(
> > +	int	threshold)
> > +{
> > +	int	error = 0;
> > +	int	i, r;
> > +
> > +	while (aio_flight > threshold) {
> > +		/* gather up some completions */
> > +		r = io_getevents(ctxp, 1, MAX_AIO_EVENTS, ioevents, NULL);
> > +		if (r < 0)  {
> > +			printf("FAIL! io_getevents returned %d\n", r);
> > +			exit(1);
> > +		}
> > +
> > +		aio_flight -= r;
> > +		for (i = 0; i < r; ++i) {
> > +			struct xfs_buf *bp = ioevents[i].data;
> > +			if (ioevents[i].res < 0)
> > +				bp->b_error = ioevents[i].res;
> > +			else if (ioevents[i].res != bp->b_bcount)
> > +				bp->b_error = -EIO;
> > +
> > +			if (--bp->b_io_count == 0 && !bp->b_error) {
> > +				bp->b_flags |= LIBXFS_B_UPTODATE;
> > +				bp->b_flags &= ~(LIBXFS_B_DIRTY | LIBXFS_B_EXIT |
> > +						 LIBXFS_B_UNCHECKED);
> 
> I'd think we need these flags updates on the submission side
> ->b_io_count decrement as well, but I guess the single-threadedness of
> the whole things dictates that I/O will only ever complete here.

Yup - I made lots of shortcuts and hacks because it is a tight
single threaded AIO dispatch/complete loop.

> 
> Which begs the question.. why elevate ->b_io_count in the delwri submit
> path at all?

So we don't drop the reference to the buffer before we've completed
all the individual IOs on a discontiguous buffer.

> FWIW, I can see writing this consistently either way: take
> the I/O ref and handle completion in both places, or rely on the fact
> that completion occurs in one place and the flush only needs to know if
> the buffer count is still elevated immediately after submission.

Sure, but for a quick and dirty AIO ring implementation like this
that's not needed. Of course, none of this code would make it into a
proper AIO engine implementation, so don't get too hung up on all
the shortcuts and simplifications I've taken in this RFCRAP code
to prove the concept is worth persuing....

> > +				libxfs_putbuf(bp);
> > +			} else if (!error) {
> > +				error = bp->b_error;
> > +			}
> > +		}
> > +		/* wait for a short while */
> > +		usleep(1000);

Like this :P

> > +	}
> > +	return error;
> > +}
> > +
> > +static int
> > +__aio_write(
> > +	struct xfs_buf  *bp,
> > +	void *buf,
> > +	int len,
> > +	off64_t offset)
> > +{
> > +	int	r, i;
> > +	int	fd = libxfs_device_to_fd(bp->b_target->dev);
> > +
> > +	i = aio_next++ % MAX_AIO_EVENTS;
> > +	aio_flight++;
> > +	bp->b_io_count++;
> > +
> > +	io_prep_pwrite(iocbs[i], fd, buf, len, offset);
> > +	iocbs[i]->data = bp;
> > +	r = io_submit(ctxp, 1, &iocbs[i]);
> > +	if (r != 1) {
> > +		aio_flight--;
> 
> Probably need to decrement ->b_io_count here as well..?

Yup, that'd be a bug.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

      reply	other threads:[~2018-09-07  5:08 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-05  8:19 [RFCRAP PATCH 0/4 v2] mkfs.xfs IO scalability Dave Chinner
2018-09-05  8:19 ` [PATCH 1/4] mkfs: stop zeroing old superblocks excessively Dave Chinner
2018-09-06 13:31   ` Brian Foster
2018-09-07  0:04     ` Dave Chinner
2018-09-07 11:05       ` Brian Foster
2018-09-05  8:19 ` [PATCH 2/4] mkfs: rework AG header initialisation ordering Dave Chinner
2018-09-06 13:31   ` Brian Foster
2018-09-07  0:08     ` Dave Chinner
2018-09-05  8:19 ` [PATCH 3/4] mkfs: introduce new delayed write buffer list Dave Chinner
2018-09-06 13:32   ` Brian Foster
2018-09-07  0:21     ` Dave Chinner
2018-09-05  8:19 ` [PATCH 4/4] mkfs: Use AIO for batched writeback Dave Chinner
2018-09-06 13:32   ` Brian Foster
2018-09-07  0:30     ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180907003016.GK27618@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.