From: Kent Overstreet <koverstreet@google.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-kernel@vger.kernel.org, linux-aio@kvack.org,
linux-fsdevel@vger.kernel.org, zab@redhat.com, bcrl@kvack.org,
jmoyer@redhat.com, viro@zeniv.linux.org.uk, tytso@mit.edu
Subject: Re: [PATCH 29/32] block, aio: Batch completion for bios/kiocbs
Date: Mon, 7 Jan 2013 15:34:43 -0800 [thread overview]
Message-ID: <20130107233443.GG26407@google.com> (raw)
In-Reply-To: <50E69F5B.8060902@kernel.dk>
On Fri, Jan 04, 2013 at 10:22:35AM +0100, Jens Axboe wrote:
> On 2012-12-27 03:00, Kent Overstreet wrote:
> > When completing a kiocb, there's some fixed overhead from touching the
> > kioctx's ring buffer the kiocb belongs to. Some newer high end block
> > devices can complete multiple IOs per interrupt, much like many network
> > interfaces have been for some time.
> >
> > This plumbs through infrastructure so we can take advantage of multiple
> > completions at the interrupt level, and complete multiple kiocbs at the
> > same time.
> >
> > Drivers have to be converted to take advantage of this, but it's a
> > simple change and the next patches will convert a few drivers.
> >
> > To use it, an interrupt handler (or any code that completes bios or
> > requests) declares and initializes a struct batch_complete:
> >
> > struct batch_complete batch;
> > batch_complete_init(&batch);
> >
> > Then, instead of calling bio_endio(), it calls
> > bio_endio_batch(bio, err, &batch). This just adds the bio to a list in
> > the batch_complete.
> >
> > At the end, it calls
> >
> > batch_complete(&batch);
> >
> > This completes all the bios all at once, building up a list of kiocbs;
> > then the list of kiocbs are completed all at once.
> >
> > Also, in order to batch up the kiocbs we have to add a different
> > bio_endio function to struct bio, that takes a pointer to the
> > batch_complete - this patch converts the dio code's bio_endio function.
> > In order to avoid changing every bio_endio function in the kernel (there
> > are many), we currently use a union and a flag to indicate what kind of
> > bio endio function to call. This is admittedly a hack, but should
> > suffice for now.
>
> It is indeed a hack... Famous last words as well, I'm sure that'll stick
> around forever if it goes in! Any ideas on how we can clean this up
> before that?
Well, I wouldn't _really_ mind changing all 200 bi_end_io uses. On the
other hand, the majority of them are either leaf nodes (filesystem code
and whatnot that's not completing anything else that could be batched),
or stuff like the dm and md code where it could be plumbed through (so
we could batch completions through md/dm) but it may take some thought
to do it right.
So I think I'd prefer to do it incrementally, for the moment. I'm always
a bit terrified of doing a cleanup that touches 50+ files, and then
changing my mind about something and going back and redoing it.
That said, I haven't forgotten about all the other block layer patches
I've got for you, as soon as I'm less swamped I'm going to finish off
that stuff so I should be around to revisit it...
> Apart from that, I think the batching makes functional sense. For the
> devices where we do get batches of completions (most of them), it's the
> right thing to do. Would be nice it were better integrated though, not a
> side hack.
>
> Is the rbtree really faster than a basic (l)list and a sort before
> completing them? Would be simpler.
Well, depends. With one or two kioctxs? The list would definitely be
faster, but I'm loathe to use an O(n^2) algorithm anywhere where the
input size isn't strictly controlled, and I know of applications out
there that use tons of kioctxs.
> A few small comments below.
>
> > +void bio_endio_batch(struct bio *bio, int error, struct batch_complete *batch)
> > +{
> > + if (error)
> > + bio->bi_error = error;
> > +
> > + if (batch)
> > + bio_list_add(&batch->bio, bio);
> > + else
> > + __bio_endio(bio, batch);
> > +
> > +}
>
> Ugh, get rid of this 'batch' checking.
The reason I did it that way is - well, look at the dio code's bi_end_io
function. It's got to be passed a pointer to a struct batch_complete *
to batch kiocbs, but the driver that calls it may or may not have batch
completions plumbed through.
So unless every single driver gets converted (and I think that'd be
silly for all the ones that can't do any actual batching) something's
going to have to have that check, and better for it to be in generic
code than every mid layer code we plumb it through.
>
> > +static inline void bio_endio(struct bio *bio, int error)
> > +{
> > + bio_endio_batch(bio, error, NULL);
> > +}
> > +
>
> Just make that __bio_endio().
That one could be changed... I dislike having the if (error)
bio->bi_error = error duplicated...
Actually, it'd probably make more sense to inline bio_endio_batch(),
because often the compiler is going to either know whether batch is null
or not or be able to lift it out of a loop.
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
next prev parent reply other threads:[~2013-01-07 23:34 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-27 1:59 [PATCH 00/32] AIO performance improvements/cleanups, v3 Kent Overstreet
2012-12-27 1:59 ` [PATCH 01/32] mm: remove old aio use_mm() comment Kent Overstreet
2012-12-27 1:59 ` [PATCH 02/32] aio: remove dead code from aio.h Kent Overstreet
2012-12-27 1:59 ` [PATCH 03/32] gadget: remove only user of aio retry Kent Overstreet
2012-12-27 1:59 ` [PATCH 04/32] aio: remove retry-based AIO Kent Overstreet
2012-12-29 7:36 ` Hillf Danton
2013-01-07 22:12 ` Kent Overstreet
2012-12-29 7:47 ` Hillf Danton
2013-01-07 22:15 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 05/32] char: add aio_{read,write} to /dev/{null,zero} Kent Overstreet
2012-12-27 1:59 ` [PATCH 06/32] aio: Kill return value of aio_complete() Kent Overstreet
2012-12-27 1:59 ` [PATCH 07/32] aio: kiocb_cancel() Kent Overstreet
2012-12-27 1:59 ` [PATCH 08/32] aio: Move private stuff out of aio.h Kent Overstreet
2012-12-27 1:59 ` [PATCH 09/32] aio: dprintk() -> pr_debug() Kent Overstreet
2012-12-27 1:59 ` [PATCH 10/32] aio: do fget() after aio_get_req() Kent Overstreet
2012-12-27 1:59 ` [PATCH 11/32] aio: Make aio_put_req() lockless Kent Overstreet
2012-12-27 1:59 ` [PATCH 12/32] aio: Refcounting cleanup Kent Overstreet
2012-12-27 1:59 ` [PATCH 13/32] wait: Add wait_event_hrtimeout() Kent Overstreet
2012-12-27 10:37 ` Fubo Chen
2013-01-03 23:08 ` Andrew Morton
2013-01-08 0:09 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 14/32] aio: Make aio_read_evt() more efficient, convert to hrtimers Kent Overstreet
2013-01-03 23:19 ` Andrew Morton
2013-01-08 0:28 ` Kent Overstreet
2013-01-08 1:00 ` Andrew Morton
2013-01-08 1:28 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 15/32] aio: Use flush_dcache_page() Kent Overstreet
2012-12-27 1:59 ` [PATCH 16/32] aio: Use cancellation list lazily Kent Overstreet
2012-12-27 1:59 ` [PATCH 17/32] aio: Change reqs_active to include unreaped completions Kent Overstreet
2012-12-27 1:59 ` [PATCH 18/32] aio: Kill batch allocation Kent Overstreet
2012-12-27 1:59 ` [PATCH 19/32] aio: Kill struct aio_ring_info Kent Overstreet
2012-12-27 1:59 ` [PATCH 20/32] aio: Give shared kioctx fields their own cachelines Kent Overstreet
2013-01-03 23:25 ` Andrew Morton
2013-01-07 23:48 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 21/32] aio: reqs_active -> reqs_available Kent Overstreet
2012-12-27 2:00 ` [PATCH 22/32] aio: percpu reqs_available Kent Overstreet
2012-12-27 2:00 ` [PATCH 23/32] Generic dynamic per cpu refcounting Kent Overstreet
2013-01-03 22:48 ` Andrew Morton
2013-01-07 23:47 ` Kent Overstreet
2013-01-08 1:03 ` [PATCH] percpu-refcount: Sparse fixes Kent Overstreet
2013-01-25 0:51 ` [PATCH 23/32] Generic dynamic per cpu refcounting Tejun Heo
2013-01-25 1:13 ` Kent Overstreet
2013-01-25 2:03 ` Tejun Heo
2013-01-25 2:09 ` Tejun Heo
2013-01-28 17:48 ` Kent Overstreet
2013-01-28 18:18 ` Tejun Heo
2013-01-25 6:15 ` Rusty Russell
2013-01-28 17:53 ` Kent Overstreet
2013-01-28 17:59 ` Tejun Heo
2013-01-28 18:32 ` Kent Overstreet
2013-01-28 18:57 ` Christoph Lameter
2013-02-08 14:44 ` Tejun Heo
2013-02-08 14:49 ` Jens Axboe
2013-02-08 17:50 ` Andrew Morton
2013-02-08 21:27 ` Kent Overstreet
2013-02-11 14:21 ` Jeff Moyer
2013-02-08 21:17 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 24/32] aio: Percpu ioctx refcount Kent Overstreet
2012-12-27 2:00 ` [PATCH 25/32] aio: use xchg() instead of completion_lock Kent Overstreet
2013-01-03 23:34 ` Andrew Morton
2013-01-07 23:21 ` Kent Overstreet
2013-01-07 23:35 ` Andrew Morton
2013-01-08 0:01 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 26/32] aio: Don't include aio.h in sched.h Kent Overstreet
2012-12-27 2:00 ` [PATCH 27/32] aio: Kill ki_key Kent Overstreet
2012-12-27 2:00 ` [PATCH 28/32] aio: Kill ki_retry Kent Overstreet
2012-12-27 2:00 ` [PATCH 29/32] block, aio: Batch completion for bios/kiocbs Kent Overstreet
2013-01-04 9:22 ` Jens Axboe
2013-01-07 23:34 ` Kent Overstreet [this message]
2013-01-08 15:33 ` Jeff Moyer
2013-01-08 16:06 ` Kent Overstreet
2013-01-08 16:15 ` Jeff Moyer
2013-01-08 16:48 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 30/32] virtio-blk: Convert to batch completion Kent Overstreet
2012-12-27 2:00 ` [PATCH 31/32] mtip32xx: " Kent Overstreet
2012-12-27 2:00 ` [PATCH 32/32] aio: Smoosh struct kiocb Kent Overstreet
2013-01-04 9:22 ` [PATCH 00/32] AIO performance improvements/cleanups, v3 Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130107233443.GG26407@google.com \
--to=koverstreet@google.com \
--cc=axboe@kernel.dk \
--cc=bcrl@kvack.org \
--cc=jmoyer@redhat.com \
--cc=linux-aio@kvack.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=zab@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).