From: Kent Overstreet <koverstreet@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-kernel@vger.kernel.org, linux-aio@kvack.org,
linux-fsdevel@vger.kernel.org, zab@redhat.com, bcrl@kvack.org,
jmoyer@redhat.com, axboe@kernel.dk, viro@zeniv.linux.org.uk,
tytso@mit.edu
Subject: Re: [PATCH 25/32] aio: use xchg() instead of completion_lock
Date: Mon, 7 Jan 2013 15:21:15 -0800 [thread overview]
Message-ID: <20130107232115.GF26407@google.com> (raw)
In-Reply-To: <20130103153414.23b0b913.akpm@linux-foundation.org>
On Thu, Jan 03, 2013 at 03:34:14PM -0800, Andrew Morton wrote:
> On Wed, 26 Dec 2012 18:00:04 -0800
> Kent Overstreet <koverstreet@google.com> wrote:
>
> > So, for sticking kiocb completions on the kioctx ringbuffer, we need a
> > lock - it unfortunately can't be lockless.
> >
> > When the kioctx is shared between threads on different cpus and the rate
> > of completions is high, this lock sees quite a bit of contention - in
> > terms of cacheline contention it's the hottest thing in the aio
> > subsystem.
> >
> > That means, with a regular spinlock, we're going to take a cache miss
> > to grab the lock, then another cache miss when we touch the data the
> > lock protects - if it's on the same cacheline as the lock, other cpus
> > spinning on the lock are going to be pulling it out from under us as
> > we're using it.
> >
> > So, we use an old trick to get rid of this second forced cache miss -
> > make the data the lock protects be the lock itself, so we grab them both
> > at once.
>
> Boy I hope you got that right.
>
> Did you consider using bit_spin_lock() on the upper bit of `tail'?
> We've done that in other places and we at least know that it works.
> And it has the optimisations for CONFIG_SMP=n, understands
> CONFIG_DEBUG_SPINLOCK, has arch-specific optimisations, etc.
I hadn't thought of that - I think it'd suffer from the same problem as
a regular spinlock, where you grab the lock, then go to grab your data
but a different CPU grabbed the cacheline you need...
But the lock debugging would be nice. It'd probably work to make
something generic like bit_spinlock() that also returns some value - or,
the recent patches for making spinlocks back off will also help with
this problem. So maybe between that and batch completion this patch
could be dropped at some point.
So, yeah. The code's plenty tested and I went over the barriers, it
already had all the needed barriers due to the ringbuffer... and I've
done this sort of thing elsewhere too. But it certaintly is a hack and I
wouldn't be sad to see it go.
--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org. For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
next prev parent reply other threads:[~2013-01-07 23:21 UTC|newest]
Thread overview: 77+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-27 1:59 [PATCH 00/32] AIO performance improvements/cleanups, v3 Kent Overstreet
2012-12-27 1:59 ` [PATCH 01/32] mm: remove old aio use_mm() comment Kent Overstreet
2012-12-27 1:59 ` [PATCH 02/32] aio: remove dead code from aio.h Kent Overstreet
2012-12-27 1:59 ` [PATCH 03/32] gadget: remove only user of aio retry Kent Overstreet
2012-12-27 1:59 ` [PATCH 04/32] aio: remove retry-based AIO Kent Overstreet
2012-12-29 7:36 ` Hillf Danton
2013-01-07 22:12 ` Kent Overstreet
2012-12-29 7:47 ` Hillf Danton
2013-01-07 22:15 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 05/32] char: add aio_{read,write} to /dev/{null,zero} Kent Overstreet
2012-12-27 1:59 ` [PATCH 06/32] aio: Kill return value of aio_complete() Kent Overstreet
2012-12-27 1:59 ` [PATCH 07/32] aio: kiocb_cancel() Kent Overstreet
2012-12-27 1:59 ` [PATCH 08/32] aio: Move private stuff out of aio.h Kent Overstreet
2012-12-27 1:59 ` [PATCH 09/32] aio: dprintk() -> pr_debug() Kent Overstreet
2012-12-27 1:59 ` [PATCH 10/32] aio: do fget() after aio_get_req() Kent Overstreet
2012-12-27 1:59 ` [PATCH 11/32] aio: Make aio_put_req() lockless Kent Overstreet
2012-12-27 1:59 ` [PATCH 12/32] aio: Refcounting cleanup Kent Overstreet
2012-12-27 1:59 ` [PATCH 13/32] wait: Add wait_event_hrtimeout() Kent Overstreet
2012-12-27 10:37 ` Fubo Chen
2013-01-03 23:08 ` Andrew Morton
2013-01-08 0:09 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 14/32] aio: Make aio_read_evt() more efficient, convert to hrtimers Kent Overstreet
2013-01-03 23:19 ` Andrew Morton
2013-01-08 0:28 ` Kent Overstreet
2013-01-08 1:00 ` Andrew Morton
2013-01-08 1:28 ` Kent Overstreet
2012-12-27 1:59 ` [PATCH 15/32] aio: Use flush_dcache_page() Kent Overstreet
2012-12-27 1:59 ` [PATCH 16/32] aio: Use cancellation list lazily Kent Overstreet
2012-12-27 1:59 ` [PATCH 17/32] aio: Change reqs_active to include unreaped completions Kent Overstreet
2012-12-27 1:59 ` [PATCH 18/32] aio: Kill batch allocation Kent Overstreet
2012-12-27 1:59 ` [PATCH 19/32] aio: Kill struct aio_ring_info Kent Overstreet
2012-12-27 1:59 ` [PATCH 20/32] aio: Give shared kioctx fields their own cachelines Kent Overstreet
2013-01-03 23:25 ` Andrew Morton
2013-01-07 23:48 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 21/32] aio: reqs_active -> reqs_available Kent Overstreet
2012-12-27 2:00 ` [PATCH 22/32] aio: percpu reqs_available Kent Overstreet
2012-12-27 2:00 ` [PATCH 23/32] Generic dynamic per cpu refcounting Kent Overstreet
2013-01-03 22:48 ` Andrew Morton
2013-01-07 23:47 ` Kent Overstreet
2013-01-08 1:03 ` [PATCH] percpu-refcount: Sparse fixes Kent Overstreet
2013-01-25 0:51 ` [PATCH 23/32] Generic dynamic per cpu refcounting Tejun Heo
2013-01-25 1:13 ` Kent Overstreet
2013-01-25 2:03 ` Tejun Heo
2013-01-25 2:09 ` Tejun Heo
2013-01-28 17:48 ` Kent Overstreet
2013-01-28 18:18 ` Tejun Heo
2013-01-25 6:15 ` Rusty Russell
2013-01-28 17:53 ` Kent Overstreet
2013-01-28 17:59 ` Tejun Heo
2013-01-28 18:32 ` Kent Overstreet
2013-01-28 18:57 ` Christoph Lameter
2013-02-08 14:44 ` Tejun Heo
2013-02-08 14:49 ` Jens Axboe
2013-02-08 17:50 ` Andrew Morton
2013-02-08 21:27 ` Kent Overstreet
2013-02-11 14:21 ` Jeff Moyer
2013-02-08 21:17 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 24/32] aio: Percpu ioctx refcount Kent Overstreet
2012-12-27 2:00 ` [PATCH 25/32] aio: use xchg() instead of completion_lock Kent Overstreet
2013-01-03 23:34 ` Andrew Morton
2013-01-07 23:21 ` Kent Overstreet [this message]
2013-01-07 23:35 ` Andrew Morton
2013-01-08 0:01 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 26/32] aio: Don't include aio.h in sched.h Kent Overstreet
2012-12-27 2:00 ` [PATCH 27/32] aio: Kill ki_key Kent Overstreet
2012-12-27 2:00 ` [PATCH 28/32] aio: Kill ki_retry Kent Overstreet
2012-12-27 2:00 ` [PATCH 29/32] block, aio: Batch completion for bios/kiocbs Kent Overstreet
2013-01-04 9:22 ` Jens Axboe
2013-01-07 23:34 ` Kent Overstreet
2013-01-08 15:33 ` Jeff Moyer
2013-01-08 16:06 ` Kent Overstreet
2013-01-08 16:15 ` Jeff Moyer
2013-01-08 16:48 ` Kent Overstreet
2012-12-27 2:00 ` [PATCH 30/32] virtio-blk: Convert to batch completion Kent Overstreet
2012-12-27 2:00 ` [PATCH 31/32] mtip32xx: " Kent Overstreet
2012-12-27 2:00 ` [PATCH 32/32] aio: Smoosh struct kiocb Kent Overstreet
2013-01-04 9:22 ` [PATCH 00/32] AIO performance improvements/cleanups, v3 Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130107232115.GF26407@google.com \
--to=koverstreet@google.com \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=bcrl@kvack.org \
--cc=jmoyer@redhat.com \
--cc=linux-aio@kvack.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
--cc=zab@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).