From: Chris Mason <chris.mason@oracle.com>
To: Manfred Spraul <manfred@colorfullife.com>
Cc: Nick Piggin <npiggin@suse.de>,
zach.brown@oracle.com, jens.axboe@oracle.com,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] ipc semaphores: reduce ipc_lock contention in semtimedop
Date: Wed, 14 Apr 2010 13:33:19 -0400 [thread overview]
Message-ID: <20100414173319.GA3228@think> (raw)
In-Reply-To: <4BC5EA75.9090803@colorfullife.com>
On Wed, Apr 14, 2010 at 06:16:53PM +0200, Manfred Spraul wrote:
> On 04/13/2010 08:19 PM, Chris Mason wrote:
> >On Wed, Apr 14, 2010 at 04:09:45AM +1000, Nick Piggin wrote:
> >>On Tue, Apr 13, 2010 at 01:39:41PM -0400, Chris Mason wrote:
> >>The other thing I don't know if your patch gets right is requeueing on
> >>of the operations. When you requeue from one list to another, then you
> >>seem to lose ordering with other pending operations, so that would
> >>seem to break the API as well (can't remember if the API strictly
> >>mandates FIFO, but anyway it can open up starvation cases).
> >I don't see anything in the docs about the FIFO order. I could add an
> >extra sort on sequence number pretty easily, but is the starvation case
> >really that bad?
> >
> How do you want to determine the sequence number?
> Is atomic_inc_return() on a per-semaphore array counter sufficiently fast?
I haven't tried yet, but hopefully it won't be a problem. A later patch
does atomics on the reference count and it doesn't show up in the
profiles.
>
> >>I was looking at doing a sequence number to be able to sort these, but
> >>it ended up getting over complex (and SAP was only using simple ops so
> >>it didn't seem to need much better).
> >>
> >>We want to be careful not to change semantics at all. And it gets
> >>tricky quickly :( What about Zach's simpler wakeup API?
> >Yeah, that's why my patches include code to handle userland sending
> >duplicate semids. Zach's simpler API is cooking too, but if I can get
> >this done without insane complexity it helps with more than just the
> >post/wait oracle workload.
> >
> What is the oracle workload, which multi-sembuf operations does it use?
> How many semaphores are in one array?
>
> When the last optimizations were written, I've searched a bit:
> - postgres uses per-process semaphores, with small semaphore arrays.
> [process sleeps on it's own semaphore and is woken up by someone
> else when it can make progress]
This is similar to Oracle (and the sembench program). Each process has
a semaphore and when it is waiting for a commit it goes to sleep on it.
They are woken up in bulk with semtimedop calls from a single process.
But oracle also uses semaphores for locking in a traditional sense.
Putting the waiters into a per-semaphore list is really only part of the
speedup. The real boost comes from the patch to break up the locks into
a per semaphore lock.
We gain another 10-15% from a later patch that gets uses atomics on the
refcount, which lets us do sem_putref without a lock (meaning we're
lockless once we get woken up).
I'm cleaning up fixes based on suggestions here and will repost.
> - with google, I couldn't find anything relevant that uses
> multi-sembuf semop() calls.
>
I think this should help any workload that has more than one semaphore
per array, even if they only do one sem per call.
> And I agree with Nick: We should be careful about changing the API.
Definitely, thanks for reading through it.
-chris
next prev parent reply other threads:[~2010-04-14 17:33 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-12 18:49 [PATCH RFC] Optimize semtimedop Chris Mason
2010-04-12 18:49 ` [PATCH 1/2] ipc semaphores: reduce ipc_lock contention in semtimedop Chris Mason
2010-04-13 17:15 ` Manfred Spraul
2010-04-13 17:39 ` Chris Mason
2010-04-13 18:09 ` Nick Piggin
2010-04-13 18:19 ` Chris Mason
2010-04-13 18:57 ` Nick Piggin
2010-04-13 19:01 ` Chris Mason
2010-04-13 19:25 ` Nick Piggin
2010-04-13 19:38 ` Chris Mason
2010-04-13 20:05 ` Nick Piggin
2010-05-16 16:57 ` Manfred Spraul
2010-05-16 22:40 ` Chris Mason
2010-05-17 7:20 ` Nick Piggin
2010-04-14 16:16 ` Manfred Spraul
2010-04-14 17:33 ` Chris Mason [this message]
2010-04-14 19:11 ` Manfred Spraul
2010-04-14 19:50 ` Chris Mason
2010-04-15 16:33 ` Manfred Spraul
2010-04-15 16:34 ` Chris Mason
2010-04-13 18:24 ` Zach Brown
2010-04-16 11:26 ` Manfred Spraul
2010-04-16 11:45 ` Chris Mason
2010-04-12 18:49 ` [PATCH 2/2] ipc semaphores: order wakeups based on waiter CPU Chris Mason
2010-04-17 10:24 ` Manfred Spraul
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100414173319.GA3228@think \
--to=chris.mason@oracle.com \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=npiggin@suse.de \
--cc=zach.brown@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.