From: Chris Mason <chris.mason@oracle.com>
To: Manfred Spraul <manfred@colorfullife.com>
Cc: zach.brown@oracle.com, jens.axboe@oracle.com,
linux-kernel@vger.kernel.org, Nick Piggin <npiggin@suse.de>
Subject: Re: [PATCH 1/2] ipc semaphores: reduce ipc_lock contention in semtimedop
Date: Fri, 16 Apr 2010 07:45:02 -0400 [thread overview]
Message-ID: <20100416114502.GH10828@think> (raw)
In-Reply-To: <4BC84957.9080608@colorfullife.com>
On Fri, Apr 16, 2010 at 01:26:15PM +0200, Manfred Spraul wrote:
> On 04/12/2010 08:49 PM, Chris Mason wrote:
> >I have a microbenchmark to test how quickly we can post and wait in
> >bulk. With this change, semtimedop is able do to more than twice
> >as much work in the same run. On a large numa machine, it brings
> >the IPC lock system time (reported by perf) down from 85% to 15%.
> >
> Looking at the current code:
> - update_queue() can be O(N^2) if only some of the waiting tasks are
> woken up.
> Actually: all non-woken up tasks are rescanned after a task that can
> be woken up is found.
>
> - Your test app tests the best case for the current code:
> You wake up the tasks in the same order as the called semop().
> If you invert the order (i.e.: worklist_add() adds to head instead
> of tail), I would expect an even worse performance of the current
> code.
>
> The O(N^2) is simple to fix, I've attached a patch.
Good point.
> For your micro-benchmark, the patch does not change much: you
> wake-up in-order, thus the current code does not misbehave.
>
> Do you know how Oracle wakes up the tasks?
> FIFO, LIFO, un-ordered?
Ordering in terms of the sem array? I had them try many variations ;) I
don't think it will be ordered as well as sembench most of the time.
>
> > while(unlikely(error == IN_WAKEUP)) {
> > cpu_relax();
> > error = queue.status;
> > }
> >
> >- if (error != -EINTR) {
> >+ /*
> >+ * we are lock free right here, and we could have timed out or
> >+ * gotten a signal, so we need to be really careful with how we
> >+ * play with queue.status. It has three possible states:
> >+ *
> >+ * -EINTR, which means nobody has changed it since we slept. This
> >+ * means we woke up on our own.
> >+ *
> >+ * IN_WAKEUP, someone is currently waking us up. We need to loop
> >+ * here until they change it to the operation error value. If
> >+ * we don't loop, our process could exit before they are done waking us
> >+ *
> >+ * operation error value: we've been properly woken up and can exit
> >+ * at any time.
> >+ *
> >+ * If queue.status is currently -EINTR, we are still being processed
> >+ * by the semtimedop core. Someone either has us on a list head
> >+ * or is currently poking our queue struct. We need to find that
> >+ * reference and remove it, which is what remove_queue_from_lists
> >+ * does.
> >+ *
> >+ * We always check for both -EINTR and IN_WAKEUP because we have no
> >+ * locks held. Someone could change us from -EINTR to IN_WAKEUP at
> >+ * any time.
> >+ */
> >+ if (error != -EINTR&& error != IN_WAKEUP) {
> > /* fast path: update_queue already obtained all requested
> > * resources */
> No: The code accesses a local variable. The loop above the comment
> guarantees that the error can't be IN_WAKEUP.
Whoops, thanks.
>
> >+
> >+out_putref:
> >+ sem_putref(sma);
> >+ goto out_free;
> Is it possible to move the sem_putref into wakeup_sem_queue()?
> Right now, the exit path of semtimedop doesn't touch the spinlock.
> You remove that optimization.
I'll look at this, we need to be able to go through the sma to remove
the process from the lists if it woke up on its own, but I don't see why
we can't putref in wakeup.
My current revision of the patch uses an atomic instead of the lock, so it
restores the lockless wakeup either way. Still it is better to putref in
wakeup.
-chris
next prev parent reply other threads:[~2010-04-16 11:45 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-12 18:49 [PATCH RFC] Optimize semtimedop Chris Mason
2010-04-12 18:49 ` [PATCH 1/2] ipc semaphores: reduce ipc_lock contention in semtimedop Chris Mason
2010-04-13 17:15 ` Manfred Spraul
2010-04-13 17:39 ` Chris Mason
2010-04-13 18:09 ` Nick Piggin
2010-04-13 18:19 ` Chris Mason
2010-04-13 18:57 ` Nick Piggin
2010-04-13 19:01 ` Chris Mason
2010-04-13 19:25 ` Nick Piggin
2010-04-13 19:38 ` Chris Mason
2010-04-13 20:05 ` Nick Piggin
2010-05-16 16:57 ` Manfred Spraul
2010-05-16 22:40 ` Chris Mason
2010-05-17 7:20 ` Nick Piggin
2010-04-14 16:16 ` Manfred Spraul
2010-04-14 17:33 ` Chris Mason
2010-04-14 19:11 ` Manfred Spraul
2010-04-14 19:50 ` Chris Mason
2010-04-15 16:33 ` Manfred Spraul
2010-04-15 16:34 ` Chris Mason
2010-04-13 18:24 ` Zach Brown
2010-04-16 11:26 ` Manfred Spraul
2010-04-16 11:45 ` Chris Mason [this message]
2010-04-12 18:49 ` [PATCH 2/2] ipc semaphores: order wakeups based on waiter CPU Chris Mason
2010-04-17 10:24 ` Manfred Spraul
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100416114502.GH10828@think \
--to=chris.mason@oracle.com \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=manfred@colorfullife.com \
--cc=npiggin@suse.de \
--cc=zach.brown@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox