From: Manfred Spraul <manfred@colorfullife.com>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@redhat.com>,
Davidlohr Bueso <davidlohr.bueso@hp.com>,
hhuang@redhat.com, Linus Torvalds <torvalds@linux-foundation.org>,
Mike Galbraith <efault@gmx.de>
Subject: Re: [PATCH 0/6] ipc/sem.c: performance improvements, FIFO
Date: Fri, 14 Jun 2013 17:38:34 +0200 [thread overview]
Message-ID: <51BB38FA.6080607@colorfullife.com> (raw)
In-Reply-To: <1370884611-3861-1-git-send-email-manfred@colorfullife.com>
Hi all,
On 06/10/2013 07:16 PM, Manfred Spraul wrote:
> Hi Andrew,
>
> I have cleaned up/improved my updates to sysv sem.
> Could you replace my patches in -akpm with this series?
>
> - 1: cacheline align output from ipc_rcu_alloc
> - 2: cacheline align semaphore structures
> - 3: seperate-wait-for-zero-and-alter-tasks
> - 4: Always-use-only-one-queue-for-alter-operations
> - 5: Replace the global sem_otime with a distributed otime
> - 6: Rename-try_atomic_semop-to-perform_atomic
Just to keep everyone updated:
I have updated my testapp:
https://github.com/manfred-colorfu/ipcscale/blob/master/sem-waitzero.cpp
Something like this gives a nice output:
# sem-waitzero -t 5 -m 0 | grep 'Cpus' | gawk '{printf("%f -
%s\n",$7/$2,$0);}' | sort -n -r
The first number is the number of operations per cpu during 5 seconds.
Mike was kind enough to run in on a 32-core (4-socket) Intel system:
- master doesn't scale at all when multiple sockets are used:
interleave 4: (i.e.: use cpu 0, then 4, then 8 (2nd socket), then 12):
34,717586.000000 - Cpus 1, interleave 4 delay 0: 34717586 in 5 secs
24,507337.500000 - Cpus 2, interleave 4 delay 0: 49014675 in 5 secs
3,487540.000000 - Cpus 3, interleave 4 delay 0: 10462620 in 5 secs
2,708145.000000 - Cpus 4, interleave 4 delay 0: 10832580 in 5 secs
interleave 8: (i.e.: use cpu 0, then 8 (2nd socket):
34,587329.000000 - Cpus 1, interleave 8 delay 0: 34587329 in 5 secs
7,746981.500000 - Cpus 2, interleave 8 delay 0: 15493963 in 5 secs
- with my patches applied, it scales linearly - but only sometimes
example for good scaling (18 threads in parallel - linear scaling):
33,928616.111111 - Cpus 18, interleave 8 delay 0: 610715090 in
5 secs
example for bad scaling:
5,829109.600000 - Cpus 5, interleave 8 delay 0: 29145548 in 5 secs
For me, it looks like a livelock somewhere:
Good example: all threads contribute the same amount to the final result:
> Result matrix:
> Thread 0: 33476433
> Thread 1: 33697100
> Thread 2: 33514249
> Thread 3: 33657413
> Thread 4: 33727959
> Thread 5: 33580684
> Thread 6: 33530294
> Thread 7: 33666761
> Thread 8: 33749836
> Thread 9: 32636493
> Thread 10: 33550620
> Thread 11: 33403314
> Thread 12: 33594457
> Thread 13: 33331920
> Thread 14: 33503588
> Thread 15: 33585348
> Cpus 16, interleave 8 delay 0: 536206469 in 5 secs
Bad example: one thread is as fast as it should be, others are slow:
> Result matrix:
> Thread 0: 31629540
> Thread 1: 5336968
> Thread 2: 6404314
> Thread 3: 9190595
> Thread 4: 9681006
> Thread 5: 9935421
> Thread 6: 9424324
> Cpus 7, interleave 8 delay 0: 81602168 in 5 secs
The results are not stable: the same test is sometimes fast, sometimes slow.
I have no idea where the livelock could be and I wasn't able to notice
anything on my i3 laptop.
Thus: Who has an idea?
What I can say is that the livelock can't be in do_smart_update(): The
function is never called.
--
Manfred
next prev parent reply other threads:[~2013-06-14 15:38 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-10 17:16 [PATCH 0/6] ipc/sem.c: performance improvements, FIFO Manfred Spraul
2013-06-10 17:16 ` [PATCH 1/6] ipc/util.c, ipc_rcu_alloc: cacheline align allocation Manfred Spraul
2013-06-10 17:16 ` [PATCH 2/6] ipc/sem.c: cacheline align the semaphore structures Manfred Spraul
2013-06-10 17:16 ` [PATCH 3/6] ipc/sem: seperate wait-for-zero and alter tasks into seperate queues Manfred Spraul
2013-06-10 17:16 ` [PATCH 4/6] ipc/sem.c: Always use only one queue for alter operations Manfred Spraul
2013-06-10 17:16 ` [PATCH 5/6] ipc/sem.c: Replace shared sem_otime with per-semaphore value Manfred Spraul
2013-06-10 17:16 ` [PATCH 6/6] ipc/sem.c: Rename try_atomic_semop() to perform_atomic_semop(), docu update Manfred Spraul
2013-06-14 15:38 ` Manfred Spraul [this message]
2013-06-14 19:05 ` [PATCH 0/6] ipc/sem.c: performance improvements, FIFO Mike Galbraith
2013-06-15 5:27 ` Manfred Spraul
2013-06-15 5:48 ` Mike Galbraith
2013-06-15 7:30 ` Mike Galbraith
2013-06-15 8:36 ` Mike Galbraith
2013-06-15 11:10 ` Manfred Spraul
2013-06-15 11:37 ` Mike Galbraith
2013-06-18 6:48 ` Mike Galbraith
2013-06-18 7:14 ` Mike Galbraith
2013-06-19 12:57 ` Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51BB38FA.6080607@colorfullife.com \
--to=manfred@colorfullife.com \
--cc=akpm@linux-foundation.org \
--cc=davidlohr.bueso@hp.com \
--cc=efault@gmx.de \
--cc=hhuang@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=riel@redhat.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.