From: Darren Hart <dvhltc@us.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@elte.hu>,
Eric Dumazet <eric.dumazet@gmail.com>,
"Peter W. Morreale" <pmorreale@novell.com>,
Rik van Riel <riel@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Gregory Haskins <ghaskins@novell.com>,
Sven-Thorsten Dietrich <sdietrich@novell.com>,
Chris Mason <chris.mason@oracle.com>,
John Cooper <john.cooper@third-harmonic.com>,
Chris Wright <chrisw@sous-sol.org>,
Ulrich Drepper <drepper@gmail.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>, Avi Kivity <avi@redhat.com>,
Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: Re: [PATCH V5 0/4][RFC] futex: FUTEX_LOCK with optional adaptive spinning
Date: Wed, 14 Apr 2010 23:13:22 -0700 [thread overview]
Message-ID: <4BC6AE82.3070703@us.ibm.com> (raw)
In-Reply-To: <1270790121-16317-1-git-send-email-dvhltc@us.ibm.com>
dvhltc@us.ibm.com wrote:
> Now that an advantage can be shown using FUTEX_LOCK_ADAPTIVE over FUTEX_LOCK,
> the next steps as I see them are:
>
> o Try and show improvement of FUTEX_LOCK_ADAPTIVE over FUTEX_WAIT based
> implementations (pthread_mutex specifically).
I've spent a bit of time on this, and made huge improvements through
some simple optimizations of the testcase lock/unlock routines. I'll be
away for a few days and wanted to let people know where things stand
with FUTEX_LOCK_ADAPTIVE.
I ran all the tests with the following options:
-i 1000000 -p 1000 -d 20
where:
-i iterations
-p period (in instructions)
-d duty cycle (in percent)
MECHANISM KITERS/SEC
----------------------------------
pthread_mutex_adaptive 1562
FUTEX_LOCK_ADAPTIVE 1190
pthread_mutex 1010
FUTEX_LOCK 532
I took some perf data while running each of the above tests as well. Any
thoughts on getting more from perf are appreciated, this is my first
pass at it. I recorded with "perf record -fg" and snippets of "perf
report" follow:
FUTEX_LOCK (not adaptive) spends a lot of time spinning on the futex
hashbucket lock.
# Overhead Command Shared Object Symbol
# ........ .......... .................. ......
#
40.76% futex_lock [kernel.kallsyms] [k] _raw_spin_lock
|
--- _raw_spin_lock
|
|--62.16%-- do_futex
| sys_futex
| system_call_fastpath
| syscall
|
|--31.05%-- futex_wake
| do_futex
| sys_futex
| system_call_fastpath
| syscall
...
14.98% futex_lock futex_lock [.] locktest
FUTEX_LOCK_ADAPTIVE spends much of its time in the test loop itself,
followed by the actual adaptive loop in the kernel. It appears much of
our savings over FUTEX_LOCK comes from not contending on the hashbucket
lock.
# Overhead Command Shared Object Symbol
# ........ .......... .................. ......
#
36.07% futex_lock futex_lock [.] locktest
|
--- locktest
|
--100.00%-- 0x400e7000000000
9.12% futex_lock perf [.] 0x00000000000eee
...
8.26% futex_lock [kernel.kallsyms] [k] futex_spin_on_owner
Pthread Mutex Adaptive spends most of it's time in the glibc heuristic
spinning, as expected, followed by the test loop itself. An impressively
minimal 3.35% is spent on the hashbucket lock.
# Overhead Command Shared Object Symbol
# ........ ............... ........................ ......
#
47.88% pthread_mutex_2 libpthread-2.5.so [.]
__pthread_mutex_lock_internal
|
--- __pthread_mutex_lock_internal
22.78% pthread_mutex_2 pthread_mutex_2 [.] locktest
...
15.16% pthread_mutex_2 perf [.] ...
...
3.35% pthread_mutex_2 [kernel.kallsyms] [k] _raw_spin_lock
Pthread Mutex (not adaptive) spends much of it's time on the hashbucket
lock as expected, followed by the test loop.
33.89% pthread_mutex_2 [kernel.kallsyms] [k] _raw_spin_lock
|
--- _raw_spin_lock
|
|--56.90%-- futex_wake
| do_futex
| sys_futex
| system_call_fastpath
| __lll_unlock_wake
|
|--28.95%-- futex_wait_setup
| futex_wait
| do_futex
| sys_futex
| system_call_fastpath
| __lll_lock_wait
...
16.60% pthread_mutex_2 pthread_mutex_2 [.] locktest
These results mostly confirm the expected: the adaptive versions spend
more time in their spin loops and less time contending for hashbucket
locks while the non-adaptive versions take the hashbucket lock more
often, and therefore shore more contention there.
I believe I should be able to get the plain FUTEX_LOCK implementation to
be much closer in performance to the plain pthread mutex version. I
expect much of the work done to benefit FUTEX_LOCK will also benefit
FUTEX_LOCK_ADAPTIVE. If that's true, and I can make a significant
improvement to FUTEX_LOCK, it wouldn't take much to get
FUTEX_LOCK_ADAPTIVE to beat the heuristics spinlock in glibc.
It could also be that this synthetic benchmark is an ideal situation for
glibc's heuristics, and a more realistic load with varying lock hold
times wouldn't favor the adaptive pthread mutex over FUTEX_LOCK_ADAPTIVE
by such a large margin.
More next week.
Thanks,
--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
prev parent reply other threads:[~2010-04-15 6:13 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-09 5:15 [PATCH V5 0/4][RFC] futex: FUTEX_LOCK with optional adaptive spinning dvhltc
2010-04-09 5:15 ` [PATCH 1/4] futex: replace fshared and clockrt with combined flags dvhltc
2010-04-09 5:15 ` [PATCH 2/4] futex: add futex_q static initializer dvhltc
2010-04-09 5:15 ` [PATCH 3/4] futex: refactor futex_lock_pi_atomic dvhltc
2010-04-09 5:15 ` [PATCH 4/4] futex: Add FUTEX_LOCK with optional adaptive spinning dvhltc
2010-04-15 6:13 ` Darren Hart [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BC6AE82.3070703@us.ibm.com \
--to=dvhltc@us.ibm.com \
--cc=acme@redhat.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=avi@redhat.com \
--cc=chris.mason@oracle.com \
--cc=chrisw@sous-sol.org \
--cc=drepper@gmail.com \
--cc=eric.dumazet@gmail.com \
--cc=ghaskins@novell.com \
--cc=john.cooper@third-harmonic.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=pmorreale@novell.com \
--cc=riel@redhat.com \
--cc=rostedt@goodmis.org \
--cc=sdietrich@novell.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.