From: Darren Hart <dvhltc@us.ibm.com>
To: linux-kernel@vger.kernel.org
Cc: Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@elte.hu>,
Eric Dumazet <eric.dumazet@gmail.com>,
"Peter W. Morreale" <pmorreale@novell.com>,
Rik van Riel <riel@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Gregory Haskins <ghaskins@novell.com>,
Sven-Thorsten Dietrich <sdietrich@novell.com>,
Chris Mason <chris.mason@oracle.com>,
John Cooper <john.cooper@third-harmonic.com>,
Chris Wright <chrisw@sous-sol.org>,
Ulrich Drepper <drepper@gmail.com>,
Alan Cox <alan@lxorguk.ukuu.org.uk>, Avi Kivity <avi@redhat.com>,
Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: Re: [PATCH V5 0/4][RFC] futex: FUTEX_LOCK with optional adaptive spinning
Date: Wed, 14 Apr 2010 23:13:22 -0700 [thread overview]
Message-ID: <4BC6AE82.3070703@us.ibm.com> (raw)
In-Reply-To: <1270790121-16317-1-git-send-email-dvhltc@us.ibm.com>
dvhltc@us.ibm.com wrote:
> Now that an advantage can be shown using FUTEX_LOCK_ADAPTIVE over FUTEX_LOCK,
> the next steps as I see them are:
>
> o Try and show improvement of FUTEX_LOCK_ADAPTIVE over FUTEX_WAIT based
> implementations (pthread_mutex specifically).
I've spent a bit of time on this, and made huge improvements through
some simple optimizations of the testcase lock/unlock routines. I'll be
away for a few days and wanted to let people know where things stand
with FUTEX_LOCK_ADAPTIVE.
I ran all the tests with the following options:
-i 1000000 -p 1000 -d 20
where:
-i iterations
-p period (in instructions)
-d duty cycle (in percent)
MECHANISM KITERS/SEC
----------------------------------
pthread_mutex_adaptive 1562
FUTEX_LOCK_ADAPTIVE 1190
pthread_mutex 1010
FUTEX_LOCK 532
I took some perf data while running each of the above tests as well. Any
thoughts on getting more from perf are appreciated, this is my first
pass at it. I recorded with "perf record -fg" and snippets of "perf
report" follow:
FUTEX_LOCK (not adaptive) spends a lot of time spinning on the futex
hashbucket lock.
# Overhead Command Shared Object Symbol
# ........ .......... .................. ......
#
40.76% futex_lock [kernel.kallsyms] [k] _raw_spin_lock
|
--- _raw_spin_lock
|
|--62.16%-- do_futex
| sys_futex
| system_call_fastpath
| syscall
|
|--31.05%-- futex_wake
| do_futex
| sys_futex
| system_call_fastpath
| syscall
...
14.98% futex_lock futex_lock [.] locktest
FUTEX_LOCK_ADAPTIVE spends much of its time in the test loop itself,
followed by the actual adaptive loop in the kernel. It appears much of
our savings over FUTEX_LOCK comes from not contending on the hashbucket
lock.
# Overhead Command Shared Object Symbol
# ........ .......... .................. ......
#
36.07% futex_lock futex_lock [.] locktest
|
--- locktest
|
--100.00%-- 0x400e7000000000
9.12% futex_lock perf [.] 0x00000000000eee
...
8.26% futex_lock [kernel.kallsyms] [k] futex_spin_on_owner
Pthread Mutex Adaptive spends most of it's time in the glibc heuristic
spinning, as expected, followed by the test loop itself. An impressively
minimal 3.35% is spent on the hashbucket lock.
# Overhead Command Shared Object Symbol
# ........ ............... ........................ ......
#
47.88% pthread_mutex_2 libpthread-2.5.so [.]
__pthread_mutex_lock_internal
|
--- __pthread_mutex_lock_internal
22.78% pthread_mutex_2 pthread_mutex_2 [.] locktest
...
15.16% pthread_mutex_2 perf [.] ...
...
3.35% pthread_mutex_2 [kernel.kallsyms] [k] _raw_spin_lock
Pthread Mutex (not adaptive) spends much of it's time on the hashbucket
lock as expected, followed by the test loop.
33.89% pthread_mutex_2 [kernel.kallsyms] [k] _raw_spin_lock
|
--- _raw_spin_lock
|
|--56.90%-- futex_wake
| do_futex
| sys_futex
| system_call_fastpath
| __lll_unlock_wake
|
|--28.95%-- futex_wait_setup
| futex_wait
| do_futex
| sys_futex
| system_call_fastpath
| __lll_lock_wait
...
16.60% pthread_mutex_2 pthread_mutex_2 [.] locktest
These results mostly confirm the expected: the adaptive versions spend
more time in their spin loops and less time contending for hashbucket
locks while the non-adaptive versions take the hashbucket lock more
often, and therefore shore more contention there.
I believe I should be able to get the plain FUTEX_LOCK implementation to
be much closer in performance to the plain pthread mutex version. I
expect much of the work done to benefit FUTEX_LOCK will also benefit
FUTEX_LOCK_ADAPTIVE. If that's true, and I can make a significant
improvement to FUTEX_LOCK, it wouldn't take much to get
FUTEX_LOCK_ADAPTIVE to beat the heuristics spinlock in glibc.
It could also be that this synthetic benchmark is an ideal situation for
glibc's heuristics, and a more realistic load with varying lock hold
times wouldn't favor the adaptive pthread mutex over FUTEX_LOCK_ADAPTIVE
by such a large margin.
More next week.
Thanks,
--
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team
prev parent reply other threads:[~2010-04-15 6:13 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-09 5:15 [PATCH V5 0/4][RFC] futex: FUTEX_LOCK with optional adaptive spinning dvhltc
2010-04-09 5:15 ` [PATCH 1/4] futex: replace fshared and clockrt with combined flags dvhltc
2010-04-09 5:15 ` [PATCH 2/4] futex: add futex_q static initializer dvhltc
2010-04-09 5:15 ` [PATCH 3/4] futex: refactor futex_lock_pi_atomic dvhltc
2010-04-09 5:15 ` [PATCH 4/4] futex: Add FUTEX_LOCK with optional adaptive spinning dvhltc
2010-04-15 6:13 ` Darren Hart [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4BC6AE82.3070703@us.ibm.com \
--to=dvhltc@us.ibm.com \
--cc=acme@redhat.com \
--cc=alan@lxorguk.ukuu.org.uk \
--cc=avi@redhat.com \
--cc=chris.mason@oracle.com \
--cc=chrisw@sous-sol.org \
--cc=drepper@gmail.com \
--cc=eric.dumazet@gmail.com \
--cc=ghaskins@novell.com \
--cc=john.cooper@third-harmonic.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=peterz@infradead.org \
--cc=pmorreale@novell.com \
--cc=riel@redhat.com \
--cc=rostedt@goodmis.org \
--cc=sdietrich@novell.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox