From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Ingo Molnar <mingo@kernel.org>, Davidlohr Bueso <dave@stgolabs.net>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>,
Thomas Gleixner <tglx@linutronix.de>,
kbuild test robot <fengguang.wu@intel.com>,
Peter Zijlstra <peterz@infradead.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] futex: eliminate cache miss from futex_hash()
Date: Mon, 26 Oct 2015 16:22:27 +0100 [thread overview]
Message-ID: <562E4533.2060907@linutronix.de> (raw)
In-Reply-To: <20150912095936.GA15348@gmail.com>
On 09/12/2015 11:59 AM, Ingo Molnar wrote:
>
> * Davidlohr Bueso <dave@stgolabs.net> wrote:
>
>> I think we should leave it as is.
>
> But ... given that these are shared-cached values (cached on all CPUs), this
> change would only be measurable in such a benchmark if the cache footprint of the
> test is just about to overflow the size of the CPU cache and the one extra cache
> line would cause cache trashing. That is very unlikely.
>
> So such a change seems to make sense unless you can argue that it's _bad_ to move
> them closer to each other.
hash_futex(), ARM, gcc-5.2.1:
- three opcodes less
- we don't push / pop a register to the stack
--- futex_old.o_f.S
+++ futex_new.o_f.S
@@ -1,26 +1,23 @@
00000000 <hash_futex>:
-push {lr} ; (str lr, [sp, #-4]!)
-movw r3, #48887 ; 0xbef7
ldr r1, [r0, #8]
-movt r3, #57005 ; 0xdead
+movw r3, #48887 ; 0xbef7
ldr r2, [r0, #4]
-movw ip, #0
+movt r3, #57005 ; 0xdead
add r3, r1, r3
ldr r0, [r0]
add r2, r3, r2
-movt ip, #0
+movw ip, #0
eor r1, r3, r2
add r3, r3, r0
sub r1, r1, r2, ror #18
-ldr ip, [ip]
+movt ip, #0
eor r3, r3, r1
-movw lr, #0
+ldr r0, [ip, #4]
sub r3, r3, r1, ror #21
-sub ip, ip, #1
+ldr ip, [ip]
eor r2, r2, r3
-movt lr, #0
+sub r0, r0, #1
sub r2, r2, r3, ror #7
-ldr r0, [lr]
eor r1, r1, r2
sub r1, r1, r2, ror #16
eor r3, r3, r1
@@ -29,6 +26,6 @@
sub r3, r2, r3, ror #18
eor r1, r1, r3
sub r3, r1, r3, ror #8
-and r3, r3, ip
-add r0, r0, r3, lsl #6
-pop {pc} ; (ldr pc, [sp], #4)
+and r0, r0, r3
+add r0, ip, r0, lsl #6
+bx lr
I guess that not invoking three opcodes is a good thing :)
> Thanks,
>
> Ingo
>
Sebastian
next prev parent reply other threads:[~2015-10-26 15:22 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-09 21:36 [PATCH] futex: eliminate cache miss from futex_hash() Rasmus Villemoes
2015-09-10 10:22 ` Davidlohr Bueso
2015-09-12 9:59 ` Ingo Molnar
2015-10-26 15:22 ` Sebastian Andrzej Siewior [this message]
2015-09-22 14:27 ` [tip:locking/core] futex: Force hot variables into a single cache line tip-bot for Rasmus Villemoes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=562E4533.2060907@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=dave@stgolabs.net \
--cc=fengguang.wu@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux@rasmusvillemoes.dk \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.