From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Anshul Garg <aksgarg1989@gmail.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Davidlohr Bueso <dave@stgolabs.net>,
Thomas Gleixner <tglx@linutronix.de>,
Ingo Molnar <mingo@kernel.org>, Will Deacon <will.deacon@arm.com>,
Joe Perches <joe@perches.com>, David Miller <davem@davemloft.net>,
Matthew Wilcox <mawilcox@microsoft.com>,
Kees Cook <keescook@chromium.org>,
Michael Davidson <md@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Arnd Bergmann <arnd@arndb.de>
Subject: [PATCH 4.9 24/30] lib/int_sqrt: optimize small argument
Date: Tue, 26 Mar 2019 15:30:03 +0900 [thread overview]
Message-ID: <20190326042608.381928654@linuxfoundation.org> (raw)
In-Reply-To: <20190326042607.558087893@linuxfoundation.org>
4.9-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra <peterz@infradead.org>
commit 3f3295709edea6268ff1609855f498035286af73 upstream.
The current int_sqrt() computation is sub-optimal for the case of small
@x. Which is the interesting case when we're going to do cumulative
distribution functions on idle times, which we assume to be a random
variable, where the target residency of the deepest idle state gives an
upper bound on the variable (5e6ns on recent Intel chips).
In the case of small @x, the compute loop:
while (m != 0) {
b = y + m;
y >>= 1;
if (x >= b) {
x -= b;
y += m;
}
m >>= 2;
}
can be reduced to:
while (m > x)
m >>= 2;
Because y==0, b==m and until x>=m y will remain 0.
And while this is computationally equivalent, it runs much faster
because there's less code, in particular less branches.
cycles: branches: branch-misses:
OLD:
hot: 45.109444 +- 0.044117 44.333392 +- 0.002254 0.018723 +- 0.000593
cold: 187.737379 +- 0.156678 44.333407 +- 0.002254 6.272844 +- 0.004305
PRE:
hot: 67.937492 +- 0.064124 66.999535 +- 0.000488 0.066720 +- 0.001113
cold: 232.004379 +- 0.332811 66.999527 +- 0.000488 6.914634 +- 0.006568
POST:
hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681
cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219
Averages computed over all values <128k using a LFSR to generate order.
Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
each int_sqrt() invocation.
Link: http://lkml.kernel.org/r/20171020164644.876503355@infradead.org
Fixes: 30493cc9dddb ("lib/int_sqrt.c: optimize square root algorithm")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Anshul Garg <aksgarg1989@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Davidson <md@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
lib/int_sqrt.c | 3 +++
1 file changed, 3 insertions(+)
--- a/lib/int_sqrt.c
+++ b/lib/int_sqrt.c
@@ -22,6 +22,9 @@ unsigned long int_sqrt(unsigned long x)
return x;
m = 1UL << (BITS_PER_LONG - 2);
+ while (m > x)
+ m >>= 2;
+
while (m != 0) {
b = y + m;
y >>= 1;
next prev parent reply other threads:[~2019-03-26 6:32 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-26 6:29 [PATCH 4.9 00/30] 4.9.166-stable review Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 01/30] mmc: pxamci: fix enum type confusion Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 02/30] drm/vmwgfx: Dont double-free the mode stored in par->set_mode Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 03/30] iommu/amd: fix sg->dma_address for sg->offset bigger than PAGE_SIZE Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 04/30] libceph: wait for latest osdmap in ceph_monc_blacklist_add() Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 05/30] udf: Fix crash on IO error during truncate Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 06/30] mips: loongson64: lemote-2f: Add IRQF_NO_SUSPEND to "cascade" irqaction Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 07/30] MIPS: Ensure ELF appended dtb is relocated Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 08/30] MIPS: Fix kernel crash for R6 in jump label branch function Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 09/30] futex: Ensure that futex address is aligned in handle_futex_death() Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 10/30] objtool: Move objtool_file struct off the stack Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 11/30] ext4: fix NULL pointer dereference while journal is aborted Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 12/30] ext4: fix data corruption caused by unaligned direct AIO Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 13/30] ext4: brelse all indirect buffer in ext4_ind_remove_space() Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 14/30] media: v4l2-ctrls.c/uvc: zero v4l2_event Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 15/30] Bluetooth: Fix decrementing reference count twice in releasing socket Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 16/30] locking/lockdep: Add debug_locks check in __lock_downgrade() Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 17/30] ALSA: hda - Record the current power state before suspend/resume calls Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 18/30] ALSA: hda - Enforces runtime_resume after S3 and S4 for each codec Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 19/30] tcp/dccp: drop SYN packets if accept queue is full Greg Kroah-Hartman
2019-03-26 6:29 ` [PATCH 4.9 20/30] serial: sprd: adjust TIMEOUT to a big value Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 4.9 21/30] Hang/soft lockup in d_invalidate with simultaneous calls Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 4.9 22/30] arm64: traps: disable irq in die() Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 4.9 23/30] serial: sprd: clear timeout interrupt only rather than all interrupts Greg Kroah-Hartman
2019-03-26 6:30 ` Greg Kroah-Hartman [this message]
2019-03-26 6:30 ` [PATCH 4.9 25/30] USB: core: only clean up what we allocated Greg Kroah-Hartman
2019-03-30 17:18 ` Nathan Chancellor
2019-04-01 11:46 ` Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 4.9 26/30] scsi: ufs: fix wrong command type of UTRD for UFSHCI v2.1 Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 4.9 27/30] rtc: Fix overflow when converting time64_t to rtc_time Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 4.9 28/30] pwm-backlight: Enable/disable the PWM before/after LCD enable toggle Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 4.9 29/30] power: supply: charger-manager: Fix incorrect return value Greg Kroah-Hartman
2019-03-26 6:30 ` [PATCH 4.9 30/30] ath10k: avoid possible string overflow Greg Kroah-Hartman
2019-03-26 11:41 ` [PATCH 4.9 00/30] 4.9.166-stable review Naresh Kamboju
2019-03-26 12:03 ` kernelci.org bot
2019-03-26 15:18 ` Jon Hunter
2019-03-26 15:18 ` Jon Hunter
2019-03-26 17:48 ` Guenter Roeck
2019-03-26 23:16 ` shuah
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190326042608.381928654@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=aksgarg1989@gmail.com \
--cc=arnd@arndb.de \
--cc=dave@stgolabs.net \
--cc=davem@davemloft.net \
--cc=joe@perches.com \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mawilcox@microsoft.com \
--cc=md@google.com \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=stable@vger.kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.