linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: christophe.leroy@csgroup.eu,akpm@linux-foundation.org,patches@lists.linux.dev,linux-mm@kvack.org,mm-commits@vger.kernel.org,torvalds@linux-foundation.org,akpm@linux-foundation.org
Subject: [patch 09/41] ilog2: force inlining of __ilog2_u32() and __ilog2_u64()
Date: Wed, 23 Mar 2022 16:05:44 -0700	[thread overview]
Message-ID: <20220323230545.35626C340E9@smtp.kernel.org> (raw)
In-Reply-To: <20220323160453.65922ced539cbf445b191555@linux-foundation.org>

From: Christophe Leroy <christophe.leroy@csgroup.eu>
Subject: ilog2: force inlining of __ilog2_u32() and __ilog2_u64()

Building a kernel with CONFIG_CC_OPTIMISE_FOR_SIZE leads to __ilog2_u32()
being duplicated 50 times and __ilog2_u64() 3 times in vmlinux on a tiny
powerpc32 config.

__ilog2_u32() being 2 instructions it is not worth being kept out of line,
so force inlining.  Allthough the u64 version is a bit bigger, there is
still a small benefit in keeping it inlined.  On a 64 bits config there's
a real benefit.

With this change the size of vmlinux text is reduced by 1 kbytes, which is
approx 50% more than the size of the removed functions.

Before the patch there is for instance:

	c00d2a94 <__ilog2_u32>:
	c00d2a94:	7c 63 00 34 	cntlzw  r3,r3
	c00d2a98:	20 63 00 1f 	subfic  r3,r3,31
	c00d2a9c:	4e 80 00 20 	blr

	c00d36d8 <__order_base_2>:
	c00d36d8:	28 03 00 01 	cmplwi  r3,1
	c00d36dc:	40 81 00 2c 	ble     c00d3708 <__order_base_2+0x30>
	c00d36e0:	94 21 ff f0 	stwu    r1,-16(r1)
	c00d36e4:	7c 08 02 a6 	mflr    r0
	c00d36e8:	38 63 ff ff 	addi    r3,r3,-1
	c00d36ec:	90 01 00 14 	stw     r0,20(r1)
	c00d36f0:	4b ff f3 a5 	bl      c00d2a94 <__ilog2_u32>
	c00d36f4:	80 01 00 14 	lwz     r0,20(r1)
	c00d36f8:	38 63 00 01 	addi    r3,r3,1
	c00d36fc:	7c 08 03 a6 	mtlr    r0
	c00d3700:	38 21 00 10 	addi    r1,r1,16
	c00d3704:	4e 80 00 20 	blr
	c00d3708:	38 60 00 00 	li      r3,0
	c00d370c:	4e 80 00 20 	blr

With the patch it has become:

	c00d356c <__order_base_2>:
	c00d356c:	28 03 00 01 	cmplwi  r3,1
	c00d3570:	40 81 00 14 	ble     c00d3584 <__order_base_2+0x18>
	c00d3574:	38 63 ff ff 	addi    r3,r3,-1
	c00d3578:	7c 63 00 34 	cntlzw  r3,r3
	c00d357c:	20 63 00 20 	subfic  r3,r3,32
	c00d3580:	4e 80 00 20 	blr
	c00d3584:	38 60 00 00 	li      r3,0
	c00d3588:	4e 80 00 20 	blr

No more need for __order_base_2() to setup a stack frame and
save/restore caller address. And the following 'add 1' is
merged in the subtract.

Another typical use of it:

	c080ff28 <hugepagesz_setup>:
	...
	c080fff8:	7f c3 f3 78 	mr      r3,r30
	c080fffc:	4b 8f 81 f1 	bl      c01081ec <__ilog2_u32>
	c0810000:	38 63 ff f2 	addi    r3,r3,-14
	...

Becomes

	c080ff1c <hugepagesz_setup>:
	...
	c080ffec:	7f c3 00 34 	cntlzw  r3,r30
	c080fff0:	20 63 00 11 	subfic  r3,r3,17
	...

Here no need to move r30 argument to r3 then substract 14 to result.  Just
work on r30 and merge the 'sub 14' with the 'sub from 31'.

Link: https://lkml.kernel.org/r/803a2ac3d923ebcfd0dd40f5886b05cae7bb0aba.1644243860.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/log2.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- a/include/linux/log2.h~ilog2-force-inlining-of-__ilog2_u32-and-__ilog2_u64
+++ a/include/linux/log2.h
@@ -18,7 +18,7 @@
  * - the arch is not required to handle n==0 if implementing the fallback
  */
 #ifndef CONFIG_ARCH_HAS_ILOG2_U32
-static inline __attribute__((const))
+static __always_inline __attribute__((const))
 int __ilog2_u32(u32 n)
 {
 	return fls(n) - 1;
@@ -26,7 +26,7 @@ int __ilog2_u32(u32 n)
 #endif
 
 #ifndef CONFIG_ARCH_HAS_ILOG2_U64
-static inline __attribute__((const))
+static __always_inline __attribute__((const))
 int __ilog2_u64(u64 n)
 {
 	return fls64(n) - 1;
_


  parent reply	other threads:[~2022-03-23 23:05 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-23 23:04 incoming Andrew Morton
2022-03-23 23:05 ` [patch 01/41] proc: alloc PATH_MAX bytes for /proc/${pid}/fd/ symlinks Andrew Morton
2022-03-23 23:05 ` [patch 02/41] proc/vmcore: fix possible deadlock on concurrent mmap and read Andrew Morton
2022-03-23 23:05 ` [patch 03/41] proc/vmcore: fix vmcore_alloc_buf() kernel-doc comment Andrew Morton
2022-03-23 23:05 ` [patch 04/41] linux/types.h: remove unnecessary __bitwise__ Andrew Morton
2022-03-23 23:05 ` [patch 05/41] Documentation/sparse: add hints about __CHECKER__ Andrew Morton
2022-03-23 23:05 ` [patch 06/41] kernel/ksysfs.c: use helper macro __ATTR_RW Andrew Morton
2022-03-23 23:05 ` [patch 07/41] Kconfig.debug: make DEBUG_INFO selectable from a choice Andrew Morton
2022-03-23 23:05 ` [patch 08/41] include: drop pointless __compiler_offsetof indirection Andrew Morton
2022-03-23 23:05 ` Andrew Morton [this message]
2022-03-23 23:05 ` [patch 10/41] bitfield: add explicit inclusions to the example Andrew Morton
2022-03-23 23:05 ` [patch 11/41] lib/Kconfig.debug: add ARCH dependency for FUNCTION_ALIGN option Andrew Morton
2022-03-23 23:05 ` [patch 12/41] lib: bitmap: fix many kernel-doc warnings Andrew Morton
2022-03-23 23:05 ` [patch 13/41] checkpatch: prefer MODULE_LICENSE("GPL") over MODULE_LICENSE("GPL v2") Andrew Morton
2022-03-23 23:05 ` [patch 14/41] checkpatch: add --fix option for some TRAILING_STATEMENTS Andrew Morton
2022-03-23 23:06 ` [patch 15/41] checkpatch: add early_param exception to blank line after struct/function test Andrew Morton
2022-03-23 23:06 ` [patch 16/41] checkpatch: use python3 to find codespell dictionary Andrew Morton
2022-03-23 23:06 ` [patch 17/41] init: use ktime_us_delta() to make initcall_debug log more precise Andrew Morton
2022-03-23 23:06 ` [patch 18/41] init.h: improve __setup and early_param documentation Andrew Morton
2022-03-23 23:06 ` [patch 19/41] init/main.c: return 1 from handled __setup() functions Andrew Morton
2022-03-23 23:06 ` [patch 20/41] fs/pipe: use kvcalloc to allocate a pipe_buffer array Andrew Morton
2022-03-23 23:06 ` [patch 21/41] fs/pipe.c: local vars have to match types of proper pipe_inode_info fields Andrew Morton
2022-03-23 23:06 ` [patch 22/41] minix: fix bug when opening a file with O_DIRECT Andrew Morton
2022-03-23 23:06 ` [patch 23/41] fat: use pointer to simple type in put_user() Andrew Morton
2022-03-23 23:06 ` [patch 24/41] cgroup: use irqsave in cgroup_rstat_flush_locked() Andrew Morton
2022-03-23 23:06 ` [patch 25/41] kexec: make crashk_res, crashk_low_res and crash_notes symbols always visible Andrew Morton
2022-03-23 23:06 ` [patch 26/41] riscv: mm: init: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef Andrew Morton
2022-03-23 23:06 ` [patch 27/41] x86/setup: " Andrew Morton
2022-03-23 23:06 ` [patch 28/41] arm64: mm: " Andrew Morton
2022-03-23 23:06 ` [patch 29/41] docs: kdump: update description about sysfs file system support Andrew Morton
2022-03-23 23:06 ` [patch 30/41] docs: kdump: add scp example to write out the dump file Andrew Morton
2022-03-23 23:06 ` [patch 31/41] panic: unset panic_on_warn inside panic() Andrew Morton
2022-03-23 23:06 ` [patch 32/41] ubsan: no need to unset panic_on_warn in ubsan_epilogue() Andrew Morton
2022-03-23 23:06 ` [patch 33/41] kasan: no need to unset panic_on_warn in end_report() Andrew Morton
2022-03-23 23:07 ` [patch 34/41] taskstats: remove unneeded dead assignment Andrew Morton
2022-03-23 23:07 ` [patch 35/41] docs: sysctl/kernel: add missing bit to panic_print Andrew Morton
2022-03-23 23:07 ` [patch 36/41] panic: add option to dump all CPUs backtraces in panic_print Andrew Morton
2022-03-23 23:07 ` [patch 37/41] panic: move panic_print before kmsg dumpers Andrew Morton
2022-03-23 23:07 ` [patch 38/41] kcov: split ioctl handling into locked and unlocked parts Andrew Morton
2022-03-23 23:07 ` [patch 39/41] kcov: properly handle subsequent mmap calls Andrew Morton
2022-03-23 23:07 ` [patch 40/41] kernel/resource: fix kfree() of bootmem memory again Andrew Morton
2022-03-23 23:07 ` [patch 41/41] Revert "ubsan, kcsan: Don't combine sanitizer with kcov on clang" Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220323230545.35626C340E9@smtp.kernel.org \
    --to=akpm@linux-foundation.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linux-mm@kvack.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).