All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Nazarewicz <mina86@mina86.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Hagen Paul Pfeifer <hagen@jauu.net>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] include: kernel.h: rewrite min3, max3 and clamp using min and max
Date: Tue, 17 Jun 2014 15:49:05 +0200	[thread overview]
Message-ID: <xa1tionzwsry.fsf@mina86.com> (raw)
In-Reply-To: <20140616161804.bb06ed842f59371c031d1252@linux-foundation.org>

On Mon, Jun 16 2014, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Mon, 16 Jun 2014 23:07:22 +0200 Michal Nazarewicz <mina86@mina86.com> wrote:
>
>> It appears that gcc is better at optimising a double call to min
>> and max rather than open coded min3 and max3.  This can be observed
>> here:
>> 
>> ...
>>
>> Furthermore, after ___make allmodconfig && make bzImage modules___ this is the
>> comparison of image and modules sizes:
>> 
>>     # Without this patch applied
>>     $ ls -l arch/x86/boot/bzImage **/*.ko |awk '{size += $5} END {print size}'
>>     350715800
>> 
>>     # With this patch applied
>>     $ ls -l arch/x86/boot/bzImage **/*.ko |awk '{size += $5} END {print size}'
>>     349856528
>
> We saved nearly a megabyte by optimising min3(), max3() and clamp()? 
>
> I'm counting a grand total of 182 callsites for those macros.  So the
> saving is 4700 bytes per invokation?  I don't believe it...

You're absolutely right.  I must have messed something up here.  This
portion of the commit message should be removed.

So I've redone this just on just the kernel image with allmodconfig:

    Linux mpn-glaptop 3.13.0-29-generic #53~precise1-Ubuntu SMP Wed Jun 4 22:06:25 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
    gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3
    Copyright (C) 2011 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    
    -rwx------ 1 mpn eng 51224656 Jun 17 14:15 vmlinux.before
    -rwx------ 1 mpn eng 51224608 Jun 17 13:57 vmlinux.after

48 bytes reduction.  The do_fault_around was a few instruction shorter
and as far as I can tell saved 12 bytes on the stack, i.e.:

    $ grep -e rsp -e pop -e push do_fault_around.*
    do_fault_around.before.s:push   %rbp
    do_fault_around.before.s:mov    %rsp,%rbp
    do_fault_around.before.s:push   %r13
    do_fault_around.before.s:push   %r12
    do_fault_around.before.s:push   %rbx
    do_fault_around.before.s:sub    $0x38,%rsp
    do_fault_around.before.s:add    $0x38,%rsp
    do_fault_around.before.s:pop    %rbx
    do_fault_around.before.s:pop    %r12
    do_fault_around.before.s:pop    %r13
    do_fault_around.before.s:pop    %rbp

    do_fault_around.after.s:push   %rbp
    do_fault_around.after.s:mov    %rsp,%rbp
    do_fault_around.after.s:push   %r12
    do_fault_around.after.s:push   %rbx
    do_fault_around.after.s:sub    $0x30,%rsp
    do_fault_around.after.s:add    $0x30,%rsp
    do_fault_around.after.s:pop    %rbx
    do_fault_around.after.s:pop    %r12
    do_fault_around.after.s:pop    %rbp

or here side-by-side:

    Before                    After
    push   %rbp               push   %rbp          
    mov    %rsp,%rbp          mov    %rsp,%rbp     
    push   %r13                                    
    push   %r12               push   %r12          
    push   %rbx               push   %rbx          
    sub    $0x38,%rsp         sub    $0x30,%rsp    
    add    $0x38,%rsp         add    $0x30,%rsp    
    pop    %rbx               pop    %rbx          
    pop    %r12               pop    %r12          
    pop    %r13                                    
    pop    %rbp               pop    %rbp          

There are also fewer branches:

    $ grep ^j do_fault_around.*
    do_fault_around.before.s:jae    ffffffff812079b7
    do_fault_around.before.s:jmp    ffffffff812079c5
    do_fault_around.before.s:jmp    ffffffff81207a14
    do_fault_around.before.s:ja     ffffffff812079f9
    do_fault_around.before.s:jb     ffffffff81207a10
    do_fault_around.before.s:jmp    ffffffff81207a63
    do_fault_around.before.s:jne    ffffffff812079df
    
    do_fault_around.after.s:jmp    ffffffff812079fd
    do_fault_around.after.s:ja     ffffffff812079e2
    do_fault_around.after.s:jb     ffffffff812079f9
    do_fault_around.after.s:jmp    ffffffff81207a4c
    do_fault_around.after.s:jne    ffffffff812079c8

And here's with allyesconfig on a different machine:

    $ uname -a; gcc --version; ls -l vmlinux.*
    Linux erwin 3.14.7-mn #54 SMP Sun Jun 15 11:25:08 CEST 2014 x86_64 AMD Phenom(tm) II X3 710 Processor AuthenticAMD GNU/Linux
    gcc (GCC) 4.8.3
    Copyright (C) 2013 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.  There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
    
    -rwx------ 1 mina86 mina86 230616126 Jun 17 15:39 vmlinux.before*
    -rwx------ 1 mina86 mina86 230614861 Jun 17 14:36 vmlinux.after*

1265 bytes reduction.

-- 
Best regards,                                         _     _
.o. | Liege of Serenely Enlightened Majesty of      o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz    (o o)
ooo +--<mpn@google.com>--<xmpp:mina86@jabber.org>--ooO--(_)--Ooo--

  parent reply	other threads:[~2014-06-17 13:49 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-16 21:07 [PATCH] include: kernel.h: rewrite min3, max3 and clamp using min and max Michal Nazarewicz
2014-06-16 23:18 ` Andrew Morton
2014-06-16 23:25   ` David Rientjes
2014-06-16 23:35     ` Andrew Morton
2014-06-16 23:54       ` David Rientjes
2014-06-17  0:21         ` Steven Rostedt
2014-06-17  4:01           ` Steven Rostedt
2014-06-17 13:49   ` Michal Nazarewicz [this message]
2014-06-17 11:37 ` Hagen Paul Pfeifer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xa1tionzwsry.fsf@mina86.com \
    --to=mina86@mina86.com \
    --cc=akpm@linux-foundation.org \
    --cc=hagen@jauu.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.