public inbox for kernel-janitors@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] lib/lz4: make arrays static const, reduces object code size
@ 2017-09-21 22:19 Colin King
  2017-09-21 23:09 ` Christophe JAILLET
  0 siblings, 1 reply; 11+ messages in thread
From: Colin King @ 2017-09-21 22:19 UTC (permalink / raw)
  To: Sven Schmidt, Andrew Morton, Arnd Bergmann; +Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

Don't populate the read-only arrays dec32table and dec64table on the
stack, instead make them both static const.  Makes the object code
smaller by over 10K bytes:

Before:
   text	   data	    bss	    dec	    hex	filename
  31500	      0	      0	  31500	   7b0c	lib/lz4/lz4_decompress.o

After:
   text	   data	    bss	    dec	    hex	filename
  20237	    176	      0	  20413	   4fbd	lib/lz4/lz4_decompress.o

(gcc version 7.2.0 x86_64)

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 lib/lz4/lz4_decompress.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
index bd3574312b82..141734d255e4 100644
--- a/lib/lz4/lz4_decompress.c
+++ b/lib/lz4/lz4_decompress.c
@@ -85,8 +85,8 @@ static FORCE_INLINE int LZ4_decompress_generic(
 	const BYTE * const lowLimit = lowPrefix - dictSize;
 
 	const BYTE * const dictEnd = (const BYTE *)dictStart + dictSize;
-	const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
-	const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
+	static const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
+	static const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
 
 	const int safeDecode = (endOnInput = endOnInputSize);
 	const int checkOffset = ((safeDecode) && (dictSize < (int)(64 * KB)));
-- 
2.14.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] lib/lz4: make arrays static const, reduces object code size
  2017-09-21 22:19 [PATCH] lib/lz4: make arrays static const, reduces object code size Colin King
@ 2017-09-21 23:09 ` Christophe JAILLET
  2017-09-21 23:11   ` Colin Ian King
  0 siblings, 1 reply; 11+ messages in thread
From: Christophe JAILLET @ 2017-09-21 23:09 UTC (permalink / raw)
  To: Colin King, Sven Schmidt, Andrew Morton, Arnd Bergmann
  Cc: kernel-janitors, linux-kernel

Le 22/09/2017 à 00:19, Colin King a écrit :
> From: Colin Ian King <colin.king@canonical.com>
>
> Don't populate the read-only arrays dec32table and dec64table on the
> stack, instead make them both static const.  Makes the object code
> smaller by over 10K bytes:
10k? Wouaouh! This is way much more than what you usually win with such 
patches.
I assume we must thanks FORCE_INLINE in this case.
Nice catch!

CJ

> Before:
>     text	   data	    bss	    dec	    hex	filename
>    31500	      0	      0	  31500	   7b0c	lib/lz4/lz4_decompress.o
>
> After:
>     text	   data	    bss	    dec	    hex	filename
>    20237	    176	      0	  20413	   4fbd	lib/lz4/lz4_decompress.o
>
> (gcc version 7.2.0 x86_64)
>
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>   lib/lz4/lz4_decompress.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
> index bd3574312b82..141734d255e4 100644
> --- a/lib/lz4/lz4_decompress.c
> +++ b/lib/lz4/lz4_decompress.c
> @@ -85,8 +85,8 @@ static FORCE_INLINE int LZ4_decompress_generic(
>   	const BYTE * const lowLimit = lowPrefix - dictSize;
>   
>   	const BYTE * const dictEnd = (const BYTE *)dictStart + dictSize;
> -	const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
> -	const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
> +	static const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
> +	static const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
>   
>   	const int safeDecode = (endOnInput = endOnInputSize);
>   	const int checkOffset = ((safeDecode) && (dictSize < (int)(64 * KB)));



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lib/lz4: make arrays static const, reduces object code size
  2017-09-21 23:09 ` Christophe JAILLET
@ 2017-09-21 23:11   ` Colin Ian King
  2017-09-22  7:48     ` Arnd Bergmann
  0 siblings, 1 reply; 11+ messages in thread
From: Colin Ian King @ 2017-09-21 23:11 UTC (permalink / raw)
  To: Christophe JAILLET, Sven Schmidt, Andrew Morton, Arnd Bergmann
  Cc: kernel-janitors, linux-kernel

On 22/09/17 00:09, Christophe JAILLET wrote:
> Le 22/09/2017 à 00:19, Colin King a écrit :
>> From: Colin Ian King <colin.king@canonical.com>
>>
>> Don't populate the read-only arrays dec32table and dec64table on the
>> stack, instead make them both static const.  Makes the object code
>> smaller by over 10K bytes:
> 10k? Wouaouh! This is way much more than what you usually win with such
> patches.

Yes, I had to triple check it because it was an unbelievable win.

> I assume we must thanks FORCE_INLINE in this case.

Yep, I believe so.

> Nice catch!
> 
> CJ
> 
>> Before:
>>     text       data        bss        dec        hex    filename
>>    31500          0          0      31500       7b0c   
>> lib/lz4/lz4_decompress.o
>>
>> After:
>>     text       data        bss        dec        hex    filename
>>    20237        176          0      20413       4fbd   
>> lib/lz4/lz4_decompress.o
>>
>> (gcc version 7.2.0 x86_64)
>>
>> Signed-off-by: Colin Ian King <colin.king@canonical.com>
>> ---
>>   lib/lz4/lz4_decompress.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
>> index bd3574312b82..141734d255e4 100644
>> --- a/lib/lz4/lz4_decompress.c
>> +++ b/lib/lz4/lz4_decompress.c
>> @@ -85,8 +85,8 @@ static FORCE_INLINE int LZ4_decompress_generic(
>>       const BYTE * const lowLimit = lowPrefix - dictSize;
>>         const BYTE * const dictEnd = (const BYTE *)dictStart + dictSize;
>> -    const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
>> -    const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
>> +    static const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
>> +    static const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
>>         const int safeDecode = (endOnInput = endOnInputSize);
>>       const int checkOffset = ((safeDecode) && (dictSize < (int)(64 *
>> KB)));
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lib/lz4: make arrays static const, reduces object code size
  2017-09-21 23:11   ` Colin Ian King
@ 2017-09-22  7:48     ` Arnd Bergmann
  2017-09-22 17:21       ` Joe Perches
  0 siblings, 1 reply; 11+ messages in thread
From: Arnd Bergmann @ 2017-09-22  7:48 UTC (permalink / raw)
  To: Colin Ian King
  Cc: Christophe JAILLET, Sven Schmidt, Andrew Morton, kernel-janitors,
	Linux Kernel Mailing List

On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
<colin.king@canonical.com> wrote:
> On 22/09/17 00:09, Christophe JAILLET wrote:
>> Le 22/09/2017 à 00:19, Colin King a écrit :
>>> From: Colin Ian King <colin.king@canonical.com>
>>>
>>> Don't populate the read-only arrays dec32table and dec64table on the
>>> stack, instead make them both static const.  Makes the object code
>>> smaller by over 10K bytes:
>> 10k? Wouaouh! This is way much more than what you usually win with such
>> patches.
>
> Yes, I had to triple check it because it was an unbelievable win.
>

I wonder whether this should be reported as a gcc bug. I tried reproducing
it here with gcc-7.1.1 and gcc-8.0.0, but I only see a 4K difference:

   text    data     bss     dec     hex filename
  18220     176       0   18396    47dc build/tmp/lib/lz4/lz4_decompress-after.o
  22297       0       0   22297    5719
build/tmp/lib/lz4/lz4_decompress-before.o

I tried x86 defconfig and allmodconfig on a slightly patched mainline kernel.

        Arnd

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lib/lz4: make arrays static const, reduces object code size
  2017-09-22  7:48     ` Arnd Bergmann
@ 2017-09-22 17:21       ` Joe Perches
  2017-09-22 19:17         ` Arnd Bergmann
  0 siblings, 1 reply; 11+ messages in thread
From: Joe Perches @ 2017-09-22 17:21 UTC (permalink / raw)
  To: Arnd Bergmann, Colin Ian King
  Cc: Christophe JAILLET, Sven Schmidt, Andrew Morton, kernel-janitors,
	Linux Kernel Mailing List

On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
> <colin.king@canonical.com> wrote:
> > On 22/09/17 00:09, Christophe JAILLET wrote:
> > > Le 22/09/2017 à 00:19, Colin King a écrit :
> > > > From: Colin Ian King <colin.king@canonical.com>
> > > > 
> > > > Don't populate the read-only arrays dec32table and dec64table on the
> > > > stack, instead make them both static const.  Makes the object code
> > > > smaller by over 10K bytes:
> > > 
> > > 10k? Wouaouh! This is way much more than what you usually win with such
> > > patches.
> > 
> > Yes, I had to triple check it because it was an unbelievable win.
> > 
> 
> I wonder whether this should be reported as a gcc bug. I tried reproducing
> it here with gcc-7.1.1 and gcc-8.0.0, but I only see a 4K difference:
> 
>    text    data     bss     dec     hex filename
>   18220     176       0   18396    47dc build/tmp/lib/lz4/lz4_decompress-after.o
>   22297       0       0   22297    5719 build/tmp/lib/lz4/lz4_decompress-before.o

Perhaps not so much a gcc bug as an opportunity
for gcc to add an additional optimization.

gcc would have to verify that the const array is
not initialized with some variable or argument like:

int foo(int a)
{
	const int array[] = {1, a};
	...
}

--
To unsubscribe from this list: send the line "unsubscribe kernel-janitors" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lib/lz4: make arrays static const, reduces object code size
  2017-09-22 17:21       ` Joe Perches
@ 2017-09-22 19:17         ` Arnd Bergmann
  2017-09-22 19:38           ` Arnd Bergmann
                             ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Arnd Bergmann @ 2017-09-22 19:17 UTC (permalink / raw)
  To: Joe Perches
  Cc: Colin Ian King, Christophe JAILLET, Sven Schmidt, Andrew Morton,
	kernel-janitors, Linux Kernel Mailing List

On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches <joe@perches.com> wrote:
> On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
>> On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King

>>    text    data     bss     dec     hex filename
>>   18220     176       0   18396    47dc build/tmp/lib/lz4/lz4_decompress-after.o
>>   22297       0       0   22297    5719 build/tmp/lib/lz4/lz4_decompress-before.o
>
> Perhaps not so much a gcc bug as an opportunity
> for gcc to add an additional optimization.
>
> gcc would have to verify that the const array is
> not initialized with some variable or argument like:
>
> int foo(int a)
> {
>         const int array[] = {1, a};
>         ...
> }

It depends. With a 10KB different in .text size, my guess is that this
is a case where gcc does the right optimization in principle, but
fails to do what was intended in some corner cases.

I just cross-checked by building with clang, there the patch has
no impact on code size, it is 24929 bytes with or without the patch.

Looking at other versions of (x86) gcc, I see .text sizes of

             after    before
gcc-3.4.6 10855 12977
gcc-4.0.4 11088 11088
gcc-4.1.3 10973 10973
gcc-4.2.5 11183 11183
gcc-4.3.6 15501 17724
gcc-4.4.7 13337 15693
gcc-4.5.4 13162 15491
gcc-4.6.4 14846 17302
gcc-4.7.4 14187 16294
gcc-4.8.5 16591 18730
gcc-4.9.4 19582 21995
gcc-5.4.1 18294 22510
gcc-6.1.1 20487 25172
gcc-6.3.1 20487 25172
gcc-7.0.0 20351 31789
gcc-7.0.1 20351 24966
gcc-7.1.1 20383 24982
gcc-8.0.0 20686 25065

It seems whatever happened in early versions of gcc-7 has since
improved, and it probably was a bug since older and newer versions
create similar code size (I have not looked at the actual object code).

The 5K difference in gcc-5 and higher still seems like a lot. It would
also be interesting to look at the decompression performance of
this code witth the different compilers to see if it got better or worse.

Most likely, gcc got better at inlining and unrolling parts of the
algorithm, but sometimes an object file that doubles or triples in
size is an indication that the compiler did something really bad.

       Arnd

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lib/lz4: make arrays static const, reduces object code size
  2017-09-22 19:17         ` Arnd Bergmann
@ 2017-09-22 19:38           ` Arnd Bergmann
  2017-09-22 21:39           ` Arnd Bergmann
  2017-09-23  1:33           ` Joe Perches
  2 siblings, 0 replies; 11+ messages in thread
From: Arnd Bergmann @ 2017-09-22 19:38 UTC (permalink / raw)
  To: Joe Perches
  Cc: Colin Ian King, Christophe JAILLET, Sven Schmidt, Andrew Morton,
	kernel-janitors, Linux Kernel Mailing List

On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann <arnd@arndb.de> wrote:

>              after    before
> gcc-3.4.6 10855 12977
> gcc-4.0.4 11088 11088
> gcc-4.1.3 10973 10973
> gcc-4.2.5 11183 11183
> gcc-4.3.6 15501 17724
> gcc-4.4.7 13337 15693
> gcc-4.5.4 13162 15491
> gcc-4.6.4 14846 17302
> gcc-4.7.4 14187 16294
> gcc-4.8.5 16591 18730
> gcc-4.9.4 19582 21995
> gcc-5.4.1 18294 22510
> gcc-6.1.1 20487 25172
> gcc-6.3.1 20487 25172
> gcc-7.0.0 20351 31789
> gcc-7.0.1 20351 24966
> gcc-7.1.1 20383 24982
> gcc-8.0.0 20686 25065

Here is the same table for 32-bit ARM, showing results that roughly
match my expectations of what we should see here

4.3.6 15318 16574
4.4.7 13364 14388
4.5.4 12856 13912
4.6.4 14596 15124
4.7.4 14942 15850
4.8.5 15454 16074
4.9.4 15994 16742
5.4.1 15930 16686
6.3.1 17302 18154
7.0.0 18042 18958
7.0.1 18278 19098
7.1.1 18302 19106
8.0.0 19710 20490

      Arnd

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lib/lz4: make arrays static const, reduces object code size
  2017-09-22 19:17         ` Arnd Bergmann
  2017-09-22 19:38           ` Arnd Bergmann
@ 2017-09-22 21:39           ` Arnd Bergmann
  2017-09-22 21:43             ` Colin Ian King
  2017-09-23  1:33             ` Joe Perches
  2017-09-23  1:33           ` Joe Perches
  2 siblings, 2 replies; 11+ messages in thread
From: Arnd Bergmann @ 2017-09-22 21:39 UTC (permalink / raw)
  To: Joe Perches
  Cc: Colin Ian King, Christophe JAILLET, Sven Schmidt, Andrew Morton,
	kernel-janitors, Linux Kernel Mailing List

On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches <joe@perches.com> wrote:
>> On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
>>> On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
>
>>>    text    data     bss     dec     hex filename
>>>   18220     176       0   18396    47dc build/tmp/lib/lz4/lz4_decompress-after.o
>>>   22297       0       0   22297    5719 build/tmp/lib/lz4/lz4_decompress-before.o
>>
>> Perhaps not so much a gcc bug as an opportunity
>> for gcc to add an additional optimization.
>>
>> gcc would have to verify that the const array is
>> not initialized with some variable or argument like:
>>
>> int foo(int a)
>> {
>>         const int array[] = {1, a};
>>         ...
>> }
>
> It depends. With a 10KB different in .text size, my guess is that this
> is a case where gcc does the right optimization in principle, but
> fails to do what was intended in some corner cases.

I found the problem: "gcc -fsanitze=kernel-address --param asan-stack=1"
produces lots of expensive checks here with gcc-5 or higher.

Disabling it makes a big difference:

upstream:
gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 31789 bytes
gcc-7.0.0: 16535 bytes

patched:
gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 20351 bytes
gcc-7.0.0: 14490 bytes

      Arnd

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lib/lz4: make arrays static const, reduces object code size
  2017-09-22 21:39           ` Arnd Bergmann
@ 2017-09-22 21:43             ` Colin Ian King
  2017-09-23  1:33             ` Joe Perches
  1 sibling, 0 replies; 11+ messages in thread
From: Colin Ian King @ 2017-09-22 21:43 UTC (permalink / raw)
  To: Arnd Bergmann, Joe Perches
  Cc: Christophe JAILLET, Sven Schmidt, Andrew Morton, kernel-janitors,
	Linux Kernel Mailing List

On 22/09/17 22:39, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches <joe@perches.com> wrote:
>>> On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
>>>> On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
>>
>>>>    text    data     bss     dec     hex filename
>>>>   18220     176       0   18396    47dc build/tmp/lib/lz4/lz4_decompress-after.o
>>>>   22297       0       0   22297    5719 build/tmp/lib/lz4/lz4_decompress-before.o
>>>
>>> Perhaps not so much a gcc bug as an opportunity
>>> for gcc to add an additional optimization.
>>>
>>> gcc would have to verify that the const array is
>>> not initialized with some variable or argument like:
>>>
>>> int foo(int a)
>>> {
>>>         const int array[] = {1, a};
>>>         ...
>>> }
>>
>> It depends. With a 10KB different in .text size, my guess is that this
>> is a case where gcc does the right optimization in principle, but
>> fails to do what was intended in some corner cases.
> 
> I found the problem: "gcc -fsanitze=kernel-address --param asan-stack=1"
> produces lots of expensive checks here with gcc-5 or higher.
> 
> Disabling it makes a big difference:
> 
> upstream:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 31789 bytes
> gcc-7.0.0: 16535 bytes
> 
> patched:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 20351 bytes
> gcc-7.0.0: 14490 bytes
> 
>       Arnd
> 

Nice catch!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lib/lz4: make arrays static const, reduces object code size
  2017-09-22 21:39           ` Arnd Bergmann
  2017-09-22 21:43             ` Colin Ian King
@ 2017-09-23  1:33             ` Joe Perches
  1 sibling, 0 replies; 11+ messages in thread
From: Joe Perches @ 2017-09-23  1:33 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Colin Ian King, Christophe JAILLET, Sven Schmidt, Andrew Morton,
	kernel-janitors, Linux Kernel Mailing List

On Fri, 2017-09-22 at 23:39 +0200, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches <joe@perches.com> wrote:
> > > On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
> > > > On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
> > > >    text    data     bss     dec     hex filename
> > > >   18220     176       0   18396    47dc build/tmp/lib/lz4/lz4_decompress-after.o
> > > >   22297       0       0   22297    5719 build/tmp/lib/lz4/lz4_decompress-before.o
> > > 
> > > Perhaps not so much a gcc bug as an opportunity
> > > for gcc to add an additional optimization.
> > > 
> > > gcc would have to verify that the const array is
> > > not initialized with some variable or argument like:
> > > 
> > > int foo(int a)
> > > {
> > >         const int array[] = {1, a};
> > >         ...
> > > }
> > 
> > It depends. With a 10KB different in .text size, my guess is that this
> > is a case where gcc does the right optimization in principle, but
> > fails to do what was intended in some corner cases.
> 
> I found the problem: "gcc -fsanitze=kernel-address --param asan-stack=1"
> produces lots of expensive checks here with gcc-5 or higher.
> 
> Disabling it makes a big difference:
> 
> upstream:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 31789 bytes
> gcc-7.0.0: 16535 bytes
> 
> patched:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 20351 bytes
> gcc-7.0.0: 14490 bytes

I think you are looking at a different issue.

There seems still a difference in size between
current and Colin's patch in compiled object size.



>       Arnd

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lib/lz4: make arrays static const, reduces object code size
  2017-09-22 19:17         ` Arnd Bergmann
  2017-09-22 19:38           ` Arnd Bergmann
  2017-09-22 21:39           ` Arnd Bergmann
@ 2017-09-23  1:33           ` Joe Perches
  2 siblings, 0 replies; 11+ messages in thread
From: Joe Perches @ 2017-09-23  1:33 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Colin Ian King, Christophe JAILLET, Sven Schmidt, Andrew Morton,
	kernel-janitors, Linux Kernel Mailing List

On Fri, 2017-09-22 at 21:17 +0200, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches <joe@perches.com> wrote:
> > On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
> > > On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
> > >    text    data     bss     dec     hex filename
> > >   18220     176       0   18396    47dc build/tmp/lib/lz4/lz4_decompress-after.o
> > >   22297       0       0   22297    5719 build/tmp/lib/lz4/lz4_decompress-before.o
> > 
> > Perhaps not so much a gcc bug as an opportunity
> > for gcc to add an additional optimization.
> > 
> > gcc would have to verify that the const array is
> > not initialized with some variable or argument like:
> > 
> > int foo(int a)
> > {
> >         const int array[] = {1, a};
> >         ...
> > }
> 
> It depends. With a 10KB different in .text size, my guess is that this
> is a case where gcc does the right optimization in principle, but
> fails to do what was intended in some corner cases.

Maybe/maybe not.  
> I just cross-checked by building with clang, there the patch has
> no impact on code size, it is 24929 bytes with or without the patch.
> 
> Looking at other versions of (x86) gcc, I see .text sizes of
> 
>              after    before
> gcc-3.4.6 10855 12977
> gcc-4.0.4 11088 11088
> gcc-4.1.3 10973 10973
> gcc-4.2.5 11183 11183
> gcc-4.3.6 15501 17724

Interesting this was apparently deoptimized at version 4.3.

Glancing at the release notes doesn't seem to indicate
anything obvious.

https://gcc.gnu.org/gcc-4.3/changes.html

> gcc-4.4.7 13337 15693
> gcc-4.5.4 13162 15491
> gcc-4.6.4 14846 17302
> gcc-4.7.4 14187 16294
> gcc-4.8.5 16591 18730
> gcc-4.9.4 19582 21995
> gcc-5.4.1 18294 22510
> gcc-6.1.1 20487 25172
> gcc-6.3.1 20487 25172
> gcc-7.0.0 20351 31789
> gcc-7.0.1 20351 24966
> gcc-7.1.1 20383 24982
> gcc-8.0.0 20686 25065
> 
> It seems whatever happened in early versions of gcc-7 has since
> improved, and it probably was a bug since older and newer versions
> create similar code size (I have not looked at the actual object code).
> 
> The 5K difference in gcc-5 and higher still seems like a lot. It would
> also be interesting to look at the decompression performance of
> this code witth the different compilers to see if it got better or worse.

yup

> Most likely, gcc got better at inlining and unrolling parts of the
> algorithm, but sometimes an object file that doubles or triples in
> size is an indication that the compiler did something really bad.

yup[2]

cheers, Joe

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-09-23  1:33 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-21 22:19 [PATCH] lib/lz4: make arrays static const, reduces object code size Colin King
2017-09-21 23:09 ` Christophe JAILLET
2017-09-21 23:11   ` Colin Ian King
2017-09-22  7:48     ` Arnd Bergmann
2017-09-22 17:21       ` Joe Perches
2017-09-22 19:17         ` Arnd Bergmann
2017-09-22 19:38           ` Arnd Bergmann
2017-09-22 21:39           ` Arnd Bergmann
2017-09-22 21:43             ` Colin Ian King
2017-09-23  1:33             ` Joe Perches
2017-09-23  1:33           ` Joe Perches

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox