From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Ball Subject: Re: [PATCH 1/3] mmc: initialize struct mmc_command at declaration time Date: Thu, 14 Apr 2011 23:11:46 -0400 Message-ID: References: <1302753463-31005-1-git-send-email-cjb@laptop.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from void.printf.net ([89.145.121.20]:38493 "EHLO void.printf.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751666Ab1DODHO (ORCPT ); Thu, 14 Apr 2011 23:07:14 -0400 In-Reply-To: (Nicolas Pitre's message of "Thu, 14 Apr 2011 19:37:20 -0400 (EDT)") Sender: linux-mmc-owner@vger.kernel.org List-Id: linux-mmc@vger.kernel.org To: Nicolas Pitre Cc: linux-mmc@vger.kernel.org Hi, On Thu, Apr 14 2011, Nicolas Pitre wrote: > Did you disassemble the resulting binary to make sure this is actually > as performant? > > I'm asking because gcc used to do a horrible dumb job with such patterns > where it would allocate two instances of the structure on the stack i.e. > one for the named variable and one for the initializer, then fill the > later with zeroes, and then call memcpy() to copy the initializer over > to the named instance. Nothing horrible as far as I can tell; in fact, the code size on ARM (stripped mmc_core.ko) decreases by 216 bytes after the patchset here. Here's a sample disassembly of one function on ARM with gcc-4.6.0, and diff -y. Explicit memset (which calls __memzero) on the left, and {0} initializer (which calls memset) on the right: 00000698 : | 000006c8 : 698: e92d4010 push {r4, lr} | 6c8: e92d4010 push {r4, lr} 69c: e2504000 subs r4, r0, #0 | 6cc: e24dd030 sub sp, sp, #48 ; 0x30 6a0: e24dd030 sub sp, sp, #48 ; 0x30 | 6d0: e1a04000 mov r4, r0 6a4: 059f0058 ldreq r0, [pc, #88] ; 704 6b4: e3530000 cmp r3, #0 | 6e4: e3540000 cmp r4, #0 6b8: 1a000002 bne 6c8 | 6f4: e5940000 ldr r0, [r4] 6c8: e1a0000d mov r0, sp | 6f8: e3500000 cmp r0, #0 6cc: e3a01030 mov r1, #48 ; 0x30 | 6fc: 1a000002 bne 70c | 700: e59f0030 ldr r0, [pc, #48] ; 738 6dc: e1a03803 lsl r3, r3, #16 | 70c: e59430c8 ldr r3, [r4, #200] ; 0xc8 6e0: e58d3004 str r3, [sp, #4] | 710: e3a02003 mov r2, #3 6e4: e5940000 ldr r0, [r4] | 714: e1a03803 lsl r3, r3, #16 6e8: e3a03015 mov r3, #21 | 718: e58d3004 str r3, [sp, #4] 6ec: e1a0100d mov r1, sp | 71c: e1a0100d mov r1, sp 6f0: e58d2000 str r2, [sp] | 720: e3a03015 mov r3, #21 6f4: e58d3018 str r3, [sp, #24] | 724: e58d2000 str r2, [sp] 6f8: ebfffffe bl 0 | 728: e58d3018 str r3, [sp, #24] 6fc: e28dd030 add sp, sp, #48 ; 0x30 | 72c: ebfffffe bl 0 700: e8bd8010 pop {r4, pc} | 730: e28dd030 add sp, sp, #48 ; 0x30 704: 0000001d .word 0x0000001d | 734: e8bd8010 pop {r4, pc} > 738: 0000001d .word 0x0000001d Thanks, - Chris. -- Chris Ball One Laptop Per Child