From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mailman by lists.gnu.org with archive (Exim 4.43) id 1KH1mG-0002wz-LT for mharc-grub-devel@gnu.org; Thu, 10 Jul 2008 15:26:04 -0400 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1KH1mF-0002vw-Q1 for grub-devel@gnu.org; Thu, 10 Jul 2008 15:26:03 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1KH1mD-0002sy-U8 for grub-devel@gnu.org; Thu, 10 Jul 2008 15:26:03 -0400 Received: from [199.232.76.173] (port=38923 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1KH1mD-0002sU-GQ for grub-devel@gnu.org; Thu, 10 Jul 2008 15:26:01 -0400 Received: from mailout03.t-online.de ([194.25.134.81]:50344) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1KH1mC-0003nu-Bm for grub-devel@gnu.org; Thu, 10 Jul 2008 15:26:01 -0400 Received: from fwd24.aul.t-online.de by mailout03.sul.t-online.de with smtp id 1KH1m8-0002fi-02; Thu, 10 Jul 2008 21:25:56 +0200 Received: from [10.3.2.2] (Tz49YsZAYhHsPMYgAlW+9wflWFzUGkwAylxIu20dftupHeMiby54Lxdiux5wk5jZD6@[217.235.250.62]) by fwd24.aul.t-online.de with esmtp id 1KH1lo-1VWHGC0; Thu, 10 Jul 2008 21:25:36 +0200 Message-ID: <48766231.9050006@t-online.de> Date: Thu, 10 Jul 2008 21:25:37 +0200 From: Christian Franke User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.11) Gecko/20071128 SeaMonkey/1.1.7 MIME-Version: 1.0 To: The development of GRUB 2 References: <1215264476.26019.160.camel@localhost> <1215293427.17114.2.camel@dv> <1215298499.26019.192.camel@localhost> <20080706183042.GA22023@thorin> <1215374534.26019.194.camel@localhost> <48726DB3.9000809@t-online.de> <4873AC32.3060004@t-online.de> <1215586047.31230.27.camel@dv> <1KGZ7b-1EcJsG0@fwd32.aul.t-online.de> <1215626250.26246.9.camel@dv> In-Reply-To: <1215626250.26246.9.camel@dv> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-ID: Tz49YsZAYhHsPMYgAlW+9wflWFzUGkwAylxIu20dftupHeMiby54Lxdiux5wk5jZD6 X-TOI-MSGID: 27a11b24-ad26-4593-a5ab-e5f2af5fc4b3 X-detected-kernel: by monty-python.gnu.org: Linux 2.6 (newer, 3) Subject: Re: Endianness macros capitalization X-BeenThere: grub-devel@gnu.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: The development of GRUB 2 List-Id: The development of GRUB 2 List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 Jul 2008 19:26:04 -0000 Pavel Roskin wrote: > On Wed, 2008-07-09 at 14:50 +0200, Christian Franke wrote: > > >> Result with the test script from my last mail: >> >> Debian gcc 4.1.2-7: >> inline (portable)=357, inline (asm)=126, function=104 >> >> Cygwin gcc 3.4.4: >> inline (portable)=340, inline (asm)=124, function=96 >> >> Function call is still better. The only candidate for inline is probably >> grub_swap_bytes16(). >> > > But we are getting closer! > > But probably not close enough in the average case, see below :-) >>> .... And if written properly, it could work with any of >>> the registers that allow access to the lower two bytes (%eax, %ebx, >>> %ecx and %edx), thus giving more flexibility to the optimizer. >>> >>> >> This would require support to access the Rl and Rh parts of eRx for each >> R in [a-d]. Something like: >> >> asm ( >> "xchg %0:l,%0:h\n" >> "roll $0x10,%0\n" >> "xchg %0:l,%0:h\n" >> : "=r"(_y) : "0"(_x) \ >> ); >> >> Do gcc or gas provide a syntax to do this? >> > > Yes. That's %b0 and %h0. Use "=q" for all registers with "upper > halves" (%ah-%dh). > > Thanks for the info. I tried this in the same script: #define grub_swap_bytes32(x) \ ({ \ grub_uint32_t _x = (x), _y; \ asm ( \ "xchgb %b0,%h0\n" \ "roll $0x10,%0\n" \ "xchgb %b0,%h0\n" \ : "=q"(_y) : "0"(_x) \ ); \ _y; \ }) GCC optimizer does a good job optimizing register use, but function call is still better: Debian gcc 4.1.2-7: inline (portable)=357,asm (%%eax)=126, asm (%0)=107, function=104 Cygwin gcc 3.4.4: inline (portable)=340, asm (%%eax)=124, asm (%0)=104, function=96 Inline asm is only better is rare cases, e.g. with this test function: type test(type *x) { return func(x[0]) + func(x[1]) + ... + func(x[N]); } Result with Cygwin gcc 3.4.4: N=0: asm=14, function=11 N=1: asm=28, function=32 N=2: asm=40, function=41 N=3: asm=52, function=51 N=4: asm=64, function=61 N=5: asm=76, function=72 To test which files would possibly be affected by any size optimization of grub_swap_bytes*(), I 've done a test compilation with these macros replaced by dummies '(x)'. Only the size of following modules changed: affs.mod afs.mod amiga.mod apple.mod ata.mod hfs.mod hfsplus.mod iso9660.mod jpeg.mod png.mod sfs.mod sun.mod xfs.mod Remaining modules and kernel.img are identical. Christian