From mboxrd@z Thu Jan 1 00:00:00 1970 From: Albert ARIBAUD Date: Thu, 18 Apr 2013 08:20:27 +0200 Subject: [U-Boot] [PATCH] crc32: Correct endianness of crc32 result In-Reply-To: References: <1365203470-9099-1-git-send-email-sjg@chromium.org> <20130406070429.381092002D9@gemini.denx.de> <20130417054047.9D5692001AD@gemini.denx.de> <20130417212326.4f803bfb@lilith> Message-ID: <20130418082027.4b5ea191@lilith> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de Hi Simon, On Wed, 17 Apr 2013 13:59:48 -0700, Simon Glass wrote: > Hi Albert, > > On Wed, Apr 17, 2013 at 12:23 PM, Albert ARIBAUD > wrote: > > Hi Simon -- and sorry for the dupe. > > > > On Wed, 17 Apr 2013 11:28:07 -0700, Simon Glass > > wrote: > > > >> I tried using: > >> > >> #ifdef USE_HOSTCC > >> crc = htobe32(crc); > >> #else > >> crc = cpu_to_be32(crc); > >> #endif > >> memcpy(output, &crc, sizeof(crc)); > >> > >> > >> This is one instruction (4 bytes, 16%) smaller, but I suspect quite a > >> lot slower due to the overhead of a very small memcpy(). > >> > >> 43e2c1d8: e28d1008 add r1, sp, #8 > >> 43e2c1dc: e3a02004 mov r2, #4 > >> 43e2c1e0: e6bf0f30 rev r0, r0 > >> 43e2c1e4: e5210004 str r0, [r1, #-4]! > >> 43e2c1e8: e1a00004 mov r0, r4 > >> 43e2c1ec: eb001af7 bl 43e32dd0 > > > > How about replacing the memcpy with an explicit put_unaligned(), > > similar to what was done in > > > > http://www.mail-archive.com/u-boot at lists.denx.de/msg109555.html > > > > with get_unaligned()? The code will be longer than above, but shorter > > than the above plus the memcpy(), and faster too -- actually, I'm > > surprised that the compiler does not unroll the memcpy() on its own, > > considering the size argument is a constant. > > Do you mean like this? > > #ifdef USE_HOSTCC > crc = htobe32(crc); > memcpy(output, &crc, sizeof(crc)); > #else > crc = cpu_to_be32(crc); > put_unaligned(crc, (uint32_t *)output); > #endif > > This produces the same code as my original patch. If this is > acceptable then I will do that, although it doesn't really seem any > better. Wolfgang may not like it any more than put_unaligned_be32() as it builds upon it (and the disk patch could have used the be32 version as well). Personally, I think we should allow and use these... ... and work on optimizing their implementation for ARM; we should be able to reduce such put()s and get()s to a few instructions. And to avoid any misunderstanding, yes, I volunteer for the optimizing work. :) > Regards, > Simon > > [and sorry for my dup] Actually I'm the culprit. Amicalement, -- Albert.